New issue
Advanced search Search tips

Issue 807495 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Feature



Sign in to add a comment

FR: Ability to make SYNC_COUNT > 1 schedule devices only from same lab

Project Member Reported by dhadd...@chromium.org, Jan 31 2018

Issue description

The autoupdate_P2P test is generally running very well:
https://stainless.corp.google.com/search?view=matrix&row=board&col=build&first_date=2018-01-28&last_date=2018-01-31&test=%5Eautoupdate%5C_P2P%24&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true

However the number 1 failure is "P2P update was disabled because no suitable peer DUT was found."

Looking at stainless you can see that some boards pass every time whereas some boards fail a lot. 

I will investigate what is going on here  
 
Looking at bob since that is failing about 40% of the time it may be due to the DUTs that are being given to the test. 

Passing test:
chromeos6-row3-rack13-host15
chromeos6-row3-rack13-host7

Passing test:
chromeos6-row3-rack13-host7
chromeos6-row4-rack13-host9

Failing test:
chromeos2-row8-rack11-host14
chromeos6-row3-rack13-host11

Failing test:
chromeos2-row8-rack11-host14
chromeos6-row3-rack13-host7

Failing test:
chromeos2-row8-rack11-host14
chromeos6-row3-rack13-host11

Similar situation on Cave, celes, edgar, espresso 

Passing tests have two DUTs that start with the same name 
Failing tests have two DUTs that difer 

I think this means the test is being given DUTs in different labs

I checked though and chromeos2 and chromeos6 DUTs can ping and ssh into each other. 
But apparently cannot find each other for P2P updates. 
Cc: dhadd...@chromium.org
Owner: ahass...@chromium.org
Amin, do you know what the criteria is for P2P updates to work between DUTs?
I think there is a more info in this README file:
https://chromium.git.corp.google.com/chromiumos/platform2/+/e3ac496bc7d22a1d2d9080a3b87458eef3dfcf47/p2p/README.md

Can you take a look and see if any information there helps? It seems like the files are advertised through DNS-SD service.

Could it be some kind of race condition where a device doesn't find another one because the other is some stale/waiting condition?
Cc: pprabhu@chromium.org shuqianz@chromium.org jkop@chromium.org
Summary: with SYNC_COUNT > 1, devices are given from different labs (chromeo2 vs chromeos6) and cannot reach each other for p2p autoupdate (was: autoupdate_P2P failure: "P2P update was disabled because no suitable peer DUT was found.")
+infra deputies for clarification on the network setup 
+pprahbu who did the sync_count magic initially

Deputies a couple of questions:
Does chromeos2 vs chromeos6 mean that the DUTs are in different labs?
So they would be on different LANs and therefore p2p does not work? 

We currently have no support for targeting multi-DUT tests to the same lab. When the DUTs are determined for incoming multi-DUT test request, *any* two DUTs with matching labels are picked. Since the labels are identical (synch_count just asks for N DUTs with the same labels), these DUTs are always available on a single shard.

So far so good. Now if the DUTs on that shard are split between labs, nothing stops those DUTs from being picked together. 
Re #5: Yep, that reading is correct. Different labs => different vlan => no P2P.

Project Member

Comment 8 by bugdroid1@chromium.org, Feb 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/620ccb3ee7127557ad61e060819264541443870a

commit 620ccb3ee7127557ad61e060819264541443870a
Author: David Haddock <dhaddock@chromium.org>
Date: Fri Feb 09 07:42:29 2018

Skip autoupdate_P2P when scheduler gives us DUTs from different labs.

Also adding in copyright header and renaming control file to be more
consistent.

BUG= chromium:809681 
BUG=chromium:807495
TEST=autoupdate_P2P.local

Change-Id: Id67580664331b1d9a6f64eb8600ecbb0b8a3af85
Reviewed-on: https://chromium-review.googlesource.com/905808
Commit-Ready: David Haddock <dhaddock@chromium.org>
Tested-by: David Haddock <dhaddock@chromium.org>
Reviewed-by: Amin Hassani <ahassani@chromium.org>

[modify] https://crrev.com/620ccb3ee7127557ad61e060819264541443870a/server/site_tests/autoupdate_P2P/autoupdate_P2P.py
[rename] https://crrev.com/620ccb3ee7127557ad61e060819264541443870a/server/site_tests/autoupdate_P2P/control.delta

Labels: -Type-Bug Type-Feature
Owner: ----
Status: Available (was: Untriaged)
Summary: FR: Ability to make SYNC_COUNT > 1 schedule devices only from same lab (was: with SYNC_COUNT > 1, devices are given from different labs (chromeo2 vs chromeos6) and cannot reach each other for p2p autoupdate )
Turning this into a FR then 

Comment 10 by jkop@chromium.org, Feb 9 2018

Cc: -shuqianz@chromium.org -jkop@chromium.org

Sign in to add a comment