New issue
Advanced search Search tips

Issue 841492 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug


Participants' hotlists:
Perf-Swarming-Migration


Sign in to add a comment

Migrate bots of "Android Nexus6 WebView Perf" to private swarming server

Project Member Reported by nednguyen@chromium.org, May 9 2018

Issue description

I want to migrate these machines to private swarming server:
build112-b1--device1, build112-b1--device2, build112-b1--device3, build112-b1--device4, build112-b1--device5, build112-b1--device6, build112-b1--device7, build113-b1--device1, build113-b1--device2, build113-b1--device3, build113-b1--device4, build113-b1--device5, build113-b1--device6, build113-b1--device7, build114-b1--device1, build114-b1--device2, build114-b1--device3, build114-b1--device4, build114-b1--device5, build114-b1--device6, build114-b1--device7
 
Cc: vhang@chromium.org
Project Member

Comment 2 by bugdroid1@chromium.org, May 10 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome-golo/chrome-golo/+/b0a2951da032552149b8f22b7dcb8758af9e6a56

commit b0a2951da032552149b8f22b7dcb8758af9e6a56
Author: John Weathersby <johnw@google.com>
Date: Thu May 10 19:31:08 2018

Project Member

Comment 3 by bugdroid1@chromium.org, May 10 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/puppet/+/c627ddd7203294876dd6acbca225f0a650b203f6

commit c627ddd7203294876dd6acbca225f0a650b203f6
Author: John Weathersby <johnw@google.com>
Date: Thu May 10 23:29:14 2018

Project Member

Comment 4 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome-golo/chrome-golo/+/214122ec535abb83fe229fd4c8465cc414644d47

commit 214122ec535abb83fe229fd4c8465cc414644d47
Author: John Weathersby <johnw@google.com>
Date: Fri May 11 00:00:10 2018

Comment 5 by jo...@chromium.org, May 11 2018

All devices migrated from build{112..114}-b1 -> build{202..204}-b7

Docker changes synced, and devices are seen in private swarming now. pool:unassigned


Project Member

Comment 6 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/171dbc755c528108b062f2b3863b61520fc6bcf6

commit 171dbc755c528108b062f2b3863b61520fc6bcf6
Author: Nghia Nguyen <nednguyen@google.com>
Date: Fri May 11 12:39:16 2018

Project Member

Comment 7 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/3fbbeee45018a68f2aef9a747df77fcadc9e8a5e

commit 3fbbeee45018a68f2aef9a747df77fcadc9e8a5e
Author: Nghia Nguyen <nednguyen@google.com>
Date: Fri May 11 12:52:55 2018

Convert 'Nexus 6 Webview Perf' to use private swarming server

Bug:841492
Change-Id: If9b934c7547298db8b18bc6b5d2689eebdcfdad8
TBR=eyaich@chromium.org
Reviewed-on: https://chromium-review.googlesource.com/1055468
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Ned Nguyen <nednguyen@google.com>

[modify] https://crrev.com/3fbbeee45018a68f2aef9a747df77fcadc9e8a5e/scripts/slave/recipe_modules/chromium_tests/chromium_perf.py

Project Member

Comment 8 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/34e65e84f06e9e7401447ebb6d1265c5646def1c

commit 34e65e84f06e9e7401447ebb6d1265c5646def1c
Author: Ned Nguyen <nednguyen@google.com>
Date: Fri May 11 12:53:49 2018

Update swarming dimension of 'Android Nexus6 WebView Perf'

NOTRY=true  # Test covered by PRESUBMIT
TBR=eyaich@chromium.org

Bug:  841492 
Change-Id: Id93784c4f9e91e665c8f251349cacd846560ef75
Reviewed-on: https://chromium-review.googlesource.com/1055469
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Cr-Commit-Position: refs/heads/master@{#557852}
[modify] https://crrev.com/34e65e84f06e9e7401447ebb6d1265c5646def1c/testing/buildbot/chromium.perf.json
[modify] https://crrev.com/34e65e84f06e9e7401447ebb6d1265c5646def1c/tools/perf/core/benchmark_sharding_map.json
[modify] https://crrev.com/34e65e84f06e9e7401447ebb6d1265c5646def1c/tools/perf/core/perf_data_generator.py

Looks like I misconfigured the id. The bots ids in #8 was build{202..204}-b1, whereas they're supposed to be build{202..204}-b7
Project Member

Comment 11 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/99a9a5d575c8385677f15f0a369dcf3f69566de6

commit 99a9a5d575c8385677f15f0a369dcf3f69566de6
Author: Ned Nguyen <nednguyen@google.com>
Date: Fri May 11 23:31:08 2018

Fix device id of 'Android Nexus6 WebView Perf' bots

NOTRY=true  # test by PRESUBMIT
TBR=eyaich@chromium.org

Bug:  841492 
Change-Id: I1b8bf1c05ea18f86ddbf676c24eeac6637a97348
Reviewed-on: https://chromium-review.googlesource.com/1056299
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Cr-Commit-Position: refs/heads/master@{#558059}
[modify] https://crrev.com/99a9a5d575c8385677f15f0a369dcf3f69566de6/testing/buildbot/chromium.perf.json
[modify] https://crrev.com/99a9a5d575c8385677f15f0a369dcf3f69566de6/tools/perf/core/benchmark_sharding_map.json
[modify] https://crrev.com/99a9a5d575c8385677f15f0a369dcf3f69566de6/tools/perf/core/perf_data_generator.py

John: somehow a few devices are shown up as dead (e.g: build190-b7--device1) & some are shown up as Ubuntu (e.g:build202-b7--device2). Can you check?

https://chrome-swarming.appspot.com/botlist?c=id&c=os&c=task&c=status&f=pool%3Achrome.tests.perf-webview&l=100&q=poo&s=id%3Aasc

Comment 13 by jo...@chromium.org, May 14 2018

Yeah, these seem like localized problems. One host won't even run lsusb, and the other seems to have gone down. We'll investigate these. Thanks.

Comment 14 by jo...@chromium.org, May 14 2018

build190-b7 had a bad usb isolator which locked up the whole chain. This has been replaced.

build202-b7 needed a reboot to clear usb issues as well.

USB device ZX1G422HK6, port num: 1
USB device ZX1G22KC78, port num: 2
USB device ZX1G22LTRP, port num: 3
USB device ZX1G22KGFM, port num: 4
USB device ZX1G22KNX2, port num: 5
USB device ZX1G22L2NR, port num: 6
USB device NP5A2N0156, port num: 7

USB device 01e1781326604096, port num: 1
USB device 008671d42589ad4c, port num: 2
USB device 01e0a954cc6738c6, port num: 3
USB device 0200121bc2432b31, port num: 4
USB device 01dc418a0c9cc4e8, port num: 5
USB device 01e06d0fccaf2eca, port num: 6
USB device 01e131841ca7cc17, port num: 7


Cc: bpastene@chromium.org
Seeing some weirdo usb failures on build190-b7, probably causing the quarantine of devices 2-7. I'll try resetting things. 
build190-b7 looks better now. Though it's taking an awful long time to scan for battors:
"Fetched battor serials in 20.02s."

If this happens again, I may have to dig into that.
Cc: charliea@chromium.org
Wait, spoke too soon. All but one phone on that host is up:
https://chrome-swarming.appspot.com/bot?id=build190-b7--device4

I think the problem is the battor-scanning issue. We wrap docker's usb interactions around a mutex. Each thread has a timeout of 1min, 1 thread for each device, 7 threads in total, and if scanning battors takes ~20s, a few of those threads will end up timing out. When a timeout happens, a container/swarming-bot ends up not getting a device... I'll fix this when I get a chance.
Status: Fixed (was: Assigned)

Sign in to add a comment