New issue
Advanced search Search tips

Issue 891848 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 4
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: ----



Sign in to add a comment

shards expiring for Android Nexus6 WebView Perf

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, Oct 3

Issue description

Filed by sheriff-o-matic@appspot.gserviceaccount.com on behalf of crouleau@google.com

rendering.mobile/idle_power_animated_gif in performance_webview_test_suite failing on chromium.perf/Android Nexus6 WebView Perf

Builders failed on: 
- Android Nexus6 WebView Perf: 
  https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/Android%20Nexus6%20WebView%20Perf


 
Summary: shards expiring for Android Nexus6 WebView Perf (was: rendering.mobile/idle_power_animated_gif in performance_webview_test_suite failing on chromium.perf/Android Nexus6 WebView Perf)
shard #8 expired, not enough capacity

In the last two runs, I see shards expiring because of lack of capacity

https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/Android%20Nexus6%20WebView%20Perf/2829
https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/Android%20Nexus6%20WebView%20Perf/2828

Here's one of the swarming tasks that expired: 
https://chrome-swarming.appspot.com/task?id=4053215ef94cb710&refresh=10&show_raw=1
It looks like the problem is that build202-b7--device1 is dead: https://chrome-swarming.appspot.com/bot?id=build202-b7--device1&sort_stats=total%3Adesc 
Cc: -crouleau@google.com crouleau@chromium.org
Components: Infra>Labs
Owner: nedngu...@google.com
Infra>Labs, could you please fix 

1. build202-b7--device1: https://chrome-swarming.appspot.com/bot?id=build202-b7--device1&sort
2. build202-b7--device4: https://chrome-swarming.appspot.com/bot?id=build202-b7--device4&sort_stats=total%3Adesc
3. build202-b7--device5: https://chrome-swarming.appspot.com/bot?id=build202-b7--device5&sort_stats=total%3Adesc

Meanwhile, isn't soft device affinity supposed to switch to a new device when a device is dead? +Ned to answer that question.
Owner: ----
It seems like we don't have enough healthy bots to switch over:

13 healthy bots from the log https://logs.chromium.org/logs/chrome/buildbucket/cr-buildbucket.appspot.com/8933671138581491520/+/steps/test_pre_run/0/steps/s__trigger__performance_webview_test_suite_on_Android_device_Nexus_6/0/stdout

  Healthy bots: ['build203-b7--device1', 'build203-b7--device3', 'build203-b7--device2', 'build203-b7--device5', 'build203-b7--device4', 'build203-b7--device7', 'build203-b7--device6', 'build204-b7--device2', 'build204-b7--device3', 'build204-b7--device6', 'build204-b7--device7', 'build204-b7--device4', 'build204-b7--device5']
  Dead Bots: ['build204-b7--device1', 'build202-b7--device4', 'build202-b7--device5', 'build202-b7--device6', 'build202-b7--device7', 'build202-b7--device1', 'build202-b7--device2', 'build202-b7--device3']
 
Owner: vhang@chromium.org
Status: Assigned (was: Available)
The Nexus 6 phones are almost 4 years old now.  We don't have any spare N6s phones left.  Is there a reason why we can't upgrade these N6 bots to newer phones like the Pixel 2s?
Cc: perezju@chromium.org
I don't have any objection to an upgrade, but it's a good deal of work to do that (for Labs and for Benchmarking team). Potentially we should have a meeting to understand the issues before we do this. It's concerning also that this issue got this bad before we noticed it.

Q4 OKRs have plan to add android-pixel, but I don't see any mention of pixel2s. See https://docs.google.com/document/d/1eLmzsM9nJoqjB2hW8NvXXT1fWWWrEv7sj8BdV2Asgi0/edit?ts=5bb50dc6#
Cc: nedngu...@google.com
In the mean time, does Labs have any means to mitigate this issue? Otherwise I may look into sharding rules to try to reduce the number of shards so that we have enough devices to run this.
#8: crouleau@ that's a good idea. You can reduce the number of shard by adjusting https://cs.chromium.org/chromium/src/tools/perf/core/bot_platforms.py?rcl=d2d3ac450ea8c16fbc4bb1ca3a311a4b9813d000&l=161 then rerun ./tools/perf/generate_perf_sharding
Owner: crouleau@chromium.org
Okay, doing it. Thanks for the pointer!
I had to additionally change the shard number in perf_data_generator.py. I sent https://chromium-review.googlesource.com/c/chromium/src/+/1260218 out for review :)
Project Member

Comment 12 by bugdroid1@chromium.org, Oct 4

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/66b91f251ec480b4df361e1cf4b7db31d3342604

commit 66b91f251ec480b4df361e1cf4b7db31d3342604
Author: Caleb Rouleau <crouleau@chromium.org>
Date: Thu Oct 04 03:43:36 2018

[Benchmarking] Reduce number of shards for Nexus6.

The N6 devices in the lab are old and failing. Tests are expiring while
waiting to find an open bot. With fewer shards we require fewer
devices.



Bug:  891848 
Cq-Include-Trybots: master.tryserver.chromium.perf:obbs_fyi
Change-Id: Ic14a808f55d405792429feef73a33d9f2603d5e5
Reviewed-on: https://chromium-review.googlesource.com/c/1260218
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Caleb Rouleau <crouleau@chromium.org>
Cr-Commit-Position: refs/heads/master@{#596493}
[modify] https://crrev.com/66b91f251ec480b4df361e1cf4b7db31d3342604/testing/buildbot/chromium.perf.json
[modify] https://crrev.com/66b91f251ec480b4df361e1cf4b7db31d3342604/tools/perf/core/bot_platforms.py
[modify] https://crrev.com/66b91f251ec480b4df361e1cf4b7db31d3342604/tools/perf/core/perf_data_generator.py
[modify] https://crrev.com/66b91f251ec480b4df361e1cf4b7db31d3342604/tools/perf/core/shard_maps/android_nexus6_webview_perf_map.json

Status: Fixed (was: Assigned)
THere were a couple passed tests on this bot and then it started failing for an unrelated reason. THis issue is fixed!
Great work, Caleb!

Sign in to add a comment