Intermittent BOT_DIED happening to mac-10_13_laptop_high_end-perf |
|||
Issue descriptionSee https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/mac-10_13_laptop_high_end-perf For many of the builds like this one https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/mac-10_13_laptop_high_end-perf/1954 we are getting one of the swarming bots for one of the shards for performance_test_suite to die: "shard #0 had an internal swarming failure" https://chrome-swarming.appspot.com/task?id=416f28179a594e10&refresh=10&show_raw=1 The swarming bot page says "BOT_DIED", but I can't find any other detail. It seems to either happen on shard 0 or shard 16. https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/mac-10_13_laptop_high_end-perf/1949 bots build140-a7 build153-a7 bot for shard 0: https://chrome-swarming.appspot.com/bot?id=build153-a7&sort_stats=total%3Adesc bot for shard 16: https://chrome-swarming.appspot.com/bot?id=build140-a7&sort_stats=total%3Adesc See issue 908515 for initial investigation. I suspected that one of the recently added test cases was to blame, but the dying kept happening after I reverted that change.
,
Nov 27
,
Nov 27
If a trooper could simply quarantine build140-a7 and build153-a7 so that my Telemetry's soft affinity selects other bots instead, that would quickly solve the issue.
,
Nov 27
This is likely bug 894421. (See vadim's comment in #6 for a summary of what's going on.)
,
Nov 28
|
|||
►
Sign in to add a comment |
|||
Comment 1 by crouleau@chromium.org
, Nov 27