Issue metadata
Sign in to add a comment
|
Shards completed tasks successfully but was identified as expired |
||||||||||||||||||||||||
Issue descriptionThis just began to happen on multiple GPU/FYI bots. A few examples: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20FYI%20Release%20%28AMD%29/3759 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20FYI%20Exp%20Release%20%28NVIDIA%29/20670 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%2010.14%20Release%20%28Intel%29/741 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%2010.14%20Release%20%28AMD%29/654 Looking at the last link above, shard 0: https://chromium-swarm.appspot.com/task?id=41e5dbd25eed0310&refresh=10&show_raw=1 Pending time is 2m 18s, runing time is 6.29s. Looking at the test output, "all tests passed." However, in the build, it was identified as "shard #0 had an internal swarming failure"
,
Dec 21
That swarming regression looks to be the same cause of bug 917085. See the "connection reset by peer" errors in https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8926594492369970592/+/steps/angle_end2end_tests_on_ATI_GPU_on_Mac_on_Mac-10.14/0/stdout Each error correlates to a missing shard. Probably due to the switch to the golang version of swarming.py.
,
Dec 21
Issue 917085 has been merged into this issue.
,
Dec 21
https://chromium-review.googlesource.com/1388166 is up for review increasing the timeouts on several FYI bots that either have limited capacity or might have limited capacity if we start a graphics driver or OS upgrade. These two: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%2010.14%20Release%20%28Intel%29/741 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%2010.14%20Release%20%28AMD%29/654 may be due to the successful runs taking just under 3 hours. There is only 1 Swarming bot backing those two Mac 10.14 configurations right now, and the timeouts were deliberately increased in Issue 871872 to handle this. Does the limited_capacity_bot mixin here: https://cs.chromium.org/chromium/src/testing/buildbot/mixins.pyl?q=limited_capacity_bot&sq=package:chromium&g=0&l=211 have to also increase hard_timeout? I see 10800 seconds (3 hours) being specified elsewhere, and it looks like that's the limit being hit. https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20FYI%20Release%20%28AMD%29/3759 should be investigated as a possible regression in Swarming.
,
Dec 21
This is also happening on one of the CQ bots now: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win_optional_gpu_tests_rel/12629
,
Dec 21
,
Dec 21
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e9b8c4cee9e9845baa9a907c7872c5f08510df51 commit e9b8c4cee9e9845baa9a907c7872c5f08510df51 Author: Kenneth Russell <kbr@chromium.org> Date: Fri Dec 21 22:31:15 2018 Increase timeout on some chromium.gpu.fyi bots. Some of these configurations have relatively few machines, and the default Swarming shard timeout of 1 hour is too short. The following machines are affected: Mac FYI Experimental Release (Intel) Mac FYI Experimental Retina Release (AMD) Win10 FYI Exp Release (Intel HD 630) Win10 FYI Exp Release (NVIDIA) Win7 ANGLE Tryserver (AMD) Win7 FYI Debug (AMD) Win7 FYI Release (AMD) Win7 FYI Release (NVIDIA) Win7 FYI dEQP Release (AMD) Win7 FYI x64 Release (NVIDIA) Bug: 917183 No-Try: True Change-Id: I8e602e9713a378c9da43e1410644d214d00c7d24 Reviewed-on: https://chromium-review.googlesource.com/c/1388166 Commit-Queue: Kenneth Russell <kbr@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Cr-Commit-Position: refs/heads/master@{#618637} [modify] https://crrev.com/e9b8c4cee9e9845baa9a907c7872c5f08510df51/testing/buildbot/chromium.gpu.fyi.json [modify] https://crrev.com/e9b8c4cee9e9845baa9a907c7872c5f08510df51/testing/buildbot/waterfalls.pyl |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by kbr@chromium.org
, Dec 20Components: Infra>Platform>Swarming