Milo suggests build not started on transient failure to retrieve logdog stream |
||
Issue descriptionA developer brought a build (https://chromium-swarm.appspot.com/task?id=3deb1b07326cad10&refresh=10&show_raw=1&wide_logs=true) to my attention yesterday shortly after it finished. The gerrit plugin reported the build as having failed, but the build page only reported that it was unable to load the logdog stream and that the build may not have started. Clicking through to the swarming task showed the raw output (https://chromium-swarm.appspot.com/task?id=3deb1b07326cad10&refresh=10&show_raw=1&wide_logs=true), which includes the steps and which steps failed. This appears to have transiently resolved, but it can be confusing for developers who look at their trybot failures shortly after they've happened.
,
Jun 6 2018
How did this user get to that page? I thought we've gotten rid of all references to the chromium-swarm page (which loads a different codepath) but that doesn't seem to be the case. This is important because if the user loaded https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/63783 page instead, it would've loaded the correct build info from buildbucket, and merely said "couldn't load steps from logdog". 850113 won't help because the /task/<id> codepath doesn't touch buildbucket.
,
Jun 6 2018
https://chromium-review.googlesource.com/c/chromium/src/+/1087927/1 -> https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/63783 -> "Source: Task ..." ... though the user in question didn't independently go to that page; I suggested it. I believe that what you suggest in #2 didn't in fact happen; the user did load that page, and it was unable to load anything.
,
Jun 6 2018
Ah I see, in that case I have no idea what's going on. The behavior i'd expect to see is this: https://screenshot.googleplex.com/HWGrFKwniAQ What you're describing doesn't match up to what I'd expected. A screenshot would've been helpful.
,
Jun 7 2018
for swarming tasks that do have a proper buildbucket build id, should we redirect to the canonical build page? then we would avoid such problems and the build status would be accurate. |
||
►
Sign in to add a comment |
||
Comment 1 by no...@chromium.org
, Jun 6 2018Components: -Infra>Platform>Milo Infra>Platform>Milo>LUCI
Status: Available (was: Untriaged)