New issue
Advanced search Search tips

Issue 795981 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Tryjob results UI in gerrit failing to display certain tryjobs

Project Member Reported by bpastene@chromium.org, Dec 19 2017

Issue description

Currently, the "Tryjobs" results pane in gerrit UI that displays the result of that patchset's tryjobs occasionally does not display *every* tryjob.

For example, this cl:
https://chromium-review.googlesource.com/c/chromium/src/+/823249/1
is missing this tryjob:
https://ci.chromium.org/buildbot/tryserver.blink/linux_trusty_blink_rel/19976

All tryjob results for that patchset are green despite it triggering a failing build. This is problematic because it hides transient failures. Even if it was an infrastructure failure, it should be discoverable. The user should see the flaky bots that delayed their CL from landing.

Here's a few more examples of patchsets and their tryjobs that aren't being displayed:

https://chromium-review.googlesource.com/c/chromium/src/+/818597/4
https://ci.chromium.org/buildbot/tryserver.chromium.linux/linux_chromium_rel_ng/606821
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_chromium_compile_dbg_ng/560581

https://chromium-review.googlesource.com/c/chromium/src/+/777990/1
https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/592932

https://chromium-review.googlesource.com/c/chromium/src/+/821713/1
https://ci.chromium.org/buildbot/tryserver.chromium.android/linux_android_rel_ng/450811
https://ci.chromium.org/buildbot/tryserver.chromium.android/android_n5x_swarming_rel/324101

https://chromium-review.googlesource.com/c/chromium/src/+/767449/1
http://build.chromium.org/p/tryserver.chromium.linux/builders/fuchsia_x64/builds/14720
(Something's really off with this one. It links to the right build, but it's displayed as green in gerrit despite being purple in milo.)

Not sure if it's a gerrit or buildbucket issue, so slapping both labels on.
 

Comment 1 by aga...@chromium.org, Dec 19 2017

Cc: aga...@chromium.org
Components: -Infra>Codereview>Gerrit
It's buildbucket; build 19976 is not in the set of results returned for the query
https://cr-buildbucket.appspot.com/_ah/api/buildbucket/v1/search?max_builds=500&fields=builds(bucket%2Cfailure_reason%2Cid%2Cparameters_json%2Cresult%2Cstatus%2Ctags%2Curl)&tag=buildset%3Apatch%2Fgerrit%2Fchromium-review.googlesource.com%2F823249%2F1
which is what the gerrit plugin issues to get all builds associated with that patchset of that CL.

Comment 2 by aga...@chromium.org, Dec 19 2017

(The "showing a build as the wrong color" bug you mention at the end is probably https://bugs.chromium.org/p/chromium/issues/detail?id=794134&desc=2)
Looks like most (all?) of the missing builds have duplicated buildbucket build ids.

8960372477412147504:
https://ci.chromium.org/buildbot/tryserver.blink/linux_trusty_blink_rel/19976 and https://ci.chromium.org/buildbot/tryserver.blink/linux_trusty_blink_rel/19979

8962624689457061920:
https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/592932 and https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/592942

8960450914272592480:
https://ci.chromium.org/buildbot/tryserver.chromium.android/android_n5x_swarming_rel/324101 and https://ci.chromium.org/buildbot/tryserver.chromium.android/android_n5x_swarming_rel/324142

etc etc. That might help explain why buildbucket's api is returning only one of them.

nodir@: Is it WAI that multiple builds have the same build id? If this is some transparent retry mechanism, can we change the api to return all attempts?

Comment 4 by no...@chromium.org, Dec 19 2017

Owner: no...@chromium.org
Status: WontFix (was: Untriaged)
correct, this is a transparent retry mechanism at the buildbucket-buildbot level *specifically* for master restarts (NOT arbitrary infra failures). 

The restart was scheduled for 3:07
https://chrome-internal.googlesource.com/infradata/master-manager/+/refs/changes/38/527838/2/desired_master_state.json#1306
Build https://ci.chromium.org/buildbot/tryserver.blink/linux_trusty_blink_rel/19976
was interrupted at 3:14pm.

same for
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_chromium_compile_dbg_ng/560581
restart: https://chrome-internal-review.googlesource.com/c/infradata/master-manager/+/527098

since this behavior is scoped to master restarts, i think this resolves the "this is problematic because it hides transient failures.", as these interrupts are not transient failures

Sign in to add a comment