failed recipe produces 'status: SUCCESS' |
||||||||||
Issue descriptionChrome Version: 67.0.3396.79 (Official Build) (64-bit) OS: Debian buster/sid What steps will reproduce the problem? (1) Visit https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180612T114237.071755963Z_00000000006c7f41/+/annotations What is the expected result? Under "Results:" I should see "Failure," since the "dm" step failed and the final step is "Failure reason." What happens instead? Under "Results:" I see "Success." This seems to be happening for all builds where we're testing Chromecasts, e.g.: https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180611T191143.287094426Z_00000000006c6294/+/annotations https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180611T152225.334094831Z_00000000006c4b4c/+/annotations https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180610T213751.484731606Z_00000000006c3be1/+/annotations https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180609T011353.726858078Z_00000000006be712/+/annotations https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180612T114237.071755963Z_00000000006c7f41/+/annotations Most likely none of our other builds contain "Write failed: Broken pipe" and "step returned non-zero exit code: 255" although I'm not sure how that could be related.
,
Jun 12 2018
> fwiw the /raw/build/* endpoints are meant for debugging. This is what Swarming uses. E.g. https://chromium-swarm.appspot.com/task?id=3e0ce79b0cb33910 "Milo Output" iframes https://ci.chromium.org/raw/build/logs.chromium.org/skia/20180612T130117.248294051Z_00000000006c8469/+/annotations
,
Jun 14 2018
> fwiw the /raw/build/* endpoints are meant for debugging. We recommend using buildbucket instead. This is Skia, they cannot use buildbucket directly yet. -- indeed, the root step is marked as SUCCESS, so milo behavior here is correct. Please reopen if you see milo showing success for a unsuccessful root step.
,
Jun 14 2018
AFAICT, the recipe is doing the right thing. Added a test specifically for this case here: https://skia-review.googlesource.com/c/skia/+/135047 The .expected ends with: { "name": "$result", "reason": "Step('nanobench') failed with return_code 255", "recipe_result": null, "status_code": 255 } Is it expected that this would be marked as 'SUCCESS'? Is there a way to see the annotation stream for a recipe example?
,
Jun 15 2018
This is strange. The swarming task does fail, eg. https://chromium-swarm.appspot.com/task?id=3e1cb42e270cc610&refresh=10 So we're not just swallowing the error. The "Failure reason" step is present, indicating that the recipe engine is aware of the failure, but the overal result is still SUCCESS.
,
Jun 19 2018
Able to reproduce the issue on reported chrome version 67.0.3396.79 (...as being discussed in above comments) and on the latest canary 69.0.3464.0. As the issue is seen from M60(60.0.3112.0) considering it as Non-Regression and marking it as Untriaged. Hence adding the required labels. Thanks!
,
Jun 19 2018
,
Jun 20 2018
The test referenced in #4 is for a red failure, but the task linked in #5 is a purple failure. Perhaps something is buggy with infra failures. Shouldn't be, InfraFailure is a subclass of StepFailure...
,
Jun 20 2018
This is a milo bug with how it interprets the result of the raw logs. This view doesn't have the benefit of additional data from the swarming task (like the fact that it failed). This is also partially the fault of how the recipes output the data; I don't think the recipe engine gets a chance to set the 'overall build failed' bit if the swarming task dies out from under it. It would be worthwhile to consider adding a "build finished normally" bit inside the annotation proto so that milo can infer that the build died before the annotations could be completed normally (so if the annotation log is "closed" and that bit isn't set, it can infer a failure)
,
Jun 21 2018
The above SGTM. Anecdotally, I've seen tasks with BOT_DIED status in Swarming for which Milo shows success.
,
Jun 21 2018
Re Milo. I don't think this is a bug wrt the /raw/ endpoint. There's no way for Milo to grok swarming if you access a /raw/build/* endpoint. The emitted recipe annotation is 100% of the information Milo has to work with.
,
Jun 21 2018
Should we then NOT display overall build status in raw endpoint if we don't have enough data to do so? All we have is steps and properties. Perhaps that's the only thing we should display
,
Jun 21 2018
Annotation still contains a place overall build status, and in the context of /raw/build/*, it is still correct. The bug here is that in the larger context of a swarming task, it is not correct.
,
Jun 21 2018
Sorry, I'm confused by #9 and #10. The original bug report as well as the examples in #4 and #5 are all failures within the recipe. The bot does not die, and the recipe finishes as it should. Re: #8, I don't see any interesting difference in behavior for red (i.e. all of the builds linked in #0) vs. purple (the build linked from #5) WRT the original bug report.
,
Oct 18
,
Oct 18
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by hinoka@chromium.org
, Jun 12 2018