Failing test suites are incorrectly marked as passing. |
|||||||||
Issue descriptionhttps://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76232 lists "Failure telemetry_perf_unittests Failure webkit_layout_tests Failure telemetry_unittests" in the top line, but a failure in telemetry_perf_unittests doesn't cause the bot to fail, as seen in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76231. This makes it harder for a sheriff to see what actually needs fixing.
,
Dec 6
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76228 has the same sort of thing for "Failure webkit_layout_tests", so it's not something to do with the specific set of tests.
,
Dec 6
Milo doesn't decide what to list up on the status, that's controlled by the underlying recipe (which is written by the test suite owners). https://screenshot.googleplex.com/TyceAJ5EMZc I'm a little confused what's going on here. It looks to me that the build should've turned red, but it's green. Why is that?
,
Dec 6
Note about that example: For the telemetry tests, it lists 0 expected failures for both telemetry_unittests and telemetry_perf_tests. Those failures are timeouts and unexpected failures respectively. webkit_layout_tests does indeed have many "expected failures", which is an unfortunate and confusing aspect of the Blink web test harness. General question for Ryan: Is a build always supposed to be failed (red) when there is at least one failed (red) step (and no purple steps)? Possibly silly question: If a red step is supposed to always make the build red but it's not happening, could it be a recipe engine issue?
,
Dec 6
Someone should be in charge of making sure various bits of the system are coherent: in this case that when something says "Failed", the build actually fails. This should not be something individual recipe writers should be able to get wrong. I'd love to assign this to that person/team instead of Milo, but I don't know the structure of the Infra teams well enough to do that. I only filed it against Milo because that's what the "Feedback" button did.
,
Dec 6
Re general question: From the Milo side there's no rule for this. This is entirely determined by recipe owners. On the highest level on "Should this generally be the case for chromium recipes", that would the the CCI team who can answer that. This doesn't look like a recipe engine issue. This looks like a recipe issue (or it might be entirely intentional too). I'll let CCI triage this, since it appears to be a general chromium recipe question.
,
Dec 7
,
Dec 7
Looking at: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76231 There appear to be two problems: * There are several shards that timed out. In the shard I checked: https://chromium-swarm.appspot.com/task?id=419b3fe019779c10&refresh=10&show_raw=1 3 tests were not run. This should have resulted in at least 3 "uenxpected failures". * The failure in telemetry_perf_unittests should have caused the build to go through the normal 'without patch', 'retry with patch' steps. It did not do so and instead succeeded. That's not good.
,
Dec 7
Oh, the link I sent was not for the CQ. It was for a CI builder. *scratches head*
,
Dec 7
Note to self: the relevant code is at https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/api.py?type=cs&q=main_waterfall_steps&g=0&l=1051
,
Dec 7
I suspect that this is the complement edge case to https://chromium-review.googlesource.com/c/chromium/tools/build/+/1357281 Based on the lack of output from the build, we know that this line never fires because failed_tests is empty: https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/api.py?type=cs&q=main_waterfall_steps&g=0&l=320
,
Dec 7
,
Dec 8
Hm. I suspect that this issue is related to validate_task_results(), but I haven't traced through all the logic so don't know exactly how this would happen. https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/steps.py?q=chromium_tests/ste&sq=package:chromium&g=0&l=1799
,
Dec 10
I believe this was caused by https://chromium-review.googlesource.com/c/chromium/tools/build/+/1333174, and has since been reverted.
,
Dec 10
It happened again in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76257, at 2018-12-07 2:19 AM (PST), after r1333174 was reverted in https://chromium.googlesource.com/chromium/tools/build/+/d832c5d0d61408972ee2eba563aed35d03ecf741 at 2018-12-06 07:35:26.
,
Dec 10
And I believe that was due to its reland: https://chromium-review.googlesource.com/c/chromium/tools/build/+/1365470 And subsequent 2nd revert: https://chromium-review.googlesource.com/c/chromium/tools/build/+/1367544 These changes certainly don't have the best track record :(
,
Dec 12
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by hinoka@chromium.org
, Dec 6