New issue
Advanced search Search tips

Issue 912630 link

Starred by 3 users

Issue metadata

Status: Duplicate
Owner:
Closed: Dec 12
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug


Show other hotlists

Hotlists containing this issue:
Test-Use-Cases


Sign in to add a comment

Failing test suites are incorrectly marked as passing.

Project Member Reported by jyasskin@chromium.org, Dec 6

Issue description

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76232 lists
"Failure telemetry_perf_unittests Failure webkit_layout_tests Failure telemetry_unittests" in the top line, but a failure in telemetry_perf_unittests doesn't cause the bot to fail, as seen in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76231. This makes it harder for a sheriff to see what actually needs fixing.
 
Components: -Infra>Platform>Milo Tests>Telemetry
This is controlled by the telemetry test harness.
Components: -Tests>Telemetry Infra>Platform>Milo
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76228 has the same sort of thing for "Failure webkit_layout_tests", so it's not something to do with the specific set of tests.
Milo doesn't decide what to list up on the status, that's controlled by the underlying recipe (which is written by the test suite owners).

https://screenshot.googleplex.com/TyceAJ5EMZc

I'm a little confused what's going on here.  It looks to me that the build should've turned red, but it's green.  Why is that?
Note about that example: For the telemetry tests, it lists 0 expected failures for both telemetry_unittests and telemetry_perf_tests. Those failures are timeouts and unexpected failures respectively. webkit_layout_tests does indeed have many "expected failures", which is an unfortunate and confusing aspect of the Blink web test harness.


General question for Ryan: Is a build always supposed to be failed (red) when there is at least one failed (red) step (and no purple steps)?

Possibly silly question: If a red step is supposed to always make the build red but it's not happening, could it be a recipe engine issue?
Someone should be in charge of making sure various bits of the system are coherent: in this case that when something says "Failed", the build actually fails. This should not be something individual recipe writers should be able to get wrong. I'd love to assign this to that person/team instead of Milo, but I don't know the structure of the Infra teams well enough to do that. I only filed it against Milo because that's what the "Feedback" button did.
Components: -Infra>Platform>Milo Infra>Client
Re general question: From the Milo side there's no rule for this.  This is entirely determined by recipe owners.  On the highest level on "Should this generally be the case for chromium recipes", that would the the CCI team who can answer that.

This doesn't look like a recipe engine issue.  This looks like a recipe issue (or it might be entirely intentional too).

I'll let CCI triage this, since it appears to be a general chromium recipe question.
Components: -Infra>Client Infra>Client>Chrome
Cc: jbudorick@chromium.org
Labels: -Pri-2 Pri-1
Looking at: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76231

There appear to be two problems:
* There are several shards that timed out. In the shard I checked: https://chromium-swarm.appspot.com/task?id=419b3fe019779c10&refresh=10&show_raw=1 3 tests were not run. This should have resulted in at least 3 "uenxpected failures". 
* The failure in telemetry_perf_unittests should have caused the build to go through the normal 'without patch', 'retry with patch' steps. It did not do so and instead succeeded. That's not good.


Oh, the link I sent was not for the CQ. It was for a CI builder. *scratches head*
I suspect that this is the complement edge case to https://chromium-review.googlesource.com/c/chromium/tools/build/+/1357281

Based on the lack of output from the build, we know that this line never fires because failed_tests is empty:
https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/api.py?type=cs&q=main_waterfall_steps&g=0&l=320
Summary: Failing test suites are incorrectly marked as passing. (was: Don't list testsuites that are expected to fail)
Hm. I suspect that this issue is related to validate_task_results(), but I haven't traced through all the logic so don't know exactly how this would happen.

https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/steps.py?q=chromium_tests/ste&sq=package:chromium&g=0&l=1799
Owner: tikuta@chromium.org
Status: Fixed (was: Untriaged)
I believe this was caused by https://chromium-review.googlesource.com/c/chromium/tools/build/+/1333174, and has since been reverted.
Cc: bpastene@chromium.org erikc...@chromium.org
Status: Assigned (was: Fixed)
It happened again in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/76257, at 2018-12-07 2:19 AM (PST), after r1333174 was reverted in https://chromium.googlesource.com/chromium/tools/build/+/d832c5d0d61408972ee2eba563aed35d03ecf741 at 2018-12-06 07:35:26.
And I believe that was due to its reland:
https://chromium-review.googlesource.com/c/chromium/tools/build/+/1365470
And subsequent 2nd revert:
https://chromium-review.googlesource.com/c/chromium/tools/build/+/1367544

These changes certainly don't have the best track record :(
Mergedinto: 914009
Status: Duplicate (was: Assigned)

Sign in to add a comment