New issue
Advanced search Search tips

Issue 916935 link

Starred by 3 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: ----
Type: Feature


Participants' hotlists:
Test-Use-Cases


Sign in to add a comment

[LUCI-Feedback] Distinguishing new test failures from the same old failures

Project Member Reported by sortie@google.com, Dec 20

Issue description

At Dart we're implementing a first class test results workflow. Our systems can compare with the previous test results and approved test failures that shouldn't turn builders green.

We'd like to be able to notify developers if a builder has a new test failure, even if the builder is already red. We currently use edge triggering with luci-notify, so developers know when builders turn red, however they are not notified about any additional test failures. Likewise the milo console has red builds, where usually only the first in a red column is interesting, but there could be other interesting builds hidden in a red column that add additional regressions.

Buildbot supported the 'Failed Again' outcome, which Milo still supports showing, however the new LUCI recipes system doesn't let us use that color. If Failed Again was supported, our recipe could readily determine whether there are new regressions and make the builder red or make it orange (failed again) if the same test failures were seen. luci-notify could then be put in level triggered mode and mail on every red build, and ignore the orange ones. Developers would then be notified if additional tests break even if the builder is red and it would be easy to spot interesting (red) builds on the milo console.

Some extensibility with additional colors could be useful. For instance we have deflaking and could use a dark green color to signify a build that had flakes that were ignored. This is not as important.

This feature isn't critical to us shipping first class test results, but it will improve how well it works and give our developers a better experience.
 
Cc: iannucci@chromium.org
Labels: Type-Feature
Would the recipe talk to some external state to retrieve previous state?
Yes we store our test results in cloud storage and the recipe compares the current test results with the baseline (for CQ that would be the previous CI results to see if there are new regressions, for CI that would be a set of approved test failures to see if there are any test failures that are unapproved).
Cc: -iannucci@chromium.org -efoo@chromium.org -no...@chromium.org -whesse@google.com machenb...@chromium.org jbudorick@chromium.org jclinton@chromium.org phosek@chromium.org
Adding representatives of other teams. How do you solve this problem today, or need it to be solved?
Cc: -machenb...@chromium.org -athom@google.com -phosek@chromium.org -jclinton@chromium.org machenbach@google.com no...@chromium.org iannu...@google.com whesse@google.com
Cc: jclinton@google.com phosek@chromium.org athom@google.com
(monorail on phone is hard)
Browser infrastructure doesn't have a good way to handle this case. As designed, I believe SoM is supposed to obviate the need for this for the sheriff use case, though I'm not sure how well that works in practice (and it only addresses the sheriff use case, of course).

I do think this would be a useful capability.
Cc: akes...@chromium.org

Sign in to add a comment