Reduce false rejects to consistently be <1%. |
|||||||
Issue descriptionFalse rejects is one of two top-level metrics tracked by the ChOps CQ SLO. The false reject rate is roughly defined as -- percentage of landed CLs that needed to have the CQ button ticked twice [with no code changes], because the first CQ run flakily failed. There are two primary sources of false rejects: 1) Some test suites are not robust to errors and return INVALID_TEST_RESULTS when they could fail gracefully. INVALID_TEST_RESULTS should only occur in exceptional circumstances that require trooper attention. Right now they occur frequently and are mostly ignored. 2) Some test suites are not robust to flaky failures. For example, flaky blink layout tests failures are exceedingly likely to fail on retry. Having test suites that are robust to flaky failures is very important, since retrying at the test suite layer is cheap O(seconds), whereas retrying at any other layer [swarming, recipe, CQ] has O(minutes)++ of overhead. Once we drive the false reject rate sufficiently low, then build failures don't need to be retried at the CQ layer. This will roughly cut the CQ cycle time of CQ runs with build failures in half. Given that 1/3 of all CQ runs had build failures [as of 9/12 -- see crbug.com/882969#c8 ], this should significantly reduce the 90th percentile of CQ cycle time. Reducing the false reject rate to 0 primarily consists of investigating and fixing sources of false rejects. This is a large task and will be broken into subtasks [split by builder]. Each builder will have its own tracking bug. ⛆ |
|
|
,
Oct 4
> It seems unlikely to me that we'll get all the way to zero, at least not for sustained periods of time; Agreed > I do think "sufficiently low" is feasible. Maybe we should focus on what that number is? Proposal: I would like "false rejects" to be consistently <1%. We can gate removing full CQ retries on "false rejects" staying below 1%, even with full CQ retries disabled.
,
Oct 4
,
Oct 6
> I would like "false rejects" to be consistently <1%. We can gate removing full > CQ retries on "false rejects" staying below 1%, even with full CQ retries disabled. I was happy with <5%, so under 1% would be terrific. That seems like reasonable criteria for removing the retry, and I'm open to a more lenient threshold if we can't get to under 1%.
,
Dec 4
|
||||
►
Sign in to add a comment |
|||||||
Comment 1 by dpranke@chromium.org
, Oct 4