On CQ, reject CLs that introduce __new tests__ that are flaky |
||||||||
Issue description
Currently, we have Flake Analyzer to analyze flaky tests in the post-submit stage. We have a recipe to rerun a given gtest N times to calculate its pass rate. According to the data in Flake Analyzer, we found that a good number of tests are already flaky in the CLs that introduced them. They pass CQ try-jobs (maybe because of the 3 retries upon failure), and then surface on waterfall or CQ try-jobs of other CLs.
It would be much better to prevent CLs from landing if they introduce __new tests__ that are flaky by rerunning the new tests N (like 20, or even 100) times. We could reject a CL if the pass rate of any new test is not 100%.
However, we might not have enough machine resource to do exactly the same for __existing tests__.
But we should have enough machine resource to verify whether a CL has fixed the flakiness of __existing tests__:
1. The CL owner adds a tag "FIX_FLAKY_TESTS={'browser_tests': 'suite.test'}"
2. trybot recipe reads the info, and rerun those given tests N times
3. fail the try-job if pass rates are not 100%
This could give confidence to developers if the CL passes the check and lands, and also avoid commit-revert-fix-commit-revert-fix-...
This is also helpful if the developers don't have the test config (OS, GPU, etc) in hand, or the flakiness is reproducible only on bots but not local workstation.
,
Mar 3 2017
This is somewhat similar to issue 694603.
,
Apr 26 2017
,
Jun 9 2017
,
Nov 20 2017
There are ~10k *new* gtests added in between Sept 14 and Nov 18.
,
Nov 20 2017
,
Nov 20 2017
,
Nov 21 2017
,
Aug 2
,
Aug 2
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by st...@chromium.org
, Mar 3 2017Summary: On CQ, reject CLs that introduce __new tests__ that are flaky (was: On CQ, reject CLs that introduce new tests that are flaky)