Flake detection idea: re-run to differentiate a consistent failure from a flaky test |
||
Issue descriptionOne things that prevents Flake Detection from being applied to other projects such as webrtc is that Flake Detection relies on the "without patch" step to differentiate a consistent failure from a flaky test. For those projects without the "without patch" mechanism, one idea is to re-run the test to achieve the same goal. Here is how it works: Imagine in a world without "without patch" rerun. CL1: cq attempt1 build1: test t failed. CL1: cq attempt2 build2: all tests passed. Flake Detection caught t as a flaky test, so it reruns t with the failed hash 30 times to decide if it is a consistent failure (may not may not be related to the patch). CL2: cq attempt1 build3: test t failed. CL2: cq attempt2 build4: all tests passed. Flake Detection caught t as a flaky test, so it reruns t with the failed hash 30 times to decide if it is a consistent failure (may not may not be related to the patch). CL3: cq attempt1 build5: test t failed. CL3: cq attempt2 build6: all tests passed. Flake Detection caught t as a flaky test, so it reruns t with the failed hash 30 times to decide if it is a consistent failure (may not may not be related to the patch). There are no code changes between the two attempts for each CL (except for rebase). Then because test t is flaky with 3 different CLs, it is considered as a flaky test. Before filing the bugs, Flake Detection checks if all the above three occurrences are consistent failures, if yes, DON'T file bug, otherwise, file a bug. Why it works? 1. If t is a consistent failure that was committed to the code, then all the three re-runs would fail consistently, then a bug WON'T be filed. 2. If t is a consistent failure caused by a specific patch, the above situation is unlikely to happen because we look at 3 different CLs. 3. If t is a flaky test, the re-run won't fail consistently, so a bug WILL be filed. Chatted with Roberto, with the build_index, it allows quickly find the hash given a build configuration and isolate target name. Will explore more later.
,
Jun 7 2018
This sounds like a working idea! One factor to consider is VM capacity.
,
Aug 2
|
||
►
Sign in to add a comment |
||
Comment 1 by st...@chromium.org
, Jun 7 2018