New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 809705 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 774675
issue 826087



Sign in to add a comment

[Findit] Flake Analyzer - Stable failing to flaky may not be reliable

Project Member Reported by lijeffrey@chromium.org, Feb 6 2018

Issue description

https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVymwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCJlY2hyb21pdW0ud2luL1dpbjcgVGVzdHMgKGRiZykoMSkvNjYxNTAvYnJvd3Nlcl90ZXN0cy9Rbkp2ZDNOcGJtZEVZWFJoVW1WdGIzWmxja0p5YjNkelpYSlVaWE4wTGtOaFkyaGwMCxITTWFzdGVyRmxha2VBbmFseXNpcxgBDA

In this analysis, the "culprit" CL was a revert of an earlier CL that broke everything. The test was already flaky to begin with, but since Findit only searches for stable-> flaky to identify a culprit, it identified the revert CL as the culprit.

For cases like this, a few options to consider:
1. lookback algorithm should continue to look back further in time. This would produce a more thorough history, but can also miss cases where the culprit is indeed correct (e.g. a fixing CL that didn't actually fully fix the test)
2. If the culprit is a revert CL, ignore revert CLs and don't notify culprits
3. A combination, where if the first suspect is a revert, then Findit should continue to search back in time
 

Comment 1 by wylieb@chromium.org, Feb 22 2018

Owner: lijeffrey@chromium.org

Comment 2 by wylieb@chromium.org, Feb 22 2018

Labels: -Pri-3 Pri-2
Status: Assigned (was: Available)

Comment 3 by wylieb@chromium.org, Mar 27 2018

Blocking: 826087
Cc: chanli@chromium.org robert...@chromium.org lijeffrey@chromium.org st...@chromium.org
 Issue 723046  has been merged into this issue.
Cc: wylieb@chromium.org
 Issue 811623  has been merged into this issue.

Comment 6 by st...@chromium.org, Jun 14 2018

Cc: -robert...@chromium.org -wylieb@chromium.org -chanli@chromium.org -lijeffrey@chromium.org -st...@chromium.org
From January 1 - October 8 2018, there were:
* 848 unique culprits identified
* 72 of those culprits were stable failing --> culprit

Based on a random sampling of these analyses, almost all cases were false positives:
* A revert of a CL that caused outright breakage (majority of cases)
* Some low flakiness cases (always failing --> mostly failing) (fairly common)
* Something went wrong during swarming tasks giving a false 0% pass rate before the culprit (very rare)

We may want to revisit a solution to detect this and for Findit to act properly, but in the meantime bailing out from auto actions should silence the false positives.
Project Member

Comment 8 by bugdroid1@chromium.org, Oct 11

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/5fe6769f4911a1dc1c56bb6248533222cea9cc9f

commit 5fe6769f4911a1dc1c56bb6248533222cea9cc9f
Author: Jeffrey Li <lijeffrey@chromium.org>
Date: Thu Oct 11 19:32:23 2018

[Findit] Flake Analyzer - Bail out of auto actions for stable failing --> flaky culprits

Bail out of auto actions for stable failing --> flaky results, which are almost always false
positives.

Bug: 809705
Change-Id: I2c535623556546862298392fcda49a27276df2c3
Reviewed-on: https://chromium-review.googlesource.com/c/1272220
Commit-Queue: Jeffrey Li <lijeffrey@chromium.org>
Reviewed-by: Yuke Liao <liaoyuke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18250}
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/services/flake_failure/flake_analysis_util.py
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/services/flake_failure/pass_rate_util.py
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/services/flake_failure/test/flake_analysis_util_test.py
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/services/flake_failure/test/pass_rate_util_test.py
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/pipelines/flake_failure/analyze_flake_pipeline.py
[modify] https://crrev.com/5fe6769f4911a1dc1c56bb6248533222cea9cc9f/appengine/findit/pipelines/flake_failure/test/analyze_flake_pipeline_test.py

Sign in to add a comment