False positive in MediaRouterIntegrationBrowserTest |
|||
Issue descriptionPage URL: https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVytwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCKAAWNocm9taXVtLm1lbW9yeS9MaW51eCBNU2FuIFRlc3RzLzk2MjUvYnJvd3Nlcl90ZXN0cy9UV1ZrYVdGU2IzVjBaWEpKYm5SbFozSmhkR2x2YmtKeWIzZHpaWEpVWlhOMExrWmhhV3hmVW1WamIyNXVaV04wVTJWemMybHZiZz09DAsSE01hc3RlckZsYWtlQW5hbHlzaXMYAQw Description: False positive in filing bug.
,
May 13 2018
,
May 13 2018
,
May 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3 commit b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3 Author: Jeffrey Li <lijeffrey@chromium.org> Date: Thu May 24 22:48:24 2018 [Findit] Flake Analyzer - Avoid false positives with better confidence score 1. Raise the bar for stable --> flaky points requiring 2+ fully (100% or 0% instead of 98%) stable points before considering it for 70% confidence to avoid filing bugs/notifying culprits that are unlikely correct. 2 stable in a row is still strict enough to avoid most false positives without bailing out unnecessarily. 2. If the data point is proposed to have 70% confidence, use max(.7, steppiness) so not all culprits are either 100% vs 70% which looks silly. 3. Relax requirement of 3+ fully-stable --> flaky to just 2 for notifying culprits, since bugs are still filed regardless. Filing bugs and sending notifications should follow similar criteria, since bugs filed usually are assigned back to the CL owner anyway for an initial investigation. 4. Fallback to steppiness in all other cases. With this change, because of the requirement for 2+ fully-stable points before a culprit will be assigned a 0.7+ confidence score, a few more false negatives may be observed, but the trade off is a large reduction in false positives. Historically, many cases observe pass rate patterns of 99% -> 100% -> 75%, which were incorrect, however 100% -> 100% -> 75% were much more reliable. In the former case, steppiness would be the primary scoring mechanism, which would assign a lower confidence score, and in the latter, 0.7+ would be used. Bug: 840413, 840074 Change-Id: I57fb0e40fb60d39b2e5b44c018c7b030f9c080fc Reviewed-on: https://chromium-review.googlesource.com/1069832 Commit-Queue: Jeffrey Li <lijeffrey@chromium.org> Reviewed-by: Shuotao Gao <stgao@chromium.org> [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/culprit_util.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/pass_rate_util.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/confidence_score_util.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/data_point_util.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/test/culprit_util_test.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/test/confidence_score_util_test.py [modify] https://crrev.com/b4c0dcb9d49688898db4c86b9cc3f59f2e7babb3/appengine/findit/services/flake_failure/flake_constants.py |
|||
►
Sign in to add a comment |
|||
Comment 1 by st...@chromium.org
, May 13 2018