New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 654434 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Aug 13
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 615468



Sign in to add a comment

Group tests by step names to make sure that analyze won't affect group flakiness

Project Member Reported by serg...@chromium.org, Oct 10 2016

Issue description

Currently we compute flakiness for a group of tests (e.g. all tests in a dir, all tests owned by some team, all tests in a given test suite) by looking at the runs that passed with 4 tries. Given these runs, we check which ones would have failed with just 3 tries for individual tests and divide that number by the total number runs that passed with 4 tries.

However, we did not account the fact that some runs are only running a subset of tests due to analyze step, therefore they are not necessarily comparable to each other. OTH, it may be that analyze is only selecting which test suites must be run rather than selecting specific tests. We need to figure it out...
 
Cc: phajdan.jr@chromium.org
Pawel, does analyze step selects specific tests in a step to be run or only selects which steps are compiled/run?
Blocking: 615468
Cc: phajdan@google.com
Analyze only works at compile targets / step level. Inside a step, it does not apply any filters.
Thanks, Pawel. In this case it's sufficient to group things by step name when computing group flakiness. We'll may need to take care about it later, but atm, since the system is only used for webkit_tests, it is not an issue.
Components: -Infra>Flakiness>Pipeline Infra>Flakiness
Labels: Flakiness-Surface Milestone-Dogfood
Summary: Group tests by step names to make sure that analyze won't affect group flakiness (was: See what effect does "analyze" step have on computed flakiness of a group of tests)
Adding to Flakiness Surface project since this will block us from extending our system to all Chromium tests.
Labels: -Pri-2 -Milestone-Dogfood Milestone-Launch Pri-3
As part of  issue 670329 , we have already sacrificed "pureness of data" by processing runs where some tests fail fully. We removing these failed tests from the results and process the rest as a group despite not running all tests. This was necessary to be able to get sufficient amount of data to approximate flakiness well. Detailed reasoning for our choice is outlined here: https://docs.google.com/document/d/1g587hOo4KySenWGNzZnItINI7zFP8hsNDn_CyhQ6X5Q/edit#heading=h.czasg68x2l6e.

Considering the above, I am not sure whether it makes sense to spend time addressing this bug. I'll keep it open, but more to a later milestone and probably only fix this if someone actually reports an issue related to analyze.
Cc: seanmccullough@chromium.org
CC Sean, who will be taking over flakiness effort as I'm transitioning to another team.
Project Member

Comment 8 by sheriffbot@chromium.org, Aug 13

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: WontFix (was: Untriaged)

Sign in to add a comment