New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 683184 link

Starred by 2 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 681662
issue 687733
issue 695653
issue 699771


Show other hotlists

Hotlists containing this issue:
speed-bisect


Sign in to add a comment

Bisect - Doesn't do well on regressions where values are mostly the same

Project Member Reported by simonhatch@chromium.org, Jan 20 2017

Issue description

From  crbug.com/681662 

Can see all the values are nearly identical, save one massive outlier (which is the regression).

{
  "result": {
    "U": 210, 
    "p": 0.34090381592813035, 
    "significance": "FAIL_TO_REJECT"
  }, 
  "sampleA": [
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    21259320, 
    2294840
  ], 
  "sampleB": [
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840, 
    2294840
  ]
}
 

Comment 1 by dtu@chromium.org, Jan 23 2017

I don't think we can say that one outlier makes them different. What if we ran one more iteration in sampleB and it turned out the same as the outlier? Then the two samples would effectively be the same.

I think MWU will give us the answer we want if > ~30% of the values are the outlier.

Comment 2 by dtu@chromium.org, Jan 23 2017

I think we can also become confident about the result if we run lots of iterations. Letting the user continue to rerun the test would help.
Yeah true, kinda think these kinda outlier style regressions really won't be handled well until we're using Pinpoint and we can guide or change the test.
Blocking: 687733
Components: Speed>Bisection
Blocking: 695653
Cc: hubbe@chromium.org crouleau@chromium.org simonhatch@chromium.org
Two ideas to fix this:

1. What about a bisect that uses the standard deviation of the results rather than the mean of the results?

2. What if we let the user of the bisect tool decide how confident the bisect needs to be? I would normally want my bisect to continue on even when it is not very confident, and then I would want to see the data myself at the end. The default value could be the same, but for users who know what they are doing, being able to adjust a p value would fix this problem.
Blocking: 699771

Sign in to add a comment