New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 605842 link

Starred by 0 users

Issue metadata

Status: Duplicate
Merged: issue 605566
Owner:
Closed: May 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

2.3% regression in smoothness.tough_filters_cases at 388340:388365

Project Member Reported by toyoshim@chromium.org, Apr 22 2016

Issue description

might be a noise, but let me run a bisect for 388292:388365 just in case.
Note: alert is for 388340:388365, but range could be wrong for a big noise just before the alert.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=605842

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICgxNK-uwoM


Bot(s) for this bug's original alert(s):

chromium-rel-mac-retina
Project Member

Comment 2 by 42576172...@developer.gserviceaccount.com, Apr 22 2016

Cc: vmp...@chromium.org
Owner: vmp...@chromium.org

=== Auto-CCing suspected CL author vmpstr@chromium.org ===

Hi vmpstr@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : base: Remove a copy of and copy ctor from SequenceAndSortKey.
Author  : vmpstr
Commit description:
  
This patch removes a copy of SequenceAndSortKey. As a result, the copy
ctor is no longer needed (there won't be an implicit one generated).

R=danakj, gab

Review URL: https://codereview.chromium.org/1901223003

Cr-Commit-Position: refs/heads/master@{#388340}
Commit  : fc05fd34560e44a262d55be0ff84771eaa76c4f5
Date    : Tue Apr 19 22:53:07 2016


===== TESTED REVISIONS =====
Revision                Mean Value  Std. Dev.   Num Values  Good?
chromium@388339         20.990258   0.272402    5           good
chromium@388340         22.316125   0.268895    5           bad         <-
chromium@388341         22.107053   0.28179     5           bad
chromium@388343         22.009273   0.205363    5           bad
chromium@388346         22.047633   0.258895    5           bad
chromium@388352         22.050085   0.144337    5           bad
chromium@388365         21.967943   0.216991    5           bad

Bisect job ran on: mac_retina_perf_bisect
Bug ID: 605842

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.tough_filters_cases
Test Metric: frame_times/http___letmespellitoutforyou.com_samples_svg_filter_terrain.svg
Relative Change: 4.66%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_retina_perf_bisect/builds/1257
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9014704831760595824


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=605842

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!

Comment 3 by vmp...@chromium.org, Apr 22 2016

Hmm, that patch doesn't change much in terms of behavior...

Instead of doing a copy, we capture a const ref, then we do a copy of the scoped_refptr (which would've happened before when we did the full copy of the object).

We can std::move the scoped_reftr and save a ref, but that's about it... I'll put the patch up for that. However, I'm very skeptical that this is the patch that regressed (or that this is a real regression).

Comment 4 by fdoray@chromium.org, Apr 22 2016

The CL identified by the bisect is certainly not responsible for the regression. base/task_scheduler isn't used anywhere yet.

Comment 5 by vmp...@chromium.org, Apr 22 2016

Owner: toyoshim@chromium.org
Ok. It seems a pretty high confidence bisect, but can we run it again?
I kicked another bisect for the same revision range.
If it indicates another suspicious revision, this regression could be a noise. Otherwise, can we revert the CL to see if the revert changes the graph?
Project Member

Comment 7 by 42576172...@developer.gserviceaccount.com, Apr 25 2016

Cc: brettw@chromium.org
Owner: brettw@chromium.org

=== Auto-CCing suspected CL author brettw@chromium.org ===

Hi brettw@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Add documentation for exec_script and gypi_to_gn
Author  : brettw
Commit description:
  
There have been several questions about this recently.

Review URL: https://codereview.chromium.org/1905433002

Cr-Commit-Position: refs/heads/master@{#388341}
Commit  : d385ecffc8fbe77cdb5e8b839896291a53b1c988
Date    : Tue Apr 19 22:55:02 2016


===== TESTED REVISIONS =====
Revision                Mean Value  Std. Dev.   Num Values  Good?
chromium@388340         20.310184   0.094215    5           good
chromium@388341         21.093991   0.553035    5           bad         <-
chromium@388342         20.995851   0.435649    5           bad
chromium@388344         20.918366   0.285318    5           bad
chromium@388347         21.033695   0.248704    5           bad
chromium@388353         21.030824   0.524076    5           bad
chromium@388365         21.118116   0.180184    5           bad

Bisect job ran on: mac_retina_perf_bisect
Bug ID: 605842

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.tough_filters_cases
Test Metric: frame_times/http___letmespellitoutforyou.com_samples_svg_filter_terrain.svg
Relative Change: 3.98%
Score: 95.0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_retina_perf_bisect/builds/1260
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9014431996108730736


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=605842

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Owner: toyoshim@chromium.org
Can I try another bisect with a wide range?
Cc: senorblanco@chromium.org
cc test owner
Project Member

Comment 10 by 42576172...@developer.gserviceaccount.com, Apr 25 2016

Cc: erikc...@chromium.org
Owner: erikc...@chromium.org

=== Auto-CCing suspected CL author erikchen@chromium.org ===

Hi erikchen@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : nacl: Remove unnecessary reference to BrokerAddTargetPeer.
Author  : erikchen
Commit description:
  
BUG= 493414 

Review URL: https://codereview.chromium.org/1882353002

Cr-Commit-Position: refs/heads/master@{#388256}
Commit  : eab4594d16f3c58b6ff44841ffa7370804657470
Date    : Tue Apr 19 18:44:47 2016


===== TESTED REVISIONS =====
Revision                Mean Value  Std. Dev.   Num Values  Good?
chromium@388244         20.264687   0.270774    5           good
chromium@388250         20.174612   0.170989    5           good
chromium@388253         20.15411    0.206507    5           good
chromium@388255         20.241238   0.256638    5           good
chromium@388256         21.455647   0.140408    5           bad         <-
chromium@388271         21.484481   0.101624    5           bad
chromium@388305         21.610749   0.992362    5           bad
chromium@388365         21.82358    0.215758    5           bad

Bisect job ran on: mac_retina_perf_bisect
Bug ID: 605842

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.tough_filters_cases
Test Metric: frame_times/http___letmespellitoutforyou.com_samples_svg_filter_terrain.svg
Relative Change: 7.69%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_retina_perf_bisect/builds/1262
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9014421199705869040


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=605842

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: robertphillips@chromium.org
I'm suspecting https://chromium.googlesource.com/skia.git/+/718a5adc6da857f08578cae434bcf81ea3f5aa3d

The regression is quite visible on the graphs. We may be able to live with it, since it's pretty small, but I'm a little surprised that it's there. Robert, could you take a look?
Owner: robertphillips@chromium.org
Cc: nyerramilli@chromium.org
Labels: TE-Triaged
robertphillips@ gentle ping..

Comment 14 by dtu@chromium.org, May 6 2016

The bisects are really clear and I'm not sure why they're pointing to the wrong culprits.
Here are graphs showing the bisect results.
comment_2.png
85.6 KB View Download
comment_7.png
89.6 KB View Download
comment_10.png
95.9 KB View Download
Cc: nednguyen@chromium.org sullivan@chromium.org
sullivan: Could someone from the Speed-Infra team take a look at this as a case-study for why/how bisect is giving bad results? 

Each time bisect points at it culprit, it claims to be very confident, but each time bisect is rerun, it points at a totally new culprit. 

Standard deviation is not particularly meaningful on small sample sizes (5), but looking at the maximum reported standard deviation (0.99) does give us an idea of the "real" deviation between different test runs. 

If the standard deviation is actually 0.99, then we don't expect to be able to bisect regressions less than 3 * stddev, or about 15%. This bug is reporting a 2.3% regression, so we would expect bisect to always give random results. There may be a real regression, but 2% is far too small to reasonably be able to bisect.
Owner: sullivan@chromium.org
Cc: dtu@chromium.org
Mergedinto: 605566
Status: Duplicate (was: Assigned)

===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Switch SkColorFilterImageFilter over to new onFilterImage interface (again)
Author  : robertphillips
Commit description:
  
Back when this was originally reverted I was able to easily repro the perf regression locally. At ToT Skia/Chrome I can no longer repro the perf regression with this CL (in fact there is a modest perf improvement).

I propose landing this and then watching the Chromium perf bots.

BUG= 602300 ,598028
TBR=reed@google.com

GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1901513002

Review URL: https://codereview.chromium.org/1901513002
Commit  : 718a5adc6da857f08578cae434bcf81ea3f5aa3d
Date    : Tue Apr 19 17:21:03 2016


===== TESTED REVISIONS =====
Revision                         Mean     Std Dev    N  Good?
chromium@388291                  20.1802  0.0470737  5  good
chromium@388328                  20.2975  0.0652044  5  good
chromium@388347                  20.3265  0.129832   5  good
chromium@388356                  20.1722  0.0533827  5  good
chromium@388361                  20.3267  0.108784   5  good
chromium@388363                  20.2936  0.0981873  5  good
chromium@388364                  20.2727  0.0567071  5  good
chromium@388364,skia@7831295c63  20.2987  0.128545   5  good
chromium@388364,skia@718a5adc6d  20.7884  0.193712   5  bad    <--
chromium@388364,skia@e05bbbba79  20.699   0.0799315  5  bad
chromium@388364,skia@05db63b5fc  20.6872  0.0487923  5  bad
chromium@388365                  20.7591  0.168949   5  bad

Bisect job ran on: mac_retina_perf_bisect
Bug ID: 605842

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.tough_filters_cases
Test Metric: frame_times/http___letmespellitoutforyou.com_samples_svg_filter_terrain.svg
Relative Change: 2.87%
Score: 99.8

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_retina_perf_bisect/builds/1289
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9013383781254651840


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5826102107308032

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!

Comment 19 by dtu@chromium.org, May 6 2016

Someone clicked the "bad_bisect" links on these bisects, so I've studied these bisects by myself, and we've also taken a look as a group, and are still not sure why these bisects are giving the wrong results.


What we know:

Yes, the results are almost definitely wrong. Your change was entirely in an `#if defined(OS_WIN)`, and the bisect reported it as a culprit on Mac Retina. And it is a regression in frame_times, not a micro-metric that could be affected by binary layout or size.

The logs indicate that there weren't any technical errors, like running the wrong Chrome build or the wrong test.


Further research:

What could be wrong with the bisect? The best theories we have are:

1. Something else is (unintentionally) running on the machine, starting and stopping and screwing up results. We are planning on upgrading the OS on these bots (currently OS X 10.9.5), so maybe a fresh image will clear out whatever there is there.

2. The test is not independent. Subsequent runs are affected by previous runs. The first two bisects show this kind of pattern -- the first run was different from the rest. The last bisect doesn't really -- if you dig into the order in which the tests ran, the first run was good, the next 4 runs were bad, and the last 3 runs were good again. But this kind of pattern is consistent with something stopping and starting on the machine. I kicked off a fourth bisect to get more information.


How the statistics works:

The mean and std dev are given as a convenience to the user, because people seem to like those kinds of numbers. Internally, the bisect algorithm does not use mean or std dev, because those rely on an implicit assumption that the data is normally distributed, which is usually not true for computing performance.

Instead, the bisect measures the overlap between the two samples. If you draw two samples of size 5 came from the same population, the probability that they don't overlap at all is ~1.2%. In each one of these bisects, there was no overlap between the "good" and "bad" groups, so we said with high confidence that they came from different populations, implying a regression. There was overlap between all the samples within each of the "good" and "bad" groups.

(This explanation is a simplification. For more details, see: https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test )
Owner: dtu@chromium.org
Status: Assigned (was: Duplicate)
Assigning to Dave as he has been investigating this.

Please click the "Not what you expected?" link to report bad bisects; we do regular triage. I do also read through all the perf bugs and flag suspect ones myself, which is why the team started investigating this one. Sorry the comments on it were not clear earlier; we usually file a separate bug, but mostly investigated in offline discussions this time around.
For "Something else is (unintentionally) running on the machine", can we add some logging to figure out which other processes are running at the same time the test is running? 

I think we do also have traces for this test since it's a smoothness test and bisect now does --upload-results. Not sure if those would help us understand anything strange going on here.

Comment 23 by dtu@chromium.org, May 6 2016

Hmm, another possibility: maybe the builds have some variance because they came from different builders? These bisects have a mixture of perfbot and trybot builds.
Friendly Sheriff ping!
Dave@, Is there any update on this?

The original bug got associated with  crbug.com/605566 . And the from the graphs it looks very noisy https://chromeperf.appspot.com/group_report?bug_id=605566

Comment 25 by dtu@chromium.org, May 19 2016

Status: Duplicate (was: Assigned)
The bisect in comment 18 was correct, and so merging it with  issue 605566  is fine.
I opened a bug for the monitoring suggested in comment 21.  Issue 613063 .
Project Member

Comment 26 by bugdroid1@chromium.org, Jun 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2f793dcc427d8345550acb237aa54c144e68ccce

commit 2f793dcc427d8345550acb237aa54c144e68ccce
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Wed Jun 29 18:12:37 2016

Roll src/third_party/catapult/ f49c20888..a45087135 (9 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/f49c20888bb8..a4508713566b

$ git log f49c20888..a45087135 --date=short --no-merges --format='%ad %ae %s'

BUG= 619045 , 609252 , 605842 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2111593002
Cr-Commit-Position: refs/heads/master@{#402870}

[modify] https://crrev.com/2f793dcc427d8345550acb237aa54c144e68ccce/DEPS

Project Member

Comment 27 by bugdroid1@chromium.org, Jul 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/14b4d7e08c4ff427b6db64d795c1768f860cbb56

commit 14b4d7e08c4ff427b6db64d795c1768f860cbb56
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Thu Jul 07 04:30:07 2016

Roll src/third_party/catapult/ 9f8a0a8c0..71bab5766 (15 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/9f8a0a8c0d45..71bab5766221

$ git log 9f8a0a8c0..71bab5766 --date=short --no-merges --format='%ad %ae %s'

BUG= 352807 , 622948 , 625064 ,472699, 605842 ,472699

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2125883005
Cr-Commit-Position: refs/heads/master@{#404074}

[modify] https://crrev.com/14b4d7e08c4ff427b6db64d795c1768f860cbb56/DEPS

Project Member

Comment 28 by bugdroid1@chromium.org, Jul 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6f04d86267fdbb9b0a9b7cda9841f7f7126ef0ef

commit 6f04d86267fdbb9b0a9b7cda9841f7f7126ef0ef
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Wed Jul 13 02:19:51 2016

Roll src/third_party/catapult/ 4160831d2..72fb0b506 (24 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/4160831d2082..72fb0b5062af

$ git log 4160831d2..72fb0b506 --date=short --no-merges --format='%ad %ae %s'

BUG= 605842 ,531641, 605842 ,450171,589726, 627221 , 625852 ,589726, 622290 ,589726

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2147823002
Cr-Commit-Position: refs/heads/master@{#404903}

[modify] https://crrev.com/6f04d86267fdbb9b0a9b7cda9841f7f7126ef0ef/DEPS

Project Member

Comment 29 by bugdroid1@chromium.org, Jul 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3a742dd0484c95bea5218bcd918c622c0565c638

commit 3a742dd0484c95bea5218bcd918c622c0565c638
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Fri Jul 15 01:25:28 2016

Roll src/third_party/catapult/ 93f763181..581af8674 (3 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/93f7631816bc..581af8674af7

$ git log 93f763181..581af8674 --date=short --no-merges --format='%ad %ae %s'

BUG= 605842 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2154433003
Cr-Commit-Position: refs/heads/master@{#405665}

[modify] https://crrev.com/3a742dd0484c95bea5218bcd918c622c0565c638/DEPS

Project Member

Comment 30 by bugdroid1@chromium.org, Jul 19 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e0b2c25be9142a0655312888cf0ad8c4390076be

commit e0b2c25be9142a0655312888cf0ad8c4390076be
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Tue Jul 19 07:31:55 2016

Roll src/third_party/catapult/ 3c2ec0209..7aeded9c4 (6 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/3c2ec02098fa..7aeded9c4ce9

$ git log 3c2ec0209..7aeded9c4 --date=short --no-merges --format='%ad %ae %s'

BUG= 605842 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2157313002
Cr-Commit-Position: refs/heads/master@{#406228}

[modify] https://crrev.com/e0b2c25be9142a0655312888cf0ad8c4390076be/DEPS

Project Member

Comment 31 by bugdroid1@chromium.org, Aug 11 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3112c1d94578ecdda5c7baf4c5140a434c905867

commit 3112c1d94578ecdda5c7baf4c5140a434c905867
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Thu Aug 11 04:55:08 2016

Roll src/third_party/catapult/ 49d354d56..0f569374f (3 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/49d354d564ab..0f569374f524

$ git log 49d354d56..0f569374f --date=short --no-merges --format='%ad %ae %s'

BUG= 605842 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2237753002
Cr-Commit-Position: refs/heads/master@{#411268}

[modify] https://crrev.com/3112c1d94578ecdda5c7baf4c5140a434c905867/DEPS

Sign in to add a comment