New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 762304 link

Starred by 1 user

Issue metadata

Status: Fixed
Merged: issue 760567
Owner:
Closed: Sep 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

1.3% regression in system_health.memory_desktop at 498176:498270

Project Member Reported by briander...@chromium.org, Sep 5 2017

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=762304

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=e4b6872173d743ef1b48d87140a40af15cef66d1d332b4438ecf4a221af2687d


Bot(s) for this bug's original alert(s):

chromium-rel-win10
Cc: bmcquade@chromium.org
Owner: bmcquade@chromium.org
Status: Assigned (was: Untriaged)

=== Auto-CCing suspected CL author bmcquade@chromium.org ===

Hi bmcquade@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Bryan McQuade
  Commit : 1d51d85bfb91069759f21f8b059d9a6810d838b7
  Date   : Tue Aug 29 22:36:00 2017
  Subject: Unconditionally track and send page load metrics from the render process.

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : system_health.memory_desktop
  Metric       : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn
  Change       : 0.26% | 26429752.3571 -> 26497729.2143

Revision             Result                   N
chromium@498175      26429752 +- 252225       14      good
chromium@498222      26421512 +- 195151       14      good
chromium@498247      26454682 +- 394491       21      good
chromium@498259      26414570 +- 172688       14      good
chromium@498260      26426501 +- 29399.2      6       good
chromium@498261      26461978 +- 107657       14      bad       <--
chromium@498262      26503531 +- 181601       9       bad
chromium@498265      26485064 +- 142308       9       bad
chromium@498270      26497729 +- 153982       14      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8969252291124802208


For feedback, file a bug with component Speed>Bisection
 Issue 762305  has been merged into this issue.
Status: WontFix (was: Assigned)
Looking at the graph and the results, this seems like an incorrect diagnosis to me.

The change flagged as causing a regression is 498261. Here are results from 3 runs: the suspected change, one before that change, and one after:

chromium@498247      26454682 +- 394491
chromium@498261      26461978 +- 107657
chromium@498262      26503531 +- 181601

We can see that there's far more of an increase after this change, than from one before:
26503531-26461978=41553
26461978-26454682=7296

We see an increase of just 7296 from a change before to the suspected change, but a subsequent much larger increase of 41553 from the suspected change to a change after.

Additionally, the +- windows are much larger than the deltas. This seems like an invalid bisect result. Closing as invalid.
Cc: perezju@chromium.org
Owner: ----
Status: (was: WontFix)
Please unassign yourself with a comment if you think the bisect is invalid. Don't WontFix, since we still need to dig into finding a culprit. See https://chromium.googlesource.com/chromium/src/+/master/docs/speed/addressing_performance_regressions.md#If-you-don_t-believe-your-CL-could-be-the-cause (linked by the bisect bot as http://g.co/ChromePerformanceRegressions)

This looks like multiple regressions, cc-ing test owner perezju about how deeply we should investigate and kicking off bisects on the individual jumps.
Annie, any idea what happened to:

https://chromeperf.appspot.com/group_report?bug_id=762304 ?

It's now empty.

Clicking on "Original alerts at time of bug-filing" shows alerts have been moved to other bugs.
There was a dup bug filed for this: https://bugs.chromium.org/p/chromium/issues/detail?id=762305 which the alerts may have been migrated to: https://chromeperf.appspot.com/group_report?bug_id=762305
That's strange. Alerts should have moved in the other direction, right? From the one closed as duplicate ( issue 762305 ) to the one merged into (this one).
Not sure - I kept the first filed issue as the canonical and marked the subsequently filed duplicate as a dupe.

If the perf system moved graphs from one issue to another, perhaps it should have marked the issue it was moving data from as the dupe, to keep everything in sync.
Ah, got it. Sorry. You manually marked that one as dupe of this. That makes sense. Because then alerts are not updated.

I thought it was an auto-dupe by the perf bisect system. Now all makes sense.

Still unclear why we got two bugs for the same alert; but anyway.

I manually moved the alert back to here just to keep things sane:
https://chromeperf.appspot.com/group_report?bug_id=762304

Let's see if the bisects in #7 and #8 come back with something meaningful.
Mergedinto: 760567
Status: Duplicate

=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Dave Tapuska
  Commit : f238a1a8f2f73d226bdb38394a6e085a59cf0e41
  Date   : Mon Aug 28 21:24:39 2017
  Subject: Move Input Injection from Chrome IPC to mojo.

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : system_health.memory_desktop
  Metric       : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn
  Change       : 0.59% | 26294536.8333 -> 26448583.0

Revision             Result                   N
chromium@497832      26294537 +- 181343       6      good
chromium@497879      26222544 +- 217334       6      good
chromium@497880      26446177 +- 225695       6      bad       <--
chromium@497881      26475429 +- 273970       6      bad
chromium@497882      26430082 +- 52062.3      6      bad
chromium@497885      26459585 +- 114780       6      bad
chromium@497891      26413452 +- 81601.0      6      bad
chromium@497902      26449250 +- 182922       6      bad
chromium@497925      26448583 +- 78807.1      6      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8969193654497079744


For feedback, file a bug with component Speed>Bisection
Cc: altimin@chromium.org
Owner: altimin@chromium.org
Status: Assigned (was: Duplicate)

=== Auto-CCing suspected CL author altimin@chromium.org ===

Hi altimin@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Alexander Timin
  Commit : e6bd0b61905fb2661959fdd32174d780c8b325b8
  Date   : Wed Aug 30 13:12:17 2017
  Subject: [scheduler] Rename per-frame task queues

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : system_health.memory_desktop
  Metric       : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn
  Change       : 0.51% | 26498693.1429 -> 26633359.0714

Revision             Result                   N
chromium@498437      26498693 +- 339656       14      good
chromium@498439      26547745 +- 486664       14      good
chromium@498440      26661503 +- 325769       14      bad       <--
chromium@498442      26711832 +- 247564       6       bad
chromium@498446      26649118 +- 59327.7      6       bad
chromium@498455      26661513 +- 95271.8      6       bad
chromium@498473      26656651 +- 112851       6       bad
chromium@498508      26633359 +- 418912       14      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8969193646654967568


For feedback, file a bug with component Speed>Bisection
Owner: perezju@chromium.org
Assigning to myself to figure out what's going on. Bisect seems to be blaming random CLs.
Status: Fixed (was: Assigned)
Hey, Annie was right. There were in fact two distinct regressions near to the alert position.

There is nothing in the first bisect (#3) just noise, so +bmcquade is off the hook.

The bisect in #14 does show a ~160KiB regression in the browser process, quite visible here:
https://chromeperf.appspot.com/report?sid=a9f091bd3b8185f40016d20be556da5169aa167d53905898278c829d982bbacb&start_rev=492256&end_rev=500184

And confirms r497880 (dtapuska) as the culprit; but that has already been addressed at  issue 760567 .

Then the bisect in #15 found a second ~150KiB regression in the renderers, also visible here:
https://chromeperf.appspot.com/report?sid=cb39b58c228e41931186ed97cdf25666d2c8f105665a00bd38a49211e6ced0bc&start_rev=494704&end_rev=499416

And confirms r498440 (altimin) as the culprit; but as it can be seen on that graph above it has already recovered.

So nothing else left to address here! Thanks all for the patience and sorry for the confusions.
malloc.png
50.2 KB View Download
Status: Assigned (was: Fixed)
Thanks for the analysis!

I'm re-opening this temporarily just so my suggestions below don't get lost. Please feel free to open new bugs for the suggestions below & then close this out if these are better tracked elsewhere.

Suggestion: could we update the bisect bot's logic to prevent it from opening a bug like this one in the future, when it produces results like these:

chromium@498175      26429752 +- 252225       14      good
chromium@498222      26421512 +- 195151       14      good
chromium@498247      26454682 +- 394491       21      good
chromium@498259      26414570 +- 172688       14      good
chromium@498260      26426501 +- 29399.2      6       good
chromium@498261      26461978 +- 107657       14      bad       <--
chromium@498262      26503531 +- 181601       9       bad
chromium@498265      26485064 +- 142308       9       bad
chromium@498270      26497729 +- 153982       14      bad

I'm not sure how this decided that my change was the culprit - that seems really bogus when you scan the results and notice that some of the deltas before my change are much smaller than deltas after (see my original analysis above).

I think it's important that the bisect bot is as accurate as possible, so engineers take it seriously. Otherwise, we end up in a "cry wolf" situation where engineers just ignore new bugs. I know that the last few times I've had a bug from bisect bots assigned to me the results were wrong, and I'm starting to take a default position of assuming bisect bot results are not useful by default, which seems undesirable as we will be more likely to let real regressions get overlooked.

A couple ideas:

* your team should track a success metric to keep track of the % of these bugs that incorrectly diagnose the root cause, and have an OKR to drive that down to a reasonable level over time, both to prevent "cry wolf" issues and to minimize wasted engineer time.

* update existing bisect bot bad change detection logic to scan results shortly before and shortly after the initially selected "bad" revision to make sure the deltas before the chosen revision are consistently larger than the deltas after. Use stats to show that the probability of the given bad change being truly bad is above some p value. If not statistically significant, maybe open the bug but assign it to the team that owns the bot so it can be analyzed further.
Owner: sullivan@chromium.org
Assigning to Annie in case she wants to add anything else.

The TL;DR is that yes, we're aware of these issues (metrics are tracked and OKR's set!) and her team is working hard on a replacement of the entire bisect system, squarely targeted at addressing concerns like the ones you express.
Cc: dtu@chromium.org
Owner: simonhatch@chromium.org
Status: Fixed (was: Assigned)
Adding Simon and Dave, who are working on pinpoint, the new version of bisect: https://docs.google.com/document/d/1FKPRNU2kbPJ15p6XHO0itCjYtfvCpGt2IHblriTX1tg/edit

This was a bit of a corner case in that there were two regressions very close together. Pinpoint will handle this much better, but we'll still have to select the range correctly, and that turns out to be harder than it sounds. We do track metrics and send them in a weekly mail to chrome-speed-operations@, and have goals around improving accuracy. We'll also be tweaking the classification algorithm post-launch.

Marking fixed, but Simon, feel free to reopen if need be.

Sign in to add a comment