Issue metadata
Sign in to add a comment
|
1.3% regression in system_health.memory_desktop at 498176:498270 |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Sep 5 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8969252291124802208
,
Sep 6 2017
=== Auto-CCing suspected CL author bmcquade@chromium.org === Hi bmcquade@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Bryan McQuade Commit : 1d51d85bfb91069759f21f8b059d9a6810d838b7 Date : Tue Aug 29 22:36:00 2017 Subject: Unconditionally track and send page load metrics from the render process. Bisect Details Configuration: winx64_10_perf_bisect Benchmark : system_health.memory_desktop Metric : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn Change : 0.26% | 26429752.3571 -> 26497729.2143 Revision Result N chromium@498175 26429752 +- 252225 14 good chromium@498222 26421512 +- 195151 14 good chromium@498247 26454682 +- 394491 21 good chromium@498259 26414570 +- 172688 14 good chromium@498260 26426501 +- 29399.2 6 good chromium@498261 26461978 +- 107657 14 bad <-- chromium@498262 26503531 +- 181601 9 bad chromium@498265 26485064 +- 142308 9 bad chromium@498270 26497729 +- 153982 14 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8969252291124802208 For feedback, file a bug with component Speed>Bisection
,
Sep 6 2017
Issue 762305 has been merged into this issue.
,
Sep 6 2017
Looking at the graph and the results, this seems like an incorrect diagnosis to me. The change flagged as causing a regression is 498261. Here are results from 3 runs: the suspected change, one before that change, and one after: chromium@498247 26454682 +- 394491 chromium@498261 26461978 +- 107657 chromium@498262 26503531 +- 181601 We can see that there's far more of an increase after this change, than from one before: 26503531-26461978=41553 26461978-26454682=7296 We see an increase of just 7296 from a change before to the suspected change, but a subsequent much larger increase of 41553 from the suspected change to a change after. Additionally, the +- windows are much larger than the deltas. This seems like an invalid bisect result. Closing as invalid.
,
Sep 6 2017
Please unassign yourself with a comment if you think the bisect is invalid. Don't WontFix, since we still need to dig into finding a culprit. See https://chromium.googlesource.com/chromium/src/+/master/docs/speed/addressing_performance_regressions.md#If-you-don_t-believe-your-CL-could-be-the-cause (linked by the bisect bot as http://g.co/ChromePerformanceRegressions) This looks like multiple regressions, cc-ing test owner perezju about how deeply we should investigate and kicking off bisects on the individual jumps.
,
Sep 6 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8969193654497079744
,
Sep 6 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8969193646654967568
,
Sep 6 2017
Annie, any idea what happened to: https://chromeperf.appspot.com/group_report?bug_id=762304 ? It's now empty. Clicking on "Original alerts at time of bug-filing" shows alerts have been moved to other bugs.
,
Sep 6 2017
There was a dup bug filed for this: https://bugs.chromium.org/p/chromium/issues/detail?id=762305 which the alerts may have been migrated to: https://chromeperf.appspot.com/group_report?bug_id=762305
,
Sep 6 2017
That's strange. Alerts should have moved in the other direction, right? From the one closed as duplicate ( issue 762305 ) to the one merged into (this one).
,
Sep 6 2017
Not sure - I kept the first filed issue as the canonical and marked the subsequently filed duplicate as a dupe. If the perf system moved graphs from one issue to another, perhaps it should have marked the issue it was moving data from as the dupe, to keep everything in sync.
,
Sep 6 2017
Ah, got it. Sorry. You manually marked that one as dupe of this. That makes sense. Because then alerts are not updated. I thought it was an auto-dupe by the perf bisect system. Now all makes sense. Still unclear why we got two bugs for the same alert; but anyway. I manually moved the alert back to here just to keep things sane: https://chromeperf.appspot.com/group_report?bug_id=762304 Let's see if the bisects in #7 and #8 come back with something meaningful.
,
Sep 6 2017
=== BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Dave Tapuska Commit : f238a1a8f2f73d226bdb38394a6e085a59cf0e41 Date : Mon Aug 28 21:24:39 2017 Subject: Move Input Injection from Chrome IPC to mojo. Bisect Details Configuration: winx64_10_perf_bisect Benchmark : system_health.memory_desktop Metric : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn Change : 0.59% | 26294536.8333 -> 26448583.0 Revision Result N chromium@497832 26294537 +- 181343 6 good chromium@497879 26222544 +- 217334 6 good chromium@497880 26446177 +- 225695 6 bad <-- chromium@497881 26475429 +- 273970 6 bad chromium@497882 26430082 +- 52062.3 6 bad chromium@497885 26459585 +- 114780 6 bad chromium@497891 26413452 +- 81601.0 6 bad chromium@497902 26449250 +- 182922 6 bad chromium@497925 26448583 +- 78807.1 6 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8969193654497079744 For feedback, file a bug with component Speed>Bisection
,
Sep 6 2017
=== Auto-CCing suspected CL author altimin@chromium.org === Hi altimin@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Alexander Timin Commit : e6bd0b61905fb2661959fdd32174d780c8b325b8 Date : Wed Aug 30 13:12:17 2017 Subject: [scheduler] Rename per-frame task queues Bisect Details Configuration: winx64_10_perf_bisect Benchmark : system_health.memory_desktop Metric : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_news/load_news_cnn Change : 0.51% | 26498693.1429 -> 26633359.0714 Revision Result N chromium@498437 26498693 +- 339656 14 good chromium@498439 26547745 +- 486664 14 good chromium@498440 26661503 +- 325769 14 bad <-- chromium@498442 26711832 +- 247564 6 bad chromium@498446 26649118 +- 59327.7 6 bad chromium@498455 26661513 +- 95271.8 6 bad chromium@498473 26656651 +- 112851 6 bad chromium@498508 26633359 +- 418912 14 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.news.cnn system_health.memory_desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8969193646654967568 For feedback, file a bug with component Speed>Bisection
,
Sep 6 2017
Assigning to myself to figure out what's going on. Bisect seems to be blaming random CLs.
,
Sep 7 2017
Hey, Annie was right. There were in fact two distinct regressions near to the alert position. There is nothing in the first bisect (#3) just noise, so +bmcquade is off the hook. The bisect in #14 does show a ~160KiB regression in the browser process, quite visible here: https://chromeperf.appspot.com/report?sid=a9f091bd3b8185f40016d20be556da5169aa167d53905898278c829d982bbacb&start_rev=492256&end_rev=500184 And confirms r497880 (dtapuska) as the culprit; but that has already been addressed at issue 760567 . Then the bisect in #15 found a second ~150KiB regression in the renderers, also visible here: https://chromeperf.appspot.com/report?sid=cb39b58c228e41931186ed97cdf25666d2c8f105665a00bd38a49211e6ced0bc&start_rev=494704&end_rev=499416 And confirms r498440 (altimin) as the culprit; but as it can be seen on that graph above it has already recovered. So nothing else left to address here! Thanks all for the patience and sorry for the confusions.
,
Sep 7 2017
Thanks for the analysis! I'm re-opening this temporarily just so my suggestions below don't get lost. Please feel free to open new bugs for the suggestions below & then close this out if these are better tracked elsewhere. Suggestion: could we update the bisect bot's logic to prevent it from opening a bug like this one in the future, when it produces results like these: chromium@498175 26429752 +- 252225 14 good chromium@498222 26421512 +- 195151 14 good chromium@498247 26454682 +- 394491 21 good chromium@498259 26414570 +- 172688 14 good chromium@498260 26426501 +- 29399.2 6 good chromium@498261 26461978 +- 107657 14 bad <-- chromium@498262 26503531 +- 181601 9 bad chromium@498265 26485064 +- 142308 9 bad chromium@498270 26497729 +- 153982 14 bad I'm not sure how this decided that my change was the culprit - that seems really bogus when you scan the results and notice that some of the deltas before my change are much smaller than deltas after (see my original analysis above). I think it's important that the bisect bot is as accurate as possible, so engineers take it seriously. Otherwise, we end up in a "cry wolf" situation where engineers just ignore new bugs. I know that the last few times I've had a bug from bisect bots assigned to me the results were wrong, and I'm starting to take a default position of assuming bisect bot results are not useful by default, which seems undesirable as we will be more likely to let real regressions get overlooked. A couple ideas: * your team should track a success metric to keep track of the % of these bugs that incorrectly diagnose the root cause, and have an OKR to drive that down to a reasonable level over time, both to prevent "cry wolf" issues and to minimize wasted engineer time. * update existing bisect bot bad change detection logic to scan results shortly before and shortly after the initially selected "bad" revision to make sure the deltas before the chosen revision are consistently larger than the deltas after. Use stats to show that the probability of the given bad change being truly bad is above some p value. If not statistically significant, maybe open the bug but assign it to the team that owns the bot so it can be analyzed further.
,
Sep 7 2017
Assigning to Annie in case she wants to add anything else. The TL;DR is that yes, we're aware of these issues (metrics are tracked and OKR's set!) and her team is working hard on a replacement of the entire bisect system, squarely targeted at addressing concerns like the ones you express.
,
Sep 7 2017
Adding Simon and Dave, who are working on pinpoint, the new version of bisect: https://docs.google.com/document/d/1FKPRNU2kbPJ15p6XHO0itCjYtfvCpGt2IHblriTX1tg/edit This was a bit of a corner case in that there were two regressions very close together. Pinpoint will handle this much better, but we'll still have to select the range correctly, and that turns out to be harder than it sounds. We do track metrics and send them in a weekly mail to chrome-speed-operations@, and have goals around improving accuracy. We'll also be tweaking the classification algorithm post-launch. Marking fixed, but Simon, feel free to reopen if need be. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Sep 5 2017