Issue metadata
Sign in to add a comment
|
Regression due to extending memory dump wait time |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jun 28 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8975584062646688784
,
Jun 28 2017
=== Auto-CCing suspected CL author perezju@chromium.org === Hi perezju@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : perezju Commit : b4eb70c319cb0f38d1a7764e7d9fb56dc96f8bcf Date : Mon Jun 26 14:25:41 2017 Subject: [System Health] Extend dump time to 5 seconds Bisect Details Configuration: android_webview_arm64_aosp_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:webview:all_processes:reported_by_os:system_memory:java_heap:proportional_resident_size_avg/load_media/load_media_dailymotion Change : 36.12% | 5027498.66667 -> 6843392.0 Revision Result N chromium@482136 5027499 +- 95475.8 6 good chromium@482271 5272576 +- 104548 6 good chromium@482280 5781845 +- 114117 6 good chromium@482282 6063445 +- 84275.5 6 good chromium@482282,catapult@b4eb70c319 8299179 +- 100401 6 bad <-- chromium@482283 8167424 +- 161675 6 bad chromium@482284 7837696 +- 214307 6 bad chromium@482288 7489877 +- 140633 6 bad chromium@482305 7315115 +- 139795 6 bad chromium@482339 7214080 +- 138872 6 bad chromium@482406 6843392 +- 221108 6 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.media.dailymotion system_health.memory_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8975584062646688784 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=6153471146852352 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Jun 28 2017
Issue 737354 has been merged into this issue.
,
Jun 28 2017
Issue 737356 has been merged into this issue.
,
Jun 28 2017
Issue 737364 has been merged into this issue.
,
Jun 28 2017
Issue 737361 has been merged into this issue.
,
Jun 28 2017
Issue 737367 has been merged into this issue.
,
Jun 28 2017
Unsurprisingly extending the dump time caused a bunch of memory metrics to jiggle around. These are probably fine, but will have a careful look at them. First thing is to clean up the list of alerts which currently has "WARNING: Non-overlapping alerts are grouped together." So regressions not related to my change could have been mistakenly assigned to this bug.
,
Jun 28 2017
,
Jun 28 2017
Issue 737471 has been merged into this issue.
,
Jun 28 2017
Issue 737369 has been merged into this issue.
,
Jun 28 2017
Issue 737476 has been merged into this issue.
,
Jun 28 2017
This CL is causing huge variability (order of +/-30MB) in a previously stable benchmark: https://chromeperf.appspot.com/report?sid=88c5b38cef32f976acd3712c66ce2ddfbd6180708bea8ad8f63e92f88c2905e1&rev=482343 Shouldn't we actually try to reduce the sampling interval (i.e. increase number of samples) to reduce the noise coming from the sampling time?
,
Jun 28 2017
I'm planning to take another large dump of measurements during the coming weekend, and asses the impact that this CL had. Re: increase number of samples: that's an end goal, but it's blocked on moving the logic to trigger memory dumps within Chrome. At the moment we still rely on a single memory dump taken per story; and run the story 3 times to get 3 samples.
,
Jun 28 2017
Annie, +simonhatch: Unrelated alerts (whose range does not include my CL) keep being added to the list in: https://chromeperf.appspot.com/group_report?bug_id=737353 After my earlier cleanup, I'm again getting "WARNING: Non-overlapping alerts are grouped together.", likely due to some recent dupe. When a culprit CL is found, alerts on ranges that not include said CL should be probably split-off on to new bugs.
,
Jun 28 2017
For short running benchmarks we probably getting samples at the beginning of the story and losing coverage of memory usage at the end of story. This may accidentally improve variability of the benchmark if there not much happening at the beginning of the story. But that is a false improvement since it comes from ignoring the data. > Re: increase number of samples: that's an end goal, but it's blocked on moving the logic to trigger memory dumps within Chrome. This CL seems to me a step in the opposite direction. Why not keep the current state until the blocking issue is resolved?
,
Jun 28 2017
On system health benchmark we always take a single memory dump at the end of the story. For the broader discussion of why this/now see issue 734853 .
,
Jun 28 2017
,
Jun 28 2017
Oh, I misunderstood the what the dumping interval means. I thought it was the interval between the periodic memory dumps. Sorry for that, please ignore comment #17. Since the new dumping interval increases the running time of a story, shouldn't the stories be re-recorded? The stories that regress with the CL are probably now throwing errors because of missing resources.
,
Jun 28 2017
That's a good point. While assessing the impact of the CL I'll check if any of the stories need re-recording.
,
Jun 28 2017
re: #c16 Yeah that'd be nice if we could split off just the ranges affected by the CL, not sure if that would be a manual or automated process.
,
Jun 29 2017
A bunch of memory regressions happened very close to each other, also near to my change (how lucky!). Keeping track of them here so far: r482282 - this issue r482305 - issue 737360 r482333 - issue 737264 r482346 .. r482354 - issue 737525 A couple of them still waiting for bisects: issues 737921 , 737922 . I'll have to carefully see whether some of the regressions still attached to this bug may be due to some of the other issues (or a combination of a few of them!)
,
Jun 29 2017
Issue 737921 has been merged into this issue.
,
Jun 29 2017
Issue 737922 has been merged into this issue.
,
Jun 30 2017
,
Jun 30 2017
,
Jul 3 2017
,
Jul 3 2017
Issue 738905 has been merged into this issue.
,
Jul 3 2017
,
Jul 3 2017
Issue 738903 has been merged into this issue.
,
Jul 3 2017
Issue 738875 has been merged into this issue.
,
Jul 3 2017
Issue 738906 has been merged into this issue.
,
Jul 4 2017
Issue 739140 has been merged into this issue.
,
Jul 4 2017
After more careful splitting off of alerts not related to this CL (new issues 738883 , and 738884 were found!), there remained ~200 alerts attributable to my change; which can be put together into the following 4 groups: A. Regressions in v8 / native_heap (105 alerts) - Affected platforms: Windows (both) and Mac (v8 only). - Bisect confirmed r482282 as culprit for v8 regression (54 MiB, load_news_qq on win). - Not confirmed on native_heap, but worst offenders coincide in magnitude and affected stories/platforms. - Worst offenders: load_news_qq (133 MiB), load_tools_weather (32 MiB), load_tools_docs (19 MiB), load_tools_drive (13 MiB). B. Regressions in skia / gpu (18 alerts) - Affected platforms: Mostly Android. - Bisects confirmed r482282 as culprit for both skia and gpu (4.1 MiB, browse_media_youtube on android). - Worst offenders: load_games_spychase (8.0 MiB on android), browse_media_youtube (4.1 MiB on android), load_news_qq (5.3 MiB on win and mac). - 2 alerts on cc for load_tools_weather also possibly related and confirmed by bisect. C. Regressions in java_heap (11 alerts) - Affected platforms: Android. - Bisect confirmed r482282 as culprit (6.7 MiB, load_tools_weather). - Worst offenders: load_tools_weather (7.8 MiB), load_games_spychase (3.4 MiB), load_news_irctc (2.1 MiB). D. Regressions in malloc (22 alerts) - Affected platforms: Mac and Windows. - Not confirmed by bisect, but worst offenders show both ref + regular build moving together, e.g.: https://chromeperf.appspot.com/report?sid=9b2ba8352519ac52c32f9e482ae41f88052054452887de1a90cef80e6a70a730&rev=482287 - Worst offenders: load_tools_weather (5.2 MiB), load_games_alphabetty (3.1 MiB), load_tools_drive (2.0 MiB) all on Mac. ================ Due to their magnitude, alerts in A probably merit more investigation. Could these be due to a leak in Chrome? (Or maybe just some javascript on the story hungrily eating more memory as time progresses?) I'm thinking of calling alerts in B, C, D as unfortunate but unavoidable effect of the change in the testing framework. Thoughts on these? As follow up, I'll now proceed to analyse the effect on noise levels (just finished gathering data points from the waterfall during this last weekend).
,
Jul 4 2017
,
Jul 4 2017
So B is fun as, after the GPU discussion that triggered all this, I was expecting them to go down not up :/ C is fine and very plausible. very likely that will change with your CL when you send USR1 anyways to get rid of Issue 724032 D very likely a consequence of A (v8's malloc memory for parsing and stuff) A I agree is curious. Possibly, as you say, that's the page trying to do something more (timers etc) after we did the GC and hence accumulating more memory.
,
Jul 4 2017
,
Jul 6 2017
For the past couple of days I've been crunching through data gathered this past weekend, and comparing it against my previous dump (from about early May), to analyze changes in noise patterns for system health on the entire perf waterfall. Colab with lots of pretty pictures: https://colab.corp.google.com/v2/notebook#fileId=0B586tdnIdbhKYUVfVXlZRHY4VlU I've got good news and bad news. GOOD: Noise in gpu_memory decreased across most stories on N5 and N7! BAD: Not due to my change. :-( Some related graphs: https://chromeperf.appspot.com/report?sid=1c9dea2d396c0a40f0f6f20b22540e6ff79445c416f9b70d97f484c968df747c&start_rev=459675&end_rev=484479 The drop in noise happened around: http://test-results.appspot.com/revision_range?start=477936&end=481246 The CL to increase dump wait time happened later at r482282. While investigating this I also found a few other cases where noise increased, also not due to my change: - v8 on load:news:qq, on mac and win https://chromeperf.appspot.com/report?sid=e0133c1ed628c1c51489400ea515bca31a7cbc48b674d03e8f58e14623f640ad&start_rev=459675&end_rev=484479 - cc on load:media:imgur https://chromeperf.appspot.com/report?sid=7c6076f9eaf0c3a0750bf512d6460ec819bfdff89d385325a9420e968dc29baf&start_rev=459675&end_rev=484479 - private dirty on mac https://chromeperf.appspot.com/report?sid=5126eabea0cbdfe7a8c38f9083f1dd6b9c528da545803e2918d4a10b75f48f75&start_rev=459675&end_rev=484479 A few other time series got a bit more or a bit less noisy, but overall noise levels stayed at about the same. Several alerts (both improvements/regressions) were caught at the time as my CL landed: https://chromeperf.appspot.com/group_report?rev=482282 But given that noise did not went down that much anyway, I'm now considering to revert it. (Probably on the weekend to avoid having it cluster again with other regressions?) Any other thoughts or opinions?
,
Jul 7 2017
,
Jul 7 2017
Issue 737331 has been merged into this issue.
,
Jul 7 2017
Yeah given that gpu improved regardless at this point I'd revert so that we don't have to worry if any of these regressions are due to the change in the delay or not. I think we just have to figure out a better strategy with gpu folks here.
,
Jul 10 2017
,
Jul 31 2017
Issue 739319 has been merged into this issue. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by sullivan@chromium.org
, Jun 28 2017