Issue metadata
Sign in to add a comment
|
1.3%-1.5% regression in memory.top_10_mobile at 518659:518766 |
||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Dec 5 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8961038381820465952
,
Dec 5 2017
=== Auto-CCing suspected CL author morlovich@chromium.org === Hi morlovich@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Maks Orlovich Commit : 6bf5e45d34018dfd5fbb114d88d422581aa33be4 Date : Wed Nov 22 21:30:29 2017 Subject: Add fieldtrial_testing_config for SimpleCachePrefetch for platforms Bisect Details Configuration: android_nexus5X_perf_bisect Benchmark : memory.top_10_mobile Metric : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/background/after_https_m_facebook_com_rihanna Change : 1.03% | 43223465.3333 -> 43668876.8889 Revision Result N chromium@518658 43223465 +- 811157 9 good chromium@518712 42863501 +- 521870 6 good chromium@518739 42826885 +- 678000 9 good chromium@518746 42834563 +- 776329 6 good chromium@518750 42938060 +- 783754 9 good chromium@518752 42853037 +- 436398 6 good chromium@518753 43477405 +- 905928 9 bad <-- chromium@518766 43668877 +- 711847 9 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=https.m.facebook.com.rihanna memory.top_10_mobile More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8961038381820465952 For feedback, file a bug with component Speed>Bisection
,
Dec 5 2017
FYI re: possibility of launching this...
,
Dec 5 2017
hmm .. 430 KiB is massive .. On the other hand we could argue that this memory bump happens more at the time when the page is loading, which is typically earlier than the mem peak happens. Hence Chrome is unlikely to be killed more often or cause more background apps to be killed more often with this feature ON. Another way could be to limit the amount of concurrent prefetched-n-unused entries per backend. This looks not too involved, but I may miss some important details in the state machine. Maks, can you start a discussion on, say, loading-dev@ to converge on the solution?
,
Dec 5 2017
@xunjieli --- any chance you have time to give me some advice/pointers here? Not really familiar with the loading-dev list, but hmm, more immediately, I can't make much sense out of traces. I think it only measures at the end --- and that point the net usage is basically zero, so it's not even pre-read buffers directly? Also, I think the difference seems to be in the renderer process, from picking a couple of traces. Maybe the change is a bad idea for CacheStorage or something? The numbers for that certainly look worse --- comparing the ratio of ReadStream1FromPrefetched = true to SyncOpenDidPrefetch = true (which basically measures the portion of opens that were useful) between SimpleCache.App. and SimpleCache.Http is quite drastic. If this (or some other bot) is tracking canary, may be worth trying turning off pre-read just for cache_type_ == APP_CACHE, since that would be a trivial CL.
,
Dec 5 2017
I looked at the traces. I can't tell what's going on. You can try to see if this reproducible locally or use perf trybots, and from there add some probes to see what had grown. (https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_trybots.md)
,
Dec 5 2017
Hmm, can I get correlated comparisons with other memory metrics for the regression run? Doing a trybot run with what's basically a revert produced a mixture where some memory metrics improved and some didn't... I wonder if the original change was like that, too? Here is the one from my trybot run of what is probably a revert: https://00e9e64bac23adef1c7a6a9b63031e0608cdc50e2c2e7023e8-apidata.googleusercontent.com/download/storage/v1/b/chromium-telemetry/o/html-results%2Fresults-2017-12-05_13-14-45?qk=AD5uMEu9hWKHqDB1Oj4VTLX2NReby7lfTOt-4-UwsNj_klnnr2wvPIigAm4T5IRaWkD8xV7YLFhpmCeiuPQlXNVAdZQOyEsrwxP-Eagr97Uyfo8E9cXqfrOvNfoJ6067LaSYexs4wxgQrJDdoVhqmlvLOrT13XSYbD9M_gDYnsXAY6_rBoWvPFw2ufRV3nb7Mh_GirZUbnFJ_J2H2FWFuq874EdvZ_Iz0bzUCxEdbf_UyMyaamEwmSAr16d3-WuNyVRkc7mP8gar2AhnJDBLeSIJ1XcutYgjWXiumxyqqqtONvx4pK_-GsAgkXRk0sMdW91jPV9AvtJI93IBtL0OsqKHKHkujuT8KJlYbrdMWccUbGmqzgsHIZYN4hz1bb4vHZ1hNJhQt4bh7uuI8DFkj4bbgkS3Y2iEImc4t4pVPK5wnI6YOmJjkTgSNxGi5EUAhGLdqVbjVRbE-B7MWhBKwwhIySU97UDljFQVRCPN7ygwah2tAfcQ0FuZ2xKxZh8sjHsCHgPrKDfYJB8Fkj9f4ZAi0afAAMtyW_hz9lJq6uJdCgD5r0icvAWAvMWrZu1JvTZpsyeYxdxfFO2ice79YUYsUZcuuQBwyz-hLvPdnsrU1-T0Pw90aXbp_UIu8Z2tBpumufpWqZRoIrlCRotICNAlmoSAzQBjrT1364uLzgooCOUYyE45q2Vv4gjJF5vl9rPrx6YIQ_Iv3Q8ujj4qo79YZwlY6RKMRCTmxrNsK9UOAFI2Xb4UvAecdZNGVWIdxCcH6sjqukMgLKQW_QR_Kq8FzM2JHHlw0w#r=TOT&s=%25%CE%94avg&g=name&c=0&n=0
,
Dec 5 2017
Re#8: To get all perf data for a revision, you can use this url: https://chromeperf.appspot.com/group_report?rev=518753
,
Dec 6 2017
Hmm, how would I get it to report the metric the report is about? (And how do I find out what exactly it means)?
,
Dec 7 2017
re #8: not a good link :) The blah-apidata domains come and go, but there is a way to share a working link if you grab it before some redirects kick in. Kudos to inventers of this convenience. This metric is a newfangled perf infra metric, written in javascript, living in the catapult repo, summarized into a fancy interactive html page (which can also be embedded into a trace). So for instance this regression has a "View trace from this run" link if you click on the exclamation point on the dashboard (and scroll into the hidden area). Which leads here: https://storage.cloud.google.com/chrome-telemetry-output/after_https_m_facebook_com_rihanna_2017-11-22_22-36-43_86103.html (does not open with @chromium.org account for me - convenience everywhere) When the trace opens (with the blah-apidata dontshareme URL), you have the vertical tab called "Metrics", then choose dropdown value "memoryMetric", then there is a row "memory:chrome:all_processes:reported_by_chrome:effective_size" with one sample (of course, we ran the page only once). Then you can click on the blue bars next to it. More documentation about this UI: https://github.com/catapult-project/catapult/blob/master/docs/metrics-results-ui.md With perf trybot you'd get a link to uploaded report, where you'd see essentially the same UI, but multiple samples and a side by side comparison. Hopefully you'd be able to dig into details from there, but I usually cannot, then I give up and run the benchmark locally, hack it to be able to grab all the raw data and filter/slice it appropriately in a colab notebook. I could not convince enough people to get me raw data at all time easily. Yet. Maybe yet. Good luck!
,
Dec 7 2017
Maybe a useful URL of trying what's essentially a revert on the bots: https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-12-06_14-26-38#r=TOT&s=%25%ce%94avg This does show a change of 430K... (well, at p=.14); and changing to just preload http/media caches doesn't. Hmm. Probably the browser process, too. Helen explained some of this to me --- what I was missing was that it was possible could split up the effective_size metric to just the malloc row by clicking the labels under the bar chart. Opening up traces from it, it seems like the difference is under malloc -> allocated_objects -> <unspecified> which isn't very helpful (but maybe I can hack the code to make things specified somehow...) Still, even with this I am not sure what exactly effective_size is counting, and what's the after_ supposed to mean...
,
Dec 8 2017
+ more folks to help with digging into details of "malloc -> allocated_objects -> <unspecified>". There were recent changes in the memory infra AFAIR.
,
Dec 8 2017
So looking at netlog, this is actually a /cold/ load, so only opens seem to just read stream 0, then write ~12 bytes to stream 2, and usually with in a millisecond or two, with one instance taking ~7ms. Preread is kinda useless there, but it's hard to see how that yields any sort of permanent memory usage change, yet my revert seems to show memory improvement pretty consistently. I wonder if any one the scenarios cover warm loads?
,
Dec 8 2017
Hmm, I am reminded of this: https://chromium.googlesource.com/chromium/src/+/248782b9cda664a1ea6d347dd508354d23160a5a ... Since this is cross-thread as well, but more intentionally so.
,
Dec 11 2017
Experimenting with some hacks to avoid it, cross-thread alloc/delete probably isn't the issue (besides, that was about system_memory:native_heap:proportional_resident_size_avg)
,
Dec 12 2017
Re #13: "malloc -> allocated_objects -> <unspecified>" on Android could increase because of 2 reasons. The app allocated 400KB extra (this could either be Chrome or indirect android allocation) or it allocated and freed the 400K, so dlmalloc kept around the memory in freelists. If you can reproduce the regression locally, then run with enabling heap profiler and check if chrome allocations have increased. If so, then we can see stack trace of where these allocations came from. If not we would need to get an android native heap dump.
,
Dec 12 2017
Thanks, that's helpful. "Allocated and freed" is more consistent with what it should be doing, though the total amount is surprising --- perhaps the large number of threads involved is a factor. I am not sure profiling would be that helpful since I know what the extra allocations are (well, at least the ones I am making, maybe changes in timing make other things allocated more) --- just I expected them to be very short term, and timed in a way that they make way for later usage[1]. So the android native heap dump may be the thing to do to understand how they interact with things. Are they any recent recommended docs for that? (Android ones seem to be "use Android studio", which may be more than sufficient) [1] hmm, entries being reopened to write out V8 compilation cache might be different in that respect, though. Actually some of the allocations for those are kinda useless...
,
Dec 14 2017
Hmm, allocated_object_size_avg has the same curve shape --- does that also potentially include things in freelists? Also kinda weird that other devices don't show the same symptom.
,
Jan 10 2018
So... Rounding up all the "new" allocations to 66,000 bytes seems to fix it... as does rounding them up to 65,000. The experiment was meant to show that the per-thread cache was relevant (given its cap on Android seems to be 65,536 by default), but I am not sure of how to understand this result.
,
Jul 30
|
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Dec 5 2017