Issue metadata
Sign in to add a comment
|
Discount memory-infra service memory use. |
||||||||||||||||||||||
Issue descriptionSeems to be limited to android bots.
,
Jul 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972540375208027568
,
Jul 31 2017
=== Auto-CCing suspected CL author hjd@google.com === Hi hjd@google.com, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Hector Dearman Commit : 3d197c0b3c6b2ff9cbce4990775f7bb120838231 Date : Fri Jul 28 11:34:31 2017 Subject: memory-infra: Start using new RequestOSMemoryDump API Bisect Details Configuration: android_nexus5_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/load_games/load_games_spychase Change : 3.03% | 20707537.3333 -> 21334402.6667 Revision Result N chromium@490179 20707537 +- 201410 6 good chromium@490280 20826725 +- 197181 6 good chromium@490330 20772580 +- 292103 6 good chromium@490355 20816724 +- 139181 6 good chromium@490359 20810340 +- 244497 6 good chromium@490360 20812675 +- 190797 6 good chromium@490361 21380413 +- 119745 6 bad <-- chromium@490362 21355447 +- 107814 6 bad chromium@490368 21309777 +- 157651 6 bad chromium@490380 21334403 +- 213126 6 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.games.spychase system_health.memory_mobile More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972540375208027568 For feedback, file a bug with component Speed>Bisection
,
Jul 31 2017
Issue 750654 has been merged into this issue.
,
Jul 31 2017
,
Aug 1 2017
,
Aug 1 2017
,
Aug 1 2017
+primiano@, +ssid@, perezju@ Using the separate OS dumps mechanism memory-infra seems to have regressed memory by up to 1mb :( I find it a little hard to believe this is accurate, it seems more likely I messed something during the switch, taking a look now.
,
Aug 1 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972471000360007152
,
Aug 1 2017
Split off regressions which were before my CL: issue 751052
,
Aug 1 2017
=== BISECT JOB RESULTS === NO Perf regression found Bisect Details Configuration: android_webview_arm64_aosp_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:webview:all_processes:reported_by_chrome:v8:effective_size_avg/load_tools/load_tools_stackoverflow Revision Result N chromium@490076 5980704 +- 959941 21 good chromium@490135 6015339 +- 861047 21 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.tools.stackoverflow system_health.memory_mobile More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972471000360007152 For feedback, file a bug with component Speed>Bisection
,
Aug 1 2017
Hmm take a look to the traces, this feels more like we are missing something in the new dumps, or maybe just timings. When you look to the traces where does the 1MB come from?
,
Aug 1 2017
After getting side tracked a lot I ran load:search:amazon with and without my CL with --enable-heap-profiling
I am deeply suspicious of the following entry which changes:
Type Total Size Count
Before: memory_instrumentation::mojom::ClientProcess 2.3KiB 21
After: memory_instrumentation::mojom::ClientProcess 553.2KiB 5,727
Why do we have 5,706 more instances of memory_instrumentation::mojom::ClientProcess after my CL? :(
This probably explains the how, now to work out why...
Before: https://screenshot.googleplex.com/M6kDnkfq4ot.png
After: https://screenshot.googleplex.com/oVQxdtb8oeF.png
Before: https://hjd.users.x20web.corp.google.com/www/2017-08-01/load_search_amazon22017-08-01_16-35-17.html
After: https://hjd.users.x20web.corp.google.com/www/2017-08-01/load_search_amazon22017-08-01_16-25-50.html
,
Aug 1 2017
So I have a native heap profile, unfortunately I'm still confused. https://screenshot.googleplex.com/Kdgx5XXJMvJ.png https://hjd.users.x20web.corp.google.com/www/2017-08-01/native-heap-profile.html
,
Aug 3 2017
Actually misleading heap profile aside this makes sense: A test on my z840 shows opening a new chrome and taking a memory dump leads to us sending 7745 memory maps to the browser. Each memory dump is a struct like this: https://cs.chromium.org/chromium/src/services/resource_coordinator/public/interfaces/memory_instrumentation/memory_instrumentation.mojom?l=50 Optimistically that is at least: uint64 * 9 + uint32 * 1 + string* * 1 = 84 bytes 84 bytes * 7745 = 650580 bytes = 635.33203125kb 635kb is the right order of magnitude (and this doesn't include the string itself and the overhead from mojo unique_ptrs for each map). So all our infrastructure works correctly! We don't take detailed dumps in the wild (right?) so this shouldn't regress for actual users but it kind of sucks that all our benchmarks are going to be affected by these allocations. Also note that this puts +/-500kb of noise in browser malloc (since we we may or may not see these allocations depending on the ordering of os_dumps vs chrome_dumps.
,
Aug 3 2017
Ahh right, good point. So w.r.t. overhead, we have a mechanism to discount it using TraceEventMemoryOverhead so won't show up in benchmarks. About sequencing, I need to understand the problem a bit more. Where does the memory accumulate into? client or service side?
,
Aug 3 2017
Mostly in the service, overhead in the clients should have doubled for now but that should disappear when my CL to remove process_metrics_dump_provider lands.
,
Aug 3 2017
I see so I guess it accumulates until all clients reply and the data gets serialized. I think that if we use TraceEventMemoryOverhead right that should go away
,
Aug 7 2017
Issue 750707 has been merged into this issue.
,
Aug 8 2017
Issue 750815 has been merged into this issue.
,
Aug 8 2017
Issue 750707 has been merged into this issue.
,
Aug 10 2017
Issue 753955 has been merged into this issue.
,
Aug 11 2017
Issue 753957 has been merged into this issue.
,
Sep 18 2017
hjd: any update on this?
,
Sep 23 2017
After the dust settles from the remaining memory-infra refactoring I'll figure out how to add this to our discount logic, I've updated the bug title and remove the sheriff label (since this measurement artifact). |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Jul 31 2017