Issue metadata
Sign in to add a comment
|
1605.8%-1626208.4% regression in memory.top_10_mobile at 413305:414853 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Aug 29 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/9002983862732741504
,
Aug 29 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/9002982605007655616
,
Aug 29 2016
Some traces: Before: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_13-2016-08-26_15-11-24-91033.html After: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_13-2016-08-26_17-07-52-41032.html "After" trace has a lot more entries in Android>Ashmem PSS (see screenshots)
,
Aug 29 2016
(adding "before" screenshot which should have been in comment #4)
,
Aug 29 2016
perezju@: most of these are huge regressions. I have kicked off several bisects, but they deserve a shortcut given the number and magnitude of regressions. Any clue might be causing these?
,
Aug 29 2016
Any clue what might have caused the huge memory regressions? Cc-ing all CL owners who have memory related changes in the range. https://chromium.googlesource.com/chromium/src/+log/1d1f13a9f7961b2b891b858c769dc7f2b787e8f8%5E..dbd715219c65c162e08fda7e797a16e4aaeec330?pretty=fuller
,
Aug 29 2016
Hmm, it's not https://codereview.chromium.org/2286833004, which was my CL.
,
Aug 29 2016
===== BISECT JOB RESULTS ===== Status: failed ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@414844 4260928 0.0 5 bad Bisect job ran on: android_nexus6_perf_bisect Bug ID: 641962 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests memory.top_10_mobile Test Metric: background-memory:chrome:all_processes:reported_by_chrome:gpu:effective_size_avg/after_https_m_facebook_com_rihanna Relative Change: None Score: 0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus6_perf_bisect/builds/2524 Job details: https://chromeperf.appspot.com/buildbucket_job_status/9002983862732741504 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5845001534177280 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Aug 29 2016
My CL in the range was to avoid allocating GPU memory to fix a regression on the same test suite. It reverted to old behaviour to avoid making an object we won't use. So not likely..
,
Aug 29 2016
Can we make it try bisect again?
,
Aug 29 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/9002963010413304000
,
Aug 29 2016
Kicked off few more bisects. Something fundamental about memory must have changed. We have 600+ alerts since Friday: https://bugs.chromium.org/p/chromium/issues/list?can=2&q=reporter%3Amustaq%40chromium.org+opened%3E2016-08-25+opened%3C2016-08-30+Type%3DBug-Regression+summary%3Amemory.top_10_mobile&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids
,
Aug 29 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/9002962624753336224
,
Aug 30 2016
=== Auto-CCing suspected CL author sievers@chromium.org === Hi sievers@chromium.org, the bisect results pointed to your CL below as possibly causing a regression. Please have a look at this info and see whether your CL be related. ===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : Android: Trigger tab placeholder update when activity is paused Author : sievers Commit description: This makes sure we have a placeholder bitmap when we resume and don't have a frame from the renderer yet. BUG= 636630 Review-Url: https://codereview.chromium.org/2186453004 Cr-Commit-Position: refs/heads/master@{#414822} Commit : 0a1f48d5cb6a5d263f85bc5322912b7f703c40ac Date : Fri Aug 26 21:50:30 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@414813 141394 182.91 5 good chromium@414819 141394 182.91 5 good chromium@414821 141394 182.91 5 good chromium@414822 2410578 182.91 5 bad <-- chromium@414824 2410578 182.91 5 bad Bisect job ran on: android_nexus5X_perf_bisect Bug ID: 641962 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests memory.top_10_mobile Test Metric: background-memory:chrome:all_processes:reported_by_os:system_memory:ashmem:proportional_resident_size_avg/after_http_www_amazon_com_gp_aw_s_k_nexus Relative Change: 1604.87% Score: 99.9 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/615 Job details: https://chromeperf.appspot.com/buildbucket_job_status/9002982605007655616 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5607765391704064 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Aug 30 2016
,
Aug 30 2016
===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : Android: Trigger tab placeholder update when activity is paused Author : sievers Commit description: This makes sure we have a placeholder bitmap when we resume and don't have a frame from the renderer yet. BUG= 636630 Review-Url: https://codereview.chromium.org/2186453004 Cr-Commit-Position: refs/heads/master@{#414822} Commit : 0a1f48d5cb6a5d263f85bc5322912b7f703c40ac Date : Fri Aug 26 21:50:30 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@414803 141394 182.91 5 good chromium@414814 141394 182.91 5 good chromium@414819 141394 182.91 5 good chromium@414821 141394 182.91 5 good chromium@414822 2410578 182.91 5 bad <-- chromium@414824 2410578 182.91 5 bad chromium@414844 2410578 182.91 5 bad Bisect job ran on: android_nexus5X_perf_bisect Bug ID: 641962 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests memory.top_10_mobile Test Metric: background-memory:chrome:all_processes:reported_by_os:system_memory:ashmem:proportional_resident_size_avg/after_http_yandex_ru_touchsearch_text_science Relative Change: 1604.87% Score: 99.9 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/618 Job details: https://chromeperf.appspot.com/buildbucket_job_status/9002963010413304000 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5255948380143616 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Aug 30 2016
===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : Android: Trigger tab placeholder update when activity is paused Author : sievers Commit description: This makes sure we have a placeholder bitmap when we resume and don't have a frame from the renderer yet. BUG= 636630 Review-Url: https://codereview.chromium.org/2186453004 Cr-Commit-Position: refs/heads/master@{#414822} Commit : 0a1f48d5cb6a5d263f85bc5322912b7f703c40ac Date : Fri Aug 26 21:50:30 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@414803 141394 182.91 5 good chromium@414814 141394 182.91 5 good chromium@414819 141394 182.91 5 good chromium@414821 141394 182.91 5 good chromium@414822 2410578 182.91 5 bad <-- chromium@414824 2410578 182.91 5 bad chromium@414844 2410578 182.91 5 bad Bisect job ran on: android_nexus5X_perf_bisect Bug ID: 641962 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests memory.top_10_mobile Test Metric: background-memory:chrome:all_processes:reported_by_os:system_memory:ashmem:proportional_resident_size_avg/after_http_m_intl_taobao_com_group_purchase_html Relative Change: 1604.87% Score: 99.9 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/619 Job details: https://chromeperf.appspot.com/buildbucket_job_status/9002962624753336224 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5838779200307200 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Aug 30 2016
sievers, could you have a look at this? Looks like your CL caused regressions on tons of memory metrics on several different bots. note that you can run the benchmark on trybots to test a change using e.g. src/tools/perf/run_benchmark try android-nexus5 memory.top_10_mobile If you don't find a fix soon, maybe we can revert that CL? Possible other duplicates of this issue are issue 641956 and issue 641980.
,
Aug 31 2016
Ping, any updates here? This is a large regression, and needs to be fixed soon. (Note: this landed after branch point, so it's not blocking 54)
,
Sep 1 2016
,
Sep 1 2016
,
Sep 1 2016
I will revert the CL now.
,
Sep 1 2016
Please wait a bit. We were hoping to ping the author and reviewers of the CL soon as MTV wakes up. It's unclear whether the revert might also have other unintended consequences.
,
Sep 1 2016
Okay, held back the revert: https://codereview.chromium.org/2306623002
,
Sep 1 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/be54d2fa0b7a766b0391de082eebeeee53888397 commit be54d2fa0b7a766b0391de082eebeeee53888397 Author: mustaq <mustaq@chromium.org> Date: Thu Sep 01 17:44:45 2016 Revert of Android: Trigger tab placeholder update when activity is paused (patchset #3 id:40001 of https://codereview.chromium.org/2186453004/ ) Reason for revert: Seeing lots of memory regression since this landed. Bisects point to this CL with high confidence. See crbug.com/641962 . Original issue's description: > Android: Trigger tab placeholder update when activity is paused > > This makes sure we have a placeholder bitmap when we resume > and don't have a frame from the renderer yet. > > BUG= 636630 > > Committed: https://crrev.com/0a1f48d5cb6a5d263f85bc5322912b7f703c40ac > Cr-Commit-Position: refs/heads/master@{#414822} TBR=dtrainor@chromium.org,sievers@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 636630 , 641962 Review-Url: https://codereview.chromium.org/2306623002 Cr-Commit-Position: refs/heads/master@{#415990} [modify] https://crrev.com/be54d2fa0b7a766b0391de082eebeeee53888397/chrome/android/java/src/org/chromium/chrome/browser/ChromeActivity.java
,
Sep 1 2016
Ha, thanks to the memory infra dumps in the traces, we immediately were able to figure this out: This is ashmem allocated from the readback path, which gets lazily allocated. Before my change it seems like the test never triggered a tab contents bitmap readback. After my patch it does. However, in the real world it constantly does this during tab switching. So this is less of a regression in the real world, but it's a great find since it made me realize that we probably never ever trim this ashmem, so there is room to improve memory not only in the background but also in the foreground.
,
Sep 2 2016
I'm glad to hear the comments in #27! Also note: please *DO NOT* merge the revert into M54. The original CL introducing the regression landed after branch point anyway. The urgency to eliminate the regression was because it was blocking some planned benchmark/infra changes we have; and we would rather have numbers at their baseline to simplify the required accounting. On the chrome perf graphs (link in #0) numbers *did* return to their baseline after the revert, so that is great! I'm still waiting for this to roll into internal bots in order to confirm everything is fine there too. So please wait a bit for us on the blocking bug to finish the changes on our side before trying to re-land or land a new fix. Should be just a matter of a couple of days.
,
Sep 2 2016
adding JasonT as an FYI
,
Sep 2 2016
All metrics has returned to normal levels after the revert except the ones on master=ChromiumPerfFyi. I isolated the latter group to crbug.com/643670. This bug can be closed once the blocking bug is fixed. Thanks perezju@ & sievers@.
,
Sep 3 2016
,
Sep 6 2016
All metrics have recovered.
,
Sep 6 2016
Note, authors of the original CL might still want to keep this open to track work to re-land while avoiding the regression. sievers, if there is a separate bug to track that, please feel free to close this but do copy over the blocking bug.
,
Sep 27 2016
,
Sep 27 2016
Ok I'm relanding my patch in https://codereview.chromium.org/2372393002/ It should not cause a test regression anymore now that https://codereview.chromium.org/2336043004/ is landed. (Again, note that this is an overall improvement to free this ashmem occasionally, which we used to never do. And the test case is arguably unrepresentative wrt the baseline as it previously never caused a readback, but in the real world we would do a read back as soon as a user switches tabs.)
,
Sep 27 2016
,
Sep 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/98f7c133cdef22c26922c56371f84989f7a2cd60 commit 98f7c133cdef22c26922c56371f84989f7a2cd60 Author: sievers <sievers@chromium.org> Date: Tue Sep 27 22:11:04 2016 Android: Free GLHelper context ashmem when it makes sense Free the 'mapped_memory' that is used in readbacks through CopyFromCompositingSurface() when there is memory pressure in the system (or no activities are running and a readback completes). This memory gets lazily allocated during the first readback and it'd be the size of the texture after scaling. Freeing it is harmless at a slight reallocation cost when the next readback is triggered. BUG= 641962 TBR=boliu@chromium.org Review-Url: https://codereview.chromium.org/2336043004 Cr-Commit-Position: refs/heads/master@{#421357} [modify] https://crrev.com/98f7c133cdef22c26922c56371f84989f7a2cd60/content/browser/renderer_host/render_widget_host_view_android.cc [modify] https://crrev.com/98f7c133cdef22c26922c56371f84989f7a2cd60/content/common/gpu/client/context_provider_command_buffer.cc [modify] https://crrev.com/98f7c133cdef22c26922c56371f84989f7a2cd60/content/common/gpu/client/context_provider_command_buffer.h
,
Sep 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5d34220f9b827cfd514e03ba6215286f2467f29d commit 5d34220f9b827cfd514e03ba6215286f2467f29d Author: sievers <sievers@chromium.org> Date: Tue Sep 27 23:59:31 2016 Reland of Android: Trigger tab placeholder update when activity is paused (patchset #1 id:1 of https://codereview.chromium.org/2306623002/ ) Reason for reland (revert-of-revert): Now that https://codereview.chromium.org/2336043004/ has landed, we will free GL context-related ashmem after the readback and when the app goes to the background. Original issue's description: > Revert of Android: Trigger tab placeholder update when activity is paused (patchset #3 id:40001 of https://codereview.chromium.org/2186453004/ ) > > Reason for revert: > Seeing lots of memory regression since this landed. Bisects point to this CL with high confidence. See crbug.com/641962 . > > Original issue's description: > > Android: Trigger tab placeholder update when activity is paused > > > > This makes sure we have a placeholder bitmap when we resume > > and don't have a frame from the renderer yet. > > > > BUG= 636630 > > > > Committed: https://crrev.com/0a1f48d5cb6a5d263f85bc5322912b7f703c40ac > > Cr-Commit-Position: refs/heads/master@{#414822} > > TBR=dtrainor@chromium.org,sievers@chromium.org > # Not skipping CQ checks because original CL landed more than 1 days ago. > BUG= 636630 , 641962 > > Committed: https://crrev.com/be54d2fa0b7a766b0391de082eebeeee53888397 > Cr-Commit-Position: refs/heads/master@{#415990} TBR=dtrainor@chromium.org,primiano@chromium.org,mustaq@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 636630 , 641962 Review-Url: https://codereview.chromium.org/2372393002 Cr-Commit-Position: refs/heads/master@{#421390} [modify] https://crrev.com/5d34220f9b827cfd514e03ba6215286f2467f29d/chrome/android/java/src/org/chromium/chrome/browser/ChromeActivity.java
,
Sep 29 2016
,
Sep 29 2016
,
Sep 29 2016
So looks like the mapped memory actually doesn't get freed. I think I'm missing some synchronization with the service (since the token to clear the free memory is updated at the end of the readback from UnmapBuffer() and DeleteBuffer()), but even if I try a Finish() before FreeUnusedSharedMemory() (and after the aforementioned calls) it doesn't work. Debugging further... I guess there's always the option of deleting the whole context when we go into the background, maybe I'll just do that.
,
Sep 29 2016
That'd require less ways to free memory on context providers too. :)
,
Sep 29 2016
But then I think I want to remove the part I added where it tries to free up things when there's memory pressure after every readback, because when you're in that situation then creating and deleting a GL context every time you switch tabs seems silly. But maybe that's ok - that was just sort of icing on the cake.
,
Sep 30 2016
Ah, mystery solved. It's the combination of this:
The async readback creates Queries. They use the same mapped memory allocator as the mapped buffers. But the query mapped memory gets freed only with an explicit call to QueryTracker::Shrink()
Then FencedAllocator::InUse() will see a block that's still in use.
void GLES2Implementation::FreeUnusedSharedMemory() {
mapped_memory_->FreeUnused(); // doesn't call query_tracker_->Shrink() unlike FreeEverything()
}
,
Oct 3 2016
So #39 is causing crashes https://bugs.chromium.org/p/chromium/issues/detail?id=652050 and we probably need to revert that. Daniel mentioned he was thinking about just deleting the context entirely when the app is backgrounded. There is https://codereview.chromium.org/2383613005/ which exists, but I'm not sure if it works correctly. bo can you help out with what you think the right way forward is?
,
Oct 3 2016
Ok, the history I got.. https://codereview.chromium.org/2186453004 is part of fix for a polish bug ( crbug.com/636630 ) in clank, and caused this memory regression The fix for the memory regression (https://codereview.chromium.org/2336043004) caused a null pointer crash: crbug.com/652050. Also apparently it did *not* fix the memory regression. So then I think the safest thing right now is to just revert https://codereview.chromium.org/2336043004 to fix the crash first. We should have a bit more time to fix memory regression, or make the decision to fix or revert. Also the polish bug is blocking m54 stable, because reasons.
,
Oct 3 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4201167e8b04762db61cc037cc83537c298e2f53 commit 4201167e8b04762db61cc037cc83537c298e2f53 Author: boliu <boliu@chromium.org> Date: Mon Oct 03 22:42:58 2016 Revert of Android: Free GLHelper context ashmem when it makes sense (patchset #4 id:60001 of https://codereview.chromium.org/2336043004/ ) Reason for revert: Causes a null pointer crash. And also apparently doesn't work? BUG=652050 Original issue's description: > Android: Free GLHelper context ashmem when it makes sense > > Free the 'mapped_memory' that is used in readbacks through > CopyFromCompositingSurface() when there is memory pressure > in the system (or no activities are running and a readback > completes). > > This memory gets lazily allocated during the first readback > and it'd be the size of the texture after scaling. Freeing it > is harmless at a slight reallocation cost when the next readback > is triggered. > > BUG= 641962 > TBR=boliu@chromium.org > > Committed: https://crrev.com/98f7c133cdef22c26922c56371f84989f7a2cd60 > Cr-Commit-Position: refs/heads/master@{#421357} TBR=danakj@chromium.org,sievers@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 641962 Review-Url: https://codereview.chromium.org/2392623003 Cr-Commit-Position: refs/heads/master@{#422574} [modify] https://crrev.com/4201167e8b04762db61cc037cc83537c298e2f53/content/browser/renderer_host/render_widget_host_view_android.cc [modify] https://crrev.com/4201167e8b04762db61cc037cc83537c298e2f53/content/common/gpu/client/context_provider_command_buffer.cc [modify] https://crrev.com/4201167e8b04762db61cc037cc83537c298e2f53/content/common/gpu/client/context_provider_command_buffer.h
,
Oct 5 2016
Be aware that the memory test regression still exists until https://codereview.chromium.org/2383613005/ lands (hopefully tonight).
,
Oct 5 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c9afe77bc5411c3bf0760c9acc37941e6e24540a commit c9afe77bc5411c3bf0760c9acc37941e6e24540a Author: sievers <sievers@chromium.org> Date: Wed Oct 05 01:42:52 2016 Android: Second attempt at freeing GLHelper ashmem This tears down the GLHelperHolder in RenderWidgetHostViewAndroid when all activities get stopped. But it only does this if there are no more readback requests in flight. BUG= 641962 , 636630 Review-Url: https://codereview.chromium.org/2383613005 Cr-Commit-Position: refs/heads/master@{#423045} [modify] https://crrev.com/c9afe77bc5411c3bf0760c9acc37941e6e24540a/content/browser/renderer_host/render_widget_host_view_android.cc [modify] https://crrev.com/c9afe77bc5411c3bf0760c9acc37941e6e24540a/content/browser/renderer_host/render_widget_host_view_android.h
,
Oct 5 2016
Looks good again after aforementioned patch landed.
,
Oct 5 2016
Issue 651098 has been merged into this issue.
,
Oct 6 2016
Thanks! Just confirming some nice ~1.5MB improvements in ashmem and overall pss for Chrome on Android https://chrome-health.googleplex.com/health-plan/android-chrome/memory/nexus5/?from_commit=421806&to_commit=423417 |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by mustaq@chromium.org
, Aug 29 2016