Issue metadata
Sign in to add a comment
|
332.5% regression in memory.blink_memory_mobile at 437651:437694 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Dec 11 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8993528327177687648
,
Dec 11 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8993528318717572096
,
Dec 12 2016
=== PERF REGRESSION === === Auto-CCing suspected CL author sunnyps@chromium.org === Hi sunnyps@chromium.org, the bisect results pointed to your CL, please take a look at the results. ===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : gpu: Thread-safe command buffer state lookup. Author : sunnyps Commit description: This makes client side (in-process / proxy) command buffer state thread- safe and adds release count to the client-server shared state. This allows the compositor thread to monitor worker context sync token in the service. Client side command buffer state is synchronized using a lock. Extra caching of the state is added to command buffer helper so that it doesn't acquire the lock for every command. Also fixes command buffer memory tracing so that it happens on the same thread which the context provider is bound to. R=piman@chromium.org,jbauman@chromium.org BUG= 514813 , 638862 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2550583002 Cr-Commit-Position: refs/heads/master@{#437651} Commit : 1285660590d371fedced6253c43be569c2d054ee Date : Fri Dec 09 21:09:57 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@437650 969856 12995377 60 good chromium@437651 4194240 0.0 12 bad <-- chromium@437652 4194240 0.0 12 bad chromium@437653 4194240 0.0 9 bad chromium@437656 4194240 0.0 11 bad chromium@437661 4194240 0.0 12 bad chromium@437672 4194240 0.0 11 bad chromium@437694 4194240 0.0 10 bad Bisect job ran on: android_nexus5X_perf_bisect Bug ID: 673185 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests memory.blink_memory_mobile Test Metric: memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg/memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg Relative Change: 332.46% Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/978 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993528318717572096 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5575345011425280 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Dec 12 2016
===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : gpu: Thread-safe command buffer state lookup. Author : sunnyps Commit description: This makes client side (in-process / proxy) command buffer state thread- safe and adds release count to the client-server shared state. This allows the compositor thread to monitor worker context sync token in the service. Client side command buffer state is synchronized using a lock. Extra caching of the state is added to command buffer helper so that it doesn't acquire the lock for every command. Also fixes command buffer memory tracing so that it happens on the same thread which the context provider is bound to. R=piman@chromium.org,jbauman@chromium.org BUG= 514813 , 638862 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2550583002 Cr-Commit-Position: refs/heads/master@{#437651} Commit : 1285660590d371fedced6253c43be569c2d054ee Date : Fri Dec 09 21:09:57 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@437650 830048 12107728 60 good chromium@437651 4194240 0.0 12 bad <-- chromium@437652 4194240 0.0 11 bad chromium@437653 4194240 0.0 11 bad chromium@437656 4194240 0.0 11 bad chromium@437661 4194240 0.0 11 bad chromium@437672 4194240 0.0 12 bad chromium@437694 4194240 0.0 10 bad Bisect job ran on: android_nexus5X_perf_bisect Bug ID: 673185 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests memory.blink_memory_mobile Test Metric: memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg/memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg Relative Change: 405.30% Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/977 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993528327177687648 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5849010596741120 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Dec 12 2016
sunnyps@ and piman@ seem OOO. jbauman@: Could you please triage this? Is this an expected result of https://codereview.chromium.org/2550583002 ?
,
Dec 13 2016
Maybe this could be related to a change in the way that memory dumps are reported. Or it's possible scheduling has changed in a way that would cause issues.
,
Dec 13 2016
This doesn't really look like a regression to me, but more of an issue with how our memory reporting system works. What appears to have happened is that a number of pages stopped reporting any memory for the memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg category. Rather than averaging these values in as "0", the system ignores them and only considers the other, much higher, values when computing the average. It seems like we should count all pages when calculating an average for a category - otherwise it makes it hard to compare between category averages. primiano@, WDYT? There's the side issue of whether these values should have gone to 0, but it doesn't seem that weird to me (depends on timing)... I'll look a bit more.
,
Dec 13 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c7b57e009932dd8c560120b9c350cbc4ba8a2781 commit c7b57e009932dd8c560120b9c350cbc4ba8a2781 Author: jbauman <jbauman@chromium.org> Date: Tue Dec 13 21:49:52 2016 Fix memory dumping for command buffers with no gr_context_. BUG= 673185 Review-Url: https://codereview.chromium.org/2567283003 Cr-Commit-Position: refs/heads/master@{#438295} [modify] https://crrev.com/c7b57e009932dd8c560120b9c350cbc4ba8a2781/content/common/gpu/client/context_provider_command_buffer.cc
,
Dec 13 2016
I think that patch should fix the memory dumping issue.
,
Dec 14 2016
Confirmed. Thanks for the fix!
,
Dec 16 2016
Re #8: ericrk: what you describe is correct but was a conscious decision. The rationale we applied is: if something decides to not report a memory row, it means that such row doesn't apply for that particular scenario and we care only about monitoring the ones where the author of the probe (MDP) decided it makes sense (read the MDP has the power to emit a 0 or not emit anything, and they have different semantics). From a concrete viewpoint, the case we thought about was: if we have a row for memory used for video decoding, that row should be averaged only for the pages that actually play a video. The other pages that don't have a video shouldn't weight down the average with zeros. Anyways, even if we made the opposite choice (considering empty as a 0) I don't think it would have changed much here (the metric would have gone down instead of up). IMHO the real issue here is the fact that we alert on the average of different pages, instead of per single page. This is non ideal for a wide number of reasons. Unfortunately that is a tradeoff that we had to accept today given the current sheriffing bandwidth: bisect success ratio and lack of cleverness of the dashboard (w.r.t. grouping similar regressions on different pages). All these are aspects that the speed-infra is working on, so one day we should be able to move away from this situation. In other words: this wouldn't have been a problem if the alert was on the specific pages and not on the average of the pages. But doing such creates too much sheriffing load, as it multiplies the alerts x10.
,
Dec 16 2016
|
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by bashi@chromium.org
, Dec 11 2016