New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 673185 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
not on Chrome anymore
Closed: Dec 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 673457



Sign in to add a comment

332.5% regression in memory.blink_memory_mobile at 437651:437694

Project Member Reported by bashi@chromium.org, Dec 11 2016

Issue description

See the link to graphs below.
 

Comment 1 by bashi@chromium.org, Dec 11 2016

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=673185

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICg1-PirgoM


Bot(s) for this bug's original alert(s):

android-nexus5X
Project Member

Comment 4 by 42576172...@developer.gserviceaccount.com, Dec 12 2016

Cc: sunn...@chromium.org
Owner: sunn...@chromium.org

=== PERF REGRESSION ===


=== Auto-CCing suspected CL author sunnyps@chromium.org ===

Hi sunnyps@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : gpu: Thread-safe command buffer state lookup.
Author  : sunnyps
Commit description:
  
This makes client side (in-process / proxy) command buffer state thread-
safe and adds release count to the client-server shared state. This
allows the compositor thread to monitor worker context sync token in the
service.

Client side command buffer state is synchronized using a lock. Extra
caching of the state is added to command buffer helper so that it
doesn't acquire the lock for every command.

Also fixes command buffer memory tracing so that it happens on the same
thread which the context provider is bound to.

R=piman@chromium.org,jbauman@chromium.org
BUG= 514813 , 638862 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2550583002
Cr-Commit-Position: refs/heads/master@{#437651}
Commit  : 1285660590d371fedced6253c43be569c2d054ee
Date    : Fri Dec 09 21:09:57 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N   Good?
chromium@437650  969856   12995377  60  good
chromium@437651  4194240  0.0       12  bad    <--
chromium@437652  4194240  0.0       12  bad
chromium@437653  4194240  0.0       9   bad
chromium@437656  4194240  0.0       11  bad
chromium@437661  4194240  0.0       12  bad
chromium@437672  4194240  0.0       11  bad
chromium@437694  4194240  0.0       10  bad

Bisect job ran on: android_nexus5X_perf_bisect
Bug ID: 673185

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests memory.blink_memory_mobile
Test Metric: memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg/memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg
Relative Change: 332.46%

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/978
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993528318717572096


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5575345011425280

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 5 by 42576172...@developer.gserviceaccount.com, Dec 12 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : gpu: Thread-safe command buffer state lookup.
Author  : sunnyps
Commit description:
  
This makes client side (in-process / proxy) command buffer state thread-
safe and adds release count to the client-server shared state. This
allows the compositor thread to monitor worker context sync token in the
service.

Client side command buffer state is synchronized using a lock. Extra
caching of the state is added to command buffer helper so that it
doesn't acquire the lock for every command.

Also fixes command buffer memory tracing so that it happens on the same
thread which the context provider is bound to.

R=piman@chromium.org,jbauman@chromium.org
BUG= 514813 , 638862 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2550583002
Cr-Commit-Position: refs/heads/master@{#437651}
Commit  : 1285660590d371fedced6253c43be569c2d054ee
Date    : Fri Dec 09 21:09:57 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N   Good?
chromium@437650  830048   12107728  60  good
chromium@437651  4194240  0.0       12  bad    <--
chromium@437652  4194240  0.0       11  bad
chromium@437653  4194240  0.0       11  bad
chromium@437656  4194240  0.0       11  bad
chromium@437661  4194240  0.0       11  bad
chromium@437672  4194240  0.0       12  bad
chromium@437694  4194240  0.0       10  bad

Bisect job ran on: android_nexus5X_perf_bisect
Bug ID: 673185

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests memory.blink_memory_mobile
Test Metric: memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg/memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg
Relative Change: 405.30%

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5X_perf_bisect/builds/977
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993528327177687648


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5849010596741120

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!

Comment 6 by bashi@chromium.org, Dec 12 2016

Cc: piman@chromium.org jbau...@chromium.org
Owner: jbau...@chromium.org
Status: Assigned (was: Untriaged)
sunnyps@ and piman@ seem OOO.

jbauman@: Could you please triage this? Is this an expected result of https://codereview.chromium.org/2550583002 ?

Cc: ericrk@chromium.org
Maybe this could be related to a change in the way that memory dumps are reported. Or it's possible scheduling has changed in a way that would cause issues.

Comment 8 by ericrk@chromium.org, Dec 13 2016

Cc: primiano@chromium.org
This doesn't really look like a regression to me, but more of an issue with how our memory reporting system works.

What appears to have happened is that a number of pages stopped reporting any memory for the memory:chrome:renderer_processes:reported_by_chrome:gpu:effective_size_avg category. Rather than averaging these values in as "0", the system ignores them and only considers the other, much higher, values when computing the average.

It seems like we should count all pages when calculating an average for a category - otherwise it makes it hard to compare between category averages.

primiano@, WDYT?

There's the side issue of whether these values should have gone to 0, but it doesn't seem that weird to me (depends on timing)... I'll look a bit more.
Project Member

Comment 9 by bugdroid1@chromium.org, Dec 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c7b57e009932dd8c560120b9c350cbc4ba8a2781

commit c7b57e009932dd8c560120b9c350cbc4ba8a2781
Author: jbauman <jbauman@chromium.org>
Date: Tue Dec 13 21:49:52 2016

Fix memory dumping for command buffers with no gr_context_.

BUG= 673185 

Review-Url: https://codereview.chromium.org/2567283003
Cr-Commit-Position: refs/heads/master@{#438295}

[modify] https://crrev.com/c7b57e009932dd8c560120b9c350cbc4ba8a2781/content/common/gpu/client/context_provider_command_buffer.cc

Status: Fixed (was: Assigned)
I think that patch should fix the memory dumping issue.

Comment 11 by bashi@chromium.org, Dec 14 2016

Confirmed. Thanks for the fix!
Re #8: 
ericrk: what you describe is correct but was a conscious decision. The rationale we applied is: if something decides to not report a memory row, it means that such row doesn't apply for that particular scenario and we care only about monitoring the ones where the author of the probe (MDP) decided it makes sense (read the MDP has the power to emit a 0 or not emit anything, and they have different semantics).
From a concrete viewpoint, the case we thought about was: if we have a row for memory used for video decoding, that row should be averaged only for the pages that actually play a video. The other pages that don't have a video shouldn't weight down the average with zeros.

Anyways, even if we made the opposite choice (considering empty as a 0) I don't think it would have changed much here (the metric would have gone down instead of up).
IMHO the real issue here is the fact that we alert on the average of different pages, instead of per single page.
This is non ideal for a wide number of reasons. Unfortunately that is a tradeoff that we had to accept today given the current sheriffing bandwidth: bisect success ratio and lack of cleverness of the dashboard (w.r.t. grouping similar regressions on different pages). All these are aspects that the speed-infra is working on, so one day we should be able to move away from this situation.

In other words: this wouldn't have been a problem if the alert was on the specific pages and not on the average of the pages. But doing such creates too much sheriffing load, as it multiplies the alerts x10.

Blockedon: 673457

Sign in to add a comment