New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 659066 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Feb 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

2.8% regression in memory.top_10_mobile at 425605:425619

Project Member Reported by alexclarke@chromium.org, Oct 25 2016

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=659066

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICg7bHqrwoM


Bot(s) for this bug's original alert(s):

android-nexus9
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Oct 25 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean       Std Dev  N   Good?
chromium@425604  198104405  2211608  18  good
chromium@425619  199327744  2843351  18  bad

Bisect job ran on: android_nexus9_perf_bisect
Bug ID: 659066

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests memory.top_10_mobile
Test Metric: memory:chrome:all_processes:reported_by_os:gpu_memory:proportional_resident_size_avg/foreground/http_m_youtube_com_results_q_science
Relative Change: 0.70%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus9_perf_bisect/builds/2212
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8997836052732832768


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5876196141694976

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 5 by 42576172...@developer.gserviceaccount.com, Oct 25 2016


===== BISECT JOB RESULTS =====
Status: failed


===== TESTED REVISIONS =====
Revision         Mean       Std Dev  N   Good?
chromium@422000  191306138  2174373  5   good
chromium@424500  192344519  2471347  18  good
chromium@425125  193094542  2943322  18  good
chromium@425750  196269397  2071174  12  bad
chromium@427000  197230592  2965821  5   bad

Bisect job ran on: android_nexus9_perf_bisect
Bug ID: 659066

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http.m.youtube.com.results.q.science memory.top_10_mobile
Test Metric: memory:chrome:all_processes:reported_by_os:gpu_memory:proportional_resident_size_avg/foreground/http_m_youtube_com_results_q_science
Relative Change: 3.10%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus9_perf_bisect/builds/2214
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8997814577932324912


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5901491351584768

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 7 by 42576172...@developer.gserviceaccount.com, Oct 26 2016

Cc: e...@chromium.org
Owner: e...@chromium.org

=== Auto-CCing suspected CL author eae@chromium.org ===

Hi eae@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Remove unsafe getFontMetrics methods
Author  : eae
Commit description:
  
Remove the ComputedStyle::getFontMetrics and Font::getFontMetrics helper
methods as they both assume that the primaryFont method always returns a
valid SimpleFontDataObject. That assumption is both incorrect and unsafe
as it can return nullptr in case fallback to the last-resort-font fails.

By replacing the convince calls with explicit calls and null checks that
type of problems becomes more apparent and can be handled appropriately.

R=wkorman@chromium.org
BUG=655815
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2

Review-Url: https://codereview.chromium.org/2416033003
Cr-Commit-Position: refs/heads/master@{#425437}
Commit  : fdcbab80bc37108f6e03d6906f27831228690350
Date    : Fri Oct 14 19:51:13 2016


===== TESTED REVISIONS =====
Revision         Mean       Std Dev  N  Good?
chromium@422000  192479232  2415579  8  good
chromium@424500  191253709  1875750  5  good
chromium@425125  190729421  2584479  5  good
chromium@425282  192092570  2297315  5  good
chromium@425360  190545920  2117533  5  good
chromium@425399  191135744  2536748  8  good
chromium@425419  190824448  2626526  8  good
chromium@425429  190365696  1146813  8  good
chromium@425434  191791104  2095981  8  good
chromium@425436  191777997  1464257  5  good
chromium@425437  197125734  3176164  5  bad    <--
chromium@425438  198593741  3223412  5  bad
chromium@425750  198908314  1793341  5  bad
chromium@427000  196116480  2266988  8  bad

Bisect job ran on: android_nexus9_perf_bisect
Bug ID: 659066

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http.m.youtube.com.results.q.science memory.top_10_mobile
Test Metric: memory:chrome:all_processes:reported_by_os:gpu_memory:proportional_resident_size_avg/foreground/http_m_youtube_com_results_q_science
Relative Change: 2.04%
Score: 99.0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus9_perf_bisect/builds/2217
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8997749640850039200


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=6096348179333120

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: primiano@chromium.org perezju@chromium.org
Note the memory.top_10_mobile regression appears to be only on gpu memory and on that particular youtube story. Still, from the alerts, it does seem to reproduce reliably on several different device types.

+primiano FYI
So I looked more in the traces and logs and unfortunately there seems a consistent regression:
- about gpu memory, both on metric reported by the chrome gpu probe (what chrome estimates to be using in terms of gpu resources) and the value reported by the OS.
- seems to affect only the youtube
- seems to affect all devices

The regression range is quite large (425416 - 425604), so I kicked another bisect (#9) on the GPU metric to see if this confirms #7


https://chromeperf.appspot.com/report?sid=11af42e5d7d28050d0c63c3f23ceb6a38eff00a76c8d6a2e7d96813cf8898a7b&start_rev=425200&end_rev=425913&rev=425619

425416 - 425604

Comment 11 by e...@chromium.org, Oct 31 2016

My change from #7 adds a set of null-checks to text rendering, seems unlikely to have an effect on GPU memory but let's see what the bisect comes back with.
Project Member

Comment 12 by 42576172...@developer.gserviceaccount.com, Oct 31 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Remove unsafe getFontMetrics methods
Author  : eae
Commit description:
  
Remove the ComputedStyle::getFontMetrics and Font::getFontMetrics helper
methods as they both assume that the primaryFont method always returns a
valid SimpleFontDataObject. That assumption is both incorrect and unsafe
as it can return nullptr in case fallback to the last-resort-font fails.

By replacing the convince calls with explicit calls and null checks that
type of problems becomes more apparent and can be handled appropriately.

R=wkorman@chromium.org
BUG=655815
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2

Review-Url: https://codereview.chromium.org/2416033003
Cr-Commit-Position: refs/heads/master@{#425437}
Commit  : fdcbab80bc37108f6e03d6906f27831228690350
Date    : Fri Oct 14 19:51:13 2016


===== TESTED REVISIONS =====
Revision         Mean      Std Dev  N  Good?
chromium@425399  40528254  56044.6  5  good
chromium@425425  40500251  80462.1  5  good
chromium@425432  40522642  68195.5  5  good
chromium@425435  40540062  137223   5  good
chromium@425436  40541256  79410.8  5  good
chromium@425437  44097653  13361.0  5  bad    <--
chromium@425438  44037602  81263.5  5  bad
chromium@425451  44289310  122758   5  bad
chromium@425502  44228158  186904   5  bad
chromium@425604  44804622  1256851  5  bad

Bisect job ran on: android_nexus9_perf_bisect
Bug ID: 659066

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http.m.youtube.com.results.q.science memory.top_10_mobile
Test Metric: memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/foreground/http_m_youtube_com_results_q_science
Relative Change: 10.55%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus9_perf_bisect/builds/2224
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8997278229931472960


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5906205627645952

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Yeah I just took a look at the CL and I am extremely skeptical that #7 is the real culprit. also the output of that bisect shows jumps in the order of <1 MB. Very likely just hit the noise of bisecting on the grand total value.
Here I'm after a multi-MB on gpu, which is a quite sharp jump over a clean metric (see link above in #10). Hopefully the bisect on #9 will give us the right answer.
The bisect on #9 returned on #12 and is pointing to the same CL.
Ping.

Note the bisect in #9 did flag the same CL r425437, and cleanly reproduced a ~3MiB regression.

eae@, could you look into this?
cr659066.png
16.4 KB View Download
Cc: benhenry@chromium.org
 Issue 672281  has been merged into this issue.
Cc: drott@chromium.org
+benhenry, as this quite big regression seems real.

Thanks Juan for triple-checking. Looks like my speculation was wrong and this is indeed a real regression.

I took a look to the traces for youtube from the dashboard.
This is what I observe by diffing before [1] vs after [2]
- the font_caches/shape_caches marginally decreases (-0.4 K)
- the font_cache/font_platform_data_cache marginally increases (+0.2 K)
So far we are talking about ridiculously small amounts of memory. I am pointing them out just to highlight some possible correlations (note: these font-cache metrics use to be extermely stable, so this small change is real and not noise).

Now the big part:
GL memory in the gpu process increases by +3 MB. This increase is confirmed by both the memory reported by the chrome probe (gpu process, column: gpu -> gpu/gl) and the memtrack_pss reported by the gpu driver (gpu process, column: gpu -> memtrack_pss)

So, being absolute ignorant about fonts, my reading of the situation is that this CL seems to require the gl driver to do *something* and keeps textures around.
eae@ does this make any sense to you?
I glanced through https://codereview.chromium.org/2416033003 and I see that lot of places got calls to getPrimaryFont().
getPrimaryFont() seems to have some code paths that populates some internal font cache [3]. Maybe this ends up triggering some GL function to compute font metrics that we weren't hitting before ?

CC-ing drott in case he has some insights here.

[1] https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_4-2016-10-14_14-11-33-2220.html
[2] https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_4-2016-10-16_20-34-57-18449.html
[3] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/fonts/FontFallbackList.h?rcl=1482135521&l=76

Comment 18 by drott@chromium.org, Dec 19 2016

Cc: bunge...@chromium.org
Thanks for the analysis, Primiano. I unfortunately do not have good insight into how a differing usage of font metrics functions leads to a difference in GL memory consumption. Ben, any suggestions here? 

Speculatively, Emil, could it be that the primaryFont() calls resolve to different font objects and we do metrics computation on more font instances than before, which has the GL side effects?

Tangentially relevant: I am planning to work on untangling FontPlatformData/SimpleFontData and simplifying our font handle caching. I have vague hopes this helps here, too.

Another thing I realized by still looking at the traces in #17 is the fact that also the fonts in the skia cache are different. Again the diff in terms of memory in skia is ridiculous (~KB) but might be a relevant signal.
After the CL I see an extra sans-serif font (sans_serif_3, see screenshot) and a bunch of extra skia/gpu_resources.
Hope it helps and I'm not just spamming the bug with noise.
before.png
137 KB View Download
after.png
137 KB View Download
I'm not entirely sure how to get to the data (cached webpage) this test is actually using, but does it look different after the blamed CL? Is it possible something just changed size or is drawing differently so the compositor just has more to composite (something is no longer just 'blank' and optimized away)?

As far as Skia's use of gpu, there is only ever one 'gray' texture, one 'lcd texture', and one 'color' texture for caching glyphs. So there wouldn't be more gpu memory used just because another font got used (unless the numbers coming back from the gpu are reporting parts of buffers that haven't been touched yet as 'free').
You should be able to run locally on a Nexus 5 attached to your workstation with something like:

  $SRC/tools/perf/run_benchmark run --browser android-chromium --story-filter youtube.com memory.top_10_mobile

(that should also download the required cached webpages for you).

Also on a checkout with a local patch you can run:

  $SRC/tools/perf/run_benchmark try android-nexus5 memory.top_10_mobile --story-filter youtube.com

to run a tryjob comparing metrics with/without the patch.
Status: Assigned (was: Untriaged)
Status: WontFix (was: Assigned)
These only failed on N9s, which we just deprecated. WontFix.
Labels: Performance-Memory

Sign in to add a comment