Issue metadata
Sign in to add a comment
|
2.6%-3.8% regression in page_cycler.intl_es_fr_pt-BR at 403875:403894 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jul 7 2016
=== Auto-CCing suspected CL author drott@chromium.org === Hi drott@chromium.org, the bisect results pointed to your CL below as possibly causing a regression. Please have a look at this info and see whether your CL be related. ===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : Activate complex text path on Android Author : drott Commit description: In Chromium commit 10749067720a341 catapult was rolled, switching to default teardowns between pageset repeats, compare [1]. Activating complex text on Android again after this change in order to see how measurements are affected by this. If no memory regression shows up in the memory health measurements for memory pagesets this would indicate that we may have a memory-leakage over time problem in webview single-process mode but no issues in the usual tab process lifecycle on Android. Activating and TBR'ing to collect bot results, potentially reverting later if needed for further investigation. [1] https://github.com/catapult-project/catapult/issues/2294 BUG=577306 TBR=eae Review-Url: https://codereview.chromium.org/2123653005 Cr-Commit-Position: refs/heads/master@{#403884} Commit : 5009e3effbb04c74140d262fb71bc51cbf38913a Date : Wed Jul 06 11:39:32 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@403882 2630.39 16.0673 8 good chromium@403883 2606.69 32.7234 8 good chromium@403884 2711.11 42.4625 5 bad <-- chromium@403885 2723.89 47.3177 5 bad Bisect job ran on: android_nexus9_perf_bisect Bug ID: 626274 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests page_cycler.intl_es_fr_pt-BR Test Metric: warm_times/page_load_time Relative Change: 3.56% Score: 98.0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus9_perf_bisect/builds/1871 Job details: https://chromeperf.appspot.com/buildbucket_job_status/9007804845261227776 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5866986970546176 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Jul 18 2016
eae@, some performance regressions here for activating complex on Android. Some of them are expected, some need evaluation - could you help assess the severity?
,
Jul 19 2016
We expected the micro benchmarks blink_perf.parser and blink_perf.layout to regress and isn't a big deal as they intentionally stress the text stack. We're doing more work to correctly shape and as they stress text layout that is to be expected. This also happened when we flipped the switch on destkop and there was no corresponding regression on either the page cycler on real world sites. On some platforms (notably android-nexus5X) it's a bit lager than expected though. For the intl page cyclers (page_cycler.intl_ar_fa_he etc) we're trading a small (2%-3#) regression for correctness and a much higher text rendering quality. This is on par with what we expected and should be considered WontFix. The page_cycler.top_10_mobile regressions (5%-9%) is a bit more concerning and it would be useful to get a breakdown to see which sites are showing a regression. Some perf hit is to be expected given that we're doing more work to support kerning, ligatures, and proper font selection. Finally, once we remove the simple text path there are a lot of cleanup and optimizations we can do to drive down the performance overhead. None of this can happen before we flip the switch and kill the old (unmaintained) simple text code path though.
,
Jul 21 2016
I looked in more detail at the benchmark results for page_cycler.top_10_mobile where we see a 7.7% increase overall. The other page cycler results for page_cycler.intl_ar_fa_he, page_cycler.intl_es_fr_pt-BR and page_cycler.intl_ko_th_vi are less impactful and vary between 2.2 and 3.3%. Details of the analysis in this sheet: https://docs.google.com/a/chromium.org/spreadsheets/d/1Qsjdb6X1kSdYBhTZX3V6m8fUuVNRhxl9ZYnBIm7gaug/edit?usp=sharing I focused on the slowest and the fastest Android device. Key results: * Most of the pages are only affected by a small amount up to 5%. * Three outliers are affected more: - The Wikipedia Science article is affected by 17.67% on the slower device, 15.56% on the faster device. This is attributable to an higher initial shaping cost. - Baidu.com Search for keyword Google: This affected 18.22% on the slwoer and 14.46% on the faster device: Baidu specifies "arial, sans-serif" as two fonts that are not supporting Chinese characters by default. Additional shaping run attempts are done with those fonts until a fallback font is chosen. - Justin Bieber's twitter stream is affected by around 7% - it's a little more unclear what is the bottleneck on this page. So, overall we do have hits 1) on long Wikipedia articles and on pages with suboptimal font choice and 2) long fallback chains. There are some promising memory and performance preliminary results on https://codereview.chromium.org/1980913002 for improving shaping performance due to better utilization of Skia glyph bounds caches which probably will help improving 1). For 2) we should consider implementing an optimization to do less shaping work on a font that fails to have glyphs for the characters in the text runs - possibly by providing a OpenType feature-reduced font to HarfBuzz for a scan run, or by additional HarfBuzz API for a fast check. Overall: * A small increase was to be expected since we do more correct text processing and shaping, better script run segmentation, better emoji segmentation (cmp. 526095, 504745, 580074) provide better typography and OpenType font integration by default. * We gain consistency with our other platforms since Mac, Windows, and Linux are running complex text by default already. * This change allows us to remove the simple text code path (roughly -2200 lines), which has caused a number of text rendering bugs. Removing that in turn enables us to do further optimizations and cleanups on the font code base - which again should help with performance. One low hanging change: Disabling more character-by-character scanning to actually decide which code path to use.
,
Jul 22 2016
Issue 626272 has been merged into this issue.
,
Aug 5 2016
cc-ing bmcquade as this has an impact on page load time (although these are the old page cyclers, which measure onload event, and do not have network simulation)
,
Aug 8 2016
sullivan@, do we have before/after values for this issue report using new page cyclers? Is there a link where I could look at such results? Thanks.
,
Aug 8 2016
+Kouhei and Ned, see comment 8: It looks like the data for pcv2 on this pageset only goes back to r404643? https://chromeperf.appspot.com/report?sid=029117e9ce56d2d65ef704b78d9b16b8f26ffcd9b135cc245a59a1f32a397f9b&start_rev=404643&end_rev=406212 Should drott try running a perf try job with a local revert on pcv2 to see the difference?
,
Aug 8 2016
(slightly off-topic: the naming confuses me. the data points are named: ChromiumPerf/android-nexus5/page_cycler_v2.intl_es_fr_pt-BR/firstMeaningfulPaint_avg/pcv1-cold "page_cycler_v2" vs "pcv1-cold")
,
Aug 8 2016
"pcv1-cold" here refers to the cache policy we apply for page_cycler_v2 pages. It's basically copied from the old page_cycler benchmarks, hence the name "pcv1-cold"
,
Aug 8 2016
To comment in #9: running a perf try job with a local revert against pcv2 is a good idea.
,
Aug 8 2016
Locally run results of simpletext (without CL) vs complex text (with CL applied) are here: http://roettsch.es/pcv2_complextext/results.html Warm time to onload avg improves much (probably thanks to the word cache), but cold time to onload is 3.86% worse. First meaningful paint avg is 1.98% worse, ranging from a 4% improvement to 7% hit.
,
Aug 8 2016
These were measured on a Z840 with: $ tools/perf/run_benchmark -v --browser android-chromium page_cycler_v2.intl_es_fr_pt-BR
,
Aug 10 2016
c#13 result looks w/in the noise range to me. PCv2 seems to have problem ending traces too early on intl_es_fr_pt-BR page_set. I can follow up on this Friday (Thurs is holiday in Japan)
,
Aug 18 2016
Perf sheriff ping: reminder to follow up on possible performance issues
,
Aug 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5cbfbf15751cfaff33008bd0dd54699b7791158b commit 5cbfbf15751cfaff33008bd0dd54699b7791158b Author: kouhei <kouhei@chromium.org> Date: Mon Aug 22 07:09:25 2016 Introduce PageCyclerStory and let the stories wait for onload+TTI BUG= 626274 Review-Url: https://codereview.chromium.org/2244943002 Cr-Commit-Position: refs/heads/master@{#413414} [modify] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/intl_ar_fa_he.py [modify] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/intl_es_fr_pt-BR.py [modify] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/intl_hi_ru.py [modify] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/intl_ja_zh.py [modify] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/intl_ko_th_vi.py [add] https://crrev.com/5cbfbf15751cfaff33008bd0dd54699b7791158b/tools/perf/page_sets/page_cycler_story.py
,
Aug 22 2016
I tried a local repro w/ and w/o the patch [1]. [1] https://codereview.chromium.org/2123653005
,
Aug 22 2016
What is ToT2 and Disable2 in your results file? And how do you interpret the results?
,
Aug 23 2016
I'm marking this bug as FIXED, as I couldn't find any significant difference in time-to-first-meaningful-paint in ToT w/ and w/o complex path. ToT runs have complex path always enabled, and disabled runs revert to old behavior before drott@'s change. The only notable difference was a increase in TTFCP on "http://elmundo.es" case. Where the old behavior runs was >200ms faster in its initial 2 runs. I run additional 10 cycles on each build to confirm, but I couldn't reproduce the regression. There was no increase in TTFMP (which supercedes TTFCP) in the all runs, so I think we can safely ignore this. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by alexclarke@chromium.org
, Jul 7 2016