New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 632651 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug-Regression



Sign in to add a comment

21.7%-76.1% regression in smoothness.top_25_smooth at 408311:408338

Project Member Reported by alexclarke@chromium.org, Jul 29 2016

Issue description

See the link to graphs below.
 
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Jul 29 2016

Cc: dpranke@chromium.org
Owner: dpranke@chromium.org

=== Auto-CCing suspected CL author dpranke@chromium.org ===

Hi dpranke@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Try again to flip official win builders to GN.
Author  : dpranke
Commit description:
  
We believe we've fixed the remaining known issues w/ the upload/
signing scripts, so it's time to try again. This affects:

- chromium.perf
  - Win Builder
- official.desktop
  - win
  - win64
- official.desktop.continuous
  - win beta
  - win stable
  - win trunk
  - win64 trunk

(Though the beta and stable builders aren't actually affected and
won't be until they are on M54).

TBR=sebmarchand@chromium.org, brettw@chromium.org, brucedawson@chromium.org
BUG= 623659 

Review-Url: https://codereview.chromium.org/2187613004
Cr-Commit-Position: refs/heads/master@{#408318}
Commit  : e28fd5d84cebeb446e2c35cf6b766e35aa56ec5f
Date    : Thu Jul 28 02:44:20 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev  N  Good?
chromium@408317  29.8174  4.14333  8  good
chromium@408318  45.6054  0.85834  5  bad    <--
chromium@408319  40.798   6.1371   8  bad
chromium@408320  37.7143  7.35066  8  bad

Bisect job ran on: win_perf_bisect
Bug ID: 632651

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.top_25_smooth
Test Metric: mean_input_event_latency/Pinterest
Relative Change: 38.25%
Score: 99.8

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6749
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9005812508803417968


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5882907583840256

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
I hope that's not true.  Lets try another bisect.
Cc: brucedaw...@chromium.org
I just kicked one off as well, but cc brucedawson just in case.
How do the other benchmarks look? The change from gyp to gn could certainly change some CPU intensive benchmarks due to subtle changes in link order or build settings, but the smoothness benchmark is not a very good metric for CPU performance issues, so I'd want this corroborated.

My understanding is that the smoothness benchmark is a better measure of task scheduling issues, although even then it isn't directly measuring what we care about.

Project Member

Comment 9 by 42576172...@developer.gserviceaccount.com, Jul 30 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Try again to flip official win builders to GN.
Author  : dpranke
Commit description:
  
We believe we've fixed the remaining known issues w/ the upload/
signing scripts, so it's time to try again. This affects:

- chromium.perf
  - Win Builder
- official.desktop
  - win
  - win64
- official.desktop.continuous
  - win beta
  - win stable
  - win trunk
  - win64 trunk

(Though the beta and stable builders aren't actually affected and
won't be until they are on M54).

TBR=sebmarchand@chromium.org, brettw@chromium.org, brucedawson@chromium.org
BUG= 623659 

Review-Url: https://codereview.chromium.org/2187613004
Cr-Commit-Position: refs/heads/master@{#408318}
Commit  : e28fd5d84cebeb446e2c35cf6b766e35aa56ec5f
Date    : Thu Jul 28 02:44:20 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N  Good?
chromium@408317  33.0768  0.23349   5  good
chromium@408318  26.4016  0.193072  5  bad    <--
chromium@408319  26.7606  0.510709  5  bad
chromium@408320  26.4132  0.224334  5  bad
chromium@408322  26.669   0.479578  5  bad
chromium@408328  26.7234  0.475523  5  bad
chromium@408338  26.7126  0.557479  5  bad

Bisect job ran on: win_perf_bisect
Bug ID: 632651

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.top_25_smooth
Test Metric: mean_input_event_latency/Docs  (1 open document tab)
Relative Change: 19.24%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6758
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9005794714395180560


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5794384818208768

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 10 by 42576172...@developer.gserviceaccount.com, Jul 30 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Try again to flip official win builders to GN.
Author  : dpranke
Commit description:
  
We believe we've fixed the remaining known issues w/ the upload/
signing scripts, so it's time to try again. This affects:

- chromium.perf
  - Win Builder
- official.desktop
  - win
  - win64
- official.desktop.continuous
  - win beta
  - win stable
  - win trunk
  - win64 trunk

(Though the beta and stable builders aren't actually affected and
won't be until they are on M54).

TBR=sebmarchand@chromium.org, brettw@chromium.org, brucedawson@chromium.org
BUG= 623659 

Review-Url: https://codereview.chromium.org/2187613004
Cr-Commit-Position: refs/heads/master@{#408318}
Commit  : e28fd5d84cebeb446e2c35cf6b766e35aa56ec5f
Date    : Thu Jul 28 02:44:20 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev    N  Good?
chromium@408200  32.9982  0.11501    5  good
chromium@408301  32.8188  0.0948404  5  good
chromium@408315  33.1062  0.212663   5  good
chromium@408317  32.929   0.257558   5  good
chromium@408318  26.567   0.347436   5  bad    <--
chromium@408319  26.5536  0.201192   5  bad
chromium@408322  26.6814  0.493747   5  bad
chromium@408328  26.9518  0.392173   5  bad
chromium@408352  26.6788  0.216391   5  bad
chromium@408400  26.2846  0.221256   5  bad

Bisect job ran on: win_perf_bisect
Bug ID: 632651

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests smoothness.top_25_smooth
Test Metric: mean_input_event_latency/Docs  (1 open document tab)
Relative Change: 20.35%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6757
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9005794727919602304


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5776423801847808

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: brettw@chromium.org
Components: -Internals>GPU Build
Labels: Proj-GN-Migration-Ship
Owner: brucedaw...@chromium.org
@bruce, do you want to own this, or should we punt this to brettw?
Labels: OS-Windows
I'm happy to own it. It is my specialty.
Perf sheriff ping: reminder to follow up on possible performance issues
I'm working on this now.

I did GYP and GN official builds from e28fd5d84cebeb446e2c35cf6b766e35aa56ec5f with official 32-bit settings. Those are:

set GYP_DEFINES=branding=Chrome buildtype=Official
gn_args: is_chrome_branded=true is_official_build=true is_debug=false target_cpu = "x86"

The code at that hash is currently not working so I had to cherry-pick 16a129d49fbd09f6a1f012845973d56d7d1f32f1 to get it to build. I'll investigate that later.

I then ran the problematic Pinterest story in both browsers and verified that mean_input_event_latency was noticeably higher with the gn build. These are the commands I used:

python tools/perf/run_benchmark --browser=exact --browser-executable=d:\src\chromium\src\out\release\chrome.exe smoothness.top_25_smooth --extra-browser-args=--no-sandbox --story-filter=Pinterest --reset-results 2>&1 | find /v "queueing_durations" | find /v "frame_times" | find "mean_input_event_latency"

python tools/perf/run_benchmark --browser=exact --browser-executable=d:\src\chromium\src\out\gn_release_official\chrome.exe smoothness.top_25_smooth --extra-browser-args=--no-sandbox --story-filter=Pinterest --reset-results 2>&1 | find /v "queueing_durations" | find /v "frame_times" | find "mean_input_event_latency"

Then I recorded an ETW trace of both of the run_benchmark runs. The results are pretty clear - the gn build is using significantly more CPU time than the gyp build, in the renderer process. 9.397 s versus 5.564 s, spread out over about 6.3 s.

In particular the chrome_child.dll!SkNWayCanvas::onDrawRect function consumes about .765 s in the gyp build, but 4.686 s on the gn build. My best guess is that this means that some files are not being LTCG compiled and therefore some cross-module inlining optimizations are being missed. I'm still drilling down to find out what is going on but thought I'd share the progress.

I raised the priority to 1.

More precisely, the gn build spends about 4.373 s inside of SkMaskFilter::filterPath, and most of that (4.159 s) is inside of draw_nine.

In the gyp build SkMaskFilter::filterPath consumes just .254 s, and draw_nine is either never called or is inlined.
I looked at the build settings for third_party\skia\src\core\SkMaskFilter.cpp to see if there were any suspicious differences. There are.

The most important differences are these compiler flags:
gyp: /fp:precise /GF /GL /GS /O2 /Ot /Oy-
gn:  /O1

/fp:precise - this would actually cause gyp to generate slightly slower FP code. Losing this is probably fine now that we are relying on SSE.
/GF  - enable read-only string pooling, we should have this on
/GL  - LTCG, I have no idea why this is off on the gn builds
/GS  - enable security checks, how did *this* switch end up turned off!
/O2  - maximize speed, 'nuff said
/Ot  - favor code speed
/Oy- - use frame pointers

/O1  - minimize space

So, the loss of LTCG and /O2 is enough to explain this I think. Some of the other changes might actually benefit gn, but not significantly.

In addition there are some gyp only defines:
-D_WINSOCK_DEPRECATED_NO_WARNINGS
-DALLOCATOR_SHIM
-DCR_CLANG_REVISION=274369-1
-DENABLE_HANGOUT_SERVICES_EXTENSION=1
-DENABLE_SWIFTSHADER
-DUSE_LIBJPEG_TURBO=1
-DUSE_LIBPCI=1
-DV8_USE_EXTERNAL_STARTUP_DATA

there are also some differences in what warnings are disabled.

Labels: -Pri-2 Pri-1
With the latest tentative changes in crrev.com/2270693006 the situation is much better, but still not back to normal. The gn build appears to consume about .3 s more CPU time than the gyp build. This may not be statistically meaningful, but profiling shows that SkMaskFilter::filterPath is still more expensive - ~391 ms versus ~216 ms.

All of the extra time is inside of GrSingleTextureEffect::~GrSingleTextureEffect, which seems odd. I'm making the remaining compile settings match to see if that resolves the problem.

This time I'm actually changing the priority to 1.
Project Member

Comment 19 by bugdroid1@chromium.org, Aug 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3e80255c5139698d5fca9d8f1595a307f6cb3639

commit 3e80255c5139698d5fca9d8f1595a307f6cb3639
Author: brucedawson <brucedawson@chromium.org>
Date: Thu Aug 25 09:08:10 2016

Adjust skia build settings to match gyp, improve perf

When official builds switched from gyp to gn some smoothness tests
regressed. Investigation showed that some skia functions were taking
up to 15x longer to execute. This was due to /GL /O2 being changed to
/O1 (LTCG and optimize-for-speed changing to optimize-for-size). This
change resolves most of the compiler option differences in skia and
dramatically reduces the CPU usage in the gn build, from 9.4 s to 5.3,
within spitting distance of the gyp build.

This increases the size of chrome_child.dll from 48,593,408 to
49,213,440 bytes. The gyp version is 49,032,192 bytes and if necessary
some more size optimizations can be applied to the gn build.

The discrepancy between gyp and gn optimization settings in skia occurred
when crrev.com/1410883008 landed for gyp only.

There may be other differences - investigation continues - but this fixes
the main issues noticed.

BUG= 632651 

Review-Url: https://codereview.chromium.org/2270693006
Cr-Commit-Position: refs/heads/master@{#414368}

[modify] https://crrev.com/3e80255c5139698d5fca9d8f1595a307f6cb3639/skia/BUILD.gn

As predicted this caused a size regression -  crbug.com/641017 . This is normal and expected and not really a problem. We decided last year to make the size/speed tradeoff in Skia, and this bug shows that that decision was justified.
I checked the graphs this morning and it is clear that the regression is fixed.

thread_times.tough_scrolling_cases /  thread_total_all_cpu_time_per_frame
 - from 8.0 ms back down to 6.5 ms
smoothness.tough_path_rendering_cases / frame_times
 - from 30 ms back down to 27.5 ms
smoothness.key_desktop_move_cases / mean_input_event_latency
 - from 22.5 ms back down to 17.5 ms
smoothness.top_25_smooth / first_gesture_scroll_update_latency /
 - from 12.9 down to 11.3, was originally around 10.1

The cpu_time metric is the most important. Some of the other metrics I don't trust to measure this regression accurately. However even the cpu_time metric did not capture the full magnitude of this regression. I saw the gn build using about 69% more CPU time than the gyp build, but none of the metrics (and especially not the CPU time metrics) regressed by more than 29%. This seems curious, and possibly worth investigating.

I don't know why smoothness.top_25_smooth / first_gesture_scroll_update_latency hasn't fully recovered. I don't know how meaningful that metric is. However I am still investigating some code-gen issues that *might* (but probably not) explain that. I may open a separate bug for that.
Status: Fixed (was: Assigned)
I am satisfied with the fix.

Sign in to add a comment