New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 726031 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug-Regression



Sign in to add a comment

14.5% regression in smoothness.tough_path_rendering_cases at 474085:474135

Project Member Reported by pmeenan@chromium.org, May 24 2017

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=726031

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDguubvvgoM


Bot(s) for this bug's original alert(s):

android-webview-nexus5X
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, May 25 2017

Cc: danakj@chromium.org
Owner: danakj@chromium.org

=== Auto-CCing suspected CL author danakj@chromium.org ===

Hi danakj@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : danakj
  Commit : cfc477b43f1c91e84e17180ee937728d6f496aa6
  Date   : Wed May 24 00:53:52 2017
  Subject: Add PaintOpBuffer::PlaybackRanges() to play a set of ranges of ops.

Bisect Details
  Configuration: android_webview_arm64_aosp_perf_bisect
  Benchmark    : smoothness.tough_path_rendering_cases
  Metric       : frame_times/https___testdrive-archive.azurewebsites.net_performance_chalkboard_
  Change       : 11.32% | 73.3561456602 -> 81.6598374232

Revision             Result                  N
chromium@474084      73.3561 +- 1.01852      6      good
chromium@474110      73.3842 +- 1.03053      6      good
chromium@474123      73.4723 +- 1.5184       6      good
chromium@474124      81.8637 +- 1.35298      6      bad       <--
chromium@474125      81.8079 +- 1.42103      6      bad
chromium@474126      82.2371 +- 1.6823       6      bad
chromium@474129      81.788 +- 1.11127       6      bad
chromium@474135      81.6598 +- 1.90845      6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=https...testdrive.archive.azurewebsites.net.performance.chalkboard. smoothness.tough_path_rendering_cases

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8978687642704029664

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=6461613332234240


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!

Comment 4 by danakj@chromium.org, May 25 2017

Cc: vmp...@chromium.org enne@chromium.org
This adds the new method but doesn't make any significant change to the work being done as its still rastering the whole set of ops always..

Comment 5 by danakj@chromium.org, May 26 2017

Oh ok, so before this patch:

PaintOpBuffer has two playback methods:

1) playback(SkCanvas)
2) playback(SkCanvas, AbortCallback)

The first is more complex, deals with save/restore optimizations and such.
The second is a simple tight loop.

DrawingDisplayItem used the second, but defers to the first when the callback is null.

This patch makes 1) a bit more complex by supporting ranges. But also makes DrawingDisplayItem use 1) as well.

Things I am trying:
- Before the patch I ran the benchmark and get 102ms
- I deleted the 2) and merged it into 1), so that we always use it, and it can abort. This did not change the benchmark, still 102ms
- I marked NextOp() as NOINLINE, now 103ms
- I removed NOINLINE, added 2 const std::vector<size_t>& args and a size_t* arg to NextOp(), construct two vectors at the top of playback(). This bumps the time significantly even though they are not used to 119ms.
- I marked the larger-argument NextOp() as NOINLINE again, down to 112ms.. still much higher.
- I removed the arguments from NextOp() so the only delta is constructing 2 vectors for playback(). Notably these vectors already exist and are used for picking DisplayItems right now, but we're constructing a second set of vectors from {0} for right now in the CL. This is 118ms.

So it looks like the intermediate state requiring construction of vectors is bad.

Comment 6 by danakj@chromium.org, May 26 2017

I took the current CL but instead of making vectors on each call to playback() I made a static vector of {0} to pass on every call, and this gets us back to 104ms, so 2ms more than before.

Comment 7 by danakj@chromium.org, May 26 2017

Labels: ReleaseBlock-Stable
Status: Started (was: Untriaged)
Removing the additional branching in NextOp recovers the last 2ms..

Comment 8 by danakj@chromium.org, May 26 2017

Labels: -Pri-2 Pri-1
Project Member

Comment 9 by sheriffbot@chromium.org, May 29 2017

This issue is marked as a release blocker with no OS labels associated. Please add an appropriate OS label.

All release blocking issues should have OS labels associated to it, so that the issue can tracked and promptly verified, once it gets fixed.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 10 by 42576172...@developer.gserviceaccount.com, May 30 2017

Cc: toyoshim@chromium.org
 Issue 727503  has been merged into this issue.
Please tag with appropriate OSs.  Thanks.
Labels: OS-All
Project Member

Comment 13 by bugdroid1@chromium.org, May 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4deb94b53a551930b3db578c4c339ec0ef95bc6b

commit 4deb94b53a551930b3db578c4c339ec0ef95bc6b
Author: danakj <danakj@chromium.org>
Date: Wed May 31 16:25:38 2017

cc: Improve performance of PaintOpBuffer::Playback with no ranges.

In this case we were constructing 2 vectors on each call to playback()
which causes us to do malloc/free twice as well. Instead, define a
static vector and pass that which recovers basically all the lost time.

The other changes in NextOp try to improve the codegen, and seem to
maybe have up to 1ms improvement of the remaining delta of 2ms from
before cfc477b43f1c91e84e17180ee937728d6f496aa6, but it's in the
noise. However I think it's nicer with less nesting and only writing
the operator++ once anyways.

BUG= 726031 
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel

Review-Url: https://codereview.chromium.org/2905383002
Cr-Commit-Position: refs/heads/master@{#475936}

[modify] https://crrev.com/4deb94b53a551930b3db578c4c339ec0ef95bc6b/cc/paint/paint_op_buffer.cc

Labels: Merge-Request-60
Status: Fixed (was: Started)
Project Member

Comment 17 by sheriffbot@chromium.org, Jun 1 2017

Labels: -Merge-Request-60 Hotlist-Merge-Approved Merge-Approved-60
Your change meets the bar and is auto-approved for M60. Please go ahead and merge the CL to branch 3112 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: Merge-Merged
Project Member

Comment 19 by bugdroid1@chromium.org, Jun 1 2017

Labels: -merge-approved-60 merge-merged-3112
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/feb6693d3c26cef1c1a1bb8ee1b2af015c4fb31a

commit feb6693d3c26cef1c1a1bb8ee1b2af015c4fb31a
Author: danakj <danakj@chromium.org>
Date: Thu Jun 01 17:43:08 2017

cc: Improve performance of PaintOpBuffer::Playback with no ranges.

In this case we were constructing 2 vectors on each call to playback()
which causes us to do malloc/free twice as well. Instead, define a
static vector and pass that which recovers basically all the lost time.

The other changes in NextOp try to improve the codegen, and seem to
maybe have up to 1ms improvement of the remaining delta of 2ms from
before cfc477b43f1c91e84e17180ee937728d6f496aa6, but it's in the
noise. However I think it's nicer with less nesting and only writing
the operator++ once anyways.

BUG= 726031 
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel

Review-Url: https://codereview.chromium.org/2905383002
Cr-Original-Commit-Position: refs/heads/master@{#475936}
Review-Url: https://codereview.chromium.org/2916703004 .
Cr-Commit-Position: refs/branch-heads/3112@{#95}
Cr-Branched-From: b6460e24cf59f429d69de255538d0fc7a425ccf9-refs/heads/master@{#474897}

[modify] https://crrev.com/feb6693d3c26cef1c1a1bb8ee1b2af015c4fb31a/cc/paint/paint_op_buffer.cc

Sign in to add a comment