New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 606069 link

Starred by 9 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocked on:
issue 646176

Blocking:
issue 606673



Sign in to add a comment

Paint invalidation of 2D transformed elements is slow

Project Member Reported by chrishtr@chromium.org, Apr 22 2016

Issue description

.
 
Components: Blink>Paint>Invalidation
It takes about 3x that on my MacBook Pro. Digging. Most likely it's expensive
geometry code.
Paint invalidation is taking a long time mostly because all elements on the
page have a transform on them. This causes us to use the slow-path rect
mapping when computing their paint invalidation rects. It has been this way
for a long time,

Looking into how to relax that.
Cc: wangxianzhu@chromium.org
I think the cleanest way is to simply implement the tree walks specified in
the Web Page Geometries design doc. This will properly handle all transforms,
since it works in terms of mapped rects, not points.
Summary: Paint invalidation of 2D transformed elements is slow (was: Paint invalidation takes 10ms per frame on a Z620 for Animometer / Design)
Cc: gab@chromium.org chrishtr@chromium.org
 Issue 424918  has been merged into this issue.
Cc: enne@chromium.org
New idea for a short-term improvement: if X% of a composited layer has already
been invalidated, then go ahead and invalidate the whole thing and forget about
optimizing invalidation rects further for that layer.

Will gather some data and implement behind a flag.
Note that if you skip paint invalidation of a subtree, the previousPaintInvalidationRects of the objects in the subtree will be invalid. The next time when an object in the subtree needs paint invalidation, we can't accurately invalidate its previousPaintInvalidationRect. Perhaps we can assume the object's previouslyPaintInvalidationRect covers the whole GraphicsLayer.
Good point...hmm.

Comment 10 by gab@chromium.org, May 19 2016

Cc: -gab@chromium.org

Comment 11 Deleted

For the design benchmark here: https://pr.gg/animometer/developer.html

Avoiding the slow rect mapping in PaintInvalidationState saves about half of the
paint invalidation time in the particular run I just did.
The paint invalidation code in WebKit has the same limitation as Blink under transform.

See shouldDisableLayoutStateForSubtree here: https://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderView.cpp

and the computation method computeRectForRepaint here: https://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderBox.cpp
Presumably theirs is faster because there is no extra tree walk, and there
may be some more overhead to our current implementation of PaintInvalidationState, whereas theirs appears to be shared with their equivalent of LayoutGeometryMap (called LayoutState in WebKit).
I have a new idea though. Coding it up to see if it works.
Continuing to try to find an improvement for  issue 606069 . Here is my current idea:

https://codereview.chromium.org/2000053002 (WIP, probably still has bugs)

Basically, treat 2D transforms as resetting paint invalidation state just like a paint invalidation container. Add special code to compute local paint invalidation offsets for children, then map them up using the paint invalidation state of the element which is the parent of the transform, plus some one-off
code to go from transform to parent.

It appears that this about doubles the speed of paint invalidation for the Design benchmark.


Question now is whether this is a good enough approach, and/or common enough to justify the additional
complexity.
The method SGTM. The added complexity looks fine. The method seems also applicable to some of other slow-path cases.
Cc: bokan@chromium.org vollick@chromium.org
 Issue 614408  has been merged into this issue.

Comment 19 by dharm...@gmail.com, May 26 2016

Does this affects 3d transformed elements (translate3d)?
No, this bug is just about 2D.

Comment 21 by dharm...@gmail.com, May 27 2016


 Issue 614408  is not just rotate, its also about translate 3d. 

On chrome 50, translate 3d and rotate/rotate3d performance significantly worse compare to translate(2d) (10fps vs 40fps on avg). See attachment. 


Timeline: no transform vs translate3d vs translate - https://www.dropbox.com/s/yf5eeoy6mgapvwc/translate3d-TimelineRawData-20160527T112149.json?dl=0

I can't reproduce this in Chrome 49 or Chrome 44. 


I think  Issue 614408  is not related to merged issue. 

Screenshot 2016-05-27 11.08.41.png
131 KB View Download
Blocking: 606673
Part of the measured time for paint invalidation is in cc::InvalidationRegion.

cc::InvalidationRegion accumulates damage rects in an SkRegion.  Rects are accumulated until the SkRegion has a complexity of kMaxInvalidationRectCount (256), after which the region is reduced to a single bounding box, then accumulation resumes.

Following are Animometer scores on my MBP with kMaxInvalidationRectCount at different counts.  

kMaxInvalidationRectCount = 256
Leaves 1106.14
Design 131.82

kMaxInvalidationRectCount = 16
Leaves 1204.91
Design 140.25

kMaxInvalidationRectCount = 1
Leaves 1206.21
Design 153.73

Some random thoughts on options to reduce time here:

 - We could just reduce kMaxInvalidationRectCount and check that we don't regress real world sites.
 - We could accumulate rects more cheaply in a vector<> instead of in an SkRegion, if we only care about a small # of invalidation rects.
 - If we still want to compute accurate non-overlapping regions, it may be more efficient to first accumulate all the rects, then run a line-sweep contour finding algorithm.  Hard to say if this is worth it.
Looking a bit higher up the stack, for each invalidation rect we call CompositedLayerMapping::setContentsNeedDisplayInRect() which runs ApplyToGraphicsLayers to transform and pass the rect to cc::Layer, etc.

A possible optimization is to collect the invalidation rects somewhere (like CompositedLayerMapping), then apply them to graphics layers in batch.

Comment 25 by enne@chromium.org, Jun 2 2016

As far as I can tell, it looks like nothing depends on InvalidationRegion being a region, so we could probably experiment with making it a vector and see whether it's the SkRegion complication or just the total rect count that makes a difference here.
https://codereview.chromium.org/2033513003 tests out the theory in #24.  It gives only ~3% extra improvement on the Design benchmark.

There are a number of inconveniences to batching the invalidations due to the needed offsetting, debug invalidation tracking, conversions from IntRect -> blink::WebRect -> gfx::Rect.

Maybe something to visit after more SPv2 advancement.

I don't like the complexity of the solution on comment 16. We should just
optimize kMaxInvalidationRectCount, and I'll pursue GeometryMapper directly.

I can work on kMaxInvalidationRectCount on Monday.
Project Member

Comment 28 by bugdroid1@chromium.org, Jun 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/62045ceab0fd83013d6cb781ba727d3d3ce68e18

commit 62045ceab0fd83013d6cb781ba727d3d3ce68e18
Author: vmiura <vmiura@chromium.org>
Date: Thu Jun 09 01:13:46 2016

Speed up InvalidationRegion

Staging invalidation rectangles in a vector allows us to skip building
a complex SkRegion for cases with many invalidations (> 256).  For cases
with fewer invalidations we will still build a full SkRegion.

R=enne@chromium.org
BUG= 606069 
CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel

Review-Url: https://codereview.chromium.org/2054473002
Cr-Commit-Position: refs/heads/master@{#398755}

[modify] https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18/cc/base/invalidation_region.cc
[modify] https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18/cc/base/invalidation_region.h

Labels: -Pri-1 Pri-2
Blockedon: 646176
Status: Fixed (was: Assigned)
Slimming Paint invalidation has fixed this bug. e.g. on the "design"
MotionMark benchmark, paint invalidation per frame with fixed complexity
on my Z620 goes from 7ms per frame to 1.7ms per frame.

Sign in to add a comment