Paint invalidation of 2D transformed elements is slow |
|||||||||
Issue description.
,
Apr 22 2016
It takes about 3x that on my MacBook Pro. Digging. Most likely it's expensive geometry code.
,
May 3 2016
Paint invalidation is taking a long time mostly because all elements on the page have a transform on them. This causes us to use the slow-path rect mapping when computing their paint invalidation rects. It has been this way for a long time, Looking into how to relax that.
,
May 3 2016
I think the cleanest way is to simply implement the tree walks specified in the Web Page Geometries design doc. This will properly handle all transforms, since it works in terms of mapped rects, not points.
,
May 4 2016
,
May 9 2016
,
May 18 2016
New idea for a short-term improvement: if X% of a composited layer has already been invalidated, then go ahead and invalidate the whole thing and forget about optimizing invalidation rects further for that layer. Will gather some data and implement behind a flag.
,
May 18 2016
Note that if you skip paint invalidation of a subtree, the previousPaintInvalidationRects of the objects in the subtree will be invalid. The next time when an object in the subtree needs paint invalidation, we can't accurately invalidate its previousPaintInvalidationRect. Perhaps we can assume the object's previouslyPaintInvalidationRect covers the whole GraphicsLayer.
,
May 18 2016
Good point...hmm.
,
May 19 2016
,
May 20 2016
For the design benchmark here: https://pr.gg/animometer/developer.html Avoiding the slow rect mapping in PaintInvalidationState saves about half of the paint invalidation time in the particular run I just did.
,
May 20 2016
The paint invalidation code in WebKit has the same limitation as Blink under transform. See shouldDisableLayoutStateForSubtree here: https://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderView.cpp and the computation method computeRectForRepaint here: https://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderBox.cpp
,
May 20 2016
Presumably theirs is faster because there is no extra tree walk, and there may be some more overhead to our current implementation of PaintInvalidationState, whereas theirs appears to be shared with their equivalent of LayoutGeometryMap (called LayoutState in WebKit).
,
May 20 2016
I have a new idea though. Coding it up to see if it works.
,
May 23 2016
Continuing to try to find an improvement for issue 606069 . Here is my current idea: https://codereview.chromium.org/2000053002 (WIP, probably still has bugs) Basically, treat 2D transforms as resetting paint invalidation state just like a paint invalidation container. Add special code to compute local paint invalidation offsets for children, then map them up using the paint invalidation state of the element which is the parent of the transform, plus some one-off code to go from transform to parent. It appears that this about doubles the speed of paint invalidation for the Design benchmark. Question now is whether this is a good enough approach, and/or common enough to justify the additional complexity.
,
May 23 2016
The method SGTM. The added complexity looks fine. The method seems also applicable to some of other slow-path cases.
,
May 26 2016
,
May 26 2016
Does this affects 3d transformed elements (translate3d)?
,
May 27 2016
No, this bug is just about 2D.
,
May 27 2016
Issue 614408 is not just rotate, its also about translate 3d. On chrome 50, translate 3d and rotate/rotate3d performance significantly worse compare to translate(2d) (10fps vs 40fps on avg). See attachment. Timeline: no transform vs translate3d vs translate - https://www.dropbox.com/s/yf5eeoy6mgapvwc/translate3d-TimelineRawData-20160527T112149.json?dl=0 I can't reproduce this in Chrome 49 or Chrome 44. I think Issue 614408 is not related to merged issue.
,
Jun 1 2016
,
Jun 2 2016
Part of the measured time for paint invalidation is in cc::InvalidationRegion. cc::InvalidationRegion accumulates damage rects in an SkRegion. Rects are accumulated until the SkRegion has a complexity of kMaxInvalidationRectCount (256), after which the region is reduced to a single bounding box, then accumulation resumes. Following are Animometer scores on my MBP with kMaxInvalidationRectCount at different counts. kMaxInvalidationRectCount = 256 Leaves 1106.14 Design 131.82 kMaxInvalidationRectCount = 16 Leaves 1204.91 Design 140.25 kMaxInvalidationRectCount = 1 Leaves 1206.21 Design 153.73 Some random thoughts on options to reduce time here: - We could just reduce kMaxInvalidationRectCount and check that we don't regress real world sites. - We could accumulate rects more cheaply in a vector<> instead of in an SkRegion, if we only care about a small # of invalidation rects. - If we still want to compute accurate non-overlapping regions, it may be more efficient to first accumulate all the rects, then run a line-sweep contour finding algorithm. Hard to say if this is worth it.
,
Jun 2 2016
Looking a bit higher up the stack, for each invalidation rect we call CompositedLayerMapping::setContentsNeedDisplayInRect() which runs ApplyToGraphicsLayers to transform and pass the rect to cc::Layer, etc. A possible optimization is to collect the invalidation rects somewhere (like CompositedLayerMapping), then apply them to graphics layers in batch.
,
Jun 2 2016
As far as I can tell, it looks like nothing depends on InvalidationRegion being a region, so we could probably experiment with making it a vector and see whether it's the SkRegion complication or just the total rect count that makes a difference here.
,
Jun 2 2016
https://codereview.chromium.org/2033513003 tests out the theory in #24. It gives only ~3% extra improvement on the Design benchmark. There are a number of inconveniences to batching the invalidations due to the needed offsetting, debug invalidation tracking, conversions from IntRect -> blink::WebRect -> gfx::Rect. Maybe something to visit after more SPv2 advancement.
,
Jun 3 2016
I don't like the complexity of the solution on comment 16. We should just optimize kMaxInvalidationRectCount, and I'll pursue GeometryMapper directly. I can work on kMaxInvalidationRectCount on Monday.
,
Jun 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/62045ceab0fd83013d6cb781ba727d3d3ce68e18 commit 62045ceab0fd83013d6cb781ba727d3d3ce68e18 Author: vmiura <vmiura@chromium.org> Date: Thu Jun 09 01:13:46 2016 Speed up InvalidationRegion Staging invalidation rectangles in a vector allows us to skip building a complex SkRegion for cases with many invalidations (> 256). For cases with fewer invalidations we will still build a full SkRegion. R=enne@chromium.org BUG= 606069 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel Review-Url: https://codereview.chromium.org/2054473002 Cr-Commit-Position: refs/heads/master@{#398755} [modify] https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18/cc/base/invalidation_region.cc [modify] https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18/cc/base/invalidation_region.h
,
Jul 11 2016
,
Oct 4 2016
,
Jun 27 2017
Slimming Paint invalidation has fixed this bug. e.g. on the "design" MotionMark benchmark, paint invalidation per frame with fixed complexity on my Z620 goes from 7ms per frame to 1.7ms per frame. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by chrishtr@chromium.org
, Apr 22 2016