[Animometer] Bouncing Clipped Rects Regresses under GPU |
||||||||||||
Issue descriptionThe bouncing clipped rects tests (both CSS and SVG) under the Animometer benchmark both regress with GPU rasterization enabled. These two benchmarks (and the Suites benchmark, which doesn't regress, but could probably be improved) all seem to suffer from the same problem. In each case, we are drawing hundreds of clipped shapes. In each of these benchmarks, we end up uploading a mask image for each shape, which results in hundreds of uploads per frame. In these benchmarks there are either 1 or 4 unique clips, which we could hopefully get more re-use out of (rather than uploading 100s per frame). Perhaps treating these clips similar to text (where we upload distance field for a glyph and then re-use that) would lead to huge speed ups? The tests in question are: SVG Bouncing Clipped Rects: https://pr.gg/animometer/developer.html?test-interval=200&display=progress-bar&controller=ramp&frame-rate=50&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=SVGsuite&test-name=SVGbouncingclippedrects CSS Bouncing Clipped Rects: https://pr.gg/animometer/developer.html?test-interval=200&display=progress-bar&controller=ramp&frame-rate=50&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=HTMLsuite&test-name=CSSbouncingclippedrects Suits: https://pr.gg/animometer/developer.html?test-interval=200&display=progress-bar&controller=ramp&frame-rate=50&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=Animometer&test-name=Suits&complexity=100
,
Apr 26 2016
,
May 20 2016
Any updates on this issue?
,
May 20 2016
Re-assinging to bsalomon@ for triage - someone on Skia is probably better equipped to look at this.
,
May 20 2016
Florin, are both the SVG and CSS tests covered by your changes?
,
May 21 2016
I don't think so: my changes deal with removing unnecessary clips when drawing rounded-border box background layers, but these tests use arbitrary clipping to form the shapes (CSS webkit-clip-path, SVG clip-path).
The draw sequence for both CSS/SVG bouncing clip rects:
save
concat(matrix)
save
clipPath(shape)
drawRect(shape_bounds)
restore
restore
That doesn't look unreasonable, but since the clipped shape is fully contained in the following draw dest it could be reduced to
save
concat(matrix)
drawPath(shape)
restore
We might be able to detect these simple cases in SVG thanks to its simplified paint model, but I think it would be really tricky to pull this optimization for general CSS clips.
@Robert is this something that could be done as an SKP peephole optimization?
,
May 23 2016
It definitely could be a peep-hole optimization but that probably isn't the main performance bottleneck. If the pages are creating anti-aliased concave path clips it would probably be more productive to explore caching the generated masks on the Skia side and reusing the same path on the Blink side. The reuse of the paths is important since Skia uses the path ID as the key for finding cached results.
,
May 23 2016
Here is the skp for the SVGbouncingclipedrects case. The clip paths appear to be identical (i.e., positioned via a transform) but the Path IDs are all different. I haven't yet determined if this is an artifact of serialization.
,
May 23 2016
It looks like each instance of the star clip path is getting rebuild anew so all the Path IDs are different by the time they get to Skia (for the SVG bouncing clipped rects case).
,
May 23 2016
Can you tell if they are built in canvas space or are built in a canonical space and transformed into the canvas?
,
May 23 2016
[SVG case] The benchmark uses objectBoundingBox clipPath units => clips are defined in a normalized [0..1] space and then mapped onto the target rect. The SVG clipper code caches the normalized path, and then we apply the mapping on the fly, generating new paths on every element/paint: https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/layout/svg/LayoutSVGResourceClipper.cpp&l=140 We should be able to avoid that, looking.
,
May 23 2016
Ok there will be some work on the Skia side to take advantage of that as we currently apply transformations when storing clip paths in the SkClipStack.
,
May 23 2016
Here is the skp for the suits test case. It seems to randomly switch between drawing the concave shape as a rect through a clip and as a direct path draw. In all the direct path draw cases the path looks to be consistent (i.e., there is only one for each suit - hearts, clubs, spades and diamonds) and in a normalized coordinate system. For the clips the path is scaled and is thus pretty arbitrary. In both cases the path IDs always appear to be different.
,
May 23 2016
I believe the different path IDs in drawPath case is an artifact of the serialization (in the suits example). Looking at the raw path IDs we do get some reuse in the drawPath case but not as much as one would hope (i.e., there are way more than 4 path IDs passed to drawPath).
,
May 23 2016
The SkTMultiMap perf issues repro with the suits.skp file.
,
May 24 2016
,
May 24 2016
,
May 24 2016
The following revision refers to this bug: https://skia.googlesource.com/skia.git/+/c4ed68426649dd4ca2c3119cdafdd562d3c3ba28 commit c4ed68426649dd4ca2c3119cdafdd562d3c3ba28 Author: robertphillips <robertphillips@google.com> Date: Tue May 24 21:17:12 2016 Don't store resources with a unique key in GrResourceCache's fScratchMap The reasoning here is that resources with a unique key are never selected from fScratchMap and just clog up the search for an available resource. This knocks a 200x loop over the SVGbouncingrects case from 264ms down to 164ms. BUG=603969 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2008083002 Review-Url: https://codereview.chromium.org/2008083002 [modify] https://crrev.com/c4ed68426649dd4ca2c3119cdafdd562d3c3ba28/src/core/SkTMultiMap.h [modify] https://crrev.com/c4ed68426649dd4ca2c3119cdafdd562d3c3ba28/src/gpu/GrResourceCache.cpp
,
May 25 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1ccc328e95809a94e7c065e9427cd5d475ec541d commit 1ccc328e95809a94e7c065e9427cd5d475ec541d Author: skia-deps-roller <skia-deps-roller@chromium.org> Date: Wed May 25 02:08:40 2016 Roll src/third_party/skia/ 5d04fda87..c4ed68426 (2 commits). https://chromium.googlesource.com/skia.git/+log/5d04fda87747..c4ed68426649 $ git log 5d04fda87..c4ed68426 --date=short --no-merges --format='%ad %ae %s' 2016-05-24 robertphillips Don't store resources with a unique key in GrResourceCache's fScratchMap 2016-05-24 bsalomon Make Sk32ToBool inline again to silence compiler warning. GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2007753003 BUG=603969 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel TBR=kjlubick@google.com Review-Url: https://codereview.chromium.org/2012593003 Cr-Commit-Position: refs/heads/master@{#395778} [modify] https://crrev.com/1ccc328e95809a94e7c065e9427cd5d475ec541d/DEPS
,
May 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c64bd21780da13051846ae45b54a1e6c6bf6ac52 commit c64bd21780da13051846ae45b54a1e6c6bf6ac52 Author: fmalita <fmalita@chromium.org> Date: Thu May 26 18:17:00 2016 Consider ClipPathDisplayItems for GPU veto. Based on senorblanco's investigation, we currently don't veto complex clipPaths. Building on top of http://crrev.com/2000423005, refactor DisplayItem GPU veto to also feed ClipPathDisplayItem signals into SkPictureGpuAnalyzer. BUG=603969 R=chrishtr@chromium.org,senorblanco@chromium.org Review-Url: https://codereview.chromium.org/2013723002 Cr-Commit-Position: refs/heads/master@{#396232} [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/ClipPathDisplayItem.cpp [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/ClipPathDisplayItem.h [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/DisplayItem.h [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/DisplayItemList.cpp [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/DrawingDisplayItem.cpp [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/DrawingDisplayItem.h [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/PaintController.cpp [modify] https://crrev.com/c64bd21780da13051846ae45b54a1e6c6bf6ac52/third_party/WebKit/Source/platform/graphics/paint/PaintControllerTest.cpp
,
May 26 2016
senorblanco discovered that we're not taking complex clip paths into account for GPU veto. c64bd21780da13051846ae45b54a1e6c6bf6ac52 fixes that, and causes us to use MSAA (when available) or software fallback for these concave-clip-heavy tests. Some numbers for a Linux/XeonE5-2690/Quadro K2200/MSAA-enabled box (maintain target FPS mode): ( CSS clips | SVG clips | Suites ) * CPU: 397-431 | 709-735 | 621-646 * GPU-prev: 298-334 | 376-395 | 510-546 * GPU-ToT: 481-532 | 744-788 | 667-723 Looks like the regression is fixed (worked around?) on ToT. Would be interesting to see some Mac (and possibly non-MSAA - although I'd expect these to be about the same as CPU) numbers, if anyone has the hw and bandwidth.
,
May 26 2016
\o/
,
Jun 7 2016
Is this fix confirmed?
,
Jun 7 2016
Unfortunately there's an issue with this change. Once we veto ganesh to software, the page remains in software until reloaded. On my MBP with Intel GPU, the 'CSS bouncing clipped rects' test improves a lot, but subsequent tests (if ran in a row) regress because GPU is off. CSS bouncing clipped rects +129.26% CSS bouncing gradient circles -37.80% CSS bouncing blend circles -69.41% CSS bouncing filter circles -53.78% CSS bouncing SVG images -85.35% CSS bouncing tagged images -79.13% Leaves 2.0 -36.58% DOM particles, SVG masks -55.21%
,
Jun 7 2016
In general I don't think the GPU Veto is a good idea, we need to dynamically decide when to software and when to GPU raster. That's what CG is doing.
,
Jun 7 2016
That's unfortunate. In the medium term, we should look at applying the GPU veto on a more frequent basis. In the short term, we should probably revert just the change to ClipPathDisplayItem, so we can choose to re-enable it later.
,
Jun 7 2016
Re c#24: I find the sticky GPU veto surprising. @chrishtr is this expected? Aren't we supposed to reset the state when the display item list gets updated?
,
Jun 7 2016
I see, the stickiness is not from Blink/recording side but from CC/LayerTreeHost: https://cs.chromium.org/chromium/src/cc/trees/layer_tree_host.cc?rcl=0&l=737 I'll revert the clipPath CL while we decide whether that's something we want to change. (it's possible to construct arbitrary benchmarks which trigger the same issue due to sticky drawPath vetoes)
,
Jun 7 2016
One of the problems is that our GPU raster is using very different tile sizes. I think CG will pick software or gpu on a per CALayer basis, and Safari is using a tile per layer, so they get veto on a per tile basis.
,
Jun 7 2016
If we veto-ed per-tile, you'd likely see artifacts along CPU/GPU borders. Skia's rendering can differ quite markedly between CPU and GPU. This was the original motivation for the "sticky" veto, IIRC: to avoid flashing if a layer toggled between veto/non-veto. OTOH, Chrome already does that in some other cases, e.g., a layer can toggle between accelerated/non-accelerated as it's animated/stops animating. But those differences tend to be smaller than Ganesh vs. non-Ganesh, and more predictable in timing (start/end of animation).
,
Jun 7 2016
Yeah, it'd be interesting to study WebKit running the benchmarks and see what's really going on. Perhaps this isn't a software vs hardware thing for them, perhaps I'm wrong about the per tile software mode as well.
,
Jun 7 2016
We could try to go for a system where we veto per-layer. The current rasterizer could support hybrid software & GPU raster, modulo some trickiness with managing image decoding and caching. The main concern with changing raster mode per-layer is that if layering decisions change back & forth in a dynamic way, there could be visual issues such as flicker or shaking.
,
Jun 7 2016
I think CG/Safari handle that with periodic stickiness times. For example they only reconsider the giant tiles decision every 500ms, so you don't jump back and forth between two tile sizes every frame.
,
Jun 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4e4d5e1f086786afff93d0d7e0a6eb5f5fcb276f commit 4e4d5e1f086786afff93d0d7e0a6eb5f5fcb276f Author: fmalita <fmalita@chromium.org> Date: Tue Jun 07 21:26:08 2016 Temporarily disable ClipPathDisplayItem accounting for GPU veto The GPU veto is persistent per root layer. Reverting http://crrev.com/2013723002 while we sort some regressions out. BUG=603969 R=senorblanco@chromium.org,chrishtr@chromium.org Review-Url: https://codereview.chromium.org/2039373003 Cr-Commit-Position: refs/heads/master@{#398371} [modify] https://crrev.com/4e4d5e1f086786afff93d0d7e0a6eb5f5fcb276f/third_party/WebKit/Source/platform/graphics/paint/ClipPathDisplayItem.cpp [modify] https://crrev.com/4e4d5e1f086786afff93d0d7e0a6eb5f5fcb276f/third_party/WebKit/Source/platform/graphics/paint/PaintControllerTest.cpp
,
Jun 8 2016
,
Jun 12 2016
,
Oct 25 2016
We landed a change to update the GPU rasterization veto on a frequent (1s?) basis. Should we re-land the change from #34.
,
Mar 7 2017
|
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by ojan@chromium.org
, Apr 22 2016