[Animometer] CSS Bouncing Gradient Circles Performance |
|||||||||
Issue descriptionthe Animometer benchmark has a test "CSS Bouncing Gradient Circles" - this test is worse in GPU raster than in SW, while most similar tests (without gradients) are much better with GPU raster. Additionally, there is a similar test "SVG Bouncing Gradient Circles", which performs a similar animation, but animates an SVG rather than divs with CSS. In the SVG version, both SW and GPU drastically outperform the CSS version. Additionally, we see a significant improvement from using GPU. Given that both of these appear to be producing the same output, there might be some fast path from the SVG version that could be applied to the CSS version. The tests in question can be found here: CSS Bouncing Gradient Circles: https://pr.gg/animometer/developer.html?test-interval=200&display=progress-bar&controller=ramp&frame-rate=50&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=HTMLsuite&test-name=CSSbouncinggradientcircles SVG Bouncing Gradient Circles: https://pr.gg/animometer/developer.html?test-interval=200&display=progress-bar&controller=ramp&frame-rate=50&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=SVGsuite&test-name=SVGbouncinggradientcircles
,
Apr 15 2016
,
Apr 20 2016
I grabbed a SKP of the CSS version. The circles are being rendered by blink as rectangles drawn through a circular clip. We have a plan to be able to batch across clip changes (in many cases) but it is pretty far off. In the short term the way to boost the speed here would be to draw circles instead of using circular clips on rectangles.
,
Apr 20 2016
The Skia translation is certainly more complicated than it needs to be:
save
clipRect(AA: false)
clipRRect(AA: true)
save
clipRect(AA: false)
drawRect()
restore
retore
Technically this could be expressed as a single drawRRect op.
BoxPainter already has a fast path which applies this sort of optimization, but it currently only works for simple colors: https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/paint/BoxPainter.cpp&l=304
I don't think it'd be terribly complicated to extend the fast path for images/shaders, just a bunch of plumbing because the GC image drawing API is currently ill suited for the task. Looking into it, feel free to assign to me if this is the only lead.
,
Apr 20 2016
,
Apr 22 2016
,
May 5 2016
Experimental CL which adds a rounded-rect image fast path: https://codereview.chromium.org/1949253004 (the draw sequence in c#4 is collapsed into a single drawRRect) CSS Bouncing Gradient Circles on my Linux workstation shows a ~12x improvement for Ganesh (28->360), and ~2.5x improvement for SW (136->340). So the approach is promising, but intrusive and doesn't play nice with CC's ImageDecodeController (images drawn via shaders are not intercepted). The good news is we're also looking at refactoring how draw image info is passed to CC, so this will likely fit much better on top of the new model (shipping ImageDisplayItems to CC). But it'll probably take a while for all the pieces to fall into place.
,
May 6 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/26528184b3b96533811a96106193d33dd8693e87 commit 26528184b3b96533811a96106193d33dd8693e87 Author: fmalita <fmalita@chromium.org> Date: Fri May 06 20:48:47 2016 Avoid unnecessary clipping of gradient/generated images Refactor GradientGeneratedImage::draw() to a) use an adjusted draw rect instead of explicit/unconditional clipping b) completely avoid save/restores when not needed (identity transform) This speeds up Animometer/CSS Bouncing Gradient Circles by ~25% in software mode (no noticeable improvement for Ganesh). BUG= 603966 R=schenney@chromium.org,reed@google.com Review-Url: https://codereview.chromium.org/1955683002 Cr-Commit-Position: refs/heads/master@{#392145} [modify] https://crrev.com/26528184b3b96533811a96106193d33dd8693e87/third_party/WebKit/LayoutTests/TestExpectations [modify] https://crrev.com/26528184b3b96533811a96106193d33dd8693e87/third_party/WebKit/Source/platform/graphics/GradientGeneratedImage.cpp
,
May 6 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420 commit 1f8fcd6a9439737814f2d6fa6294a7e77b6c0420 Author: Rebaseline Bot <blink-rebaseline-bot@chromium.org> Date: Fri May 06 21:44:51 2016 Auto-rebaseline for r392145 https://chromium.googlesource.com/chromium/src/+/26528184b BUG= 603966 TBR=fmalita@chromium.org Review URL: https://codereview.chromium.org/1960773002 . Cr-Commit-Position: refs/heads/master@{#392166} [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/TestExpectations [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/platform/linux/fast/css/transformed-mask-expected.png [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/platform/mac/fast/css/transformed-mask-expected.png [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/platform/mac/fast/reflections/reflection-with-zoom-expected.png [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/platform/win/fast/css/transformed-mask-expected.png [modify] https://crrev.com/1f8fcd6a9439737814f2d6fa6294a7e77b6c0420/third_party/WebKit/LayoutTests/platform/win/fast/reflections/reflection-with-zoom-expected.png
,
May 10 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bef2b571b0d6d9571876f88341f47b350af3f6ce commit bef2b571b0d6d9571876f88341f47b350af3f6ce Author: fmalita <fmalita@chromium.org> Date: Tue May 10 21:48:22 2016 Avoid unneeded RoundedInnerRectClipper rect clips ClipDisplayItem always applies a rectangular clip, and optionally a list of rounded-rect clips. As its client, RoundedInnerRectClipper is only interested in applying the rounded-rect clips - so it passes a synthetic "infinite" rect clip. This clip is pure overhead. Updated ClipDisplayItem to detect and ignore this clip. While the hood's up, remove the explicit FloatRoundedRect -> SkRRect conversion (FloatRoundedRect already implements a SkRRect conversion operator). This yields a ~23% improvement for Animometer/CSS Bouncing Gradient Circles, in software mode only (no change for Ganesh). BUG= 603966 R=chrishtr@chromium.org CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel Review-Url: https://codereview.chromium.org/1961693002 Cr-Commit-Position: refs/heads/master@{#392734} [modify] https://crrev.com/bef2b571b0d6d9571876f88341f47b350af3f6ce/third_party/WebKit/Source/platform/graphics/paint/ClipDisplayItem.cpp
,
May 24 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf commit 4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf Author: fmalita <fmalita@chromium.org> Date: Tue May 24 00:53:09 2016 Rounded background image fast path The current box background paint logic relies on clipping for drawing rounded-rect background images. This can yield some pathological rasterization slowdowns on the GPU backend. In the case where a single image tile covers the whole box (no fancy tiling involved/required), a more efficient way to draw is to use a Skia shader and just fill the rounded rect shape (skipping the unnecessary clip). This CL implements such an optimization: 1) Introduce a new Image virtual (applyShader) which configures the paint for filling arbitrary shapes with the image content. 2) Add a new GraphicsContext method for drawing rounded-rect images (drawRRectImage) using shaders build via #1. 3) Expand BoxPainter's fillFastBottomLayer() fast path to detect when tiling is not required and use #2 for painting rounded-rect background images. This avoids clipping for simple rounded-rect background images/gradients, and improves GPU rasterization times significantly in some cases. In local testing, Animometer/CSS-bouncing-gradient-circles shows a ~26% score improvement with the software backend (300 -> 370) and a 14X score improvement with Ganesh (27 -> 380). BUG= 603966 R=fs@opera.com,chrishtr@chromium.org,schenney@chromium.org CQ_INCLUDE_TRYBOTS=tryserver.chromium.win:win_optional_gpu_tests_rel;tryserver.chromium.mac:mac_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/1949253004 Cr-Commit-Position: refs/heads/master@{#395492} [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/core/paint/BackgroundImageGeometry.h [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/core/paint/BoxPainter.cpp [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/GradientGeneratedImage.cpp [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/GradientGeneratedImage.h [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/GraphicsContext.cpp [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/GraphicsContext.h [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/Image.cpp [modify] https://crrev.com/4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf/third_party/WebKit/Source/platform/graphics/Image.h
,
May 24 2016
After c#11, the GPU score is on par with the CPU. The numbers are somewhat noisy, but on a Xeon E5-2690/Quadro K2200/Linux, I'm getting the following ranges (Ramp @30fps) - before 4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf: 645-724 [cpu] vs. 40-55 [gpu] - after 4c1400381d16b6dcd8921a8f19a2ae9af4aac4cf: 708-810 [cpu] vs. 703-812 [gpu] These are still much lower than the SVG scores on the same HW (~1500), but since the draw commands are roughly the same (save-concat-drawRRect-restore vs. save-concat-drawOval-restore), I suspect the bottleneck is no longer in rasterization.
,
May 24 2016
Can you take a quick trace and post any high-level summary? In particular, is it now bound by paint invalidation?
,
May 25 2016
Attaching traces & high-level screenshots for CSS gradients, SVG gradients and Suits (added for comparison), all in GPU raster mode. Someone better than myself at reading traces should take a look, but here's my interpretation: * By its nature, this benchmark pushes the system to the limit - in maintain target FPS mode it increases the number of objects until it can no longer maintain the desired FPS. This means that we're always going to bottleneck on *something*. * Looking at various processes/threads in traces, the busiest one is likely to be the bottleneck. * Given the above: - both CSS and SVG gradients are now main-thread-bound; this is consistent with the earlier observation that CPU-vs-GPU snumbers are roughly equivalent for these benchmarks. - Suits (known to be slow due to GPU clipping/rasterization) OTOH shows a much lighter main thread workload and is instead bound by the GPU thread. * The main difference between the CSS and SVG main thread workloads appears to be style recalc: the former spends ~7ms/frame on updateStyle() while for the latter doesn't even show this phase. Un-assigning myself in case someone wants to pursue this further.
,
May 25 2016
SVG uses an attribute to represent transform, so there's no need to recalc style: https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/svg/SVGGraphicsElement.cpp&q=SVGElement%20transformAttr&sq=package:chromium&type=cs&l=180 How do we compare to Safari on both of those?
,
May 25 2016
I don't have a Mac handy to check, can you (or someone else) try it? (IIRC Mike looked at CSS gradients on his Mac workstation and we were beating Safari after the patch, but I don't remember the numbers)
,
Oct 25 2016
Most recent numbers are better than Safari on all the bouncing gradients tests. Seems like we can just close this out (other bugs are tracking some of the other issues mentioned, such as suits). |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by ericrk@chromium.org
, Apr 15 2016