Project: skia Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 4 users
Status: Accepted
Owner:
Cc:
Area: GPU
Priority: Medium
Type: Performance

Blocked on:
issue 3388



Sign in to add a comment
desk_carsvg.skp is slow on msaa and nvprmsaa
Project Member Reported by kkinnu...@nvidia.com, Nov 12 2014 Back to list
According to skiaperf.com, desk_carsvg.skp is slow on NVPR.
This maybe fixable

 
Project Member Comment 1 by kkinnu...@nvidia.com, Nov 12 2014
Cc: bsalo...@google.com
Brian, sorry to bother you with this, but:
Would it be possible to know from which web page desk_carsvg.skp was recorded from?
Project Member Comment 2 by bsalo...@google.com, Nov 12 2014
desk_carsvg.skp
241 KB Download
Project Member Comment 3 by kkinnu...@nvidia.com, Nov 13 2014
Cc: cdal...@nvidia.com vbuzi...@nvidia.com
The content is http://commons.wikimedia.org/wiki/File:Car_Yellow.svg

So this slowness is common to both msaa4 and nvprmsaa4. 

The content has lots of sequences with "save, clip, save layer, drawpath, restore, restore" and similar patterns.

The additional calls these generate are similar to the pattern below: 

../../src/gpu/gl/GrGpuGL.cpp(1012) : GL: BindTexture(0x0DE1, 0)
../../src/gpu/gl/GrGpuGL.cpp(816) : GL: GenFramebuffers(1, &idDesc->fTexFBOID)
../../src/gpu/gl/GrGpuGL.cpp(827) : GL: GenFramebuffers(1, &idDesc->fRTFBOID)
../../src/gpu/gl/GrGpuGL.cpp(828) : GL: GenRenderbuffers(1, &idDesc->fMSColorRenderbufferID)
../../src/gpu/gl/GrGpuGL.cpp(847) : GL: BindRenderbuffer(0x8D41, idDesc->fMSColorRenderbufferID)
../../src/gpu/gl/GrGpuGL.cpp(777) : GL: RenderbufferStorageMultisample(0x8D41, sampleCount, format, width, height)
../../src/gpu/gl/GrGpuGL.cpp(855) : GL: BindFramebuffer(0x8D40, idDesc->fRTFBOID)
../../src/gpu/gl/GrGpuGL.cpp(859) : GL: FramebufferRenderbuffer(0x8D40, 0x8CE0, 0x8D41, idDesc->fMSColorRenderbufferID)
../../src/gpu/gl/GrGpuGL.cpp(862) : GL: CheckFramebufferStatus(0x8D40)
../../src/gpu/gl/GrGpuGL.cpp(870) : GL: BindFramebuffer(0x8D40, idDesc->fTexFBOID)
../../src/gpu/gl/GrGpuGL.cpp(881) : GL: FramebufferTexture2D(0x8D40, 0x8CE0, 0x0DE1, texID, 0)
../../src/gpu/gl/GrGpuGL.cpp(885) : GL: CheckFramebufferStatus(0x8D40)
../../src/gpu/gl/GrGpuGL.cpp(1214) : GL: BindFramebuffer(0x8D40, fbo)
../../src/gpu/gl/GrGpuGL.cpp(1217) : GL: FramebufferRenderbuffer(0x8D40, 0x8D20, 0x8D41, rb)
../../src/gpu/gl/GrGpuGL.cpp(1225) : GL: FramebufferRenderbuffer(0x8D40, 0x8D00, 0x8D41, 0)
../../src/gpu/gl/GrGpuGL.cpp(1230) : GL: CheckFramebufferStatus(0x8D40)
Ugh, Car_Yellow.svg has many <feGaussianBlur> (separable) filters scattered through it.

I'm guessing that's the problem.

(I thought this might be a non-1.0 group opacity issue but I don't see that as a problem.)

The GPU should be good at Gaussian blurs.  Compute shaders make these very efficient as they are separable.

But if OpenGL FBOs are being dynamically allocated, that's going to perform terribly.
Project Member Comment 5 by kkinnu...@nvidia.com, Dec 19 2014
Cc: robertph...@google.com
Filters are not an issue at this point, since they are not part of the .skp recordings at all. The bug for that is:
https://code.google.com/p/chromium/issues/detail?id=409945

The one part of the  slowdown is due to following factors:
1) The testcase has paths that contain css opacity < 1. Blink creates transparency layers (SkCanvas::saveLayer() ) for these.

2) The saveLayers are not optimized away, rather most paths are drawn to their own Ε•ender targets and then blit to textures and then blended to the original content.

3) Each layer is MSAA16. This causes the 96mb resource cache to be filled almost immediately, since some of the paths are large. This causes each new render target allocation to purge an existing render target. This runs GL delete code such as glDeleteRenderbuffers.

4) Smaller problem is that the picture contains rect clips. These end up stencilled with NVPR as paths. A new SkPath is created for each rect by the clip stack manager. This means that the clip rects as paths are not cacheable, because cache works with path ids. When the picture is played back enough times, these paths fill the cache
(item count increases until it reaches maximum). The new clips force random resources to be purged. These resources might be render targets or stencil buffers or paths.

It appears that in order to trigger the adverse behavior the paths that are drawn with css opacity need to also have a pattern as fill. I assume this is an optimization in Skia's recording optimization process or layer hoisting system: it notices that a single and simple drawPath is inside saveLayer()/restore(), so the process transfers the opacity in the color of the drawpath and removes saveLayer().

So as an example, following SVG path has an problem:

<path style="opacity:0.7625;fill:url(#grad);stroke:none" d="...." />

One partial fix for the slowdown would be to remove the need for the layers for this case. Maybe the layer reduction could apply also to a draw with a complex fill?

This maybe could be done either in Blink or then as part of the (assumed) Skia picture recording optimization code. 

CC:robertphillips: would you have an suggestion where (and whether) I should try to fix this?

save-layer-reduction-2.svg
643 bytes Download
Project Member Comment 6 by kkinnu...@nvidia.com, Dec 19 2014
Summary: desk_carsvg.skp is slow on msaa and nvprmsaa (was: desk_carsvg.skp is slow on NVPR)
Some filters are being captured in this skp (only the SkPictureImageFilters should be missing in general). I have updated the debugger so the existing filters are visible.

The SaveLayer|DrawPath|Restores can't, in most cases, be (easily) optimized away since the gradient applied to the path is partially transparent and the layer is adding its own transparency. 

Another horrible pattern that is appearing in car.svg is: SaveLayer1|SaveLayer2|Restore|Restore. There could be a missing SkPictureImageFilter on SaveLayer2 but, as is, SaveLayer2 currently just has an SkColorFilterImageFilter (which prevents it from being optimized away) but the whole pattern doesn't draw anything.

One thing to try is to add a peep hole optimizer that looks for SaveLayer|Draw|Restore patterns (where the Draw just has a LinearGradient) and then folding the alpha from the layer into the draw's paint. (I'm actually not sure if this will work.)


Project Member Comment 8 by robertph...@google.com, Dec 19 2014
Cc: reed@google.com
This should be as easy as removing the "paint.getShader()" check from HasAnyEffect in SkRecordOpts.cpp. This would allow alpha folding for single draw layers where the draw's paint's alpha is opaque.
Project Member Comment 9 by bugdroid1@chromium.org, Dec 21 2014
The following revision refers to this bug:
  https://skia.googlesource.com/skia.git/+/ca32da7533ed86fbbe50e078c751b0f1c198bb5b

commit ca32da7533ed86fbbe50e078c751b0f1c198bb5b
Author: robertphillips <robertphillips@google.com>
Date: Sun Dec 21 10:52:01 2014 -0800

Allow the alpha folding optimization if the single draw's paint has a shader

Let's land this and then leave for 2 weeks.

BUG=skia:3119

Review URL: https://codereview.chromium.org/817033002

[modify] http://crrev.com/ca32da7533ed86fbbe50e078c751b0f1c198bb5b/src/core/SkRecordOpts.cpp

Project Member Comment 10 by robertph...@google.com, Dec 21 2014
The alpha folding CL removes 57 out of the 179 layers in the desk_carsvg test case. The remaining instances seem to fall into the following cases:

1) SaveLayer|DrawPath|DrawPath|Restore
2) SaveLayer1|SaveLayer2|Restore|Restore where SL2 has a SkColorFilterImageFilter
3) SaveLayer|DrawPath|Restore where both paints have an alpha

Case 3 could be pretty easily optimized away with some care. Case 1 is here to stay since AFAICT, in all cases, the two paths are the same for both draws but have different paints. Case 2 could go away (unless there is some missing SkPictureImageFilter that is getting deleted) but will take some work.
Project Member Comment 11 by robertph...@google.com, Dec 22 2014
I have captured an skp with the PictureImageFilters (attached but to use one needs to comment out the isCrossProcess lines in SkPictureImageFilter.cpp). The SaveLayer1|SaveLayer2|Restore|Restore patterns do indeed contain an SkPictureImageFilter. The exact filter DAG (on SaveLayer2) appears to be:

ColorFilterImageFilter1 (TableColorFilter) <- BlurImageFilter <- ColorFilterImageFilter2 (TableColorFilter) <- PictureImageFilter

where the TableColorFilters have different tables. So ... the SaveLayer1|SaveLayer2|Restore|Restore patterns would need to be changed upstream in the Blink code.
car-w-picfilters.skp
297 KB Download
Project Member Comment 12 by bugdroid1@chromium.org, Jan 16 2015
The following revision refers to this bug:
  https://skia.googlesource.com/skia.git/+/678c1b019ac98bc7d94841132c8105a77490bc64

commit 678c1b019ac98bc7d94841132c8105a77490bc64
Author: kkinnunen <kkinnunen@nvidia.com>
Date: Fri Jan 16 05:04:36 2015 -0800

Fold alpha to the draw in savelayer-draw-restore patterns with non-opaque draw

Fold alpha of a save layer call to the subsequent draw call paint, even when
the draw call paint color is already non-opaque.

Comparing the difference of the unoptimized and the optimized call pattern
with all (layer alpha, draw alpha) combinations, this produces
off-by-one pixels ~50% of the time.

Reduces layers of current desk_carsvg.skp (recorded with cross-process
picture image filters allowed) from 122 to 115.

BUG=skia:3119

Review URL: https://codereview.chromium.org/840483005

[modify] http://crrev.com/678c1b019ac98bc7d94841132c8105a77490bc64/src/core/SkRecordOpts.cpp
[modify] http://crrev.com/678c1b019ac98bc7d94841132c8105a77490bc64/tests/RecordOptsTest.cpp

Project Member Comment 13 by bugdroid1@chromium.org, Jan 26 2015
The following revision refers to this bug:
  https://skia.googlesource.com/skia.git/+/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec

commit dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec
Author: kkinnunen <kkinnunen@nvidia.com>
Date: Mon Jan 26 00:14:26 2015 -0800

Fold alpha to the inner savelayer in savelayer-savelayer-restore patterns

Fold alpha to the inner savelayer in savelayer-savelayer-restore
patterns such as this:

  SaveLayer (non-opaque)
    Save
      ClipRect
      SaveLayer
      Restore
    Restore
  Restore

Current blink generates these for example for SVG content such as this:

<path style="opacity:0.5 filter:url(#blur_filter)"/>

The outer save layer is due to the opacity and the inner one is due to
blur filter being implemented with picture image filter.

Reduces layers in desk_carsvg.skp testcase from 115 to 78.

BUG=skia:3119

Review URL: https://codereview.chromium.org/835973005

[add] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/gm/recordopts.cpp
[modify] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/gyp/gmslides.gypi
[modify] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/src/core/SkRecordOpts.cpp
[modify] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/src/core/SkRecordOpts.h
[modify] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/src/core/SkRecordPattern.h
[modify] http://crrev.com/dc0f408a961b6ce4e7631e06031bbfe1d8d7bcec/tests/RecordOptsTest.cpp

Project Member Comment 14 by bugdroid1@chromium.org, Jan 27 2015
The following revision refers to this bug:
  https://skia.googlesource.com/skia.git/+/36c57dfb4fe9bbaca436942d5eaa75b142ba251d

commit 36c57dfb4fe9bbaca436942d5eaa75b142ba251d
Author: kkinnunen <kkinnunen@nvidia.com>
Date: Tue Jan 27 00:30:18 2015 -0800

Make stencil buffers uncached for uncached render target textures

Make new stencil buffers of uncached render target textures not affect the
cache budgets. This is consistent with render buffer storage of uncached
render target textures.

Affects only newly created stencil buffers. An uncached render target
might still receive a cached stencil buffer if such is available from
cache.

BUG=skia:3119
BUG=skia:3301

Review URL: https://codereview.chromium.org/859013002

[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/GrGpu.cpp
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/GrGpu.h
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/GrStencilBuffer.h
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/GrTest.cpp
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/gl/GrGLGpu.cpp
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/gl/GrGLGpu.h
[modify] http://crrev.com/36c57dfb4fe9bbaca436942d5eaa75b142ba251d/src/gpu/gl/GrGLStencilBuffer.h

Project Member Comment 15 by kkinnu...@nvidia.com, Feb 3 2015
Blockedon: skia:3388
Project Member Comment 16 by ethannicholas@google.com, Aug 12 2015
Labels: -Type-Defect Type-Performance
Sign in to add a comment