New issue
Advanced search Search tips
Starred by 2 users

Issue metadata

Status: Accepted
Area: ----
Priority: Medium
Type: Defect

Sign in to add a comment

nanobench and SKPBench are suboptimal for msaa

Project Member Reported by, Jan 13 2015 Back to list

Issue description

This issue came up while inspecting MSAA (NVPR, ofc) performance with nanobench:

With MSAA backends, using nanobench to benchmark non-trivial .skp files runs into performance problems related to gpu resource cache filling up of surfaces, even though in "real world", the cache might not or should not fill up for these.

Consider nanobench drawing a skp that is 1000x1000:

1) nanobench creates a big surface of 1000x1000
2) SKPBench creates n tile surfaces
3) SKPBench replays the picture to the tile surfaces (loops x samples) times
4) SKPBench draws tiles to the big surface

The problem is that if picture creates any layers, it will probably start trashing render targets. The cache fills up due to nanobench surface and skp tile surfaces. Any layer created by the picture causes a render target deletion. 

Deleting render targets is very slow, and it's not *that* representative of Chromium rasterization, as the nanobench surface is not present there.

The issue details:

The render target *texture* is created uncached, but MSAA render target storage is created cached.

One screenful of MSAA16 needs at least 1000x1000x4x16 + 1000x1000x1x16 = 64M + 4M in cache. Twice this is >100M.

Each nanobench frame will clear the nanobench surface and resolve the msaa, even though the skpbench does not draw anything there within the frame. This skews the preformance report a bit (unneccessary clear and resolve), but also forces re-allocation of the big surface render targets if cache was able to free it, for each frame.

Would it be possible to change nanobench in following way:

1) in "per canvas draw" testcases, the big canvas would be allocated only after the per canvas draw timing test was executed. E.g. big canvas would maybe be used only for the per canvas post draw screenshot.

2) the clears would apply to the tiled canvases, not the big canvas

How would this be implemented, and who would want to implement it? (I can, if it is ok)

Would for example following design be ok:

a) Make current Benchmark a baseclass that had Benchmark::time or similar function which would do the draw looop

b) Make the surface and canvas allocation responsibility of the benchmark

c) Make normal tests be instances of DrawBenchmark (which is-a Benchmark)

d) Make per canvas tests be instances of PerCanvasDrawBenchmark (which is-a Benchmark)

This way "::draw(Canvas* canvas)" methods would be only on benchmarks that actually did draw to the canvas. Also "::onPerCanvasPreDraw(SomeInfoOnHowToAllocateSurfaces*)" would be only on benchmarks that do tiling (the name might be referring to allocation instead of predraw, too)

Project Member

Comment 1 by, Jan 13 2015

mtklein@ would have a better idea of the implementation details but the proposal seems ok to me. I think the large canvas should only be needed for a screenshot and drawing to it shouldn't be timed.
Project Member

Comment 2 by, Jan 14 2015

Hmm.  Yeah, the more I look at this the more SKPBench doesn't seem to fit.  I agree:
  1) The calls to canvas->clear(SK_ColorWHITE) and canvas->flush() are pointless if we're not drawing to that canvas.
  2) Allocating the giant 1000x1000 surface is pointless/harmful if we don't draw into it.
  3) We only need that giant surface if we're going to write out the images.
  4) If we're just using it to write out images, we can just read-back each tile independently into CPU RAM for compositing instead of using GPU resources.
  5) We shouldn't be timing the compositing, PNG encoding, or file IO.

As far as I can tell, we're at least OK on 5) for now.  The others need some thinking.  I think probably the best way to fix this is admit SKPBench should not inherit from Benchmark, and just have its own path for timing.
Project Member

Comment 3 by, Jan 15 2015

>Hmm.  Yeah, the more I look at this the more SKPBench doesn't seem to fit.  I agree:

Ok. What about the last question, eg. "who would want to implement it?"
Project Member

Comment 4 by, Jan 15 2015

From chatting with Mike yesterday, it sounded like he wanted to take a crack at it.

I may have made it so that the MSAA surface does not count against the cache budget here:

I think it was a mistake that the MSAA surface was counted against the cache budget when the texture was not supposed to count.
Project Member

Comment 5 by, Jan 27 2015

The following revision refers to this bug:

commit 36c57dfb4fe9bbaca436942d5eaa75b142ba251d
Author: kkinnunen <>
Date: Tue Jan 27 00:30:18 2015 -0800

Make stencil buffers uncached for uncached render target textures

Make new stencil buffers of uncached render target textures not affect the
cache budgets. This is consistent with render buffer storage of uncached
render target textures.

Affects only newly created stencil buffers. An uncached render target
might still receive a cached stencil buffer if such is available from


Review URL:


Project Member

Comment 6 by, Nov 17 2016

Any more work to do here or can we close this out?  +Chris for perspective on newer skpbench work

Sign in to add a comment