Consider options to early out layer iteration in RasterTilePriorityQueueAll & RasterTilePriorityQueueRequired |
||||
Issue descriptionThe RasterTilePriorityQueues* are used by cc::TileManager to iterate tiles for rasterization in TileManager::PrepareTiles(), TileManager::IsReadyToActivate() and TileManager::IsReadyToDraw(). Together these can make up to 30% of CC Impl thread time at high layer counts. [Overview] These classes iterate all PictureLayerImpls, and build a list of TilingSetRasterQueue* objects corresponding to potential sets of raster tiles to iterate on each layer. In the case of RasterTilePriorityQueueAll, the vector is heap sorted, so that the TilingSetRasterQueues can be iterated in priority order. In the case of RasterTilePriorityQueueRequired, ordering is not needed. In the RasterTilePriorityQueueRequired for Activation case, we collect TilingSetRasterQueues for both Pending and Active trees. I'm not sure why that is, but presume it is to avoid some form of flashing if a content area is rastered on the Pending tree but not on the Active tree. Currently the TilingSetRasterQueue object is built for each layer, and then appended to the list only if it's not empty. If empty, it's discarded. There is some cost to allocating and constructing the TilingSetRasterQueue even when it will be empty. Once the TilingSetRasterQueues are collected, they are iterated to get tiles that require raster in each priority rectangle of the TilingSetRasterQueue. These are in order: 'visible|soon|skewport|eventually' rectangles. Iteration can involve skipping through already rastered tiles. In profiling of TileManager::PrepareTiles(), I am seeing iteration of tiles in the 'eventually' bin take the most time. [Proposal] It would be better if we could skip some of this allocation and iteration. An option is to mark layers layers that have content in their priority rectangles beforehand, such as in PictureLayerTilingSet::UpdateTilePriorities(). By limiting to 'visible|soon|skewport' rectangles, layers that are far out of viewport could be skipped early. However, the 'eventually' rectangles are very large and can cover layers that are within a wide distance of the viewport, so these would need some special handling. The good news is with Ganesh (GPU Rasterization) we don't use the 'eventually' bin. If we passed this knowledge to the RasterTilePriorityQueues, then we can cut a lot of processing from tile iterators. A rough prototype seems to give between 50% and 90% time reduction in tile iteration.
,
Sep 4
,
Sep 4
We could consider some larger changes here as well. I think the tile iterators really exist to handle large numbers of tiles, especially during pinch zoom, especially during software raster. We might want to reconsider our general data structure for prioritizing and iterating tiles, especially given gpu raster's larger tile size.
,
Sep 4
Re #3: Agreed. I tested an iterative (pun intended) early-out approach here to get an idea of what may be on the table for doing a larger change: CL: https://chromium-review.googlesource.com/c/chromium/src/+/1201346/6/cc/tiles/raster_tile_priority_queue_all.cc . Pinpoint results with CL: https://pinpoint-dot-chromeperf.appspot.com/job/12c72073640000 Changes on js_paint_plus_n_layers_99: PrepareTiles 3.879 ms -> 0.561 ms IsReadyToActivate 1.104 ms -> 0.297 IsReadyToDraw 0.497 -> 0.137
,
Sep 18
Re #1: Just wanted to understand more on this. These classes iterate all PictureLayerImpls, and build a list of TilingSetRasterQueue* objects corresponding to potential sets of raster tiles to iterate on each layer. In the case of RasterTilePriorityQueueAll, the vector is heap sorted, so that the TilingSetRasterQueues can be iterated in priority order. In the case of RasterTilePriorityQueueRequired, ordering is not needed. -- Is lazy building of queues expected here? -- Can we keep queues with layers, so that we can iterate queues of layers instead of cumulative big queue? In the RasterTilePriorityQueueRequired for Activation case, we collect TilingSetRasterQueues for both Pending and Active trees. I'm not sure why that is, but presume it is to avoid some form of flashing if a content area is rastered on the Pending tree but not on the Active tree. -- Do we need tiles from pending tree layers while preparing tiles for active tree layers? Currently the TilingSetRasterQueue object is built for each layer, and then appended to the list only if it's not empty. If empty, it's discarded. There is some cost to allocating and constructing the TilingSetRasterQueue even when it will be empty. -- What if we create the queue at the time of layer creation itself and store it in layer. Iterators would work on rects (around, consider, ignore) passed? If this issue is not on anybody's plate, I would be happy to work on it.
,
Sep 18
I checked linux latest dev build with url 55bbs.com by zooming and scrolling content. The LayerTreeHostImpl::BuildRasterQueue() takes "CPU self time" from min ~0.003 ms to max ~0.996 ms with average time of 0.071 ms. Looks like early out in building queue will help reduce iterator building time.
,
Sep 18
Hi Prashant, it will take me a few days to get back to you but I'm happy to provide some guidance on the design here. A rough change to early-out queue building for layers without tile content in priority rectangles is here (along with other optimizations). http://crrev.com/c/1201346 See early-outs in cc/tiles/raster_tile_priority_queue_all.cc and cc/tiles/raster_tile_priority_queue_required.cc. tiling_set->has_priority_rects() is only roughly implemented in this patch. For AppendTilingSetRequiredQueues the solution is simpler, we just have to compute if the Visible, or Soon rectangles overlap a tiling's content area. For CreateTilingSetRasterQueues, if we're using GPU Raster then we need to check if Visible, Soon, or Skewport rectangles overlap tiling's content area.
,
Oct 25
If we optimize PictureLayerTilingSet::UpdateTilePriorities() to update tilings only when priority rects are changed, then we can save some resources. Kindly refer WIP patch http://crrev.com/c/1299093. I tested it manually to www.amazon.com page for scrolling and pinch zoom on Samsung Galaxy S8 device and results are as given below - 1. For scrolling, out of 12060 UpdateTilePriorities(), 3425 updates to tilings can be saved. 2. For pinch zoom, out of 19944 UpdateTilePriorities(), 1461 updates to tilings can be saved. Can we have this into the design being considered here? Also I noticed that not all rects may be changing on every update. (We can check here if we can reuse the built queues if viewport and tiling rects like skew/soon/eventual are not changing). |
||||
►
Sign in to add a comment |
||||
Comment 1 by vmi...@chromium.org
, Sep 4