New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 768799 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-11-13
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Slow/thrashing image decodes

Project Member Reported by jakearchibald@chromium.org, Sep 26 2017

Issue description

https://jakearchibald.github.io/getting-touchy-presentation/

* Click the first slide to make it fill the browser window
* Press the right arrow key to advance

This takes Chrome many seconds to change to the next slide, but other browsers manage it almost instantly.

The problem seems to start when the slide fills the browser view. Chrome maxes out the CPU decoding images.

The slide framework stacks all the slides, so it isn't the best approach, but it's interesting how much we suffer here compared to other browsers.

Trace attached.
 
trace_touchy.json.gz
2.9 MB Download

Comment 1 by rbyers@chromium.org, Sep 26 2017

Cc: vmp...@chromium.org enne@chromium.org
Perhaps related to the discussion of async image decodes?  Async is still opt-in in Safari though, right?  Would be good to understand why we're so much worse than other browsers here.

Comment 2 by r...@rbyers.net, Sep 26 2017

Discussion on twitter: https://twitter.com/patrick_h_lauke/status/911962086875521026

Clearly keeping all these big images decoded at once is never going to work well, but what are other browsers doing differently?

Comment 3 by danakj@chromium.org, Oct 27 2017

Owner: khushals...@chromium.org
Status: Assigned (was: Untriaged)
Cc: khushals...@chromium.org
Components: -Internals>GPU>Image Blink>Paint
Labels: Needs-Feedback
Owner: chrishtr@chromium.org
Interestingly this didn't repro with software raster, only gpu. And the reason for this is the difference in tile sizes between the 2 modes. Here is the situation.

All content is on a single layer of size 3226x783, visible bounds 1440x783. In Gpu raster, we get tilings of size 1472x256, since the tile width matches the viewport width in this mode. The images other than the one visible on top start at about [2000, 0] in the layer's DisplayList. The problem happens when we start pre-painting. Since the pre-paint tiles cover a large width, the first SOON bin tiles we schedule have ~230 images. In the general case the images would be decoded before raster, but this case easily blows through the limit for how much we can decode and lock before rasterization, so this tile goes into at-raster mode, which means most images are decoded during rasterization. This is in particular worse with GPU rasterization because this raster task locks the worker context and will block any other raster work from starting. So now decoding of all these images is on the critical path.

I don't think this is something we should be handling in cc, no page should try to display that many images at once and not expect jank. I'm curious about how all these images were added to the DisplayList. Blink paints extra if that content could become visible on a compositor scroll, I didn't see that happening here. Should this have been painted by blink at all? +chrishtr@ for feedback.
And here is a trace with some events that make it easy to see the problem.
trace_touch_presentation.json.gz
3.9 MB Download
All of the images are in opaque slides stacked on top of each other.
My guess is that other browsers can detect the other images are obscured by
slides on top, and Chrome cannot.
This might be related to https://bugs.chromium.org/p/chromium/issues/detail?id=593430, where we no no opaqueness information until the image is rasterized, even if the header tells us the image must be opaque.

The discussion on that bug concluded that it was better to draw content behind the image before we decode the image even if that content will be obscured later. Maybe we need to rethink that.
NextAction: 2017-11-13
Are we sure its related to 593430? IIUC, that bug says that the occlusion optimizations during painting are missed if the image doesn't report that its opaque. I hacked up a patch[1] so that all signals provided externally by BitmapImage report it as opaque and I can still repro it.

[1]: https://chromium-review.googlesource.com/c/chromium/src/+/744327
As I understand it the issue is on the first paint, or close to first. Opaqueness is updated but it comes after we have already cached image meta data, and we never re-ask for new data. We had no mechanism to push the "opaqueness information needs updating" all the way back up into Blink.

It might of course not be related. I was speculating. But to know that one image occludes a while stack of images we would need accurate opacity information. 
I get that we need accurate opacity info, and a notification for when that changes to do the occlusion accurately. The patch in #9 makes it so the image always says its opaque, in which case the paint code should be able to assume that the image on top occludes everything else. The fact that this didn't happen makes me think the bug is where occlusion using this info is happening.
The NextAction date has arrived: 2017-11-13
Tiny addendum (possibly obvious from the discussion so far): using visibility:hidden (rather than display:none) on non-active slides also works to avoid the thrashing in Chrome.
Labels: -Pri-3 Pri-2
Updates:

1. The reason other browsers are able to avoid the jank is because they are
better about detecting off-screen images and avoiding a decode of them in this
case.

2. To make sure it's clear: the page described in this bug has an issue only
in GPU raster path in Chrome, which is because (a) it's only GPU raster that
does not allow more than one raster task at once, and (b) GPU raster tiles
are much bigger than software ones, leading to pre-painting a tile that intersects
with the images even though they are offscreen by 50%. I'm landing this patch:
https://chromium-review.googlesource.com/c/chromium/src/+/806994
which I think is of independent value, but fixing this bug's use-case is one
of them.

3. Another patch (https://chromium-review.googlesource.com/c/chromium/src/+/806855)
to avoid long raster tasks for pre-paint is in development. Without the patch
from #2 above, this patch will will still improve things on the example page,
by avoiding a blocking raster, but will still schedule the image decodes in parallel.
Project Member

Comment 15 by bugdroid1@chromium.org, Dec 5 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3

commit a3e4477c29d087d3708dbcb4dc5a90985c2b45e3
Author: Chris Harrelson <chrishtr@chromium.org>
Date: Tue Dec 05 01:51:58 2017

[PE] Turn on half-width tiles for all platforms and devices.

Full-width offscreen tiles become increasingly expensive on
larger displays.

Bug:  768799 
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel
Change-Id: Ibb2423c6dc956a7ad81a85d12aefd9d58ee28872
Reviewed-on: https://chromium-review.googlesource.com/806994
Reviewed-by: Antoine Labour <piman@chromium.org>
Reviewed-by: Eric Karl <ericrk@chromium.org>
Commit-Queue: Chris Harrelson <chrishtr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#521586}
[modify] https://crrev.com/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3/cc/layers/picture_layer_impl.cc
[modify] https://crrev.com/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3/cc/layers/picture_layer_impl_unittest.cc
[modify] https://crrev.com/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3/cc/trees/layer_tree_host_unittest.cc
[modify] https://crrev.com/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3/cc/trees/layer_tree_settings.h
[modify] https://crrev.com/a3e4477c29d087d3708dbcb4dc5a90985c2b45e3/content/renderer/gpu/render_widget_compositor.cc

Project Member

Comment 16 by bugdroid1@chromium.org, Dec 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f77264597cf2b51ede6995221acec18dd79a3601

commit f77264597cf2b51ede6995221acec18dd79a3601
Author: Chris Harrelson <chrishtr@chromium.org>
Date: Wed Dec 06 03:04:18 2017

[PE] Don't allow raster tasks for pre-paint tiles with at-raster images.

In GPU raster mode, only one raster task can be active at a time. This means
that these rasters have to happen quickly, or they will block other work.
For this reason, we should not allow pre-paint GPU raster tasks which might
have very slow performance to execute, because otherwise they will block
future high-priority on-screen raster tasks.

Raster tasks can have at-raster image decoding if the memory for storing
pre-decoded images has been exhausted. This is the case in the website
referenced in the bug. In these cases, rather than allowing the raster
to proceed and potentially jank display due to long image decodes, we
abort the raster, but still proceed with the image decodes (which don't
block other raster, and may be useful to decode in any case).

Bug:  768799 
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel
Change-Id: I2ec78361739b7b8f455f26211f21a5f8e995a8b6
Reviewed-on: https://chromium-review.googlesource.com/806855
Commit-Queue: Chris Harrelson <chrishtr@chromium.org>
Reviewed-by: Khushal <khushalsagar@chromium.org>
Reviewed-by: vmpstr <vmpstr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#521975}
[modify] https://crrev.com/f77264597cf2b51ede6995221acec18dd79a3601/cc/tiles/tile.h
[modify] https://crrev.com/f77264597cf2b51ede6995221acec18dd79a3601/cc/tiles/tile_manager.cc
[modify] https://crrev.com/f77264597cf2b51ede6995221acec18dd79a3601/cc/tiles/tile_manager.h
[modify] https://crrev.com/f77264597cf2b51ede6995221acec18dd79a3601/cc/tiles/tile_manager_unittest.cc

Status: Fixed (was: Assigned)

Sign in to add a comment