New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 367355 link

Starred by 25 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocking:
issue 422000



Sign in to add a comment

2D Canvas: Panning a large image is Janky

Project Member Reported by junov@chromium.org, Apr 25 2014

Issue description

This issue is a spin-off of  issue 315025 

Problem: when a 2D canvas is used as a panning viewport to expose a very large image, we hit jank caused by large texture uploads.

The problem happens when the displayed image is larger than the maximum texture size supported by the GPU, which forces the image to be rendered in tiles. When a new tile is exposed, we get a rendering hiccup due to a miss in skia's texture cache. 
 
Cc: vangelis@chromium.org
A panning on a 32000x900px image can be found here:
http://packages.gkny.fr/canvas_tiling/index_large.html


I didn't manage to reproduce the problem, except for really small lags. In addition, the lags are even less frequent in a HP chromebook than in a desktop (Linux) with a nvidia GPU.


Attached: a capture showing a big gap in the CrGpuMain thread at 7840ms. It seems to be approximately when I saw a little lag in the animation...
trace_big_gap.json.tar.bz2
1.7 MB Download
Timing the calls to SkGpuDevice::drawTiledBitmap() shows that the average time spent in this method is around 0.1 ms. Sometimes (not more than once every ~10 secs, i have 2ms) it is longer but it is not frequent enough to create the small lags we're seeing.

Comment 4 by jer...@duckware.com, Jun 16 2014

A test case for this was already published Oct 9, 2013 in 305617

Comment 5 by jer...@duckware.com, Jun 16 2014

See http://www.duckware.com/test/chrome/hugebug.html, which details the cause of the problem (at least on my hardware).

Comment 6 by junov@chromium.org, Jul 11 2014

Owner: piotaixr@chromium.org

Comment 7 by junov@chromium.org, Jan 6 2015

Owner: junov@chromium.org

Comment 8 by junov@chromium.org, Dec 7 2015

Labels: Hotlist-Polish

Comment 9 by jer...@duckware.com, Jul 27 2016

Any update on this issue?

More info... Go to vsynctester.com, click on the gear icon, and under "Background image", check the 'huge' option.  On one test system, the delays are so large, the scrolling effectively stops.  On a second test system, scrolling continues, but with periodic vsync spikes (can not maintain vsync).
Attached are results of testing as per comment 9 for various browsers on a Dell Inspiron 15 3543 (Intel HD Graphics 5500).
iexplorer-huge-good.jpg
74.2 KB View Download
firefox-huge-good.jpg
84.0 KB View Download
chrome-huge-bad.jpg
71.8 KB View Download

Comment 11 by junov@chromium.org, Jul 27 2016

Cc: -vangelis@chromium.org junov@chromium.org
Components: Internals>Skia
Owner: bsalomon@chromium.org
The solution to this probably lies in the skia GPU backend implementation.
@bsalomon: could you take a look?

Comment 12 by junov@chromium.org, Jul 27 2016

Status: WontFix (was: Assigned)
I was able to reproduce in 53.0.2778.2, but the problem appears to be fixed in in my developer build (rev 408115).

On the GPU process, the calls to GLES2DecoderImpl::HandlePostSubBufferCHROMIUM are more than 10x faster than before.  Not sure how this got fixed, but it's no longer reproducible in Chrome 54.
The issue is absolutely NOT resolved.  Just re-ran the test again with r408125 and there is no change.  vsync spikes exactly like chrome-huge-bad.jpg in comment #10.  Please re-open this issue.  Are you testing on Windows?  Are you testing with older Intel drivers?

Comment 14 by junov@chromium.org, Jul 27 2016

Status: Assigned (was: WontFix)
Re-opening.
I was on windows with NVIDIA GPU/drivers when I observed the speedup.

Could you record a trace and attach it to this bug?
Instructions: https://www.chromium.org/developers/how-tos/submitting-a-performance-bug
Here are traces from r408125:

The Intel 3000 trace has pauses over 1000 ms (effectively hangs the computer).  See  issue 315025 .

The Intel 5500 trace captures what is seen in chrome-huge-bad.jpg from comment 10 above.
intel3000trace.json.gz
1.4 MB Download
intel5500trace.json.gz
1.3 MB Download

Comment 16 by junov@chromium.org, Jul 27 2016

Those traces are dramatically different from what we observe with NVIDIA.
It's really the GPU flushes that are excruciatingly slow with these intel GPU, probably because of the dependency on the prior texture upload.

@bsalomon: Really looks like skia is using a slow path in the intel driver.
I'm confirming that it is due to cache purging. However, I'm not sure that I have a Intel GPU Windows machine to reproduce the slowness. May have to acquire something.

Comment 18 by junov@chromium.org, Jul 27 2016

Cache purging also shows up on NVIDIA but the texture re-upload only takes about one millisecond, so it does not affect the frame rate.  How come the texture uploads are so slow here on Intel?  In the past, I've seen this kind of issue where the drivers were optimized for specific texture internal/external format combos.  I'm just shooting in the dark here, but it might be worth trying RGBA instead of BGRA on some windows (dx11?) configs.
In the huge mode I see that we are regularly purging 3.6 to 4.2 MB resources. I'm wondering if we have a slow texture upload path the command buffer/angle.

This could be related: https://software.intel.com/en-us/forums/developing-games-and-graphics-on-intel/topic/537884

jerryj@, I'm curious whether you see the same issue if you run with --d3d11?

I'm going to look into getting access to a system that can reproduce the slowness. Unfortunately my Windows system is a Xeon with no integrated GPU.
On the Intel 3000 system, I can't force d3d11 -- it always wants to run in disable_d3d11 mode.  Can I force d3d11 somehow?

On the Intel 5500, chrome runs in d3d11 mode.  When I use --disable_d3d11, it runs in d3d9ex, with the same vsync spikes (same chrome-huge-bad.jpg result seen in comment 10).
OK, on the Intel 3000 system, ran with --disable-gpu-driver-bug-workarounds, and d3d11 was used.  It certainly runs a LOT better (occasional spikes instead of almost every frame a spike), but still has spikes (much more like the intel 5500 system).  See attached.
trace.json.gz
2.6 MB Download
intel3000-d9-in-d11-mode.jpg
145 KB View Download
Even the Intel 5500 (a newish notebook computer) with Direct3D11 can not handle this test:

  - Visit http://www.duckware.com/test/chrome/hugebug.html
  - click on the "D810ex" link
  - mouse wheel over the canvas (to scroll the image)

Scrolling should take no time.  Why is Chrome so incredibly slow, and can Chrome be fixed?

And please don't blame drivers or anything else, because both Firefox and IE on the same hardware works just fine and is incredibly fast (and is even very fast on the Intel 3000 hardware, where Chrome seems even worse).

No one is blaming drivers. We're postulating that we may be calling drivers in a way that may put us on a slow path.
I think Chrome is thrashing the GPU memory.
Verified.  Chrome is thrashing GPU memory, and evicting cache (new  issue 631166 ) too early.  When that is combined with 'slow texture upload', that is killer.

After writing a GPU memory thrasher and speed testing all browsers, it is clear that any 'slow texture upload' is affecting all browsers equally.

What remains is that (1) Chrome can not handle large textures (the well known and old 96MB thrashing issue) and (2) Chrome now appears to be evicting cache too early ( issue 631166 ).

I have verified the issue on yet another Intel HD 4400 system.

Chrome must: (1) address the 96MB issue and (2) be much smarter about cache eviction.  Fixing these two items will fix this issue.

ps: it would help the community a lot to test more using Intel HD GPU systems.

The slow path is Chrome.  As stated by ericrk@chromium.org, "[The image] exceeds Ganesh's cache limits, which means that for each tile we render we will re-upload the entire image - not only that, but this upload appears to exceed the transfer buffer size used by TexImage2D (16mb), so we'll chunk the upload over the command buffer, which is fairly inefficient."

Blocking: 422000

Comment 28 by junov@chromium.org, Aug 11 2016

Two things to do  here:
1) There is a massive refactoring effort underway to centralize GPU resource management in the GPU process, which should allow us to avoid setting arbitrary cache size limits all over the place, and in particular in the Render process.
2) When a texture upload is necessary, find workarounds to avoid driver slow paths. (Every driver is different, they each have a preferred texture format)
Cc: geoffl...@chromium.org
I am now actively looking at improving the cache replacement policy. It will help in cases where the total per frame usage exceeds the budget by small amount but not much when it is vastly exceeded.

The work mentioned in #28 to centralize management is pretty far off I believe. I'm assuming that this means the MUS work, but correct me if not. I think we should consider dynamically setting the cache budget for c2d based on total GPU memory. Or the browser could make explicitly texture backed SkImages that wouldn't be subject to cache purging.

Also, on Windows we are using BGRA not RGBA which I believe is slowing down uploads. This was decision was made long ago for D3D9 performance and GDI interop. Now that most systems are using ANGLE's D3D11 backend uploads would be faster if we used RGBA. I'm not sure if GDI interop is at all relevant these days for native control UI rendering. +Geoff Lang in case I didn't describe this situation correctly.
> I am now actively looking at improving the cache replacement policy.

No cache replacement policy will fix the issue that Chrome uses a very small hard coded GPU cache limit.  Why spend any time and effort on an algorithm that will ultimately will *not* fix the problem -- at all.

The only fix is to change Chrome's GPU memory limit: Set a GPU cache limit based upon the maximum of (1) some hard coded number and (2) some percentage of GPU memory (system memory for integrated intel gpu).

Every other issue (cache replacement policy, BGRA/RGBA, etc) pales in comparison to the problem of Chrome maxing out using 96MB of GPU memory on a system with 10,240MB of GPU memory.

Am I wrong?
As long as there is some cache limit it is desirable to have a replacement strategy that doesn't fail catastrophically when it is just barely exceeded. So it is worth some amount effort. This will benefit all clients of Skia not just Chromium.

However, I agree that the hard coded limit is a problem that should be addressed. If the cache size were larger the vsynctester page would work fine and the other problems I mentioned wouldn't be a problem for this particular page. junov@ is the lead for canvas2D in the browser and can comment better than I on the feasibility of doing this.
> As long as there is some cache limit it is desirable to have
> a replacement strategy that doesn't fail catastrophically when
> it is just barely exceeded.

Exactly my point.  Attempting to find/create a replacement strategy when 96MB is barely exceeded -- on a desktop system with 10240MB of GPU memory -- is an academic exercise in banging your head up against a wall.  The best 'replacement strategy' if the 96MB hard coded limit is "just barely exceeded" is to set a new GPU memory limit to the new "barely exceeded" amount.  Especially when 96MB represents less than 1% of GPU memory.

The hard coded 96MB has not kept up with the times.  What was the typical GPU memory amount when that value was created?  What percentage of GPU memory is that?  Or, asked another way, what maximum percentage of GPU memory was 96MB supposed to represent?

Hard coded values are a relic of the past.  Adaptive values must now be used.  Chrome has not even kept up with Safari and what GPU memory it can use.

> So it is worth some amount effort.

Not now.  Maybe AFTER the "FIXED" GPU memory limit has been changed to be adaptive.  Because using an adaptive GPU limit will (dramatically) increase the GPU limit on all modern systems, totally eliminating the need for any cache replacement algorithm in the first place on a far majority of systems.

junov@: On the surface it appears that a 'maximum of a hard coded value, and a percentage of GPU memory' would be crazy easy (and fast) to implement.  Is it?  If so, can we get an adaptive GPU memory limit in Chrome ASAP?

Cc: bsalomon@chromium.org
Owner: junov@chromium.org
I am in the process of putting up a set of changes that will cause Skia to switch replacement policies when it detects that were consistently over budget. It helps quite a bit in cases when we are a little over budget. However, in cases like the vsynctester page where the budget is exceeded by a lot it makes little difference. Reassigning this to junov@ for possible improvements on the Chrome side (e.g. dynamic budget or pinning SkImages as textures).
On a brand new intel i7-7600 laptop, this issue is making chrome pretty much unusable for my purposes.  Presumably because having a 4K display results in the GPU cache getting thrashed harder.

Edge is as smooth as butter, chrome is a slideshow. 
Project Member

Comment 36 by bugdroid1@chromium.org, Sep 28 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/03b1013d9a855aba76fa626a640b08b95146154f

commit 03b1013d9a855aba76fa626a640b08b95146154f
Author: junov <junov@chromium.org>
Date: Wed Sep 28 21:52:55 2016

Cancel GPU acceleration for 2D canvas when drawing very large images

GPU texture upload overhead is often prohibitively expensive when
drawing very large images into a canvas. This change adds an
heuristic to trigger a fallback to software rendering when this
happens

TBR=kbr@chromium.org
BUG= 367355 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2362363002
Cr-Commit-Position: refs/heads/master@{#421651}

[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/frame/ImageBitmap.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/frame/ImageBitmap.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/html/HTMLCanvasElement.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/html/HTMLCanvasElement.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/html/HTMLImageElement.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/html/HTMLVideoElement.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/html/canvas/CanvasImageSource.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/offscreencanvas/OffscreenCanvas.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/core/offscreencanvas/OffscreenCanvas.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/modules/canvas2d/BaseRenderingContext2D.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/modules/canvas2d/CanvasRenderingContext2DTest.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/modules/canvas2d/CanvasRenderingContext2DUsageTrackingTest.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/platform/graphics/ExpensiveCanvasHeuristicParameters.h
[modify] https://crrev.com/03b1013d9a855aba76fa626a640b08b95146154f/third_party/WebKit/Source/platform/graphics/ImageBuffer.cpp

The patch makes things MUCH worse (things that used to work ARE NOW BROKEN) -- please revert.  Manipulation of small 5MP images is broken:

  1. vsynctester.com
  2. gear icon
  3. background image MP to 5

result: busted

replicated on several laptops.

Comment 38 by junov@chromium.org, Sep 29 2016

A 5MP image should not trigger the fix. Nothing under 16MP should be affected. 
I am taking a look now...

Comment 39 by junov@chromium.org, Sep 29 2016

Tested vsynctester.com on Chrome 55.0.2875.0 (Canary) Windows 64-bit.
It works well as far as I can tell.  I get a steady 60fps for both cases, 44MP and 5MP images.

Can you give us more information about what you are observing?
What do you mean by "busted"? Is the page not rendering correctly?
Could you attach the chrome://gpu page from a system where you are observing the failure?
junov, why are we taking the draconian path of disabling the GPU when large images are being used -- rather than increasing the maximum amount of GPU memory that Chrome can use?

-----

I am seeing the attached on multiple notebook systems.  Of course, 5MP is nothing and should work great, but...

Review the attached r421646 for a 5MP running on vsynctester.com -- it works.

Review the attached r421661 for a 5MP running on vsynctester.com -- it fails, horribly.

The only change: r421646 vs r421661.

Unless there is another change in the range that screams that it is causing this -- just revert the change until you understand what is going on.

This is getting really annoying and very frustrating -- that every other web browser CAN handle large images with the GPU, but Chrome can not -- and now you appear to be on a path to 'fix' that by not using the GPU at all.  Bad.  Very bad.
r421646.jpg
166 KB View Download
r421661.jpg
158 KB View Download
gpu.txt
6.3 KB View Download
And regarding: "GPU texture upload overhead is often prohibitively expensive when drawing very large images into a canvas".

That is not the whole truth!

The whole truth is that Chrome *itself* is thrashing GPU memory (Chrome is shooting itself in the foot), because Chrome refuses to use more than 96MB of GPU memory, even on GPU's with 10GB.

Chrome's GPU thrashing can be eliminated by allowing Chrome to use more GPU memory.

The 5MP problem has now replicated on yet another notebook computer.  Revert the patch, and increase allowed GPU memory.

This is what I see on several notebook systems: http://www.duckware.com/test/chrome/20160929_121257.mp4

(starts out at 44MP, then switch to 5MP and failure)

Comment 43 by junov@chromium.org, Sep 29 2016

Wow, that is really bad. I borrowed a windows laptop with Intel GPU from a co-worker and was able to reproduce the issue. I see what is happening. I'll have this fixed in a few days.  I am going to spin-off a separate bug, which I will mark as a release blocker to make sure we do not ship a stable release in this state.

FWIW, the cache limit will be lifted as soon as we have centralized GPU resource management. Because of Chrome's multi-process architecture, coordinating GPU resource allocations is a lot harder for us than for other browsers that do not use process isolation (for security). That's why we have low arbitrary limits in several places.  We are still several months away from having that more general solution in place.

Comment 44 by junov@chromium.org, Sep 29 2016

Status: Fixed (was: Assigned)
Closing this issue.
Follow-up in  issue 651517 
what issue is tracking "centralized GPU resource management".  If none, can one be created?

Sign in to add a comment