New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Regression in canvas performance between M68 and M70

Reported by dat...@gmail.com, Aug 8

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36

Steps to reproduce the problem:
1. Turn Hardware Acceleration Off in Google Chrome
2. Go to https://dev3.orgchartnow.com
3. login using the following creds: google@google.com
4. Password: tbs2rt
5. A chart will open
6. Click on various tabs at the top (REFRESH tab for example)
7. AW SNAP Message...

What is the expected behavior?
Worked in Chrome 67. FireFox has no issues with same code.

What went wrong?
See the following video for details.

https://dev3.orgchartnow.com/downloads/awsnap.mp4

Crashed report ID: I enabled crash reporting but I don't see any entry.

How much crashed? Just one tab

Is it a problem with a plugin? N/A 

Did this work before? Yes Chrome 67

Chrome version: 68.0.3440.84  Channel: stable
OS Version: 10.0
Flash Version: 

Also crashes in Chrome 69 Beta.
 
Cc: abdulsyed@chromium.org jbanavatu@chromium.org manoranj...@chromium.org
Components: Blink>Canvas
Labels: -Pri-2 hasbisect-per-revision ReleaseBlock-Stable M-68 M-69 Target-69 Target-70 RegressedIn-68 M-70 FoundIn-70 Target-68 FoundIn-68 FoundIn-69 OS-Linux OS-Mac Pri-1
Owner: khushals...@chromium.org
Status: Assigned (was: Unconfirmed)
Tested in chrome reported version #68.0.3440.84,Latest Dev #69.0.3493.23 & Latest canary #70.0.3515.0 on Linux Debian Rodete, Win 10,Mac OS 10.13.6. Able to reproduce the issue.

Below are the Bisect Details:

Using the per-revision bisect providing the bisect results,
Good Build: 68.0.3417.0
Bad Build: 68.0.3418.0

CHANGE-LOG URL:
---------------
https://chromium.googlesource.com/chromium/src/+log/21ac0c57e05f8555d4557575815b49d2b3d85516..49fc8a9c4a9eb46faa2258daea5bdbee48da7dcb

From the CL above, assigning the issue to the concern owner

@ khushalsagar: Could you please look into the issue, pardon me if it has nothing to do with your changes and if possible please assign it to concern owner.Adding RB stable for M68 please remove if not the case.

Reviewed-on: https://chromium-review.googlesource.com/858570 

Thanks!
I'm having trouble repro this bug. Tried with the current chrome canary: 70.0.3516.0 on MacOS 10.13.3. Turning off hardware acceleration from chrome://settings definitely forces canvas2d to use software raster but I don't see a renderer crash (aw snap).

jbanavatu@, since you already have a repro, could you share a crash stack?

datnow@, could you try looking for the crash report id in chrome://crashes/, and posting it here if available?
Okay. Got a crash!

Received signal 6
#0 0x7f71d71f044c base::debug::StackTrace::StackTrace()
#1 0x7f71d71effb1 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x7f71cdba70c0 <unknown>
#3 0x7f71cba3afcf gsignal
#4 0x7f71cba3c3fa abort
#5 0x7f71d6b56275 sk_abort_no_print()
#6 0x7f71d6e6d9a9 SkSurface_Raster::onCopyOnWrite()
#7 0x7f71d6e6cdfa SkSurface_Base::aboutToDraw()
#8 0x7f71d6d8f35d SkCanvas::onDrawImageRect()
#9 0x7f71d6d8c0bf SkCanvas::drawImageRect()
#10 0x7f71d6da1784 SkColorSpaceXformCanvas::onDrawImageRect()
#11 0x7f71d6d8c0bf SkCanvas::drawImageRect()
#12 0x7f71d65760da cc::DrawImageRectOp::RasterWithFlags()
#13 0x7f71d658c5bd cc::SkiaPaintCanvas::drawImageRect()
#14 0x7f71cf44c700 blink::StaticBitmapImage::DrawHelper()
#15 0x7f71cf44edc4 blink::UnacceleratedStaticBitmapImage::Draw()
#16 0x7f71cec40900 blink::BaseRenderingContext2D::DrawImageInternal()
#17 0x7f71cec3ffeb blink::BaseRenderingContext2D::drawImage()
#18 0x7f71cea2d0d3 blink::V8OffscreenCanvasRenderingContext2D::drawImageMethodCallback()
#19 0x7f71d15d1261 v8::internal::FunctionCallbackArguments::Call()
#20 0x7f71d15d0766 v8::internal::(anonymous namespace)::HandleApiCallHelper<>()
#21 0x7f71d15cfef6 v8::internal::Builtin_Impl_HandleApiCall()
Is there time frame for fixing this bug? I have thousands of customers that are impacted.
I'll put up a patch today, I have narrowed down what it is.
Project Member

Comment 7 by bugdroid1@chromium.org, Aug 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2a72048f91a5bc9d26c81d091b1acff7643883fb

commit 2a72048f91a5bc9d26c81d091b1acff7643883fb
Author: Khushal <khushalsagar@chromium.org>
Date: Fri Aug 10 01:25:35 2018

cc: Fix image locking by 2d canvas for bitmap images.

Canvas2d uses cc's ImageDecodeCache in at-raster mode, which means the
cache internally locks and unlocks images after every draw. If an image
should be locked for the duration of a draw, it needs to be managed
external to the cache. Avoiding the continous ref/un-ref churn was
particularly important with GPU discardable.

For this reason, canvas2d keeps images locked uptill a particular
budget. The tracking for whether the budget has been exceeded is in the
image cache and the information is passed to the CanvasImageProvider in
DecodedDrawImage. This was resulting in the issue where for images which
can directly be used in raster (bitmaps in sw case), the cache would
always indicate them as budgeted so these images were not being released
until the end of a canvas frame, causing OOM in certain cases.

To fix the above, we could indicate these cases as unbudgeted from the
cache but in that canvas the CanvasImageProvider drops all current
decodes. The correct fix here would be to kill this locking of images
in CanvasImageProvider and instead have the cc cache internally lock
images uptill a budget for this use-case. But as a bandaid fix, this
change makes it so we never hit the cache for these cases, and the
CanvasImageProvider never locks these images if the
ScopedDecodedDrawImage doesn't need to be unlocked.

Bug:  872117 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I48089d6aa3d1523a3c4d45821c7e307c6ee43c8c
Reviewed-on: https://chromium-review.googlesource.com/1169902
Reviewed-by: enne <enne@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#581999}
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/raster/playback_image_provider.cc
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/raster/playback_image_provider_unittest.cc
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/test/stub_decode_cache.cc
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/test/stub_decode_cache.h
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/gpu_image_decode_cache.cc
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/gpu_image_decode_cache.h
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/image_decode_cache.h
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/software_image_decode_cache.cc
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/software_image_decode_cache.h
[modify] https://crrev.com/2a72048f91a5bc9d26c81d091b1acff7643883fb/cc/tiles/software_image_decode_cache_unittest.cc

How is the change listed at #20 looking in canary? And how safe it will be to merge to M69 this late in cycle? 

Pls note this is M68 regression, so unless fix is super safe I would hesitate to take this merge in for M69. 
Labels: Merge-Request-69
I checked on the latest canary and the issue is fixed. This is a very targeted minor change and includes automated tests, I think its a safe merge. The bug can result in OOM crashes for canvas cases similar to the page above, making them completely unusable so I think it's an important fix to merge as well.
Project Member

Comment 10 by sheriffbot@chromium.org, Aug 11

Labels: -Merge-Request-69 Merge-Review-69 Hotlist-Merge-Review
This bug requires manual review: M69 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
NextAction: 2018-08-13
Pls update bug with canary result on Monday morning to get more canary coverage. Thank you. 
The NextAction date has arrived: 2018-08-13
Cc: rbasuvula@chromium.org
w.r.t comment #11

Tested the issue on Windows-10, Debian Rodete and Mac OS 10.13.6 using chrome latest Canary M70-70.0.3521.0 by following steps mentioned in the original comment. Observed that no crashes are observed and working as expected. 

Please find the screen cast(Mac) for reference.

Thank you!
872117.mp4
1.0 MB View Download
Re #11, I double checked on canary as well and its fixed.
Labels: -Merge-Review-69 Merge-Approved-69
Approving merge to M69 branch 3497 based on comment #9 and #14. Please merge now and mark bug as fixed after the merge if nothing else is pending. Thank you.
I am testing with Canary 70.0.3521.0. The issue is NOT fully resolved. There is a still a major memory leak.

To reproduce:

1. Turn Hardware Acceleration On in Google Chrome
2. Go to https://dev3.orgchartnow.com
3. login using the following creds: google@google.com
4. Password: tbs2rt
5. A chart will open
6. Click on the gear icon while holding down the shift key
7. Select 'Test Performance'
8. Repeat many times. Each time you will notice the the Frame per second goes down and memory to google process goes up.

Please let me know if you want me to make a video.


Labels: -Merge-Approved-69 Merge-Review-69
Moving back to Merge-Review-69 based on comment #16.
Addition to comment #16: If you turn hardware acceleration off the issue seems to be fully addressed both the frame rate and the memory usage are stable across multiple tests.
Labels: -M-68 -Target-68
Removing M68. Let's target M69+
The change in #7 definitely fixes the issue for when hardware acceleration is turned off, requesting a merge for that.

I'm investigating the issue with hardware acceleration in the meanwhile. datnow@, is this a regression you only see in 68, is M67 fine?
Labels: -Merge-Review-69 Merge-Approved-69
Approving merge to M69 branch 3497 based on comment #15 and #20. Please merge now. 
Project Member

Comment 22 by bugdroid1@chromium.org, Aug 13

Labels: -merge-approved-69 merge-merged-3497
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6

commit 5cee3747339bc1bfdc080cc7bef8b280eb0c77c6
Author: Khushal <khushalsagar@chromium.org>
Date: Mon Aug 13 19:00:11 2018

cc: Fix image locking by 2d canvas for bitmap images.

Canvas2d uses cc's ImageDecodeCache in at-raster mode, which means the
cache internally locks and unlocks images after every draw. If an image
should be locked for the duration of a draw, it needs to be managed
external to the cache. Avoiding the continous ref/un-ref churn was
particularly important with GPU discardable.

For this reason, canvas2d keeps images locked uptill a particular
budget. The tracking for whether the budget has been exceeded is in the
image cache and the information is passed to the CanvasImageProvider in
DecodedDrawImage. This was resulting in the issue where for images which
can directly be used in raster (bitmaps in sw case), the cache would
always indicate them as budgeted so these images were not being released
until the end of a canvas frame, causing OOM in certain cases.

To fix the above, we could indicate these cases as unbudgeted from the
cache but in that canvas the CanvasImageProvider drops all current
decodes. The correct fix here would be to kill this locking of images
in CanvasImageProvider and instead have the cc cache internally lock
images uptill a budget for this use-case. But as a bandaid fix, this
change makes it so we never hit the cache for these cases, and the
CanvasImageProvider never locks these images if the
ScopedDecodedDrawImage doesn't need to be unlocked.

Bug:  872117 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I48089d6aa3d1523a3c4d45821c7e307c6ee43c8c
Reviewed-on: https://chromium-review.googlesource.com/1169902
Reviewed-by: enne <enne@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#581999}(cherry picked from commit 2a72048f91a5bc9d26c81d091b1acff7643883fb)
Reviewed-on: https://chromium-review.googlesource.com/1173168
Reviewed-by: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#573}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/raster/playback_image_provider.cc
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/raster/playback_image_provider_unittest.cc
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/test/stub_decode_cache.cc
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/test/stub_decode_cache.h
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/gpu_image_decode_cache.cc
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/gpu_image_decode_cache.h
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/image_decode_cache.h
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/software_image_decode_cache.cc
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/software_image_decode_cache.h
[modify] https://crrev.com/5cee3747339bc1bfdc080cc7bef8b280eb0c77c6/cc/tiles/software_image_decode_cache_unittest.cc

M69 Stable promotion is coming VERY soon. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and request a merge into the release branch ASAP. Thank you.


Re #16, could you share a video of what you're seeing and verify whether this is a M68 regression?

The regression in question resulted from using a different image caching stack for canvas2d. While the software version for it went into M68, using it for the hardware accelerated case went in M66. I've tried disabling this code as well, to draw a comparison with memory usage, and I'm not seeing anything that would point to a memory leak. I'm attaching a couple of chromium traces that I'm using to verify this.
trace_canvas_different_image_cache.json.gz
2.5 MB Download
trace_canvas_tot.json.gz
15.1 MB Download
Labels: -ReleaseBlock-Stable
Removing ReleaseBlock-Stable since the OOM for the non hardware accelerated case is fixed. I'll update this again if the investigation for the hardware accelerated case brings up a different issue.

Comment 26 Deleted

In response to #20, all of machines have auto-upgraded to 68. Is there a way to downgrade to 67?
Re #27, could you share a video with the steps, and how you're measuring the memory use. Maybe the test team could try to bisect this as well, and figure out if this regressed in a particular version.
Steps to reproduce the problem:
1. Turn Hardware Acceleration On/Off in Google Chrome as appropriate
2. Go to https://dev3.orgchartnow.com
3. login using the following creds: google@google.com
4. Password: tbs2rt
5. A chart will open
6. Click on the gear icon while holding down the shift key
7. Select 'Test Performance'
8. Repeat many times. Please see video below for results on 68,69 and 70

https://dev3.orgchartnow.com/downloads/chromecanvas.mp4
Any updates?
Cc: fs...@chromium.org khushals...@chromium.org
Owner: rbasuvula@chromium.org
Thanks for the detailed repro steps! I'm pretty confident that this is unrelated to the original bug by #7 for the software rendering case. Here is a bit more context.

The bug was introduced by a change which locks/caches images used during canvas draw which can result in high peak memory usage. While the change which introduced this caching landed in M66 (https://chromium-review.googlesource.com/884881), the code path was only being used with hardware acceleration where it worked correctly. This was enabled for software rendering in M68 (https://chromium-review.googlesource.com/858570), for which it did not handle a case resulting in a OOM crash we see for M68. The fix has already been merged for M69 and it should be present in the next beta release.

There is definitely a bug with decrease in frame rate from M69, coupled with a memory increase. rbasuvula@, could you run a bisect with the steps outlined in #29?
Cc: altimin@chromium.org ericrk@chromium.org
Labels: ReleaseBlock-Stable
Owner: khushals...@chromium.org
Eric narrowed down that the issue was related to the image locking code above, but that there has been a scheduling change in M69 due to which these images are not being released as frequently. The cleanup work should be triggered from CanvasRenderingContext::DidProcessTask, which is expected to be called after every JS task that mutates the canvas (https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/html/canvas/canvas_rendering_context.cc?sq=package:chromium&g=0&l=132).

+altmin, has there been any scheduling change in M69 that could affect this?

In the meanwhile I'll send a CL for just scheduling a PostTask to do this cleanup, since the current mechanism is pretty brittle.
Tested Version 70.0.3524.0 (Official Build) canary (64-bit) (See comment #29).Hardware Accel on.
1. First run 10.8 FPS
2. Second run 7.3 FPS
3. Third run 4.48 FPS
4. Fourth run 3.56 FPS
5. Fifth run 2.44 FPS

Version 68.0.3440.106 (Official Build) (64-bit). Hardware Accel on.
1. First run 21 FPS
2. Second run 21 FPS
3. Third run 22 FPS
4. Fourth run 21 FPS

FireFox 61.0.2 (64-bit)
1. First run 18 FPS
2. Second run 18.5 FPS
3. Third run 18 FPS





Labels: -hasbisect-per-revision Needs-Bisect
A bisect to narrow the change in M69 that regressed this would still be good.
Labels: -Needs-Bisect hasbisect-per-revision
#W.r.to #31&35
 
Tested in chrome stable #68.0.3440.106 and beta #69.0.3497.42 on Windows 10 and below are the observations.
Case 1: Turn Hardware Acceleration On: All the builds working fine without any crash.
Case 2: Turn Hardware Acceleration Off: M68 #68.0.3440.106 - M70 #70.0.3517.0 - Unable to check the memory usage due to after giving the login credentials in provided URL, crash is happening and observed high peak memory usage in Task manager.

Not able to reproduce the issue in canary #70.0.3525.0 and dev #70.0.3521.2 (Note: Its working fine after comment #22 fix) providing the reverse bisect details.

Below are the Bisect Details:

Bisect Info:
=============
Good Build: 70.0.3518.0 (Revision- 582029)
Bad Build: 70.0.3517.0 (Revision- 581729)

Bisect URL:
=========== 
You are probably looking for a change made after 581998 (known good), but no later than 581999 (first known bad).

CHANGELOG URL:

https://chromium.googlesource.com/chromium/src/+log/30029f7b62764903416a8648bec3a1c6ff509347..2a72048f91a5bc9d26c81d091b1acff7643883fb

Removing Needs-Bisect label as of now.Please add if its required.

Thank You!
Re #32: could you grab before / after trace with toplevel,benchmark,blink,renderer.scheduler,disabled-by-default-renderer.scheduler,sequence_manager,disabled-by-default-sequence_manager categories?
Project Member

Comment 38 by bugdroid1@chromium.org, Aug 17

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9

commit a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9
Author: Khushal <khushalsagar@chromium.org>
Date: Fri Aug 17 18:14:24 2018

blink/canvas: Queue cleanup task to purge locked images for canvas2d.

The CanvasImageProvider locks images, limited by a budget, to avoid GPU
discardable ref churn. In the non deferral mode, these images are
released on an external signal indicating the end of a task which
mutated the canvas.

The above mechanism is brittle, and can result in holding this memory
for long durations depending on when the release is triggered. Instead
queue a task each time we start adding images to this list, to ensure
they are eventually cleared.

R=ericrk@chromium.org, fserb@chromium.org

Bug:  872117 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ie9179e498ef70c78ac091a15aa871b16ae5d80ca
Reviewed-on: https://chromium-review.googlesource.com/1179376
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#584128}
[modify] https://crrev.com/a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9/third_party/blink/renderer/platform/graphics/canvas_2d_layer_bridge_test.cc
[modify] https://crrev.com/a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

I ran a bisect and it came back to this change:
Make OffscreenCanvas 2d contexts use CanvasResource
https://chromium.googlesource.com/chromium/src/+/79531c77e491dc9536c985d6c4f5dcd8c6bb48b8

Firstly, there has not been a change with running DidProcessTask which triggers FinalizeFrame which should schedule cleanup of these locked images, but the logic does not run for OffscreenCanvas. Since the actual CanvasResourceProvider used for rendering exists in OffscreenCanvas, not HTMLCanvasElement. So this memory leak from locked images has always been an issue for OffscreenCanvas and should be fixed by #38. I'll request a merge for it after its been in today's canary.

The reason the patch above made it worse seems to be a change with ID generation for PaintImages which is causing cache misses in the GPU ImageDecodeCache and having it grow continuously. These are bitmaps so software rendering is unaffected, but it causes repeated uploads in the GPU case which will definitely kill performance. I'm doing some more tracing to narrow down what's invalidating these IDs.

Cc: -altimin@chromium.org
https://chromium-review.googlesource.com/c/chromium/src/+/1180423 addresses the issue of multiple uploads.
When will this issue be resolved in 68? This is seriously impacting revenue for my company.
Merge request for the change in #38. I've tested this on 70.0.3528.0 and its working as intended.

Re# 41, sorry but these fixes won't be applied to 68. It's too late in the cycle for new changes to be merged and releasing another update. I'm merging all the fixes to 69, and it is expected to be rolled out from September 4.
These bugs are restricted to offscreen canvas, you can avoid using it in your code as a workaround in the meantime, if possible.
Also adding, the merge is completely safe and the changes includes a test.
Current beta channel has Version 69.0.3497.42 (Official Build) beta (64-bit) posted. When will a fixed version be posted to the beta channel?
The next beta will be rolled out on Wednesday. I'll post an update here when this change is released on the beta channel.
Approving merge for cl listed at #38 to M69 branch 3497 based on comment #42, #43 and per offline chat with  khushalsagar@.
Project Member

Comment 47 by bugdroid1@chromium.org, Aug 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4174e43d14347282965dc96315d2e872bd18861f

commit 4174e43d14347282965dc96315d2e872bd18861f
Author: Khushal <khushalsagar@chromium.org>
Date: Mon Aug 20 18:07:16 2018

blink/canvas: Queue cleanup task to purge locked images for canvas2d.

The CanvasImageProvider locks images, limited by a budget, to avoid GPU
discardable ref churn. In the non deferral mode, these images are
released on an external signal indicating the end of a task which
mutated the canvas.

The above mechanism is brittle, and can result in holding this memory
for long durations depending on when the release is triggered. Instead
queue a task each time we start adding images to this list, to ensure
they are eventually cleared.

R=​ericrk@chromium.org, fserb@chromium.org

Bug:  872117 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ie9179e498ef70c78ac091a15aa871b16ae5d80ca
Reviewed-on: https://chromium-review.googlesource.com/1179376
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#584128}(cherry picked from commit a8ae8cd395ac425d4a521cbbec134c95c3e0e2f9)
Reviewed-on: https://chromium-review.googlesource.com/1181741
Reviewed-by: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#718}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/4174e43d14347282965dc96315d2e872bd18861f/third_party/blink/renderer/platform/graphics/canvas_2d_layer_bridge_test.cc
[modify] https://crrev.com/4174e43d14347282965dc96315d2e872bd18861f/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/4174e43d14347282965dc96315d2e872bd18861f/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

Correction to #45. The tentative date for the next beta is Thursday. I'll make sure to post an update when it is released.
Project Member

Comment 49 by bugdroid1@chromium.org, Aug 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340

commit f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340
Author: Khushal <khushalsagar@chromium.org>
Date: Tue Aug 21 01:12:26 2018

blink/canvas: Ensure constant PaintImage::ContentId for canvas snapshots.

Its important for Canvas snapshots to have a constant
PaintImage::ContentId to get a cache-hit in cc's ImageDecodeCache. The
logic was missing the case where the CanvasResourceProvider has a valid
ContextProvider but is still using software raster, thus creating
bitmaps which were being allocated new ids.

R=fserb@chromium.org

Bug:  868369 ,  872117 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Iadc76f35483c0aed9c31aba38a7808aee1dc8863
Reviewed-on: https://chromium-review.googlesource.com/1180423
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#584588}
[modify] https://crrev.com/f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

CL listed at 49 didn't make it to canary. Pls merge the change to canary branch #3529. So I trigger Desktop canary from same branch. Thank you.
Project Member

Comment 51 by bugdroid1@chromium.org, Aug 21

Labels: merge-merged-3529
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bb3941cc8f025f1a63a1134e7e8085e5b41c820b

commit bb3941cc8f025f1a63a1134e7e8085e5b41c820b
Author: Khushal <khushalsagar@chromium.org>
Date: Tue Aug 21 03:32:06 2018

blink/canvas: Ensure constant PaintImage::ContentId for canvas snapshots.

Its important for Canvas snapshots to have a constant
PaintImage::ContentId to get a cache-hit in cc's ImageDecodeCache. The
logic was missing the case where the CanvasResourceProvider has a valid
ContextProvider but is still using software raster, thus creating
bitmaps which were being allocated new ids.

R=​fserb@chromium.org

Bug:  868369 ,  872117 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Iadc76f35483c0aed9c31aba38a7808aee1dc8863
Reviewed-on: https://chromium-review.googlesource.com/1180423
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#584588}(cherry picked from commit f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340)
Reviewed-on: https://chromium-review.googlesource.com/1182921
Reviewed-by: Krishna Govind <govind@chromium.org>
Cr-Commit-Position: refs/branch-heads/3529@{#2}
Cr-Branched-From: 6e0e7cd9bf10a61f443ba63115e86e9204d061ae-refs/heads/master@{#584585}
[modify] https://crrev.com/bb3941cc8f025f1a63a1134e7e8085e5b41c820b/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/bb3941cc8f025f1a63a1134e7e8085e5b41c820b/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

Merged the change to canary branch 3529 at #51. Triggering new canary now from same branch.

Comment 53 Deleted

NextAction: 2018-08-21
khushalsagar@, pls update bug with canary result tomorrow. Thank you.
Pls check canary data on version #70.0.3529.3.
The NextAction date has arrived: 2018-08-21
Labels: TE-Verified-M70 TE-Verified-70.0.3529.3
Able to reproduce the issue as per comment#16 on reported version hence verifying the fix on latest canary 70.0.3529.3 using Windows 10 , Mac 10.13.6 & Debian.

Now, no major decrease in Frame per second is  and no major increase in memory usage is seen for google process.Attaching screen-cast for reference.

As fix is working as expected adding verified labels.

Thanks!


M70_withfix.webm
14.8 MB View Download
Project Member

Comment 58 by bugdroid1@chromium.org, Aug 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9500d6e2d1035ac42928e80f4d6d88cf675efccf

commit 9500d6e2d1035ac42928e80f4d6d88cf675efccf
Author: Khushal <khushalsagar@chromium.org>
Date: Tue Aug 21 15:55:12 2018

blink/canvas: Limit cleanup tasks for locked images.

Don't queue a cleanup task if one is already pending. This avoid
queueing multiple redundant tasks for the case where images are being
purged mid-draw due to exceeding budget constraints.

R=fserb@chromium.org

Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I17ba7c59ee6031bf2c18f81e3691685a78634f3c
Bug:  876001 ,  872117 
Reviewed-on: https://chromium-review.googlesource.com/1182601
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#584759}
[modify] https://crrev.com/9500d6e2d1035ac42928e80f4d6d88cf675efccf/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/9500d6e2d1035ac42928e80f4d6d88cf675efccf/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

Tested 70.0.3529.3 this morning. Windows 10. Hardware Accel On. Frame rate is about 13 FPS over multiple tests. See Comment #33. FPS is still much slower than Chrome 67.

Re #59, what config are you testing on? And could you share a "rendering" trace when running the test (https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs#TOC-Capture-a-trace-on-Chrome-desktop has instructions on how to capture a trace).

I tested this on 70.0.3529.3 and 68.0.3440.106 and I see the same framerate. Attaching a video with the test.
canvas_hw_accelerated.mov
12.5 MB View Download
^ The video above is too low-res. Uploaded a better version here: https://drive.google.com/file/d/1XPgWHpPbHE8sDcymFG1QA-thu6uYWFOm
M69 Stable promotion is coming VERY soon. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and request a merge into the release branch ASAP. Merge has to happen latest by 4:00 PM PT Friday (08/24/18) in order to make it to next week stable cut. Thank you.
Labels: -OS-Linux -ReleaseBlock-Stable -OS-Mac
Owner: fs...@chromium.org
Summary: Regression in canvas performance between M68 and M70 (was: Chrome 68 Crashes with AWSNAP Message with HTML5 2D Canvas)
I could see a perf regression for this page on Windows between 68.0.3440.106 and 70.0.3529.3 and this doesn't look related to images now. The changes above definitely address the issue of multiple uploads and performing regular cleanup for images it locks. I think the changes that have been merged to 69 here are sufficient for addressing the spike in memory usage for both the software and hardware accelerated modes. The change in #51 for avoiding multiple uploads and in #58 for avoiding redundant cleanup tasks can wait until 70. I'm removing the RBS-Stable label.

I'm attaching the traces from Windows and the only difference I can see is that the renderer main thread is blocked on the GPU thread more often in CommandBufferProxyImpl::WaitForGetOffset. An inadvertent side-effect of avoiding repeated uploads is that the command buffer is flushed less often, since uploading a new image was causing a flush from the gpu cache. Removing that could result in less parallelism between the renderer main and GPU thread. Might also be worth taking a look at https://chromium.googlesource.com/chromium/src/+/79531c77e491dc9536c985d6c4f5dcd8c6bb48b8, in case there is something else here I missed.
fserb@, could you investigate this further from the canvas side.
Actually attaching the windows traces.
trace_canvas_70.json.gz
3.9 MB Download
trace_canvas_68.json.gz
3.8 MB Download
The beta released today (69.0.3497.57) has the fixes merged here.
69.0.3497.57 tested (See Comment #29). Crash and memory issues appear to be resolved.

Using document.createElement("canvas") to create offscreen canvas elements performance is:
* Hardware Acceleration OFF is about 22 FPS. 
* Hardware Acceleration ON is about  13.5 FPS. 

Using 'new OffscreenCanvas' to create offscreen canvas elements performance is:
* Hardware Acceleration OFF is about 20.5 FPS.
* Hardware Acceleration ON is about  11.5 FPS. 

Conclusions:
0. With HW Acceleration Off 69 seems to be close to the performance of 68.
1. There is still a defect with respect to Hardware Acceleration - GPU+CPU should be faster than unassisted CPU. Should I log defect for this?
2. The OffscreenCanvas canvas construct creates a performance penalty. Is this expected behavior?
Additional comment for reference:
* FireFox is 24.5 FPS with HW Accel Off
* FireFox is 18.5 FPS with HW Accel On (about 40% faster)
 
document.createElement("canvas") is used to create offscreen canvas elements
 
Status: Fixed (was: Assigned)
I'm not sure what you mean by "using createElement to create OffscreenCanvas". Could you clarify?

Hardware acceleration shouldn't be slower, but enabling in doesn't imply that all canvas are hardware accelerated.

Please do fill bugs for those other issues. I'll be closing this bug.
Re #68

There definitely seems to be a perf regression from 68 for this page on Win. Is that worth investigation to understand the root cause and how common it is before closing this?
Cc: gov...@chromium.org
Labels: ReleaseBlock-Stable
Status: Assigned (was: Fixed)
Re-opening due to still reproducible regression on M69 branch.
Labels: Merge-Request-69
Making another merge request for M69 for the change in #49. The repeated uploads have severely regressed performance on many canvas2d cases. 
Project Member

Comment 72 by sheriffbot@chromium.org, Sep 8

Labels: -Merge-Request-69 Merge-Review-69
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: pbomm...@chromium.org mlamouri@chromium.org ajha@chromium.org vmi...@chromium.org vamshi.kommuri@chromium.org
 Issue 880620  has been merged into this issue.
Labels: Merge-Request-69
Request to merge the change at Comment 51.

commit bb3941cc8f025f1a63a1134e7e8085e5b41c820b
Author: Khushal <khushalsagar@chromium.org>
Date: Tue Aug 21 03:32:06 2018

blink/canvas: Ensure constant PaintImage::ContentId for canvas snapshots.
Project Member

Comment 75 by sheriffbot@chromium.org, Sep 8

Labels: -Merge-Request-69
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Review-69 Merge-Approved-69
Approving merge for commit bb3941cc8f025f1a63a1134e7e8085e5b41c820b to M69 branch 3497 based on comment #71, #74 and per offline chat with vmiura@. 

Also as this cl is in M70 & M71, we don't see  Issue 880620   reported on stable there, correct?
khushaksagar@, also requesting a postmortem for this, pls see go/chrome-postmortems for more details. Thank you.
> Also as this cl is in M70 & M71, we don't see   Issue 880620    reported on stable there, correct?

Correct, I tried both with the steps in  Issue 880620  and they do not have the regression we saw on M69 release branch.
Thank you vmiura@ for taking a look, bisecting and testing on canary & dev.
Project Member

Comment 80 by bugdroid1@chromium.org, Sep 8

Labels: -merge-approved-69
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8b8f1dbacda50770d9afa2f72da765cc6d27441a

commit 8b8f1dbacda50770d9afa2f72da765cc6d27441a
Author: Khushal <khushalsagar@chromium.org>
Date: Sat Sep 08 02:08:08 2018

blink/canvas: Ensure constant PaintImage::ContentId for canvas snapshots.

Its important for Canvas snapshots to have a constant
PaintImage::ContentId to get a cache-hit in cc's ImageDecodeCache. The
logic was missing the case where the CanvasResourceProvider has a valid
ContextProvider but is still using software raster, thus creating
bitmaps which were being allocated new ids.

R=​fserb@chromium.org

Bug:  868369 ,  872117 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Iadc76f35483c0aed9c31aba38a7808aee1dc8863
Reviewed-on: https://chromium-review.googlesource.com/1180423
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Khushal <khushalsagar@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#584588}(cherry picked from commit f93f8b7c3d3080586deba2a1a7f6b47bcfc4e340)
Reviewed-on: https://chromium-review.googlesource.com/1214753
Reviewed-by: Khushal <khushalsagar@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#912}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/8b8f1dbacda50770d9afa2f72da765cc6d27441a/third_party/blink/renderer/platform/graphics/canvas_resource_provider.cc
[modify] https://crrev.com/8b8f1dbacda50770d9afa2f72da765cc6d27441a/third_party/blink/renderer/platform/graphics/canvas_resource_provider.h

Status: Fixed (was: Assigned)
Closing this now since the fix has been merged to 69.
Labels: TE-Verified-M69 TE-Verified-69.0.3497.91
Tested the issue on Windows-10,Mac 10.13.6 using chrome latest M69 M69-69.0.3497.91 by following steps mentioned in the comment #29. Observed that no crashes and no major memory spikes are observed and working as expected.Hence adding TE-Verified label.

Please find the screen shot for reference.

Thank you!
872117 (1).mp4
12.6 MB View Download
Update to comment #82 Please find the "Hardware Acceleration OFF"  screen cast. 
872117(1).mp4
8.6 MB View Download
Cc: junov@chromium.org lafo...@chromium.org kbr@chromium.org senorblanco@chromium.org aaronhk@chromium.org sunn...@chromium.org davidqu@chromium.org
 Issue 881824  has been merged into this issue.
Labels: TE-Verified-69.0.3497.92
Tested the issue on Windows-10 using chrome latest Stable RC M69 M69-69.0.3497.92 by following steps mentioned in the comment #29. Observed that no crashes and no major memory spikes are observed and working as expected.Hence adding TE-Verified label.

Please find the screen shot for reference.

Thank you!
872117(M69).mp4
12.1 MB View Download
Good news: 69.0.3497.92 is now available with the fix on Chrome Stable. You can force an update by visiting chrome://chrome. Thank you.
69.0.3497.92 has certainly improved my drawImage performance issues (over the previous 69.0.3497.81 version). I have the default GPU settings (Accelerated 2D canvas enabled). 

The operations were lots of draws (of small areas -- say 32x64) from a 2048x512 canvas onto a larger canvas (say 2400 x 1000). Each drawImage would take 10ms or more!


Cc: susan.boorgula@chromium.org
 Issue 882792  has been merged into this issue.
Will the .92 version be rolled out on Android? One of my users can generate a crash using 69.0.3497.91 Chrome on Android 8.
Re #89, could file a bug for it with the repro steps?
Requesting a postmortem for this, pls see go/chrome-postmortems for more details.

Sign in to add a comment