New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 701925 link

Starred by 6 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

canvas toBlob should use asynchronous GPU readbacks

Project Member Reported by junov@chromium.org, Mar 15 2017

Issue description

HTMLCanvasElement.toBlob and OffscreenCanvas.convertToBlob currently performa synchronous readPixels operation when the underlying pixel buffer is on the GPU.  This read back should be made asynchronous, especially considering that toBlob is an async API.


 

Comment 1 by xlai@chromium.org, Mar 15 2017

Does the problem come from the call of HTMLCanvasElement::toImageData() for both toDataURL and toBlob?

Comment 2 by junov@chromium.org, Sep 27 2017

@!: yes, but it will have to be refactored because that path is shared with synchronous APIs (toDataURL + getImageData) which still need to be blocking.  What yoiu want to do is get rid of the underlying call to SkImage::readPixels (for the GPU-accelerated case), an a async GPU readback instead.

Comment 3 by xlai@chromium.org, Oct 16 2017

junov@: emircan's CL (https://chromium-review.googlesource.com/c/chromium/src/+/691041) on improving readPixels is simply switching to peekPixels() whenever the pixels are retained in memory. It's not really changing it to async.

Looks like we need Skia support in async readback?

Comment 4 by xlai@chromium.org, Dec 19 2017

Cc: junov@chromium.org xlai@chromium.org zakerinasab@chromium.org noel@chromium.org
 Issue 771209  has been merged into this issue.

Comment 5 by xlai@chromium.org, Jan 9 2018

Labels: -Pri-3 Pri-2
Status: Fixed (was: Assigned)
While doing performance analysis, I have some exciting discoveries about canvas.toBlob.

Previously,  crbug.com/771209  spot a regression in frame_times in the smoothness test of canvas.toBlob affected by noel@'s CL r505447 that uses a faster encoder:

  Configuration: win_8_perf_bisect
  Benchmark    : smoothness.tough_canvas_cases
  Metric       : frame_times/tough_canvas_cases_canvas_toBlob.html
  Change       : 2.06% | 21.4123169791 -> 21.8523687874

But when I took a look at the same perf graph now, I see a big improvement between r521699 and r521790:
https://chromeperf.appspot.com/report?sid=ad23ce6a8561133c328a6bda767fc746b75a86ba9c1b31530bf853ae897c6de4&start_rev=504319&end_rev=523782
(attached perf graph snapshot). Below is the statistics of improvement:

  Median of segment before: 22.148706666666804
  Median of segment after: 18.17900790755844
  Relative change: 17.9%
  Test: ChromiumPerf/chromium-rel-win8-
        dual/smoothness.tough_canvas_cases/frame_times/tough_canvas_cases_canvas_toBlob.html

Inside this range it is Reza's r521715 ("Refactor CanvasAsyncBlobCreator to use StaticBitmapImage", https://chromium-review.googlesource.com/c/chromium/src/+/780279). Although I have not
done a more granular bisecting on this range, I am quite sure that this CL is the cause of improvement. 

In  crbug.com/771209 , we had discussion about the reason of regression and know that a canvas.toBlob operation mainly consists
of two steps--(1) read image data (2) encoding. The step (2) is carried out in idle periods but the step (1) was a synchronous GPU readback, which occupies main-thread resources. In the performance test, the faster the image encoder, the more canvas.toBlob calls would occur in a fixed time frame, resulting in a decreasing frame time. So the only way to reverse this
regression is to improve the step (1) by avoiding the underlying call to SkImage::readPixels.

r521715 effectively does that. It calls HTMLCanvasElement::ToStaticBitmapImage, which gets the SkImage from SkSurface::makeImageSnapshot. Then inside
CanvasAsyncBlobCreator, it calls SkImage::peekPixels to directly access the pixels if they are still retained in the memory. The switch from SkImage::readPixels to SkImage::peekPixels changes the "copy pixels" behavior to "directly access in-memory pixels" and is thus a huge improvement.


Screenshot from 2018-01-09 11:50:50.png
171 KB View Download

Comment 6 by junov@chromium.org, Jan 9 2018

Status: Assigned (was: Fixed)
That is great! But this bug still stands though.
Performance may have been improved, but the implementation is still performing synchronous ReadPixels on the main thread's GL context.
The problem is in CanvasAsyncBlobCreator::LoadStaticBitmapImage(). It calls image_->MakeUnaccelerated().

So there is still an opportunity to greatly improve performance here.

Comment 7 by xlai@chromium.org, May 4 2018

Cc: -zakerinasab@chromium.org
Owner: zakerinasab@chromium.org
Assigning this to Reza, who have been contributed to the performance improvement of toBlob() before.
Cc: -junov@chromium.org
Owner: ----
Status: Available (was: Assigned)

Sign in to add a comment