New issue
Advanced search Search tips
Starred by 5 users
Status: Fixed
Owner:
Closed: Oct 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug

Blocking:
issue 563816



Sign in to add a comment
OffscreenCanvas using commit() from worker thread flickers when main thread blocked
Project Member Reported by kbr@chromium.org, Sep 6 Back to list
Chrome Version: 63.0.3207.0 (Official Build) canary (64-bit)
OS: macOS 10.12.5

What steps will reproduce the problem?
(1) Go to https://baicoianu.com/~bai/scratch/offscreencanvasworkertest.html
(2) Click "Block main thread for 5 seconds"
(3) Scroll up and down a little

What is the expected result?

Expect the animation to continue running.


What happens instead?

The animation flickers a little. It looks like it's dropping frames and occasionally rendering the canvas as black.

I think this is a known issue but couldn't find it in the issue tracker. This was reported by mr. doob on Twitter. Justin, could you own this, or reassign it?

 
Cc: fsam...@chromium.org
Status: Assigned
There was a flickering issue in the past. It was resolved.  This one has a slightly different use case with the scrolling and blocking.  Thanks for reporting!
Cc: kylec...@chromium.org
This could be related to surface references. +kylechar.
I just reproduced on macOS Canary 63.0.3207.0. I can see the flicker but Compositing.SurfaceAggregator.SurfaceDrawQuad.MissingSurface UMA shows no missing Surfaces when I go to chrome://histograms. I don't think this is related to surface references.
Owner: xlai@chromium.org
Labels: Needs-Bisect
So I tried this demo in both Stable Chrome and Canary in Mac and see only the flickering in Canary. I couldn't see obvious difference in stable Chrome. So I guess this might be a regression.
I didn't update my Chrome so it is the 60.0.3112.113 that's working fine. But 61.0.3163.91, the latest stable Chrome, is showing flickering. Bisecting between this range now.
Cc: kbr@chromium.org
It is very hard to perform a reliable bisect on this, as the flickering is not easy to detect using raw eyes. I would prefer to look into the code to do a static analysis instead.

Based on my repeated attempts, it seems that this flickering only happens on Mac platform; Linux and Win are all doing right. Also, the problem occurs much less frequently in a high-performance Mac machine (2.7 GHz 12-core, 64GB memory, 3072MB graphics card). The problem is easier to detect in a low-performance Mac machine (2.2 GHz i7 4-core, 8GB memory, 1536MB graphics).

The image looks like an old frame getting painted on the screen. I suspect that this is because some compositor frames are out of order in Mac.

Ken@: are you aware of any other similar issues in Mac platform that have frames displaying in missing/out-of-order fashion?
Cc: sunn...@chromium.org
Olivia: yes, actually -- see  Issue 765729 . Basically, running that demo with --disable-features=WebGLImageChromium is causing flickering.

Meant to bisect that yesterday, but was focused on fixing the underlying problem with the canvas rendering transparent. Will talk with sunnyps@ who mentioned there was a specific recent CL that might be at fault.

Cc: susanjuniab@chromium.org
Labels: Needs-Triage-M63 M-63
Able to reproduce the issue on Mac 10.12.6 using chrome stable version 61.0.3163.91 and latest canary 63.0.3218.0 and below is the bisect information.

Bisect Information:
=====================
Good build: 61.0.3161.0	 
Bad Build : 61.0.3162.0	

Change Log URL: 
https://chromium.googlesource.com/chromium/src/+log/55ed8ccd43eadef318b5e7a6bcf2519b078b1213..a61014fad30efbfc90135865dc7d79299ce7e44d

From the above change log suspecting below change
Review URL: https://chromium.googlesource.com/chromium/src/+/a61014fad30efbfc90135865dc7d79299ce7e44d

The above changelog looks to be irrelevant, as this issue was inconsistent when performing bisect.

Keeping the 'Needs-Bisect' label and requesting Dev for further help in triaging  this issue.

Thanks...!!

Labels: -M-63 -Needs-Triage-M63
I found another low-end Mac machine to do a more reliable and effective bisecting, and can confirm that the last good commit is 479169 and first bad commit is 479185. All of these commits have already been rolled into stable Chrome.

Looking at the commits in this range https://chromium.googlesource.com/chromium/src/+log/16289b3ef759d892a4058671e0657c016101be23..b46c5f152d4b888cae95e06d26815325133b268a

I feel that the possible culprit CL is "Enable gpu scheduler on waterfall and developer builds." by sunnyps@ but I'm not sure how it caused so. The other CLs are doing formatting, histogram, extension change etc so they look less likely to be the one.

sunnyps@: Could you take a look at this issue to see why this CL is causing a flickering on WebGL canvas on worker? 
@xlai: You could disable the gpu scheduler on your local build to confirm whether this is related to the problem. If this is the case, it could be an indication that sync tokens are being used improperly in OffscreenCanvas code.
Project Member Comment 12 by bugdroid1@chromium.org, Sep 20
Labels: -Needs-Bisect
Removing the Needs-Bisect label as it is Assigned to dev and fix is already landed for this issue.

Thanks..
Labels: Needs-Bisect
The above CL is not a fix, just an aid in debugging.
Quick observation:

In CanvasLayer2DBridge::ReleaseFrameResources we don't wait (or defer the wait) on the sync token returned by the browser. That seems wrong.

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/Canvas2DLayerBridge.cpp?l=1022

Compare this with VideoResourceUpdater::ReturnTexture which either waits for the sync token there or saves the sync token and defers the wait.

https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?l=599
A better comparison would be with DrawingBuffer, which also saves the returns the sync token and defers the wait to when the buffer would be reused:

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/DrawingBuffer.cpp?l=441

I'm pretty sure this incorrect synchronization is the cause for the bug.
Cc: tkent@chromium.org
Labels: -Needs-Bisect
Re-tested this issue again and able to reproduce the issue on Mac 10.12.6 using chrome stable version 61.0.3163.91 and latest canary 63.0.3220.0 and got the same good and bad builds as mentioned in the comment #9.

Bisect Information:
=====================
Good build: 61.0.3161.0	 
Bad Build : 61.0.3162.0	

Re-bisected using the above good and bad builds and got a different change log this time.

Change Log URL: 
https://chromium.googlesource.com/chromium/src/+log/0a3b0613ab242aad2c002709c4673b204539fdbb..6999b61a3be6b84ae0e5f22679d068e57c751ed2

From the above change log suspecting below change
Review URL: https://chromium.googlesource.com/chromium/src/+/6999b61a3be6b84ae0e5f22679d068e57c751ed2

Adding tkent@ to CC list and removing the 'Needs-Bisect' label..
tkent@ Can you please check if this issue is caused with respect to your change, if not please help us in assigning it to the right owner.


Thanks...!!
susanjuniab@: Please see comment #10. I have already done the basic bisecting few days ago.

junov@: Yes, the animation will be smooth when I pass "--disable-features=GpuScheduler" to Canary.

sunnyps@: I tested on the latest Canary with your new flag passed in. The Offscreen WebGL animation is still flickering when I run Canary with "--disable-gpu-async-worker-context"; so it's not the worker context issue. Need to look into it further. 


Cc: -tkent@chromium.org
xlai@ told me that we use a surface for offscreen canvas and use SetTransferableResourceToStaticImageBitmap: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/OffscreenCanvasResourceProvider.cpp?l=155

In this case, context_provider_wrapper_ isn't set on the FrameResource, and so the return path (ReclaimResources) will not do a wait sync token.

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/OffscreenCanvasResourceProvider.cpp?l=192

(BTW CanvasLayer2DBridge also seems wrong in its return path when used with TextureLayer - see comments #15 and #16)
sunnyps@: The sync token is indeed wrongly implemented and I'll follow up with a CL on that; but this is not the cause of this bug.

The real cause lies in the way FrameResource is released in OffscreenCanvasResourceProvider (https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/OffscreenCanvasResourceProvider.cpp?l=208). The resource must be explicitly released twice in order to open the lock to really get released, which can then be recycled in CreateOrRecycleFrameResource() when the next image frame needs to use a resource.

The two explicit calls to release the resource come from:
1. When GPU compositor sends back an acknowledgement that it has received the compositor frame (https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/OffscreenCanvasFrameDispatcherImpl.cpp?l=319)
2. When HTML canvas on main thread finishes storing the previous image on its  placeholder_frame_. (https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/OffscreenCanvasPlaceholder.cpp?l=79)

In the case when the main thread is blocked, HTML canvas will not release the resource. So for each new frame, the worker has to re-create new resource, thereby causing the flickering.

I commented out all the code that post image from worker to main thread as well as the locks on resource, the animation becomes smooth. junov@: I am thinking of releasing resource locks if the main thread stops or lags for an unexpected long time, so that worker performance would not be compromised; do you think this is an acceptable approach?

One doubt left is that I am not quite sure why this problem is only seen when the new GPU scheduler is enabled.
The GPU scheduler increases command latency in some cases, and that can cause the transfer buffer to be filled up (see issue 742345) and the client to wait for free space. This might be causing the main thread to be blocked as you said.


@#21: Releasing the locks if the main theead falls behind is a bad idea.  The locks are important to prevent race conditions.  A better idea would be to set a maximum number of resources that the main thread is allowed to hold locks on (needs to be at least 2 for things to work). When the threshold is reached, the OffscreenCanvas should stop sending new frames to the main thread. The sending of frames can resume when the main thread is unblocked and starts releasing resources again.

The missing sync_token logic that Sunny pointed out was removed by mistake in this CL: https://chromium-review.googlesource.com/c/chromium/src/+/579690
This is likely to cause flickering and other weird behavior.  I will file a separate bug because this is a regression that we'll need to fix in the 62 branch as well.

Status: Started
Project Member Comment 25 by bugdroid1@chromium.org, Oct 11
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c690099e270d9be683f5fae295ff5bddb4f833a3

commit c690099e270d9be683f5fae295ff5bddb4f833a3
Author: xlai-o <xlai@chromium.org>
Date: Wed Oct 11 15:54:30 2017

Limit placeholder canvas to hold at most 3 pending image frames on main thread


When main thread is blocked and worker thread keeps animating, there will be
many accumulated callbacks that post images from OffscreenCanvas to the
placeholder canvas. This CL stops posting more images when there has been 3
images sent to the placeholder and none has returned. Once the main thread
becomes unblocked, placeholder returns back with a ReclaimResource request,
which will then trigger OffscreenCanvas to post the latest unposted image to
placeholder.

Bug:  762647 
Change-Id: I3145ef424b1d9b1f268e9b4dbe6c36177ecfa7a4
Reviewed-on: https://chromium-review.googlesource.com/690935
Reviewed-by: Justin Novosad <junov@chromium.org>
Commit-Queue: Olivia Lai <xlai@chromium.org>
Cr-Commit-Position: refs/heads/master@{#507991}
[modify] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/BUILD.gn
[modify] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/graphics/OffscreenCanvasFrameDispatcherImpl.cpp
[modify] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/graphics/OffscreenCanvasFrameDispatcherImpl.h
[add] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/graphics/OffscreenCanvasFrameDispatcherImplTest.cpp
[modify] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/graphics/OffscreenCanvasResourceProvider.cpp
[modify] https://crrev.com/c690099e270d9be683f5fae295ff5bddb4f833a3/third_party/WebKit/Source/platform/graphics/OffscreenCanvasResourceProvider.h

Status: Fixed
Sign in to add a comment