New issue
Advanced search Search tips

Issue 828425 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 430155


Participants' hotlists:
Hostlist-AnimationWorklet-OT


Sign in to add a comment

Animation Worklet - reloading page causes freeze when devtools is open

Project Member Reported by majidvp@chromium.org, Apr 3 2018

Issue description

Try reloading this page while devtools is open: 
https://aw-bug-hunt.glitch.me/freeze.html

The page is going to freeze.





 
surma@:  FYI, I can only reproduce this when devtools is open.
This could be related to issue 662812. 

There was also a similar issue ( issue 774753 ) occurring with AudioWorklet which is another threaded worklet. That has been resolved.
This reproduces 100% on 65, but on 66.0.3359.81 I have to refresh the page a few times before the deadlock occurs. It felt like refreshing the page whilst it was still loading or soon after it loaded was more likely to reproduce the issue.

On 67.0.3393.4 the repro page logs 'The target element is not composited'. I'll make a local version and try again.

You can tell when its reproducing as your devtools element window will go blank.

So with the attached repro, I believe this this no longer reproduces on canary. I'll attempt a bisect to see if I can figure out when it got fixed.

freeze.html
1.2 KB View Download
Hrm. I did a bisect, but it just pointed to a CL by Jeremy, and a V8 roll. 

You are probably looking for a change made after 542482 (known bad), but no later than 542484 (first known good).
CHANGELOG URL:
  https://chromium.googlesource.com/chromium/src/+log/37496521d8c159eabafda209e80cb53307b65ddb..ade1dc7a1a967b82df4f1f7af2bec379efa26ce1

ade1dc7 Throw a DataCloneError if a neutered ArrayBuffer occurs in the transfer list. by Jeremy Roman · 4 weeks ago
264efd1 Update V8 to version 6.7.34. by v8-autoroll · 4 weeks ago

The v8 list is:

7e11bad Version 6.7.34 by v8-autoroll · 4 weeks ago 6.7.34 6.7.34
53c8152 Move function definition into right #ifdef range by Sigurd Schneider · 4 weeks ago
14a41cb Skip failing debug test on x64 msvc. by Yang Guo · 4 weeks ago
d81b7aa [debug] materialize arguments and receiver for break-at-entry condition. by Yang Guo · 4 weeks ago
e58bc01 Reland "Skip Execution::Call in CompileFunctionInContext." by Yang Guo · 4 weeks ago
923a7ad Update V8 DEPS. by v8-autoroll · 4 weeks ago

I'm not 100% confident in the bisect since the issue is hard to reproduce in later Chromium versions (lots of refreshing of the page, waiting, not waiting, etc). But it certainly could be that the hang was somewhere in V8 and was fixed?
Cc: -surma@chromium.org smcgruer@chromium.org majidvp@chromium.org
Owner: surma@chromium.org
Surma - can you update the original demo so that it works on Canary (you need to give the boxes width, I think, so we will composite them. This may be it's own Chrome bug for us as well later), and try to reproduce with later than 67.0.3393.4?
Labels: -Type-Feature Hotlist-Experimental Type-Bug

Comment 8 by surma@chromium.org, Apr 12 2018

Owner: smcgruer@chromium.org
Updated the demo with the sizes for the boxes.

Testing with 67.0.3395.0, it has gotten better, but after 3-4 reloads the loading spinner seems to stick around infinitely and the tabs takes >5 seconds to close. DevTools stays responsive throughout this.
Owner: majidvp@chromium.org
Status: Started (was: Assigned)
So I managed to reproduce this rather consistently in --single-process-mode and after working on it a bit managed to get to occur in a debugger without having it crash.
With single process mode the devtools is not responsive either. The whole thing freezes. In mutil-process model this still happens (more rare) but devtools remains responsive which is perhaps because it is running in a separate process.


I now have a theory why this is happening: it is a deadlock!

# Background 

There is a golden rule to avoid deadlock between compositor thread and main thread: never blocks compositor on main thread.

# Possible deadlock

I think we have now introduced a path where this rule is broken. This seems to happen when devtools inspector is attached when page is being reloaded.
Here is the chain of blocked threads:
   - Main thread is blocked on compositor thread when it is ready to commit. [1] 
   - Compositor is producing impl frame and running animation which at the moment block on animation thread to receive its update [2]
   - Animation Thread (all worklet threads) can be paused waiting on inspecter task runner if one is attached. [3] 
   - I suspect somehow that the inspector task runner is either itself running on main thread or indirectly blocked on main thread. Haven't fully trace this part yet.
    
All this combined can lead to a deadlock.

[1] https://codesearch.chromium.org/chromium/src/cc/trees/proxy_main.cc?q=ProxyMain::BeginMainFrame&sq=package:chromium&dr=CSs&l=326
[2] https://codesearch.chromium.org/chromium/src/third_party/blink/renderer/platform/graphics/compositor_mutator_impl.cc?q=CompositorMutatorImpl::Mutate&dr=CSs&l=68
[3] https://codesearch.chromium.org/chromium/src/third_party/blink/renderer/core/workers/worker_thread.cc?type=cs&sq=package:chromium&g=0&l=276

# Additional confirmation

To do a quick confirmation, I modified the compositor mutator so that compositor thread do not wait on animation thread.
After that it no longer occurs. 

Further more, at the bottom are some stack traces when freeze occurs which I think further confirms my hypothesis.


## Solution

I am still thinking how best to fix this. Basically we need to break this cycle. 
Here are two options I can think of:
 1- No longer block the compositor thread on animation worklet
 2- See if we can special case animation thread to no be blocked on devtools inspector.

any ideas?

# Relevant Stack traces


## AnimationWorklet thread
#0  0x00007ffff7bc915f in pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007fffdfa7ae7d in WTF::ThreadCondition::Wait(WTF::Mutex&) (this=0x2dfa42f04660, mutex=...) at ../../third_party/blink/renderer/platform/wtf/threading_pthreads.cc:210
#2  0x00007fffdfa7afc8 in WTF::ThreadCondition::TimedWait(WTF::Mutex&, double) (this=0x2dfa42f04660, mutex=..., absolute_time=1.7976931348623157e+308) at ../../third_party/blink/renderer/platform/wtf/threading_pthreads.cc:222
#3  0x00007fffe19cc82f in blink::InspectorTaskRunner::TakeNextTask(blink::InspectorTaskRunner::WaitMode) (this=0x2dfa42f04610, wait_mode=blink::InspectorTaskRunner::kWaitForTask) at ../../third_party/blink/renderer/core/inspector/inspector_task_runner.cc:92
#4  0x00007fffe19cc5f7 in blink::InspectorTaskRunner::WaitForAndRunSingleTask() (this=0x2dfa42f04610) at ../../third_party/blink/renderer/core/inspector/inspector_task_runner.cc:67
#5  0x00007fffe21f2ec3 in blink::WorkerThread::StartRunningDebuggerTasksOnPauseOnWorkerThread() (this=0x3822a5103220) at ../../third_party/blink/renderer/core/workers/worker_thread.cc:278
#6  0x00007fffe21f06ed in blink::WorkerThread::InitializeOnWorkerThread(std::__1::unique_ptr<blink::GlobalScopeCreationParams, std::__1::default_delete<blink::GlobalScopeCreationParams> >, base::Optional<blink::WorkerBackingThreadStartupData> const&, blink::WorkerInspectorProxy::PauseOnWorkerStart) (this=0x3822a51032


## Main thread
#1  0x00007ffff79441a0 in base::ConditionVariable::Wait() (this=0x7fffa4078108) at ../../base/synchronization/condition_variable_posix.cc:71
#2  0x00007ffff7945d89 in base::WaitableEvent::TimedWaitUntil(base::TimeTicks const&) (this=0x7fffa4078e18, end_time=...) at ../../base/synchronization/waitable_event_posix.cc:223
#3  0x00007ffff7945a0c in base::WaitableEvent::Wait() (this=0x7fffa4078e18) at ../../base/synchronization/waitable_event_posix.cc:154
#4  0x00007fffee6b3812 in cc::CompletionEvent::Wait() (this=0x7fffa4078e18) at ../../cc/base/completion_event.h:43
#5  0x00007fffee84e8eb in cc::ProxyMain::BeginMainFrame(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >) (this=0x3822a4d559b0, begin_main_frame_state=...) at ../../cc/trees/proxy_main.cc:326
#6  0x00007fffee84a6d3 in base::internal::FunctorTraits<void (cc::ProxyMain::*)(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >), void>::Invoke<void (cc::ProxyMain::*)(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc


## Compositor thread
#0  0x00007ffff7bc915f in pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007ffff79441a0 in base::ConditionVariable::Wait() (this=0x7fffa2455858) at ../../base/synchronization/condition_variable_posix.cc:71
#2  0x00007ffff7945d89 in base::WaitableEvent::TimedWaitUntil(base::TimeTicks const&) (this=0x36ea4ac4dee0, end_time=...) at ../../base/synchronization/waitable_event_posix.cc:223
#3  0x00007ffff7945a0c in base::WaitableEvent::Wait() (this=0x36ea4ac4dee0) at ../../base/synchronization/waitable_event_posix.cc:154
#4  0x00007fffded91410 in blink::WaitableEvent::Wait() (this=0x7fffa2455b80) at ../../third_party/blink/renderer/platform/waitable_event.cc:32
#5  0x00007fffdea6d7f2 in blink::CompositorMutatorImpl::Mutate(std::__1::unique_ptr<cc::MutatorInputState, std::__1::default_delete<cc::MutatorInputState> >) (this=0x36ea449b08a0, state=...) at ../../third_party/blink/renderer/platform/graphics/compositor_mutator_impl.cc:68
#6  0x00007fffdea6c251 in blink::CompositorMutatorClient::Mutate(std::__1::unique_ptr<cc::MutatorInputState, std::__1::default_delete<cc::MutatorInputState> >) (this=0x36ea436fe660, input_state=...) at ../../third_party/blink/renderer/platform/graphics/compositor_mutator_client.cc:29
#7  0x00007fffe44c66a6 in cc::AnimationHost::TickAnimations(base::TimeTicks, cc::ScrollTree const&) (this=0x36ea45709e30, monotonic_time=..., scroll_tree=...) at ../../cc/animation/animation_host.cc:331
#8  0x00007fffee7aa1b8 in cc::LayerTreeHostImpl::AnimateLayers(base::TimeTicks, bool) (this=0x36ea44b1f420, monotonic_time=..., is_active_tree=true) at ../../cc/trees/layer_tree_host_impl.cc:4377
#9  0x00007fffee786352 in cc::LayerTreeHostImpl::AnimateInternal() (this=0x36ea44b1f420) at ../../cc/trees/layer_tree_host_impl.cc:583
#10 0x00007fffee785345 in cc::LayerTreeHostImpl::Animate() (this=0x36ea44b1f420) at ../../cc/trees/layer_tree_host_impl.cc:555
#11 0x00007fffee799f44 in cc::LayerTreeHostImpl::WillBeginImplFrame(viz::BeginFrameArgs const&) (this=0x36ea44b1f420, args=...) at ../../cc/trees/layer_tree_host_impl.cc:2277
#12 0x00007fffee8441d5 in cc::ProxyImpl::WillBeginImplFrame(viz::BeginFrameArgs const&) (this=0x36ea44b59de0, args=...) at ../../cc/trees/proxy_impl.cc:494
#13 0x00007fffee66b0ae in cc::Scheduler::BeginImplFrame(viz::BeginFrameArgs const&, base::TimeTicks) (this=0x36ea43700ba0, args=..., now=...) at ../../cc/scheduler/scheduler.cc:552
#14 0x00007fffee669ff1 in cc::Scheduler::BeginImplFrameWithDeadline(viz::BeginFrameArgs const&) (this=0x36ea43700ba0, args=...) at ../../cc/scheduler/scheduler.cc:483
#15 0x00007fffee668989 in cc::Scheduler::OnBeginFrameDerivedImpl(viz::BeginFrameArgs const&) (this=0x36ea43700ba0, args=...) at ../../cc/scheduler/scheduler.cc:312
#16 0x00007fffee0c2401 in viz::BeginFrameObserverBase::OnBeginFrame(viz::BeginFrameArgs const&) (this=0x36ea43700ba0, args=...) at ../../components/viz/common/frame_sinks/begin_frame_source.cc:66



As we discussed, it seems strange for the compositor thread to be trying to use the worklet thread before it has finished initializing. We should probably delay the call to register the global scope at least until we've finished blink::WorkerThread::InitializeOnWorkerThread

Once we start using the worklet thread, it should no longer block on the main thread, as in the future we may make blocking calls from the main thread to get non-accelerated animation values for main thread animation updates.
FWIW, I found at least one case where main thread is going to append a debug task immediately after it creates a worker. So perhaps when main thread is
blocked on impl it will never append that task which the worker thread is
paused waiting for it.

See: https://codesearch.chromium.org/chromium/src/third_party/blink/renderer/core/exported/web_shared_worker_impl.cc?type=cs&g=0&l=364

I have a WIP CL which delays registering the animation worklet global scope and it seems to be solving the issue. I just have to figure out how best to test it. 

https://chromium-review.googlesource.com/c/chromium/src/+/1062947 
Project Member

Comment 15 by bugdroid1@chromium.org, May 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0ce9ac1f4475b3ffa835230f2152bca9890bda10

commit 0ce9ac1f4475b3ffa835230f2152bca9890bda10
Author: Majid Valipour <majidvp@chromium.org>
Date: Fri May 18 17:23:51 2018

[animation-worklet] Lazily register animation worklet global scope

Register Animation Worklet Global Scope (AWGS) with its proxy client only when
it has at least one animator definition.

Previously AWGS was being registered with its proxy client as soon as it was
constructed.

Turns out worklet threads may be indefinitely blocked to receive debugger tasks.
In practice this would happen when devtools is open and page is reloaded.

The was leading to deadlock because it was creating a cycle where compositor
thread could get indirectly blocked on main via the newly register animation
worklet global scope. Moreover, registering the global scope too eagerly when it
does not have any work to do is wasteful.

Lazily registering AWGS addresses the initialization process deadlock and
it is more efficient.

TEST: modules/animationworklet/animation_worklet_global_scope_test.cc
     and manual testing based on the instruction in the bug

BUG:  828425 
Change-Id: Iac8e1653d8a79bd3e27d9d9c5bfaf19174a59664
Reviewed-on: https://chromium-review.googlesource.com/1062947
Reviewed-by: Majid Valipour <majidvp@chromium.org>
Reviewed-by: Stephen McGruer <smcgruer@chromium.org>
Commit-Queue: Majid Valipour <majidvp@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559936}
[modify] https://crrev.com/0ce9ac1f4475b3ffa835230f2152bca9890bda10/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.cc
[modify] https://crrev.com/0ce9ac1f4475b3ffa835230f2152bca9890bda10/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.h
[modify] https://crrev.com/0ce9ac1f4475b3ffa835230f2152bca9890bda10/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope_test.cc
[modify] https://crrev.com/0ce9ac1f4475b3ffa835230f2152bca9890bda10/third_party/blink/renderer/modules/animationworklet/animation_worklet_proxy_client_impl.cc

Project Member

Comment 16 by bugdroid1@chromium.org, May 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8812fdd2571377c30bbadaf774bdc56c37037194

commit 8812fdd2571377c30bbadaf774bdc56c37037194
Author: Thomas Anderson <thomasanderson@chromium.org>
Date: Fri May 18 18:04:07 2018

Revert "[animation-worklet] Lazily register animation worklet global scope"

This reverts commit 0ce9ac1f4475b3ffa835230f2152bca9890bda10.

Reason for revert: Breaks jumbo build:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/linux-jumbo-rel/1640

Original change's description:
> [animation-worklet] Lazily register animation worklet global scope
> 
> Register Animation Worklet Global Scope (AWGS) with its proxy client only when
> it has at least one animator definition.
> 
> Previously AWGS was being registered with its proxy client as soon as it was
> constructed.
> 
> Turns out worklet threads may be indefinitely blocked to receive debugger tasks.
> In practice this would happen when devtools is open and page is reloaded.
> 
> The was leading to deadlock because it was creating a cycle where compositor
> thread could get indirectly blocked on main via the newly register animation
> worklet global scope. Moreover, registering the global scope too eagerly when it
> does not have any work to do is wasteful.
> 
> Lazily registering AWGS addresses the initialization process deadlock and
> it is more efficient.
> 
> TEST: modules/animationworklet/animation_worklet_global_scope_test.cc
>      and manual testing based on the instruction in the bug
> 
> BUG:  828425 
> Change-Id: Iac8e1653d8a79bd3e27d9d9c5bfaf19174a59664
> Reviewed-on: https://chromium-review.googlesource.com/1062947
> Reviewed-by: Majid Valipour <majidvp@chromium.org>
> Reviewed-by: Stephen McGruer <smcgruer@chromium.org>
> Commit-Queue: Majid Valipour <majidvp@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#559936}

TBR=majidvp@chromium.org,smcgruer@chromium.org

Change-Id: Id7f0c78ea8676ab7a1c8ebabdd3ce19b5ca83d04
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/1066154
Reviewed-by: Thomas Anderson <thomasanderson@chromium.org>
Commit-Queue: Thomas Anderson <thomasanderson@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559954}
[modify] https://crrev.com/8812fdd2571377c30bbadaf774bdc56c37037194/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.cc
[modify] https://crrev.com/8812fdd2571377c30bbadaf774bdc56c37037194/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.h
[modify] https://crrev.com/8812fdd2571377c30bbadaf774bdc56c37037194/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope_test.cc
[modify] https://crrev.com/8812fdd2571377c30bbadaf774bdc56c37037194/third_party/blink/renderer/modules/animationworklet/animation_worklet_proxy_client_impl.cc

Project Member

Comment 17 by bugdroid1@chromium.org, May 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0b2f6b4c90d1c52d444095396ec2055913054448

commit 0b2f6b4c90d1c52d444095396ec2055913054448
Author: Majid Valipour <majidvp@chromium.org>
Date: Fri May 18 19:48:28 2018

Reland "[animation-worklet] Lazily register animation worklet global scope"

Fix jumbo build issue by s/TestAnimationWorkletProxyClient/MockAnimationWorkletProxyClient/


This is a reland of 0ce9ac1f4475b3ffa835230f2152bca9890bda10

Original change's description:
> [animation-worklet] Lazily register animation worklet global scope
>
> Register Animation Worklet Global Scope (AWGS) with its proxy client only when
> it has at least one animator definition.
>
> Previously AWGS was being registered with its proxy client as soon as it was
> constructed.
>
> Turns out worklet threads may be indefinitely blocked to receive debugger tasks.
> In practice this would happen when devtools is open and page is reloaded.
>
> The was leading to deadlock because it was creating a cycle where compositor
> thread could get indirectly blocked on main via the newly register animation
> worklet global scope. Moreover, registering the global scope too eagerly when it
> does not have any work to do is wasteful.
>
> Lazily registering AWGS addresses the initialization process deadlock and
> it is more efficient.
>
> TEST: modules/animationworklet/animation_worklet_global_scope_test.cc
>      and manual testing based on the instruction in the bug
>
> BUG:  828425 
> Change-Id: Iac8e1653d8a79bd3e27d9d9c5bfaf19174a59664
> Reviewed-on: https://chromium-review.googlesource.com/1062947
> Reviewed-by: Majid Valipour <majidvp@chromium.org>
> Reviewed-by: Stephen McGruer <smcgruer@chromium.org>
> Commit-Queue: Majid Valipour <majidvp@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#559936}

TBR: smcgruer@chromium.org

Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2

Change-Id: I64843615f4af78418682500bb75053bf3f00ce1f
Reviewed-on: https://chromium-review.googlesource.com/1065431
Reviewed-by: Majid Valipour <majidvp@chromium.org>
Reviewed-by: Stephen McGruer <smcgruer@chromium.org>
Commit-Queue: Stephen McGruer <smcgruer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559997}
[modify] https://crrev.com/0b2f6b4c90d1c52d444095396ec2055913054448/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.cc
[modify] https://crrev.com/0b2f6b4c90d1c52d444095396ec2055913054448/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope.h
[modify] https://crrev.com/0b2f6b4c90d1c52d444095396ec2055913054448/third_party/blink/renderer/modules/animationworklet/animation_worklet_global_scope_test.cc
[modify] https://crrev.com/0b2f6b4c90d1c52d444095396ec2055913054448/third_party/blink/renderer/modules/animationworklet/animation_worklet_proxy_client_impl.cc

Status: Fixed (was: Started)
This should be fixed now!

Sign in to add a comment