New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 807297 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Scheduler crash when exiting nested run loop

Project Member Reported by skyos...@chromium.org, Jan 30 2018

Issue description

Crash link: https://goto.google.com/kpuzq

Most of the crashes seem to occur when exiting a nested run loop, e.g., when printing or unpausing the debugger, e.g.:

	0x000007fed102a587	(chrome_child.dll -intrusive_heap.h:215 )	blink::scheduler::IntrusiveHeap<blink::scheduler::internal::WorkQueueSets::OldestTaskEnqueueOrder>::MoveHoleDownAndFillWithLeafElement(unsigned __int64,blink::scheduler::internal::WorkQueueSets::OldestTaskEnqueueOrder &&)
0x000007fed207b7de	(chrome_child.dll -work_queue_sets.cc:61 )	blink::scheduler::internal::WorkQueueSets::OnFrontTaskChanged(blink::scheduler::internal::WorkQueue *)
0x000007fed207b6a1	(chrome_child.dll -work_queue.cc:113 )	blink::scheduler::internal::WorkQueue::PushNonNestableTaskToFront(blink::scheduler::internal::TaskQueueImpl::Task)
0x000007fed2077e43	(chrome_child.dll -task_queue_impl.cc:887 )	blink::scheduler::internal::TaskQueueImpl::RequeueDeferredNonNestableTask(blink::scheduler::internal::TaskQueueImpl::Task &&,blink::scheduler::internal::Sequence::WorkType)
0x000007fed207816a	(chrome_child.dll -task_queue_manager.cc:210 )	blink::scheduler::TaskQueueManager::OnExitNestedRunLoop()
0x000007fed2193eaf	(chrome_child.dll -run_loop.cc:326 )	base::RunLoop::AfterRun()
0x000007fed2332acc	(chrome_child.dll -ipc_sync_channel.cc:701 )	IPC::SyncChannel::WaitForReplyWithNestedMessageLoop(IPC::SyncChannel::SyncContext *)
0x000007fed11a4d20	(chrome_child.dll -ipc_sync_channel.cc:691 )	IPC::SyncChannel::WaitForReply(mojo::SyncHandleRegistry *,IPC::SyncChannel::SyncContext *,bool)
0x000007fed0efea7b	(chrome_child.dll -ipc_sync_channel.cc:635 )	IPC::SyncChannel::Send(IPC::Message *)
0x000007fed0efe90e	(chrome_child.dll -render_thread_impl.cc:1069 )	content::RenderThreadImpl::Send(IPC::Message *)
0x000007fed373fe5f	(chrome_child.dll -print_render_frame_helper.cc:2024 )	printing::PrintRenderFrameHelper::RequestPrintPreview(printing::PrintRenderFrameHelper::PrintPreviewRequestType)
0x000007fed354c811	(chrome_child.dll -render_frame_impl.cc:1652 )	content::RenderFrameImpl::ScriptedPrint(bool)
0x000007fed337ff92	(chrome_child.dll -ChromeClient.cpp:249 )	blink::ChromeClient::Print(blink::LocalFrame *)
 
Cc: -alexclarke@chromium.org altimin@chromium.org
Components: Blink>Scheduling
Labels: -Pri-3 Pri-2
Owner: alexclarke@chromium.org
Updated link will all Chrome versions and filtering by 'TaskQueueManager::OnExitNestedRunLoop': https://goto.google.com/djdiw

The crashes started around Jan 6, therefore suspecting https://chromium-review.googlesource.com/817595.
Project Member

Comment 2 by bugdroid1@chromium.org, Feb 1 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e803632e9b6c578015ef112babf8c3fc515d527b

commit e803632e9b6c578015ef112babf8c3fc515d527b
Author: Alexander Timin <altimin@chromium.org>
Date: Thu Feb 01 11:56:00 2018

[scheduler] Fix crash involving fences and non-nestable tasks.

When a non-nestable task is posted inside a nested message loop and a
queue blocked by fence subsequently that means that a blocked queue
will have work posted before inserting a fence and will become unblocked.

Check for this case to correctly notify WorkQueueSets about this
situation (OnTaskPushedToEmptyQueue instead of OnFrontTaskChanged).

R=alexclarke@chromium.org
BUG= 807297 

Change-Id: I66acf84003da655771b4f7a44f77b519ce72170b
Reviewed-on: https://chromium-review.googlesource.com/895587
Commit-Queue: Alexander Timin <altimin@chromium.org>
Reviewed-by: Alex Clarke <alexclarke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#533642}
[modify] https://crrev.com/e803632e9b6c578015ef112babf8c3fc515d527b/third_party/WebKit/Source/platform/scheduler/base/task_queue_manager_unittest.cc
[modify] https://crrev.com/e803632e9b6c578015ef112babf8c3fc515d527b/third_party/WebKit/Source/platform/scheduler/base/work_queue.cc

Project Member

Comment 3 by sheriffbot@chromium.org, Feb 6 2018

Labels: FoundIn-M-65 OS-Windows Fracas
Users experienced this crash on the following builds:

Win Dev 65.0.3325.31 -  0.11 CPM, 42 reports, 41 clients (signature blink::scheduler::IntrusiveHeap<blink::scheduler::internal::WorkQueueSets::OldestTaskEnqueueOrder>::MoveHoleDownAndFillWithLeafElement)

If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates.

- Go/Fracas
Cc: brajkumar@chromium.org
Still crash instances are observed on chrome latest beta #65.0.3325.51 with 128 instances. As per the below link observing continuous crashes on latest M65. Currently this crash is ranked as number #30 under renderer process for windows platform. As of no crash instances are seen on market dev and canary builds. Last crash is seen on #66.0.3336.6 with 1 instance. 

Link to list of the builds:
----------------------------
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20%20AND%20expanded_custom_data.ChromeCrashProto.ptype%3D%27renderer%27%20AND%20expanded_custom_data.ChromeCrashProto.magic_signature_1.name%3D%27blink%3A%3Ascheduler%3A%3AIntrusiveHeap%3Cblink%3A%3Ascheduler%3A%3Ainternal%3A%3AWorkQueueSets%3A%3AOldestTaskEnqueueOrder%3E%3A%3AMoveHoleDownAndFillWithLeafElement%27#-samplereports,productversion:1000,-magicsignature:50,-magicsignature2:50,-stablesignature:50,-magicsignaturesorted:50

alexclarke@ Could you please take a look in to this issue?

Thanks!
Labels: Merge-Request-65
Owner: altimin@chromium.org
There are no crashes on canary since 66.0.3336.6, so we just need to merge the fix back to M65.
Project Member

Comment 6 by sheriffbot@chromium.org, Feb 14 2018

Labels: -Merge-Request-65 Merge-Review-65 Hotlist-Merge-Review
This bug requires manual review: M65 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 7 by gov...@chromium.org, Feb 14 2018


Before we approve merge to M65, could you pls confirm followings?

Is the change well baked/verified in Canary, having enough automation tests coverage and safe to merge?
Any other imp details to justify the merge.

Please note M65 is already promoted to Beta so merge bar is very high. Thank you.

Re 7: Yes, it's been in Canary for 2 weeks, has test coverage and the number of crashes went down.

Comment 9 by gov...@chromium.org, Feb 16 2018

Labels: -Merge-Review-65 Merge-Approved-65
Approving merge to M65 branch 3325 based on comment #8. Please merge ASAP so we can pick it up for next week beta release. Thank you.
Project Member

Comment 10 by sheriffbot@chromium.org, Feb 19 2018

Cc: gov...@chromium.org
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 11 by bugdroid1@chromium.org, Feb 20 2018

Labels: -merge-approved-65 merge-merged-3325
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/586ef45901b422fcf8e89600a1ad00024679223c

commit 586ef45901b422fcf8e89600a1ad00024679223c
Author: Alexander Timin <altimin@chromium.org>
Date: Tue Feb 20 17:58:38 2018

[scheduler] Fix crash involving fences and non-nestable tasks.

When a non-nestable task is posted inside a nested message loop and a
queue blocked by fence subsequently that means that a blocked queue
will have work posted before inserting a fence and will become unblocked.

Check for this case to correctly notify WorkQueueSets about this
situation (OnTaskPushedToEmptyQueue instead of OnFrontTaskChanged).

R=alexclarke@chromium.org
TBR=altimin@chromium.org
BUG= 807297 

(cherry picked from commit e803632e9b6c578015ef112babf8c3fc515d527b)

Change-Id: I66acf84003da655771b4f7a44f77b519ce72170b
Reviewed-on: https://chromium-review.googlesource.com/895587
Commit-Queue: Alexander Timin <altimin@chromium.org>
Reviewed-by: Alex Clarke <alexclarke@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#533642}
Reviewed-on: https://chromium-review.googlesource.com/926941
Reviewed-by: Alexander Timin <altimin@chromium.org>
Cr-Commit-Position: refs/branch-heads/3325@{#507}
Cr-Branched-From: bc084a8b5afa3744a74927344e304c02ae54189f-refs/heads/master@{#530369}
[modify] https://crrev.com/586ef45901b422fcf8e89600a1ad00024679223c/third_party/WebKit/Source/platform/scheduler/base/task_queue_manager_unittest.cc
[modify] https://crrev.com/586ef45901b422fcf8e89600a1ad00024679223c/third_party/WebKit/Source/platform/scheduler/base/work_queue.cc

Status: Fixed (was: Assigned)

Sign in to add a comment