New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 901345 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on: View detail
issue 901800
issue 901805

Blocking:
issue 891670



Sign in to add a comment

Lock-free cross-thread task posting support

Project Member Reported by altimin@chromium.org, Nov 2

Issue description

At the moment ThreadController uses lock to protect against race conditions when creating the pump. Port lock-free logic from MessageLoop to avoid locking.
 
Cc: gab@chromium.org
Labels: -Pri-3 Pri-1
Summary: Lock-free cross-thread task posting support (was: Remove locking from ThreadControllerWithMessagePump)
It turns out that we need to implement lock-free task posting in SequenceManager.
- We'll need to copy atomic trick from MessageLoop to TaskQueueImpl
- DelayedIncomingQueue can become lock-free.
- We can refactor TimeDomain to be const and avoid locking there. 
Blocking: 891670
Recap of offline discussion : The most important part is not lock-free logic at creation. It's lock-free logic while invoking ScheduleWork() to avoid regressing  issue 890978 .

The MessageLoop logic does tell MessageLoopTaskRunner to stop accepting tasks during shutdown. But since it invokes ScheduleWork() without a lock, we need the atomic detach logic in MessageLoop::Controller to ensure no use-after-frees (i.e. MessageLoop::Controller is the transactional funnel point to all post tasks + detach).

This lock is currently by far the most held lock on the Blink main thread : https://uma.googleplex.com/p/chrome/callstacks?sid=4fac8fb314eef317844ae3377042b066 (though @ 0.5% it's not as bad as  issue 890978  was in the browser; likely because the browser uses background priority threads more and suffered more from priority inversions induced by this).

On BrowserThread::UI, that improvement reduced LockImpl::Lock overhead on the main thread by 66% at runtime (https://uma.googleplex.com/p/chrome/callstacks?sid=2272e8f261252fc2f5e75502b96d45ca) and by 85% on startup (https://uma.googleplex.com/p/chrome/callstacks?sid=987a001cf2533ab9fe77ef7d0ceb9f9d)
Note that's only true when ThreadControllerWithMessagePumpImpl is used.  We should think about migrating blink over to ThreadControllerWithMessagePumpImpl. 
No I'm wrong. We do have a lock inside ThreadControllerWithMessagePumpImpl::ScheduleWork.  That should use atomics instead.
You hold a lock for the entirety of any cross-thread PostTask (which are the most likely to ScheduleWork()) @ https://cs.chromium.org/chromium/src/base/task/sequence_manager/task_queue_proxy.cc?type=cs&q=file:task_queue_proxy.cc+AutoLock&g=0&l=34

This is specifically the scenario for which MessageLoopTaskRunner releases the queue lock before relying on MessageLoop::Controller to ScheduleWork() atomically.
Blockedon: 901805 901800
Re. #5-6 : So you have two problematic locks to address:
 1) ThreadControllerWithMessagePumpImpl::ScheduleWork()
 2) TaskQueueProxy::PostTask() (will need to keep the lock for queuing but must release it before scheduling work)
πŸ“ Job complete. See results below.
https://pinpoint-dot-chromeperf.appspot.com/job/168f665be40000
πŸ“ Job complete. See results below.
https://pinpoint-dot-chromeperf.appspot.com/job/1004de60140000
Project Member

Comment 13 by bugdroid1@chromium.org, Nov 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0554cbe7f341f1e84ccbeb0953270c197e1990ef

commit 0554cbe7f341f1e84ccbeb0953270c197e1990ef
Author: Carlos Caballero <carlscab@google.com>
Date: Tue Nov 20 12:18:50 2018

[task] New OperationsController class

Create a helper class to manage critical multi-threaded operations
without locks.

This class will be used to remove the lock in
base/task/sequence_manager/task_queue_proxy.h and to replace similar
code in base/task/task_scheduler/task_tracker.h, and
base/message_loop/message_loop_impl.cc

Inspired from MessageLoopImpl::Controller.

Bug: 901345
Change-Id: I56d2f2143e396c7bc81a76a23fe4731d280455d1
Reviewed-on: https://chromium-review.googlesource.com/c/1335942
Commit-Queue: Carlos Caballero <carlscab@google.com>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Reviewed-by: Alex Clarke <alexclarke@chromium.org>
Reviewed-by: Sami KyΓΆstilΓ€ <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#609670}
[modify] https://crrev.com/0554cbe7f341f1e84ccbeb0953270c197e1990ef/base/BUILD.gn
[add] https://crrev.com/0554cbe7f341f1e84ccbeb0953270c197e1990ef/base/task/common/operations_controller.cc
[add] https://crrev.com/0554cbe7f341f1e84ccbeb0953270c197e1990ef/base/task/common/operations_controller.h
[add] https://crrev.com/0554cbe7f341f1e84ccbeb0953270c197e1990ef/base/task/common/operations_controller_unittest.cc

Project Member

Comment 14 by bugdroid1@chromium.org, Nov 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5458566c937877111800a8124c895b42a34108f8

commit 5458566c937877111800a8124c895b42a34108f8
Author: Carlos Caballero <carlscab@google.com>
Date: Wed Nov 21 09:09:06 2018

Use OperationsController in MessageLoopImpl::Controller

Replace all the state checking and tracking code with the new and shiny
OperationsController.

Bug: 901345
Change-Id: I8023e78ebfee679fd565de465e810dbb9c6ee2ec
Reviewed-on: https://chromium-review.googlesource.com/c/1341525
Commit-Queue: Gabriel Charette <gab@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#609960}
[modify] https://crrev.com/5458566c937877111800a8124c895b42a34108f8/base/message_loop/message_loop_impl.cc

Project Member

Comment 15 by bugdroid1@chromium.org, Nov 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1e6a6033f1b2927d266240ac0521f1807aa73ca0

commit 1e6a6033f1b2927d266240ac0521f1807aa73ca0
Author: Carlos Caballero <carlscab@google.com>
Date: Wed Nov 21 10:22:10 2018

Replace lock with base::internal::OperationsController

BUG=901345

Change-Id: I90afa3439fedc68032ab69158d2ecfae2ba52884
Reviewed-on: https://chromium-review.googlesource.com/c/1343088
Commit-Queue: Carlos Caballero <carlscab@google.com>
Reviewed-by: Alex Clarke <alexclarke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#609984}
[modify] https://crrev.com/1e6a6033f1b2927d266240ac0521f1807aa73ca0/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc
[modify] https://crrev.com/1e6a6033f1b2927d266240ac0521f1807aa73ca0/base/task/sequence_manager/thread_controller_with_message_pump_impl.h

Project Member

Comment 16 by bugdroid1@chromium.org, Nov 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/967d3a9b2ec4a39d32721d98837c774982f5269d

commit 967d3a9b2ec4a39d32721d98837c774982f5269d
Author: Carlos Caballero <carlscab@google.com>
Date: Wed Nov 21 21:29:23 2018

Nit: Remove superfluous global namespace prefix

Bug: 901345
Change-Id: I3fb2bcdab9a61fc818da6a4cc6439635f9421b31
Reviewed-on: https://chromium-review.googlesource.com/c/1346834
Reviewed-by: Gabriel Charette <gab@chromium.org>
Commit-Queue: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#610220}
[modify] https://crrev.com/967d3a9b2ec4a39d32721d98837c774982f5269d/base/task/sequence_manager/thread_controller_with_message_pump_impl.h

Project Member

Comment 18 by bugdroid1@chromium.org, Jan 16 (6 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3887ea2cf3939e25f397440da3cedfb32df9f2a0

commit 3887ea2cf3939e25f397440da3cedfb32df9f2a0
Author: Carlos Caballero <carlscab@google.com>
Date: Wed Jan 16 13:37:41 2019

Remove thread checks in OperationsController

There is no need to perform those checks in this class. It should be up
to the user to determine what the threading requirements are. In
particular there are use cases where StartAcceptingOperations and
ShutdownAndWaitForZeroOperations are called from different threads.

Bug: 901345

Change-Id: I15be7eddd49d490bff398f2a06b029c66c3bc8f9
Reviewed-on: https://chromium-review.googlesource.com/c/1407191
Commit-Queue: Carlos Caballero <carlscab@google.com>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#623215}
[modify] https://crrev.com/3887ea2cf3939e25f397440da3cedfb32df9f2a0/base/message_loop/message_loop_impl.cc
[modify] https://crrev.com/3887ea2cf3939e25f397440da3cedfb32df9f2a0/base/task/common/operations_controller.cc
[modify] https://crrev.com/3887ea2cf3939e25f397440da3cedfb32df9f2a0/base/task/common/operations_controller.h

Project Member

Comment 19 by bugdroid1@chromium.org, Jan 16 (6 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/71484617b8149e9c6f6572006d384aa609bb2b93

commit 71484617b8149e9c6f6572006d384aa609bb2b93
Author: Carlos Caballero <carlscab@google.com>
Date: Wed Jan 16 14:28:39 2019

Remove TaskQueueProxy

Simplify the code by using an base::internal::OperationsController. This
will also get rid of of an unnecessary lock.

Bug: 901345

Change-Id: Ie78b3338d9ab6643f30834167a8159f02bd7e5cd
Reviewed-on: https://chromium-review.googlesource.com/c/1407078
Commit-Queue: Carlos Caballero <carlscab@google.com>
Reviewed-by: Alex Clarke <alexclarke@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#623227}
[modify] https://crrev.com/71484617b8149e9c6f6572006d384aa609bb2b93/base/BUILD.gn
[modify] https://crrev.com/71484617b8149e9c6f6572006d384aa609bb2b93/base/task/sequence_manager/task_queue.cc
[modify] https://crrev.com/71484617b8149e9c6f6572006d384aa609bb2b93/base/task/sequence_manager/task_queue_impl.cc
[modify] https://crrev.com/71484617b8149e9c6f6572006d384aa609bb2b93/base/task/sequence_manager/task_queue_impl.h
[delete] https://crrev.com/85f4cd7f81ea9fe24d70d3de8e6ae13b78c46846/base/task/sequence_manager/task_queue_proxy.cc
[delete] https://crrev.com/85f4cd7f81ea9fe24d70d3de8e6ae13b78c46846/base/task/sequence_manager/task_queue_proxy.h
[delete] https://crrev.com/85f4cd7f81ea9fe24d70d3de8e6ae13b78c46846/base/task/sequence_manager/task_queue_task_runner.cc
[delete] https://crrev.com/85f4cd7f81ea9fe24d70d3de8e6ae13b78c46846/base/task/sequence_manager/task_queue_task_runner.h

Project Member

Comment 20 by 42576172...@developer.gserviceaccount.com, Jan 16 (6 days ago)

😿 Pinpoint job stopped with an error.
https://pinpoint-dot-chromeperf.appspot.com/job/1588ffa8540000

All of the runs failed. The most common error (1/20 runs) was:
IOError: [Errno 2] No such file or directory: '/b/s/w/itbqk7Mj/tmpW8rrfutelemetry/histograms.json'

Sign in to add a comment