Investigate using Callback::IsCancelled() in MessageLoop / TaskScheduler |
|||||||
Issue descriptionI'm going to measure how many tasks are actually cancelled in the current MessageLoop / WorkerPool implementations. If we see a fair number of cancellations, we may want to implement the same optimization Blink's task scheduler uses.
,
Oct 6 2016
You're probably thinking of this which was for the renderer: https://docs.google.com/document/d/1IummzxGNDkgBdPOZFxyPPFhPIGLqrB5N8UpX65vOEjk/edit I don't have any intuition over how frequent task cancellation is for the browser, however I would not be surprised if optimizations do proved worthwhile.
,
Oct 6 2016
Measuring quantity should be pretty straightforward: I'm just going to instrument TaskAnnotator::RunTask and see what the raw numbers look like. Measuring time saved will require an actual perf test. The good news: MessagePump already has a perf test. Yay! The bad news: the perf test hasn't actually been run in ages, as it crashes somewhere in the middle =)
,
Oct 6 2016
The TaskScheduler has no workload currently by default so you won't see much there :-). We have a work item for ourselves to look into this once we do have a large workload.
,
Oct 6 2016
I threw together a rough patch [1] to measure cancellations and did some emails / code review in the built Chrome binary.
The current numbers look something like this:
Histogram: TaskRunner.DispatchedTaskIsCancelled recorded 2801777 samples, mean = 0.0 (flags = 0x1)
0 ------------------------------------------------------------------------O (2735971 = 97.7%)
1 --O (65806 = 2.3%) {97.7%}
2 O (0 = 0.0%) {100.0%}
Histogram: TaskRunner.ExecutingDelayedTaskIsCancelled recorded 437362 samples, mean = 0.1 (flags = 0x41)
0 ------------------------------------------------------------------------O (384761 = 88.0%)
1 ----------O (52601 = 12.0%) {88.0%}
2 O (0 = 0.0%) {100.0%}
Histogram: TaskRunner.ScheduledTaskIsCancelled recorded 260288 samples, mean = 0.1 (flags = 0x41)
0 ------------------------------------------------------------------------O (229681 = 88.2%)
1 ----------O (30607 = 11.8%) {88.2%}
2 O (0 = 0.0%) {100.0%}
Histogram: TaskRunner.WokeUpTooSoonTaskIsCancelled recorded 3142498 samples, mean = 0.1 (flags = 0x41)
0 ------------------------------------------------------------------------O (2811652 = 89.5%)
1 --------O (330846 = 10.5%) {89.5%}
2 O (0 = 0.0%) {100.0%}
Several things to note:
- The message loop mostly dispatches immediate tasks:
we dispatched ~2.8M immediate tasks and only ~430K delayed tasks.
- Cancelled task % has been trending down slowly over time.
- Unclear why we end up in WokeUpTooSoon so much: I suppose there's
things other than
In addition, the cost of dispatching / scheduling for the delayed task is unclear. It turns out we have a MessagePump perf test, which still builds. However, no one has run it in a long time it seems: the perftest aborts before the run is complete.
[1] https://codereview.chromium.org/2398263002/
,
Oct 7 2016
Interesting. The ratio of cancelled tasks seems high enough that we might want to prune them -- especially for the extra wakeups. WDYT?
,
Oct 7 2016
Re. WokeUpTooSoon (on Windows) : from this doc (https://docs.google.com/document/d/1dYaoZTcdzFQCf7hg3azMontkjzF_zzcKEtxPb7VET6c/edit), it appears we might be under sleeping by a tiny amount fairly often on Windows (fdoray repro'ed this in the lab for task scheduler too and we are exploring solutions): "when message pump waits for a delayed work it is very likely to get out of sleep before it is time to run the earliest delayed task"
,
Oct 7 2016
Actually it does look like an attempt to address this was made : https://cs.chromium.org/chromium/src/base/message_loop/message_pump_default.cc?rcl=1475824766&l=60 but under sleeping is still a possibility (ref. WaitableEvent::TimedWait() in waitable_event_win.cc).
,
Oct 7 2016
Yeah, I think it might be worth trying. I'll see what the numbers look like after leaving Chrome idle overnight when I get in today. I actually prepared a patch for this [1] before I wrote the metrics, but it appears I don't know how to use the perf trybots. One other thing I'd like to try out is measuring the cost of running cancelled tasks, but I need to fix the broken perftest first. [1] https://codereview.chromium.org/2394833004/
,
Nov 7 2016
,
May 9 2017
,
May 9 2017
Ping, any update? Issue 716187 is an example where a cancellation functionality is handy. In the network stack, we post a delayed task for every socket connect. These socket connects finish quickly, but the delayed tasks are left over in IO thread's task queue for a long time. The more network requests we make the longer our delayed task queue is. It will be nice if we have a general solution for problems like this, as other parts of the codebase do use similar patterns (https://cs.chromium.org/search/?q=%26%5Ba-z:%5D*timeout+package:%5Echromium$&type=cs)
,
May 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6a32dcf0c4968f022b2f603912ce2804606636e1 commit 6a32dcf0c4968f022b2f603912ce2804606636e1 Author: Daniel Cheng <dcheng@chromium.org> Date: Wed May 31 21:32:10 2017 Use task cancellation status in base::MessageLoop. MessageLoop now sweeps cancelled tasks from the front of the work queue before dispatching or scheduling work. Previously, MessageLoop did not use the task cancellation status. It often scheduled wakeups for cancelled tasks or dispatched cancelled tasks, relying on base::Callback to turn it into a no-op. This was problematic for several reasons: - Scheduling wakeups based on cancelled tasks means the wakeup is spurious. - MessageLoop only processes one non-delayed and one delayed task per tick, so having many cancelled tasks results in spinning the message loop for awhile to drain them. Bug: 653394 Change-Id: I7287d5d430c1791535f7274c56c40f1e3f8669a5 Reviewed-on: https://chromium-review.googlesource.com/505274 Commit-Queue: Daniel Cheng <dcheng@chromium.org> Reviewed-by: danakj <danakj@chromium.org> Reviewed-by: Mike Wittman <wittman@chromium.org> Reviewed-by: Alexei Svitkine <asvitkine@chromium.org> Reviewed-by: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#476033} [modify] https://crrev.com/6a32dcf0c4968f022b2f603912ce2804606636e1/base/message_loop/message_loop.cc [modify] https://crrev.com/6a32dcf0c4968f022b2f603912ce2804606636e1/base/message_loop/message_loop.h [modify] https://crrev.com/6a32dcf0c4968f022b2f603912ce2804606636e1/chrome/browser/metrics/thread_watcher.cc [modify] https://crrev.com/6a32dcf0c4968f022b2f603912ce2804606636e1/chrome/browser/metrics/thread_watcher.h [modify] https://crrev.com/6a32dcf0c4968f022b2f603912ce2804606636e1/chrome/browser/metrics/thread_watcher_unittest.cc
,
Jun 7 2017
,
Mar 13 2018
Hey Daniel, I'm going to assume you're not actively working on this. I'll pick this up as part of addressing issue 786597 and issue 716187 which may need to drive this further. Thanks for prior work here!
,
Jul 20
How that the thread-safe WeakPtr::MaybeValid() and Callback::MaybeCancelled() have landed (or almost : https://chromium-review.googlesource.com/c/chromium/src/+/1144208). I've put more thought into how we can do optimistic cancellation of TaskScheduler delayed tasks. The tricky part is that task_scheduler::DelayedTaskManager posts its delayed tasks to service thread's MessageLoop (task_scheduler::ServiceThread). But MessageLoop doesn't know that it's not safe to delete these tasks there if cancelled. So while MessageLoop (IncomingTaskQueue::DelayedQueue::HasTasks()) can start checking MaybeCancelled() instead of IsCancelled()), the tasks themselves need to ensure deletion on the proper sequence. To do this, I think DelayedTaskManager::AddDelayedTaskNow() has to post a relay class that will do the right thing: i.e. void DelayedTaskManager::AddDelayedTaskNow( Task task, TimeDelta delay, PostTaskNowCallback post_task_now_callback) { ... // Posts |task| when destroyed. This ensure that |task| is handed to the right // sequence whether the underlying MessageLoop intended to run it or delete it // (e.g. if cancelled -- in that case it will no-op when it gets on its // sequence and be properly deleted there as its BindState expects). struct DelayedTaskRelay { ~DelayedTaskRelay() { std::move(post_task_now_callback).Run(std::move(task)); } PostTaskNowCallback post_task_now_callback; Task task; }; service_thread_task_runner_->PostDelayedTask( FROM_HERE, BindOnce(DoNothing<DelayedTaskRelay>(), DelayedTaskRelay{std::move(post_task_now_callback), std::move(task)}), delay); ... } Also : the DelayedTaskRelay will need custom CallbackCancellationTraits in order to forward the cancellation state of |task|.
,
Jul 23
Rather than the overhead of the relay wrapper in DelayedTaskManager, could we collate the list of cancelled Tasks, when we check MaybeCancelled(), into e.g. a vector<> that can be posted to the target Sequence to delete them?
,
Jul 23
The thing is, it's the ServiceThread's MessageLoop which holds the delayed tasks. But MessageLoop doesn't know where it's supposed to run/delete tasks (it assumes on its bound thread -- and ServiceManager gives it an indirection for tasks to bounce when ran (never cancelled at the moment per indirection losing the IsCancelled() bit)). To do what you have in mind we'd have to have DelayedTaskManager keep the delayed tasks and only send the next shortest delay to MessageLoop (not a bad idea and in fact that's what I want to do with base::Timer : https://docs.google.com/document/d/11gcfpL-orj8SWY8bUBa1_ZGe8kPLUcjtGf2_WD7nBKw/edit)
,
Jul 24
Re #18: Yes, the idea would basically be to batch up cancelled tasks for deletion - I believe the only hard constraint is that each cancelled task must be deleted before any later task runs, so so long as we keep the batch in-order, and ensure to release the tasks in it sometime before the next non-cancelled task that follows them, then we otherwise have flexibility in precisely when/how that happens.
,
Jul 24
Deleting now vs later is not the tricky/costly part though. The tricky part is for the cancellation to be picked up by MessageLoop and somehow forwarded where it should happen.
,
Jul 25
,
Jul 30
Re #20: How does the service-thread know where to _run_ the delayed tasks, but not know where to delete them? Are you already wrapping delayed tasks with the sequence TaskRunner, so that when they're Run() they get PostTask'd to the sequence, or something?
,
Jul 31
Yes, see DelayedTaskManager::PostTaskNowCallback (these are what's queued on the service thread's MessageLoop) |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by skyos...@chromium.org
, Oct 6 2016