It should be possible to free the main thread during commit |
||||||
Issue descriptionDuring commit we block the main thread so that the compositor thread can copy data out of the main-thread cc layer structures. However, unless blink is trying to contend with the compositor thread (ie is running the compositor life cycle), it should be free to do other things in the meantime such as js and layout. The suggestion here is to return from commit in ProxyMain immediately, instead of waiting on a mutex. Then in each entry point to the compositor on the main thread, check if we're inside commit, and if so, grab a mutex that gets signaled when commit completes. One tricky thing to solve: we need to signal the main thread *after* cleaning up the commit and running the commit ending stuff. But we need to run those on the main thread. To measure: How much time does the main thread actually stay blocked, is this worth thinking about? This came to mind reading fdoray's proposal doc for base thread locking: https://docs.google.com/document/d/1zRg0P_bQYE8Oc5x5Shrd4CIkIevPiMBzxgq6awRrt-A/edit#heading=h.xgjl2srtytjt
,
Sep 19 2017
Running Motion Mark: http://browserbench.org/MotionMark/, looking at the ProxyMain::BeginMainFrame::commit trace event which measures the main thread waiting on the WaitableEvent: Wall Duration 0.362 ms CPU Duration 0.082 ms Wall Duration 3.403 ms CPU Duration 0.052 ms Wall Duration 2.556 ms CPU Duration 0.083 ms Wall Duration 0.433 ms CPU Duration 0.060 ms Wall Duration 3.997 ms CPU Duration 0.050 ms These are some representative samples during different tests. The thread blocks for up to 3-4ms, while sometimes finishing in < 1ms, though less common by looking at the numbers a bit anecdotally. I am not sure how to collect statistics from traces.
,
Sep 19 2017
I threw a histogram in there:
diff --git a/cc/trees/proxy_main.cc b/cc/trees/proxy_main.cc
index 056a856330de..a28c0ef15f2f 100644
--- a/cc/trees/proxy_main.cc
+++ b/cc/trees/proxy_main.cc
@@ -276,6 +276,7 @@ void ProxyMain::BeginMainFrame(
// coordinated by the Scheduler.
{
TRACE_EVENT0("cc", "ProxyMain::BeginMainFrame::commit");
+ base::TimeTicks start = base::TimeTicks::Now();
DebugScopedSetMainThreadBlocked main_thread_blocked(task_runner_provider_);
@@ -295,6 +296,9 @@ void ProxyMain::BeginMainFrame(
layer_tree_host_, begin_main_frame_start_time,
hold_commit_for_activation));
completion.Wait();
+
+ base::TimeTicks end = base::TimeTicks::Now();
+ UMA_HISTOGRAM_TIMES("Commit.Blocked", end - start);
}
current_pipeline_stage_ = NO_PIPELINE_STAGE;
And ran Motion Mark twice, and get 86.6% of frames blocking < 1ms, and 13.4% of frames blocking more.
Histogram: Commit.Blocked recorded 30622 samples, mean = 0.3 (flags = 0x41)
0 ------------------------------------------------------------------------O (26526 = 86.6%)
1 -----O (1864 = 6.1%) {86.6%}
2 ----O (1443 = 4.7%) {92.7%}
3 --O (631 = 2.1%) {97.4%}
4 O (136 = 0.4%) {99.5%}
5 O (9 = 0.0%) {99.9%}
6 O (7 = 0.0%) {100.0%}
7 O (3 = 0.0%) {100.0%}
8 O (1 = 0.0%) {100.0%}
10 O (0 = 0.0%) {100.0%}
12 O (1 = 0.0%) {100.0%}
14 ...
633 O (1 = 0.0%) {100.0%}
752 ...
,
Sep 19 2017
> For the tricky thing: basically in all the entrypoints, if you're in that > state, you want to reentrantly execute the post-commit main-thread work after > you acquired the mutex That's smart, thanks. Exactly what I was looking for.
,
Sep 19 2017
Poster circle commits are pretty quick, with me scrolling and selecting text. Though one did block the main thread for 12ms x.x
Histogram: Commit.Blocked recorded 637 samples, mean = 0.0 (flags = 0x41)
0 ------------------------------------------------------------------------O (632 = 99.2%)
1 O (2 = 0.3%) {99.2%}
2 ...
4 O (1 = 0.2%) {99.5%}
5 O (0 = 0.0%) {99.7%}
6 O (1 = 0.2%) {99.7%}
7 ...
12 O (1 = 0.2%) {99.8%}
14 ...
I think there's not really many layer commits going on here as the animation is fully specified.
,
Sep 19 2017
,
Sep 19 2017
Re-ran Motion Mark just the Animometer suite from http://browserbench.org/MotionMark/developer.html Similar results. Did see More than 60 frames block more than 5ms. Histogram: Commit.Blocked recorded 14580 samples, mean = 0.3 (flags = 0x41) 0 ------------------------------------------------------------------------O (12652 = 86.8%) 1 ----O (737 = 5.1%) {86.8%} 2 ----O (621 = 4.3%) {91.8%} 3 --O (364 = 2.5%) {96.1%} 4 -O (145 = 1.0%) {98.6%} 5 O (50 = 0.3%) {99.6%} 6 O (7 = 0.0%) {99.9%} 7 O (2 = 0.0%) {100.0%} 8 O (1 = 0.0%) {100.0%} 10 O (0 = 0.0%) {100.0%} 12 O (1 = 0.0%) {100.0%} 14 ...
,
Sep 19 2017
senorblanco@ suggested running the test with fixed complexity instead of the default "ramp" mode to get more consistent results, so here' is that:
Histogram: Commit.Blocked recorded 14351 samples, mean = 0.6 (flags = 0x41)
0 ------------------------------------------------------------------------O (11129 = 77.5%)
1 --O (328 = 2.3%) {77.5%}
2 --------O (1293 = 9.0%) {79.8%}
3 -------O (1129 = 7.9%) {88.8%}
4 ---O (466 = 3.2%) {96.7%}
5 O (1 = 0.0%) {100.0%}
6 O (1 = 0.0%) {100.0%}
7 O (1 = 0.0%) {100.0%}
8 O (1 = 0.0%) {100.0%}
10 ...
14 O (1 = 0.0%) {100.0%}
17 O (0 = 0.0%) {100.0%}
20 O (1 = 0.0%) {100.0%}
24 ...
,
Sep 20 2017
I wrote a hacky prototype and ran motion mark. I got this:
Histogram: Commit.Blocked recorded 3706 samples, mean = 0.0 (flags = 0x41)
0 ------------------------------------------------------------------------O (3703 = 99.9%)
1 O (2 = 0.1%) {99.9%}
2 O (1 = 0.0%) {100.0%}
3 ...
This means that 3706 frames blocked at all, instead of 14351 above. Of those, only 3 were >= 1 ms.
I'm going to build super profiling build and collect some test scores.
,
Sep 20 2017
FTR these 3706 frames are where we requested the commit block until activation, as in that case I leave the main thread blocked for now. We can probably stop doing that too if we think about it.
,
Sep 20 2017
With patch (fixed complexity 50fps): 353.93 -1.73% / +1.70% Score 543.00 474.00 431.00 1276.00 6147.00 56.00 76.00 74.00 318.00 ========== With patch (ramp): 337.22 -6.52% / +4.20% Score 416.02 437.57 460.10 1501.33 5752.02 57.78 71.10 60.52 313.59 ========== Without patch (fixed complexity 50fps): 382.56 -1.74% / +1.70% Score 450.00 507.00 532.00 1442.00 6642.00 63.00 88.00 74.00 368.00 ========== Without patch (ramp): 305.53 -9.03% / +7.18% Score 442.76 459.03 415.98 1143.87 4508.51 44.71 64.09 54.47 341.00 These numbers don't seem very reliable TBH, they vary quite a lot between runs. hrmm. Going to look at a trace and make sure nothing silly is happening also.
,
Sep 20 2017
Looking at traces, it doesn't seem like MotionMark is a good example for this. While the main thread isn't blocking on commit with my patch, it's just sitting mostly idle instead, as it doesn't do much of anything (just a bit of TaskQueueManager::ProcessTaskFromWorkQueue) until the next BeginMainFrame happens after the commit/activation is done. Presumably on other sites, blink would have more to do in between? Also need to look at --enable-main-frame-before-activation, and see if I can make that go before the commit is done even.
,
Sep 20 2017
,
Sep 21
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 21
danakj: Is this something you would like to still investigate? Should we leave this open?
,
Sep 21
We need a real world example of a web page that would be improved by doing this to motivate the work. I think we can leave this open and if someone can come up with a website/benchmark that benefits from releasing the main thread to do work sooner, then we can use that to push ahead. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by piman@chromium.org
, Sep 19 2017