New issue
Advanced search Search tips

Issue 766749 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

It should be possible to free the main thread during commit

Project Member Reported by danakj@chromium.org, Sep 19 2017

Issue description

During commit we block the main thread so that the compositor thread can copy data out of the main-thread cc layer structures.

However, unless blink is trying to contend with the compositor thread (ie is running the compositor life cycle), it should be free to do other things in the meantime such as js and layout.

The suggestion here is to return from commit in ProxyMain immediately, instead of waiting on a mutex. Then in each entry point to the compositor on the main thread, check if we're inside commit, and if so, grab a mutex that gets signaled when commit completes.

One tricky thing to solve: we need to signal the main thread *after* cleaning up the commit and running the commit ending stuff. But we need to run those on the main thread.

To measure: How much time does the main thread actually stay blocked, is this worth thinking about?


This came to mind reading fdoray's proposal doc for base thread locking: https://docs.google.com/document/d/1zRg0P_bQYE8Oc5x5Shrd4CIkIevPiMBzxgq6awRrt-A/edit#heading=h.xgjl2srtytjt
 

Comment 1 by piman@chromium.org, Sep 19 2017

For the tricky thing: basically in all the entrypoints, if you're in that state, you want to reentrantly execute the post-commit main-thread work after you acquired the mutex (note, most likely you still want a WaitableEvent, not a mutex) - except for the callbacks, which you wouldn't want to call reentrantly most likely. Instead, I think it would be ok to call them as soon as we sent the task to the impl thread - they will be called early but the argument is that they will see consistent state if they try to access the compositor (because they would wait then until the state is consistent). Double-subtle, but I think most read accessors don't actually need the wait logic, because for the most part values are not modified on the impl thread or the post-commit work.
Truth is, beyond the callbacks, I think most state change in the post-commit is never read from the impl thread either (with the exception of source_frame_number_), so the reentrant work is very minimal - or we could even pass the source_frame_number_  with the posted task so that it becomes nothing at all.

Comment 2 by danakj@chromium.org, Sep 19 2017

Running Motion Mark: http://browserbench.org/MotionMark/, looking at the ProxyMain::BeginMainFrame::commit trace event which measures the main thread waiting on the WaitableEvent:

Wall Duration	0.362 ms
CPU Duration	0.082 ms

Wall Duration	3.403 ms
CPU Duration	0.052 ms

Wall Duration	2.556 ms
CPU Duration	0.083 ms

Wall Duration	0.433 ms
CPU Duration	0.060 ms

Wall Duration	3.997 ms
CPU Duration	0.050 ms

These are some representative samples during different tests. The thread blocks for up to 3-4ms, while sometimes finishing in < 1ms, though less common by looking at the numbers a bit anecdotally.

I am not sure how to collect statistics from traces.

Comment 3 by danakj@chromium.org, Sep 19 2017

I threw a histogram in there:

diff --git a/cc/trees/proxy_main.cc b/cc/trees/proxy_main.cc
index 056a856330de..a28c0ef15f2f 100644
--- a/cc/trees/proxy_main.cc
+++ b/cc/trees/proxy_main.cc
@@ -276,6 +276,7 @@ void ProxyMain::BeginMainFrame(
   // coordinated by the Scheduler.
   {
     TRACE_EVENT0("cc", "ProxyMain::BeginMainFrame::commit");
+    base::TimeTicks start = base::TimeTicks::Now();
 
     DebugScopedSetMainThreadBlocked main_thread_blocked(task_runner_provider_);
 
@@ -295,6 +296,9 @@ void ProxyMain::BeginMainFrame(
                        layer_tree_host_, begin_main_frame_start_time,
                        hold_commit_for_activation));
     completion.Wait();
+
+    base::TimeTicks end = base::TimeTicks::Now();
+    UMA_HISTOGRAM_TIMES("Commit.Blocked", end - start);
   }
 
   current_pipeline_stage_ = NO_PIPELINE_STAGE;

And ran Motion Mark twice, and get 86.6% of frames blocking < 1ms, and 13.4% of frames blocking more.

Histogram: Commit.Blocked recorded 30622 samples, mean = 0.3 (flags = 0x41)
0    ------------------------------------------------------------------------O (26526 = 86.6%)
1    -----O                                                                    (1864 = 6.1%) {86.6%}
2    ----O                                                                     (1443 = 4.7%) {92.7%}
3    --O                                                                       (631 = 2.1%) {97.4%}
4    O                                                                         (136 = 0.4%) {99.5%}
5    O                                                                         (9 = 0.0%) {99.9%}
6    O                                                                         (7 = 0.0%) {100.0%}
7    O                                                                         (3 = 0.0%) {100.0%}
8    O                                                                         (1 = 0.0%) {100.0%}
10   O                                                                         (0 = 0.0%) {100.0%}
12   O                                                                         (1 = 0.0%) {100.0%}
14   ... 
633  O                                                                         (1 = 0.0%) {100.0%}
752  ... 

Comment 4 by danakj@chromium.org, Sep 19 2017

> For the tricky thing: basically in all the entrypoints, if you're in that
> state, you want to reentrantly execute the post-commit main-thread work after
> you acquired the mutex

That's smart, thanks. Exactly what I was looking for.

Comment 5 by danakj@chromium.org, Sep 19 2017

Poster circle commits are pretty quick, with me scrolling and selecting text. Though one did block the main thread for 12ms x.x

Histogram: Commit.Blocked recorded 637 samples, mean = 0.0 (flags = 0x41)
0   ------------------------------------------------------------------------O (632 = 99.2%)
1   O                                                                         (2 = 0.3%) {99.2%}
2   ... 
4   O                                                                         (1 = 0.2%) {99.5%}
5   O                                                                         (0 = 0.0%) {99.7%}
6   O                                                                         (1 = 0.2%) {99.7%}
7   ... 
12  O                                                                         (1 = 0.2%) {99.8%}
14  ... 

I think there's not really many layer commits going on here as the animation is fully specified.

Comment 6 by danakj@chromium.org, Sep 19 2017

Description: Show this description

Comment 7 by danakj@chromium.org, Sep 19 2017

Re-ran Motion Mark just the Animometer suite from http://browserbench.org/MotionMark/developer.html

Similar results. Did see More than 60 frames block more than 5ms.

Histogram: Commit.Blocked recorded 14580 samples, mean = 0.3 (flags = 0x41)
0   ------------------------------------------------------------------------O (12652 = 86.8%)
1   ----O                                                                     (737 = 5.1%) {86.8%}
2   ----O                                                                     (621 = 4.3%) {91.8%}
3   --O                                                                       (364 = 2.5%) {96.1%}
4   -O                                                                        (145 = 1.0%) {98.6%}
5   O                                                                         (50 = 0.3%) {99.6%}
6   O                                                                         (7 = 0.0%) {99.9%}
7   O                                                                         (2 = 0.0%) {100.0%}
8   O                                                                         (1 = 0.0%) {100.0%}
10  O                                                                         (0 = 0.0%) {100.0%}
12  O                                                                         (1 = 0.0%) {100.0%}
14  ... 

Comment 8 by danakj@chromium.org, Sep 19 2017

senorblanco@ suggested running the test with fixed complexity instead of the default "ramp" mode to get more consistent results, so here' is that:

Histogram: Commit.Blocked recorded 14351 samples, mean = 0.6 (flags = 0x41)
0   ------------------------------------------------------------------------O (11129 = 77.5%)
1   --O                                                                       (328 = 2.3%) {77.5%}
2   --------O                                                                 (1293 = 9.0%) {79.8%}
3   -------O                                                                  (1129 = 7.9%) {88.8%}
4   ---O                                                                      (466 = 3.2%) {96.7%}
5   O                                                                         (1 = 0.0%) {100.0%}
6   O                                                                         (1 = 0.0%) {100.0%}
7   O                                                                         (1 = 0.0%) {100.0%}
8   O                                                                         (1 = 0.0%) {100.0%}
10  ... 
14  O                                                                         (1 = 0.0%) {100.0%}
17  O                                                                         (0 = 0.0%) {100.0%}
20  O                                                                         (1 = 0.0%) {100.0%}
24  ... 

Comment 9 by danakj@chromium.org, Sep 20 2017

I wrote a hacky prototype and ran motion mark. I got this:

Histogram: Commit.Blocked recorded 3706 samples, mean = 0.0 (flags = 0x41)
0  ------------------------------------------------------------------------O (3703 = 99.9%)
1  O                                                                         (2 = 0.1%) {99.9%}
2  O                                                                         (1 = 0.0%) {100.0%}
3  ... 

This means that 3706 frames blocked at all, instead of 14351 above. Of those, only 3 were >= 1 ms.

I'm going to build super profiling build and collect some test scores.
FTR these 3706 frames are where we requested the commit block until activation, as in that case I leave the main thread blocked for now. We can probably stop doing that too if we think about it.
With patch (fixed complexity 50fps):

353.93

-1.73% / +1.70%

Score
543.00
474.00
431.00
1276.00
6147.00
56.00
76.00
74.00
318.00

==========

With patch (ramp):

337.22

-6.52% / +4.20%

Score
416.02
437.57
460.10
1501.33
5752.02
57.78
71.10
60.52
313.59

==========

Without patch (fixed complexity 50fps):

382.56

-1.74% / +1.70%

Score
450.00
507.00
532.00
1442.00
6642.00
63.00
88.00
74.00
368.00


==========

Without patch (ramp):

305.53

-9.03% / +7.18%

Score
442.76
459.03
415.98
1143.87
4508.51
44.71
64.09
54.47
341.00


These numbers don't seem very reliable TBH, they vary quite a lot between runs. hrmm. Going to look at a trace and make sure nothing silly is happening also.
Cc: sunn...@chromium.org
Looking at traces, it doesn't seem like MotionMark is a good example for this.

While the main thread isn't blocking on commit with my patch, it's just sitting mostly idle instead, as it doesn't do much of anything (just a bit of TaskQueueManager::ProcessTaskFromWorkQueue) until the next BeginMainFrame happens after the commit/activation is done. Presumably on other sites, blink would have more to do in between?

Also need to look at --enable-main-frame-before-activation, and see if I can make that go before the commit is done even.
Project Member

Comment 14 by sheriffbot@chromium.org, Sep 21

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Owner: danakj@chromium.org
Status: Assigned (was: Untriaged)
danakj: Is this something you would like to still investigate? Should we leave this open?
Owner: ----
Status: Available (was: Assigned)
We need a real world example of a web page that would be improved by doing this to motivate the work. I think we can leave this open and if someone can come up with a website/benchmark that benefits from releasing the main thread to do work sooner, then we can use that to push ahead.

Sign in to add a comment