Issue metadata
Sign in to add a comment
|
132.9% regression in smoothness.top_25_smooth at 367701:367719 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Mar 4 2016
This regression has occurred twice - the only patch common to both cases is https://chromium.googlesource.com/chromium/src/+/7f045e4f2e9f7142c79e2eeb30b70716a1ea8f0b. This looks extremely likely to be the issue. sievers@, can you take a look?
,
Mar 14 2016
friendly ping
,
Mar 15 2016
Sorry it'll still be a few days until I'll be able to get to this. So sorry.
,
Mar 25 2016
sievers: ping?
,
Mar 25 2016
How does one find out what's being measured where in the code for a metric?
,
Mar 25 2016
(code search for 'mean_input_event_latency' is not very conclusive)
,
Mar 25 2016
48.0.2564.95 does not have my change. 49.0.2623.91 does have my change. Now I've looked at all the interesting histograms computed in render_widget_host_latency_tracker.cc:ComputeInputLatencyHistograms() which seem to cover anything in between touch event and swap happening on the GPU thread in various combinations. But I don't see any interesting changes there in the mean values between the two releases.
,
Mar 25 2016
Oh and I also missed that this is specific to youtube.com, where instead of around 12ms we now have a more unstable measure between roughly 9 an 28ms. That looks like an extra frame's time (on top of the 12ms baseline) in a lot of the measurements. But this really needs understanding what we're measuring here.
,
Mar 28 2016
tdresser, vmiura, is there documentation on mean_input_event_latency anywhere? The smoothness benchmark documentation links to http://www.chromium.org/developers/design-documents/rendering-benchmarks but it does not include anything about input.
,
Mar 29 2016
rbyers@ could you help with the question above on mean_input_event_latency?
,
Mar 29 2016
I've added: "mean_input_event_latency: time between receiving a touchmove event and performing the frame swap for the corresponding scroll event." To https://sites.google.com/a/chromium.org/dev/developers/design-documents/rendering-benchmarks. Is that enough information?
,
Mar 29 2016
So basically the mean of this?
void ComputeScrollLatencyHistograms(
...
for (size_t i = 0; i < original_component.event_count; i++) {
UMA_HISTOGRAM_TOUCH_TO_SCROLL_LATENCY(
"Event.Latency.TouchToScrollUpdateSwapBegin", original_component,
gpu_swap_begin_component);
}
,
Mar 30 2016
Yup, it's the same values. mean_input_event_latency doesn't give you any results because the mean is prepended separately. See https://code.google.com/p/chromium/codesearch#chromium/src/third_party/catapult/telemetry/telemetry/web_perf/metrics/smoothness.py&l=69 for a reasonable place to start, or https://code.google.com/p/chromium/codesearch#chromium/src/third_party/catapult/telemetry/telemetry/web_perf/metrics/rendering_stats.py&l=66 where the latency is actually determined.
,
Mar 30 2016
Cool, thanks, that'll get me started - I'll need to look at some traces.
,
Apr 1 2016
Ok here's what I've found out so far: The slow frames that are distorting the mean have a browser composite (besides just being a frame from the renderer) which is caused by the top-controls logic (omnibox hiding... we disable the actual movement for tests... I think to reduce noise, but still go through the code.. so looks like we did that at the wrong level). The reason that this shows up on youtube only is I think because the scroll is really short on this page. I get around 15 frames of which the ones around the ends (which trigger the top-controls path and browser composite) are slow which is good enough to distort the mean. I do see that there is some slight change in scheduling now that the browser compositor uses cc scheduler. It overall does more work, and it splits work up into more tasks so things drift a little more. I'm not sure this explains it alone yet, I have to try to restore the old behavior and compare side-by-side. Again note that this only affects the frames when the top controls are shifting. One other thing that stands out and could be responsible related to that is this sequence I see now on bad frames: BC: browser compositor DC: display scheduler 1) VSYNC 2) BC_beginframe 3) BC_NotifyReadyToCommit 4) OnSwapCompositorFrame (from renderer) 5) BC_OnBeginImplFrameDeadline 6) DS_OnBeginImplFrameDeadline *IIUC* and from following the scroll update flow event, this actually adds a frame of latency because the browser commit 3) happens before 4) so the display scheduler is probably latching the previous frame (remember it has to for it to match the top controls scroll offset - we force that by deleting the renderer's Surface in 4)). This would explain things since before my patch we'd have more likely forced the sequence of: OnSwapCompositorFrame->Browser_Composite. And finally +enne@ for just something flaky I noticed on the way: The CommitVSyncParameters() logic and how it interacts with the display scheduler is probably a bit prone to drift. When we call CommitVSyncParameters() is (totally unintentionally) dependent on whether there is a browser composite (and that requests a vsync). This will totally get better with the UBF stuff enne@ is doing. I don't think it's the main problem here though.
,
Apr 1 2016
The display scheduler latches the previous frame because 4 happens before 5, right? You said it happens because 3 happens before 4 and I don't see why that's a problem.
,
Apr 1 2016
I actually haven't dug into how this all works in detail wrt display scheduler and surface aggregation. In RWHVAndroid we destroy the current surface and recreate the browser compositor's SurfaceLayer if the top control offset changes in 4). And FWIU that has the intended side effect of needing a new browser frame for this new renderer frame. So I was guessing that a browser commit after 4) would produce a proper Surface state for all of this to go on the screen together.
,
Apr 1 2016
So in case it wasn't clear: the intention of the code is to basically bypass the direct Surface submission from the renderer when the top control offset changes because there is a layer in the *browser* Surface with a texture for the omnibox which needs to be positioned according to the scroll offset of the renderer Surface's current frame (send through CompositorFrame metadata). But my comments might be inaccurate in terms of how and if that actually works in detail.
,
Apr 1 2016
,
Apr 4 2016
Issue 600375 has been merged into this issue.
,
Apr 5 2016
Ok looks like I can't defer the browser compositor with a deadline to wait for the renderer frame as long as there is an update to the main tree involved, because BeginMainFrame will still happen immediately and if UpdateLayerTreeHost() causes any 'needs commit' then it will clobber my deadline to immediately. On the other hand, deferring BF itself also seems like a hack. Sunny suggests that it'd be easier if the animations on the browser compositor tree were impl-side. So that'd have to at least cover omnibox hiding, but maybe also overscroll glow.
,
Apr 7 2016
I can restore the previous ordering which yields lower latency by simply posting an immediate BeginFrame when NeedsBeginFrames() turns on (when the renderer frame arrived and the change to the SurfaceLayer causes SetNeedsCommit).
Then on the next vsync it will defer the BeginFrame because nothing changed (*) but immediately post it when the top controls offset changes again on the next OnSwapCompositorFrame. And so forth.
(*)
I did have to remove this requestRender() call here (which causes SetNeedsAnimate() for no reason):
CompositorViewHolder.java:
public void onVisibleContentOffsetChanged(float offset) {
mLastVisibleContentOffset = offset;
propagateViewportToLayouts(getWidth(), getHeight());
requestRender(); <---
}
,
Apr 8 2016
Ok two more things I noticed while debugging this: a) retro begin frame args cause badness because we don't clean them up when we go idle and turn off the BFS. Once we come back from idle it causes a skipped deferred first frame (effectively pushing us into high latency mode), I think because the timestamp is too old. b) we are not particularly good with SetNeedsBeginMainFrame() so you can also end up in high latency mode because we start another BeginMainFrame though there were no changes. This is because WillBeginMainFrame() can go on and off repeatedly before there is an actual BeginFrame started. For example if you request SetNeedsCommit() and SetNeedsAnimate() from the same callstack, it causes two BeginMainFrames with SingleThreadProxy.
,
Apr 15 2016
Were you able to land the changes you described?
,
Apr 18 2016
Not yet, still discussing what really needs to happen for this to work best, while at the same time the BeginFrame plumbing is changing (also see crbug.com/604007 which I think is similar).
,
Apr 18 2016
https://codereview.chromium.org/1887243002/ + https://codereview.chromium.org/1879833002/ might be what works to address this.
,
Apr 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6414a4526b77dd087581d8a9c1cec50affc66333 commit 6414a4526b77dd087581d8a9c1cec50affc66333 Author: sievers <sievers@chromium.org> Date: Fri Apr 22 21:33:20 2016 Android: Browser-side scheduling latency tweaks - Emit an immediate 'missed' BeginFrame when an BFObserver is added instead of waiting for next vsync - Remove unneeded requestRender() (=SetNeedsAnimate()) call The former helps both the browser compositor and display scheduler (now that it uses the same BeginFrameSource). Regarding the browser compositor: When the omnibox is hiding (or selection handles are showing) we need to browser composite with the correct scroll offset for a given renderer frame. Without this patch we run: ScrollBegin VSync SwapCompositorFrame VSync BrowserComposite Now we do: ScrollBegin VSync SwapCF MissedBF BrowserComposite VSync In the next interval, this works because if nothing changed yet, the scheduler will defer the BeginMainFrame until there is another change in the browser compositor tree (MODE_LATE). Calling SetNeedsAnimate()) from CompositorViewHolder when the top control offset changes is unnecessary, since this is a renderer-driven animation (causes an invalidation in the browser compositor tree), so will already cause the necessary UpdateLayerTreeHost(). Requesting SetNeedsAnimate() causes an extra BeginMainFrame() (SingleThreadProxy+scheduler are not particularly good at tracking these between needs_commit and needs_animate) and puts us into high latency mode otherwise. Fileed bug crbug.com/602489 to make this more robust in SingleThreadProxy and scheduler. TBR=dtrainor@chromium.org BUG= 591725 , 602489 , 602485 , 604007 Review URL: https://codereview.chromium.org/1879833002 Cr-Commit-Position: refs/heads/master@{#389248} [modify] https://crrev.com/6414a4526b77dd087581d8a9c1cec50affc66333/chrome/android/java/src/org/chromium/chrome/browser/compositor/CompositorViewHolder.java [modify] https://crrev.com/6414a4526b77dd087581d8a9c1cec50affc66333/content/browser/renderer_host/compositor_impl_android.cc
,
Apr 29 2016
The CL mentioned in #27 (Remove retro frames from scheduler.) Has not yet landed, and the metric has not recovered. @sunnyps, could you comment?
,
Apr 29 2016
The CL in #28 should have addressed this.
,
May 6 2016
The graph may have gotten a little better, but it's not fully resolved. sievers, can you investigate again?
,
May 12 2016
Friendly Perf Sheriff ping! sievers@chromium.org, Did you get chance to investigate #31
,
Jun 3 2016
Friendly perf-sheriff ping. Any update re #31?
,
Jul 11 2016
Sievers, ping! This is still pretty bad.
,
Jul 22 2016
sievers, this still needs some investigation. Though it is only a single bot/metric, so maybe we're okay...
,
Aug 5 2016
Emailed sievers.
,
Aug 5 2016
Downgrading since this only affects latency while the omnibox is animating and therefore also wasn't noticeable in the overall UMA histograms from the field for these latency measurements. It was only noticeable in one test IIRC that did a very short scroll of a few frames duration during which the scroll offset changed the whole time - ironically the omnibox hiding was supposed to be disabled for noise like this in tests, but it turns out it's disabled 'visually' (the layer doesn't move) but still hits this codepath with the side effects. The reason this happens is that we use cc surfaces and the cc (as well as display) scheduler now. Before we had some special cased logic that dealt with this specific use case a little better. Unfortunately, now it's up to the cc scheduling logic to handle this use case in an ideal fashion. It's probably an uncommon and Android-specific use case where the renderer frame causes the UI to invalidate (because of the scroll offset and omnibox needing to move in the UI layer tree in the browser). In fact, with cc Surfaces the whole design was centered around having less dependencies (and less of a hierarchy) between the different compositors. However, maybe it's not totally unreasonable to support this optimization where the browser compositor handles the MISSED BeginFrame from SetNeedsBeginFrame() going high when the renderer sends a frame where the scroll offset changed. I thought I had it pretty much working with the patch I landed. I did see the retroactive begin frame logic in the scheduler causing issues occasionally, so it's possible that finishing and landing https://codereview.chromium.org/1887243002/ would help here. It's also possible though that at the end of the day it remains a bit fragile, esp. if there are independent invalidations for the UI also (the browser compositor scheduler would have to know that it might want to wait for a renderer frame). But then that was sort of the case before also. In other words, the previous logic wasn't necessarily better overall (it was simpler, and less generic, so hard to say).
,
Aug 5 2016
,
Aug 5 2016
I should say though that in terms of weighing complexity (in terms of modifying the scheduler) vs. the severity of the problem, the task of removing retro frames from the scheduler will actually remove complexity :) So definitely worth seeing how this helps.
,
Aug 5 2016
In terms of the tests we do here I think it would be better to: - totally disable omnibox hiding for all the regular smoothness tests - maybe have a specific test that measures the latency for omnibox hiding specifically Not sure how to find someone to do that though.
,
Aug 18 2016
Perf sheriff ping: reminder to follow up on possible performance issues
,
Oct 5 2016
Over to Sunny. It's possible that things are better with the retroactive frame logic gone.
,
Oct 12 2016
Ping!
,
Dec 2 2016
Anyway we can get some of Sunny's time to investigate?
,
Dec 3 2016
+ more scheduling folks. I'm afraid Sunny may not have time to look at this till Jan.
,
Aug 16 2017
This hasn't moved in 8 months. The graph eventually came back down. Should we WontFix?
,
Aug 16 2017
#46, agree this looks like it fully recovered at one point. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by tdres...@chromium.org
, Mar 3 2016