New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 678119 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug



Sign in to add a comment

Browser-side animations jank scrolling

Project Member Reported by aelias@chromium.org, Jan 3 2017

Issue description

No repro: 54.0.2840.85
Repro: 55.0.2883.91, 56.0.2924.23 (Nexus 6P / Pixel XL)

See internal bug http://b/33751067 for details including video of symptoms.

I exactly bisected this to https://codereview.chromium.org/2083423006 "Remove default begin frame sources from the renderer"

Repro steps:
1. Visit http://en.m.wikipedia.org/wiki/Martini_(cocktail)
2. Fling down to the bottom of the page.
3. While the overscroll glow is still animating, fling upwards.
4. Observe low scrolling framerate (~20 fps) for the duration of the animation.

Although this bug made it to 55 stable channel, it would be nice to fix for 56 if safe.
 

Comment 1 by enne@chromium.org, Jan 5 2017

Thanks.  I'll try to take a look this week.  That's awfully mysterious that *that* particular change regressed things.  There was another regression related to vsync changing (that follows up a week or so after this one), but I think the throttling changes should have been the only behavior change here.  I'll trace locally etc etc and see what I can see.

Comment 2 by ericrk@chromium.org, Jan 11 2017

Pri2 + releaseblock-stable M56 is a bit weird :D - we should probably bump to pri1 or drop the releaseblock.

Given that this already made it into M55, maybe just drop releaseblock?

Comment 3 by aelias@chromium.org, Jan 11 2017

Labels: -Pri-2 Pri-1
Well, it was marked P1 in the internal bug http://b/33751067.

Comment 4 by enne@chromium.org, Jan 11 2017

Cc: danakj@chromium.org
I was able to repro this on my nexus 5x.

It does look it's that patch, and it looks like the line that's doing it is the LayerTreeSettings change to default "use_output_surface_begin_frame_source = true;".

Previously, the Android browser compositor was driven by a synthetic begin frame source with the default of no offset and renderer setting display refresh interval (from LTHI initialization).  These parameters don't get updated.

With this patch, it changes the Android browser compositor to be driven by its ExternalBeginFrameSource from OnVSync from WindowAndroidCompositor.  This is the begin frame source that is also driving the renderer.

If I try to recall back to what I was thinking in the middle of a patch set sequence in August/September, I think I believed that the Android browser compositor was already being driven by its ExternalBeginFrameSource and not a default synthetic one, and that this patch was a noop in that respect.

It does seem a little bit odd that driving the browser compositor with an arbitrarily offset begin frame source is less janky than driving it with the ExternalBeginFrameSource.

To me, that smells a little bit like receiving a frame from the renderer here that wants to kick off a frame in the browser, but the browser had already done work the frame.

I'm going to attempt to channel the spirit of sievers and see what else I can find.

Comment 5 by enne@chromium.org, Jan 12 2017

Cc: boliu@chromium.org sunn...@chromium.org briander...@chromium.org
This does appear to be a weird scheduling interaction with the overscroll controller (and not omnibox hiding or something else).  If I disable that, then everything runs very smoothly.

(https://codereview.chromium.org/1879833002 is the sievers patch from the past that this reminded me of)

Comment 6 by aelias@chromium.org, Jan 18 2017

Labels: -M-56 M-57
OK, punting to M57 since it's getting late in the cycle and at least it only affects one type of browser animation.

Comment 7 by enne@chromium.org, Jan 20 2017

I managed to capture traces at ToT and with a version of ToT that hacks in the old synthetic begin frame source for the browser compositor.  It looks like in both cases, the browser compositor is running at 60hz, the renderer compositor is running at 60hz, but the order of events is slightly different.

ToT (janky scroll up starts at 1s in trace)
-------------------------------------------
VSync (begin frame to renderer and browser)
CompositorImpl animate overscroll + commit
CompositorImpl deadline + draw
RWHVA::SwapCompositorFrame
DisplayScheduler deadline + draw
(repeats every VSync)

hacked synthetic begin frame source (starting at 1s)
-----------------------------------
browser compositor begin frame (synthetic)
CompositorImpl animate overscroll + commit
CompositorImpl deadline + draw
VSync (begin frame to renderer)
DisplayScheduler deadline + draw
RWHVA::SwapCompositorFrame
(repeats)

It seems like the RWHVA adjusts a bunch of properties on the overscroll controller on swap compositor frame, which then get used the next time it animates, and so maybe not having a chance to animate and apply those properties before the frame gets used messes something up?

I am a bit lost in the Android browser compositor and the overscroll controller.  Is there somebody who knows more about how this is supposed to work? Yes, my patch regressed this, but it seems like this was quite fragile before.
overscroll_jank_tot.json.gz
1.4 MB Download
overscroll_jank_hacked_fixed.json.gz
1.0 MB Download

Comment 8 by danakj@chromium.org, Jan 20 2017

Cc: bokan@chromium.org

Comment 9 by aelias@chromium.org, Jan 20 2017

Cc: skyos...@chromium.org
Well, jdduke@ and sievers@ worked on this the most and they've both left the team, but skyostil@ was also involved and might know something.

Could you upload your hack so other folks could apply it locally if needed?
Cc: khushals...@chromium.org
+khushalsagar@
I managed to get a trace where the renderer is falling to 30 fps. Looks like while the browser is animating, the browser's commit + draw has the DisplayScheduler drawing before the renderer can submit a frame. And now the renderer's draw is throttled till the next Vsync because it is awaiting a frame ack from the browser. Now on the next vsync, the renderer skips an impl frame this vsync to recover latency, since the pending ack throttles the draw. It gets the ack and this keeps repeating.

Finally once the browser animation has stopped, its only the renderer that's driving the display's draw, so it gets acks for submitted frames within the same vsync and things get back to normal.
trace_overscroll_animation_jank.json
22.7 MB Download
Took another look at the code. I think its the child surface damage tracking logic in DisplayScheduler. It takes surfaces damaged in the last 2 consecutive frames as active surfaces to expect damage frame from in the next frame. Once the browser compositor (root surface) draws, the scheduler only waits for the active surfaces otherwise the draw is triggered immediately. So the display scheduler draws before the deadline sent to the renderer, causing the renderer to assume its running in high latency and skipping a frame on the next vsync, causing the display scheduler to assume its inactive...
Cc: eseckler@chromium.org
+eseckler@ who has dug into this recently and I believe has a patch to adjust the way DisplayScheduler waits for updates.
The patch Sami mentioned is this one: https://codereview.chromium.org/2527283003/. It replaces the child surface heuristics with acknowledgements sent by the corresponding BeginFrameObservers. That change should resolve the issue you are describing, since waiting for child frames won't depend on whether the surface was active in the prior frame.

However, there's still a bunch of work to do before the Display scheduler can start using BeginFrameAcks in this way. I'm currently in the process of landing the patch above in smaller pieces. Realistically, I'll complete that towards the end of Q1. So I'm not sure if you want to be blocked on it ;)
Cc: m...@chromium.org
Thanks for figuring out the root cause!

These heuristics can't handle all the cases perfectly, so it'll be nice to have eseckler's work in.

For the short term though, khushalsagar@ came by wondering if we could play around with the heuristic. We'll have to be careful in particular with A) 24fps and 30fps content on Chromecast and B) 60 fps UI animations on their own.

A)
In the past, I had a patch to consider a surface active if it submitted a frame recently, which should theoretically fix this bug, but it regresses Chromecast use cases:

"cc: Consider Surface active if frame received recently" - https://codereview.chromium.org/1251693002
Alerts: https://chromeperf.appspot.com/group_report?rev=340299

It's revert - https://codereview.chromium.org/1258273005
Alerts: https://chromeperf.appspot.com/group_report?rev=340725

B)
We could always wait a little for a Renderer even if the root surface is invalidated, but we would have to be careful not to drop UI animations to 30fps, which occurred in  issue 492035  and was fixed by https://codereview.chromium.org/1155183004.

Some other short-term alternatives:
C) Send a flag with the BeginFrame that says "disable latency recovery because there's a UI animation". This is a lot of plumbing for a short term solution.

I'd personally go with A at this point and then wait for eseckler's work to fix Chromecast performance. Or provide a DisplayScheduler setting that Chromecast can set to select the old heuristic if the regression is unacceptable for them. +miu for comment.
Cc: enne@chromium.org
Labels: -Pri-1 -ReleaseBlock-Stable -M-57 M-58 Pri-2
Owner: eseckler@chromium.org
OK, thanks for the investigation.  Although I had marked this for quick fix before, now that we understand this, I think I'm okay blocking this on eseckler@'s work.  Reasons:

1) This regression looks entirely specific to overscroll glow.  No obvious problem for tablet tab spinner case http://jsbin.com/suluteh, nor URL-bar-hiding, nor pull-to-refresh.
2) Workaround A still looks complicated to me, and the past history of getting reverted for causing another regression isn't encouraging.  Seems like a can of worms and risks introducing regressions worse than the one we'd aiming to fix here.  I'd rather we spend our time getting the proper fix in earlier.
Project Member

Comment 18 by bugdroid1@chromium.org, May 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e356c64f253923983248656d304d8aa471aad5f0

commit e356c64f253923983248656d304d8aa471aad5f0
Author: eseckler <eseckler@chromium.org>
Date: Fri May 26 12:59:36 2017

[cc] Plumb BeginFrameAcks through SurfaceManager to DisplayScheduler.

Plumbs the BeginFrameAcks from DidNotProduceFrame and
SubmitCompositorFrame through new SurfaceObserver methods from
CompositorFrameSinks to the DisplayScheduler.

Also replaces the expected surface damage heuristics previously used
by DisplayScheduler with tracking BeginFrame(Ack)s and surface damage.

This is work towards unified BeginFrame acknowledgments, see:
Tracking bug:  https://crbug.com/697086 
Design doc: http://bit.ly/beginframeacks

BUG= 697086 , 646774,  678119 ,  703079 ,  707513 
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_site_isolation

Review-Url: https://codereview.chromium.org/2854163003
Cr-Commit-Position: refs/heads/master@{#474989}

[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/BUILD.gn
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/compositor_frame_sink_support.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/compositor_frame_sink_support_unittest.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/direct_compositor_frame_sink_unittest.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler_unittest.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_unittest.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_manager.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_manager.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_observer.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_synchronization_unittest.cc
[add] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/test/fake_surface_observer.cc
[add] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/test/fake_surface_observer.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/components/viz/frame_sinks/mojo_frame_sink_manager.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/components/viz/frame_sinks/mojo_frame_sink_manager.h
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/content/browser/frame_host/render_widget_host_view_child_frame_unittest.cc
[modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/content/browser/renderer_host/render_widget_host_view_aura_unittest.cc

Status: Fixed (was: Assigned)
This should be fixed in M61 since above patch landed \o/

Sign in to add a comment