Browser-side animations jank scrolling |
|||||||||||
Issue descriptionNo repro: 54.0.2840.85 Repro: 55.0.2883.91, 56.0.2924.23 (Nexus 6P / Pixel XL) See internal bug http://b/33751067 for details including video of symptoms. I exactly bisected this to https://codereview.chromium.org/2083423006 "Remove default begin frame sources from the renderer" Repro steps: 1. Visit http://en.m.wikipedia.org/wiki/Martini_(cocktail) 2. Fling down to the bottom of the page. 3. While the overscroll glow is still animating, fling upwards. 4. Observe low scrolling framerate (~20 fps) for the duration of the animation. Although this bug made it to 55 stable channel, it would be nice to fix for 56 if safe.
,
Jan 11 2017
Pri2 + releaseblock-stable M56 is a bit weird :D - we should probably bump to pri1 or drop the releaseblock. Given that this already made it into M55, maybe just drop releaseblock?
,
Jan 11 2017
,
Jan 11 2017
I was able to repro this on my nexus 5x. It does look it's that patch, and it looks like the line that's doing it is the LayerTreeSettings change to default "use_output_surface_begin_frame_source = true;". Previously, the Android browser compositor was driven by a synthetic begin frame source with the default of no offset and renderer setting display refresh interval (from LTHI initialization). These parameters don't get updated. With this patch, it changes the Android browser compositor to be driven by its ExternalBeginFrameSource from OnVSync from WindowAndroidCompositor. This is the begin frame source that is also driving the renderer. If I try to recall back to what I was thinking in the middle of a patch set sequence in August/September, I think I believed that the Android browser compositor was already being driven by its ExternalBeginFrameSource and not a default synthetic one, and that this patch was a noop in that respect. It does seem a little bit odd that driving the browser compositor with an arbitrarily offset begin frame source is less janky than driving it with the ExternalBeginFrameSource. To me, that smells a little bit like receiving a frame from the renderer here that wants to kick off a frame in the browser, but the browser had already done work the frame. I'm going to attempt to channel the spirit of sievers and see what else I can find.
,
Jan 12 2017
This does appear to be a weird scheduling interaction with the overscroll controller (and not omnibox hiding or something else). If I disable that, then everything runs very smoothly. (https://codereview.chromium.org/1879833002 is the sievers patch from the past that this reminded me of)
,
Jan 18 2017
OK, punting to M57 since it's getting late in the cycle and at least it only affects one type of browser animation.
,
Jan 20 2017
I managed to capture traces at ToT and with a version of ToT that hacks in the old synthetic begin frame source for the browser compositor. It looks like in both cases, the browser compositor is running at 60hz, the renderer compositor is running at 60hz, but the order of events is slightly different. ToT (janky scroll up starts at 1s in trace) ------------------------------------------- VSync (begin frame to renderer and browser) CompositorImpl animate overscroll + commit CompositorImpl deadline + draw RWHVA::SwapCompositorFrame DisplayScheduler deadline + draw (repeats every VSync) hacked synthetic begin frame source (starting at 1s) ----------------------------------- browser compositor begin frame (synthetic) CompositorImpl animate overscroll + commit CompositorImpl deadline + draw VSync (begin frame to renderer) DisplayScheduler deadline + draw RWHVA::SwapCompositorFrame (repeats) It seems like the RWHVA adjusts a bunch of properties on the overscroll controller on swap compositor frame, which then get used the next time it animates, and so maybe not having a chance to animate and apply those properties before the frame gets used messes something up? I am a bit lost in the Android browser compositor and the overscroll controller. Is there somebody who knows more about how this is supposed to work? Yes, my patch regressed this, but it seems like this was quite fragile before.
,
Jan 20 2017
,
Jan 20 2017
Well, jdduke@ and sievers@ worked on this the most and they've both left the team, but skyostil@ was also involved and might know something. Could you upload your hack so other folks could apply it locally if needed?
,
Jan 20 2017
Good thought! https://codereview.chromium.org/2648583005
,
Jan 21 2017
+khushalsagar@
,
Jan 23 2017
I managed to get a trace where the renderer is falling to 30 fps. Looks like while the browser is animating, the browser's commit + draw has the DisplayScheduler drawing before the renderer can submit a frame. And now the renderer's draw is throttled till the next Vsync because it is awaiting a frame ack from the browser. Now on the next vsync, the renderer skips an impl frame this vsync to recover latency, since the pending ack throttles the draw. It gets the ack and this keeps repeating. Finally once the browser animation has stopped, its only the renderer that's driving the display's draw, so it gets acks for submitted frames within the same vsync and things get back to normal.
,
Jan 23 2017
Took another look at the code. I think its the child surface damage tracking logic in DisplayScheduler. It takes surfaces damaged in the last 2 consecutive frames as active surfaces to expect damage frame from in the next frame. Once the browser compositor (root surface) draws, the scheduler only waits for the active surfaces otherwise the draw is triggered immediately. So the display scheduler draws before the deadline sent to the renderer, causing the renderer to assume its running in high latency and skipping a frame on the next vsync, causing the display scheduler to assume its inactive...
,
Jan 25 2017
+eseckler@ who has dug into this recently and I believe has a patch to adjust the way DisplayScheduler waits for updates.
,
Jan 25 2017
The patch Sami mentioned is this one: https://codereview.chromium.org/2527283003/. It replaces the child surface heuristics with acknowledgements sent by the corresponding BeginFrameObservers. That change should resolve the issue you are describing, since waiting for child frames won't depend on whether the surface was active in the prior frame. However, there's still a bunch of work to do before the Display scheduler can start using BeginFrameAcks in this way. I'm currently in the process of landing the patch above in smaller pieces. Realistically, I'll complete that towards the end of Q1. So I'm not sure if you want to be blocked on it ;)
,
Jan 25 2017
Thanks for figuring out the root cause! These heuristics can't handle all the cases perfectly, so it'll be nice to have eseckler's work in. For the short term though, khushalsagar@ came by wondering if we could play around with the heuristic. We'll have to be careful in particular with A) 24fps and 30fps content on Chromecast and B) 60 fps UI animations on their own. A) In the past, I had a patch to consider a surface active if it submitted a frame recently, which should theoretically fix this bug, but it regresses Chromecast use cases: "cc: Consider Surface active if frame received recently" - https://codereview.chromium.org/1251693002 Alerts: https://chromeperf.appspot.com/group_report?rev=340299 It's revert - https://codereview.chromium.org/1258273005 Alerts: https://chromeperf.appspot.com/group_report?rev=340725 B) We could always wait a little for a Renderer even if the root surface is invalidated, but we would have to be careful not to drop UI animations to 30fps, which occurred in issue 492035 and was fixed by https://codereview.chromium.org/1155183004. Some other short-term alternatives: C) Send a flag with the BeginFrame that says "disable latency recovery because there's a UI animation". This is a lot of plumbing for a short term solution. I'd personally go with A at this point and then wait for eseckler's work to fix Chromecast performance. Or provide a DisplayScheduler setting that Chromecast can set to select the old heuristic if the regression is unacceptable for them. +miu for comment.
,
Jan 25 2017
OK, thanks for the investigation. Although I had marked this for quick fix before, now that we understand this, I think I'm okay blocking this on eseckler@'s work. Reasons: 1) This regression looks entirely specific to overscroll glow. No obvious problem for tablet tab spinner case http://jsbin.com/suluteh, nor URL-bar-hiding, nor pull-to-refresh. 2) Workaround A still looks complicated to me, and the past history of getting reverted for causing another regression isn't encouraging. Seems like a can of worms and risks introducing regressions worse than the one we'd aiming to fix here. I'd rather we spend our time getting the proper fix in earlier.
,
May 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e356c64f253923983248656d304d8aa471aad5f0 commit e356c64f253923983248656d304d8aa471aad5f0 Author: eseckler <eseckler@chromium.org> Date: Fri May 26 12:59:36 2017 [cc] Plumb BeginFrameAcks through SurfaceManager to DisplayScheduler. Plumbs the BeginFrameAcks from DidNotProduceFrame and SubmitCompositorFrame through new SurfaceObserver methods from CompositorFrameSinks to the DisplayScheduler. Also replaces the expected surface damage heuristics previously used by DisplayScheduler with tracking BeginFrame(Ack)s and surface damage. This is work towards unified BeginFrame acknowledgments, see: Tracking bug: https://crbug.com/697086 Design doc: http://bit.ly/beginframeacks BUG= 697086 , 646774, 678119 , 703079 , 707513 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/2854163003 Cr-Commit-Position: refs/heads/master@{#474989} [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/BUILD.gn [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/compositor_frame_sink_support.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/compositor_frame_sink_support_unittest.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/direct_compositor_frame_sink_unittest.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_scheduler_unittest.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/display_unittest.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_manager.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_manager.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_observer.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/surfaces/surface_synchronization_unittest.cc [add] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/test/fake_surface_observer.cc [add] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/cc/test/fake_surface_observer.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/components/viz/frame_sinks/mojo_frame_sink_manager.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/components/viz/frame_sinks/mojo_frame_sink_manager.h [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/content/browser/frame_host/render_widget_host_view_child_frame_unittest.cc [modify] https://crrev.com/e356c64f253923983248656d304d8aa471aad5f0/content/browser/renderer_host/render_widget_host_view_aura_unittest.cc
,
Jun 15 2017
This should be fixed in M61 since above patch landed \o/ |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by enne@chromium.org
, Jan 5 2017