Noticeable flicker as zoom level propagates to out-of-process-iframes (oopifs) |
||||||||||
Issue descriptionWhen you change the zoom level on a page containing out-of-process iframes, there is are noticeable flicker artifacts as the zoom level and viewport changes propagate to child frames. Steps to reproduce: - Run chrome with --top-document-isolation. - Visit http://csreis.github.io/tests/cross-site-iframe.html - Click 'Go cross-site (complex page)' - Hit Ctrl+Plus and Ctrl+Minus to zoom in and out. It feels both laggy and flickery as the zoom level propagates. Here is a slow-motion recording of the above steps: https://www.youtube.com/watch?v=dLi-A-ctRts In this video, I see two distinct phases: 1. guttering/cropping phase: the iframe is resized, showing the inner content at the old zoom level, but guttered/cropped to the new viewport size 2. flicker phase: after the cropped phase, there seems to be one frame that is different but not final. Perhaps we are seeing the old viewport, with the new zoom level applied? My sense is that it's behavior #2 that feels so jarring, and should be fixed. This is probably responsible for the sensation of flicker when you watch this at full frame rate. The guttering behavior (#1) is probably unavoidable in general, but maybe we could have a fast path that would let us skip it if we produce frames fast enough. It contributes a sense of lagginess.
,
May 2 2016
For #1, we currently have optimistic scroll syncing (if the main thread is fast enough, in theory it can keep in sync with compositor scrolls). Applying something like that here makes sense to me. I agree that #2 looks like a bug that we should be able to fix. I think there's a third problem here too. If I zoom this page without TDI, it seems pretty snappy. With TDI, it's super slow looking. I can't tell if that's just my perception though. Maybe we should measure time from start of zoom until the iframe is done resizing with and without TDI to make sure there isn't a more core bug here?
,
May 2 2016
This is apparently an expensive page to layout, but that seems to be the case with and without TDI. I see layout events in the subframe process taking about 90ms. Without TDI, the layout of the whole page takes about 100ms. A big difference seems to be that in TDI, the top document manages to paint while the subdocument is still doing layout. In my trace there's about a 140ms delay between activity on the main process compositor thread, and the start of activity on the subframe process compositor thread. So that's many frame intervals where the gutters are visible, which might make things feel more sluggish, even though the main page was technically more responsive? If you hit Ctrl+Plus three times in rapid succession on this page, or scroll using Ctrl+Mousewheel, TDI actually feel quite a bit more responsive overall (at least for drawing the outer page). It seems like the more scroll events you queue up, the longer the delay is before we see the iframe at the final zoom level.
,
May 2 2016
^^ s/scroll event/zoom change/ in the last sentence above ^^
,
May 2 2016
,
May 3 2016
I just occurred to me (as I explained at the meeting), that this could have to do with the subframe receiving zoom-change information (from the web-contents [in the browser process]) and frameRectSize information (from the parent frame [in another renderer]) at slightly different times. This could potentially lead to two layout cycles, where one would be guaranteed to be at the wrong frameRect size. Nick, we talked about this in relation to the two frame-resize events we were sometimes seeing in the tests. As ojan@ suggested, one possibility is to have the parent frame send the zoom information together with the frameRect size change (although could this be laggy if it has to propagate through several parent-child boundaries?). Another possibility would be to treat frameRect size changes due to zoom differently; for example, if a subframe receives a zoom change, it *knows* to expect a frame re-size with it, so perhaps it could anticipate the new frame size based on the current one and the new zoom factor? I'm not sure of what the implications are in terms of the layout process, but it's food for thought.
,
May 3 2016
#3: 90ms layout for the subframe is pretty long, and it probably explains what is happening pretty well. 1. Zoom is getting sent to both renderer processes, prompting relayout. 2. The parent frame finishes first, and during its layout it sends an IPC to change the size of the child frame. 3. The browser compositor receives the compositor frame from the parent frame, at which point the gutter appears. 4. The child frame finishes layout from the zoom change, and emits a compositor frame, but by this point it has received the resize so it has to do layout again. 5. The browser compositor receives the zoomed but not resized compositor frame from the child frame, and subsequently the resized one. That produces the flicker that is visible in the video. It should solve the flicker to propagate zoom changes down through nested OOPIFs, along with size information, rather than simultaneously broadcasting them to all frames. This kind of makes sense, since zoom will basically always change the iframe rect, and so the child frame knowing its zoom in advance of its new size isn't useful. With nested OOPIFs this delay will be worse, but I don't think there is a way around that.
,
May 3 2016
,
May 4 2016
I suspect there is similar flickering when the top-level window resizes and the OOPIF tries to keep up. I wonder if we should try batch geometry changes like this into the BeginFrame message so there's no way to see the broken intermediate state.
,
May 4 2016
The problem is that, before the geometry is known for a subframe, the parent frame has to go through a layout. So then entire geometry change isn't known at the time BeginFrame is send. And since any single layout can be long, and there's now a nested chain of layouts, it seems unlikely layout could be completed withing a single frame's budget. I've been thinking that a way to avoid getting multiple layouts *per subframe*, as we sometimes do now, is to bundle zoom levels for subframes into ViewMessage_UpdateScreenRects sent from RenderWidgetHostImpl::SendScreenRects(). The receiver can update it's rects and its zoom level at the same time, and go through a single layout before propagating the zoom level and new frame rects to any subframes. A (small?) disadvantage is that we might have to force sending of screen rects in cases where we do not currently: e.g. imagine an iframe positioned top-left in a body with no border/margines, where the iframe has a width and height relative to the size of the parent frame. But I don't know if this happens a lot at present, and at most forces one IPC (at present we typically send two, one for rects and one for zoom level). Thoughts?
,
May 4 2016
This reminds me of a thread from a few weeks ago where nested message loops in resize came up: https://groups.google.com/a/chromium.org/d/msg/platform-architecture-dev/siuQ00BaKKM/tfsmz2ZFAgAJ I asked piman@ about this today. Some comments from the chat: piman@: the surfaces are supposed to deal with this I think piman@: though I'm not sure how the timeout is handled piman@: at the end of the day, if it takes a second or so for the inner frame to relayout/repaint, something has to gutter piman@: iiuc you're supposed to create a new surface on resize Also CCing jbauman@ as suggested
,
May 4 2016
I believe surfaces should be able to handle that. When resizing the outer frame, you'd create a new surface id, so that the surface aggregator knows to wait for a CompositorFrame for the inner frame. +jbauman
,
May 4 2016
Certainly, it seems impossible to avoid guttering in every case.
,
May 4 2016
piman: I'm not sure I entirely understand this. If the timer smooths out the guttering during resize then that should be okay, but the zoom issue is largely about the time difference between the child frame renderer getting a zoom change and it getting a resize, which causes it to dispatch two very different compositor frames (potentially both with new SurfaceIds, I believe). Also, and I guess John can comment better, but the new compositor frame received from the parent frame (after zoom or resize) will still have the old SurfaceID from the child frame. Is the surface aggregator smart enough to not use the last received compositor frame submitted for that surface?
,
May 4 2016
I guess my point is that the parent frame needs to use a new SurfaceID for the child if you want atomicity of the overall frame (parent+child) across resizes. You'd want to communicate that new SurfaceID together with the resize and/or zoom change to the child. Then you're not dependent on timing.
,
May 4 2016
Yeah, a different way to do this would be keep track of the zoom for a frame in ViewHostMsg_SwapCompositorFrame, and create a new SurfaceId if that doesn't match the previous zoom for a frame. The parent renderer can then decide what to do with that SurfaceId (probably throw it away, as it's the wrong size). Still, the child renderer would have done all the work for a useless frame (correct zoom, incorrect size), so I think it makes sense to ensure the child gets its new zoom and new size at the same time.
,
May 5 2016
This is a general problem though right? If a parent frame does a layout which resizes the iframe, the iframe will always be one frame behind if both frames get the BeginMainFrame IPC at the same time.
,
May 5 2016
Yeah, there is definitely an inherent race condition here. Even if you knew the zoom level for every frame, at some point you might either have to choose the old contents or nothing and there will be some kind of flash. It could be that painting blank and then filling it in later is less jarring on average, but that will require some experimentation. Seems like a P3.
,
May 6 2016
I think it's a bummer if we have time to produce a perfect frame, but we produce a bad one. We do have the tools to make it non-racy.
,
May 6 2016
Tangentially related: we have similar issues around fullscreen and orientation switching[1], so if we want to build something general to solve this, let's look at those cases too. [1] https://docs.google.com/document/d/1NXluXrFqgEjqtbfq4yninCXUqEl3OwykvJGDQ_M9Z0w/edit#heading=h.4ev17mmzrbhy
,
May 25 2016
,
Dec 13 2016
Issue 672373 has been merged into this issue.
,
Jan 23 2017
,
May 5 2017
,
May 8 2018
,
Nov 27
***UI Mass Triage *** Adding labels for expert review. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by wjmaclean@chromium.org
, May 2 2016