New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Noticeable flicker as zoom level propagates to out-of-process-iframes (oopifs)

Project Member Reported by nick@chromium.org, May 2 2016

Issue description

When you change the zoom level on a page containing out-of-process iframes, there is are noticeable flicker artifacts as the zoom level and viewport changes propagate to child frames.

Steps to reproduce:
 - Run chrome with --top-document-isolation.
 - Visit http://csreis.github.io/tests/cross-site-iframe.html
 - Click 'Go cross-site (complex page)'
 - Hit Ctrl+Plus and Ctrl+Minus to zoom in and out.

It feels both laggy and flickery as the zoom level propagates. Here is a slow-motion recording of the above steps: https://www.youtube.com/watch?v=dLi-A-ctRts

In this video, I see two distinct phases:
 1. guttering/cropping phase: the iframe is resized, showing the inner content at the old zoom level, but guttered/cropped to the new viewport size
 2. flicker phase: after the cropped phase, there seems to be one frame that is different but not final. Perhaps we are seeing the old viewport, with the new zoom level applied?

My sense is that it's behavior #2 that feels so jarring, and should be fixed. This is probably responsible for the sensation of flicker when you watch this at full frame rate.

The guttering behavior (#1) is probably unavoidable in general, but maybe we could have a fast path that would let us skip it if we produce frames fast enough. It contributes a sense of lagginess.
 
I had noticed this too. It seems like ultimately to fix this really well it would be nice to synchronize the application of the new zoom levels via the compositor frame currently being drawn (not sure how this would work).

But I'm not sure that varying layout times (e.g. the parent frame is simple and the child frame is complex) mightn't mean that what we really need to synchronize is the *end* of layout, not the start of it? For example, what does it look like when you embed a simple frame inside a complex one?

We're using WebContentsImpl::SendPageMessage() here: I'm assuming there's no significant lag between the IPCs it sends to different processes, is there?

Comment 2 by ojan@chromium.org, May 2 2016

Cc: skyos...@chromium.org esprehn@chromium.org enne@chromium.org brettw@chromium.org
For #1, we currently have optimistic scroll syncing (if the main thread is fast enough, in theory it can keep in sync with compositor scrolls). Applying something like that here makes sense to me.

I agree that #2 looks like a bug that we should be able to fix.

I think there's a third problem here too. If I zoom this page without TDI, it seems pretty snappy. With TDI, it's super slow looking. I can't tell if that's just my perception though. Maybe we should measure time from start of zoom until the iframe is done resizing with and without TDI to make sure there isn't a more core bug here?

Comment 3 by nick@chromium.org, May 2 2016

This is apparently an expensive page to layout, but that seems to be the case with and without TDI. I see layout events in the subframe process taking about 90ms. Without TDI, the layout of the whole page takes about 100ms.

A big difference seems to be that in TDI, the top document manages to paint while the subdocument is still doing layout. In my trace there's about a 140ms delay between activity on the main process compositor thread, and the start of activity on the subframe process compositor thread. So that's many frame intervals where the gutters are visible, which might make things feel more sluggish, even though the main page was technically more responsive?

If you hit Ctrl+Plus three times in rapid succession on this page, or scroll using Ctrl+Mousewheel, TDI actually feel quite a bit more responsive overall (at least for drawing the outer page). It seems like the more scroll events you queue up, the longer the delay is before we see the iframe at the final zoom level.

Comment 4 by nick@chromium.org, May 2 2016

^^ s/scroll event/zoom change/ in the last sentence above ^^

Comment 5 by nick@chromium.org, May 2 2016

Components: UI>Browser>Zoom
I just occurred to me (as I explained at the meeting), that this could have to do with the subframe receiving zoom-change information (from the web-contents [in the browser process]) and frameRectSize information (from the parent frame [in another renderer]) at slightly different times. This could potentially lead to two layout cycles, where one would be guaranteed to be at the wrong frameRect size. Nick, we talked about this in relation to the two frame-resize events we were sometimes seeing in the tests.

As ojan@ suggested, one possibility is to have the parent frame send the zoom information together with the frameRect size change (although could this be laggy if it has to propagate through several parent-child boundaries?).

Another possibility would be to treat frameRect size changes due to zoom differently; for example, if a subframe receives a zoom change, it *knows* to expect a frame re-size with it, so perhaps it could anticipate the new frame size based on the current one and the new zoom factor? I'm not sure of what the implications are in terms of the layout process, but it's food for thought.

Comment 7 by kenrb@chromium.org, May 3 2016

#3: 90ms layout for the subframe is pretty long, and it probably explains what is happening pretty well.

1. Zoom is getting sent to both renderer processes, prompting relayout.
2. The parent frame finishes first, and during its layout it sends an IPC to change the size of the child frame.
3. The browser compositor receives the compositor frame from the parent frame, at which point the gutter appears.
4. The child frame finishes layout from the zoom change, and emits a compositor frame, but by this point it has received the resize so it has to do layout again.
5. The browser compositor receives the zoomed but not resized compositor frame from the child frame, and subsequently the resized one. That produces the flicker that is visible in the video.

It should solve the flicker to propagate zoom changes down through nested OOPIFs, along with size information, rather than simultaneously broadcasting them to all frames. This kind of makes sense, since zoom will basically always change the iframe rect, and so the child frame knowing its zoom in advance of its new size isn't useful. With nested OOPIFs this delay will be worse, but I don't think there is a way around that.
Owner: wjmaclean@chromium.org
Status: Started (was: Available)
I suspect there is similar flickering when the top-level window resizes and the OOPIF tries to keep up. I wonder if we should try batch geometry changes like this into the BeginFrame message so there's no way to see the broken intermediate state.
The problem is that, before the geometry is known for a subframe, the parent frame has to go through a layout. So then entire geometry change isn't known at the time BeginFrame is send. And since any single layout can be long, and there's now a nested chain of layouts, it seems unlikely layout could be completed withing a single frame's budget.

I've been thinking that a way to avoid getting multiple layouts *per subframe*, as we sometimes do now, is to bundle zoom levels for subframes into ViewMessage_UpdateScreenRects sent from RenderWidgetHostImpl::SendScreenRects().

The receiver can update it's rects and its zoom level at the same time, and go through a single layout before propagating the zoom level and new frame rects to any subframes.

A (small?) disadvantage is that we might have to force sending of screen rects in cases where we do not currently: e.g. imagine an iframe positioned top-left in a body with no border/margines, where the iframe has a width and height relative to the size of the parent frame. But I don't know if this happens a lot at present, and at most forces one IPC (at present we typically send two, one for rects and one for zoom level).

Thoughts?
Cc: jbau...@chromium.org
This reminds me of a thread from a few weeks ago where nested message loops in resize came up: https://groups.google.com/a/chromium.org/d/msg/platform-architecture-dev/siuQ00BaKKM/tfsmz2ZFAgAJ

I asked piman@ about this today. Some comments from the chat:
piman@: the surfaces are supposed to deal with this I think
piman@: though I'm not sure how the timeout is handled
piman@: at the end of the day, if it takes a second or so for the inner frame to relayout/repaint, something has to gutter
piman@: iiuc you're supposed to create a new surface on resize

Also CCing jbauman@ as suggested
I believe surfaces should be able to handle that. When resizing the outer frame, you'd create a new surface id, so that the surface aggregator knows to wait for a CompositorFrame for the inner frame. +jbauman
Certainly, it seems impossible to avoid guttering in every case.
piman: I'm not sure I entirely understand this. If the timer smooths out the guttering during resize then that should be okay, but the zoom issue is largely about the time difference between the child frame renderer getting a zoom change and it getting a resize, which causes it to dispatch two very different compositor frames (potentially both with new SurfaceIds, I believe).

Also, and I guess John can comment better, but the new compositor frame received from the parent frame (after zoom or resize) will still have the old SurfaceID from the child frame. Is the surface aggregator smart enough to not use the last received compositor frame submitted for that surface?
I guess my point is that the parent frame needs to use a new SurfaceID for the child if you want atomicity of the overall frame (parent+child) across resizes.
You'd want to communicate that new SurfaceID together with the resize and/or zoom change to the child.
Then you're not dependent on timing.
Yeah, a different way to do this would be keep track of the zoom for a frame in ViewHostMsg_SwapCompositorFrame, and create a new SurfaceId if that doesn't match the previous zoom for a frame. The parent renderer can then decide what to do with that SurfaceId (probably throw it away, as it's the wrong size).

Still, the child renderer would have done all the work for a useless frame (correct zoom, incorrect size), so I think it makes sense to ensure the child gets its new zoom and new size at the same time.
This is a general problem though right? If a parent frame does a layout which resizes the iframe, the iframe will always be one frame behind if both frames get the BeginMainFrame IPC at the same time.
Yeah, there is definitely an inherent race condition here. Even if you knew the zoom level for every frame, at some point you might either have to choose the old contents or nothing and there will be some kind of flash. It could be that painting blank and then filling it in later is less jarring on average, but that will require some experimentation. Seems like a P3.
I think it's a bummer if we have time to produce a perfect frame, but we produce a bad one. We do have the tools to make it non-racy.
Tangentially related: we have similar issues around fullscreen and orientation switching[1], so if we want to build something general to solve this, let's look at those cases too.

[1] https://docs.google.com/document/d/1NXluXrFqgEjqtbfq4yninCXUqEl3OwykvJGDQ_M9Z0w/edit#heading=h.4ev17mmzrbhy
Status: Assigned (was: Started)
Cc: nyerramilli@chromium.org mcnee@chromium.org ranjitkan@chromium.org
 Issue 672373  has been merged into this issue.

Comment 23 by mcnee@chromium.org, Jan 23 2017

Cc: msrchandra@chromium.org dsinclair@chromium.org
 Issue 681798  has been merged into this issue.
Cc: rjkroege@chromium.org danakj@chromium.org sadrul@chromium.org kylec...@chromium.org piman@chromium.org fsam...@chromium.org

Comment 25 by ojan@chromium.org, May 8 2018

Cc: -ojan@chromium.org
Labels: Hotlist-DesktopUIToolingRequired Hotlist-DesktopUIChecked
***UI Mass Triage ***
Adding labels for expert review.

Sign in to add a comment