New issue
Advanced search Search tips

Issue 865446 link

Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 7
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

scrollIntoView{behavior: "smooth"} broken for OOPIFs

Project Member Reported by ekaramad@chromium.org, Jul 19

Issue description

Chrome Version: 67.0.3396.99
OS: All

What steps will reproduce the problem?
(1) Navigate a new tab to: https://jsfiddle.net/ekaramad/fr9uj8s4/show
  Note: This tab contains a cross-origin frame which itself contains another cross-origin frame back to jsfiddle.net. So this is a A (B ( A ) ) scenario.
(2) Press key '3':
  Note: the test case is wired up so that the inner most frame has focus. Pressing '3' will trigger button.scrollIntoView({behavior: 'smooth'}).

What is the expected result?
After some short delay, the button should smoothly scroll into view.

What happens instead?
The button is not visible after the scrolling animation is over.

This issue only occurs with --site-per-process.

 
Cc: alex...@chromium.org
Status: Assigned (was: Available)
I am not quite sure how to fix it but I suspect the issue is with NotifyOffsetChanged call which does not seem to be called right away when the behavior is smooth.

This could be complicated to fix, so I wonder if we could avoid smooth scrolling for OOIPFs instead. This will make sure pages won't break.

# The suggestion in comment 1 was for a temporary hack until we get the smooth scroll working for OOPIFs.
I tested this and found out that all the values of offsets in recursions to LayoutBox::ScrollRectToVisibleRecursive as well as those sent to browser are identical. I wonder if the bug is on the compositor side.
When smooth scrolling an element in a series of nested div I see that animations and scroll start by scrolling the outermost div first. If this is related and absolutely the only way then I don't see any practical solutions for this problem. Digging more.
Cc: creis@chromium.org
I'm not very familiar with smooth scrolling, but it looks like it's a feature where you click an anchor link and the browser gradually scrolls to the destination in the page rather than jumping straight there?  It would be unfortunate to turn it off for all OOPIFs, but that's perhaps better than scrolling to the wrong place for the time being.

If this only affected nested OOPIFs and not a single level of OOPIFs, can we allow smooth scrolling to still work with a single level of OOPIFs, or is that too hard to detect?

Also curious if bokan@ has thoughts on how to fix this properly when he gets back.
IIUC, we do have security measures in place against fragmented scrolling to avoid "frame sniffing".

https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/loader/frame_loader.cc?rcl=19d339e6db10af4e46d2f956fb7877eba0e9c4b7&l=1405

So this should mean a fragmented scroll will never propagate to embedder and is fully contained within the local root.

I did test this by setting iframe.src = "corss-origin#fragment" and I don't see the embedder being scrolling.

So perhaps if do have to force the scroll to instant for OOPIFs, it only needs to be applied to those that actually propagate cross origin. A base |element.scrollIntoView()| is one such case.

I also need to investigate more to verify if the issue is happening to the deeply nested frames only (that was the impression I had when reproducing this bug).
Cc: sunyunjia@chromium.org
Re#5: Yeah, smooth scrolling is a CSS feature that allows pages to specify that programmatic scrolls should be "smoothly animated". i.e. calling scrollIntoView be smooth. Since multiple things are required to scroll, we queue the scrolls and perform them in sequence.

To achieve this, the class SmoothScrollSequencer keeps a queue and once we've processed all the ancestor layout boxes and determined what their final scroll position should be, we play back the scroll into view animations one by one.

Where I think this goes wrong is that we keep SmoothScrollSequencer on Page. In the A(B(A)) case here, both A's are a local root but share the same Page, right? In that case, we'll queue and run the animations for the inner most A, then do the same for B (this is happening in parallel with OOPIF), then we'll start the scroll into view for the outer A. When we reach the outer A we cancel the inner A's animations and reset the queue since it looks like a "new" scrollIntoView call.

The simplest solution here is to move the sequencer to be per-local-root rather than per-page. This should all "just work" - even though the behavior will differ somewhat from what happens in the non-OOPIF case since we'll be running the animations in parallel rather than a sequence. I'm not sure if that's worth fixing since it's just a UX difference. To really get right we'd have to defer running the queue for an OOPIF and pass a signal down the frame tree to kick start the sequencer in each nested OOPIF. That should be fairly straightforward AFICT.

+sunyunjia@ who is the domain expert here. 
Thanks David! Refactoring sounds good. IIUC, for a frame tree A->B->C the ideal behavior is scroll A, B, and C in order one after.

If we only refactor and do not send down the message for the same frame tree we will most probably end up with running the scrolling task for C, then B, and then A, right?  So supposedly if B or C are way outside of viewable region, then by the time smooth scrolling for A is done, B and C have already scrolled which perhaps would look instant.

So to implement the actual behavior, we need that IPC to travel downwards in frame tree and get the scrolling going. Two conerns here:

1- This means we should know the frame tree route we took for a scroll sequence, so would there be any issues here? Basically frame A would know that there is a C inside a B underneath.

2- Would the implementation be as simple as dequeuing one scroll task per received trigger IPC or do we need to remember more information for each queued task? Also is it OK to block any forthcoming smooth scroll tasks to handle the head of line task (seems like the correct thing to do but from comment #7 I understand there is more kinds of programmatic scrolling which are not necessarily recursive)?

> If we only refactor and do not send down the message for the same frame tree we will most probably end up with running the scrolling task for C, then B, and then A, right?

That's correct. But the final positions should all be correct and IIRC the spec gives a fair amount of latitude in how the UA chooses to sequence and perform scrolls so it's not a severe issue.

> 1- This means we should know the frame tree route we took for a scroll sequence, so would there be any issues here? Basically frame A would know that there is a C inside a B underneath.

I don't think so. I think Frame A would only need to know about Frame B somehow (e.g. Sequencer stores the Element being scrolled into view - check if it's an OOPIF iframe). Frame B would ping C in the same way once it's done.

> 2- Would the implementation be as simple as dequeuing one scroll task per received trigger IPC or do we need to remember more information for each queued task? 

I think it be best to process the entire queue in a given LocalRoot. When the queue is emptied, and the element being scrolled into view is an OOPIF, IPC the OOPIF to run it's sequenced animations. E.g:

[ scroll A1 -> scroll A2 -> scroll A3 -> IPC B ] -> [ scroll B1 -> scroll B2 -> IPC C ] -> [ scroll C1 -> scroll C2 ]

> Also is it OK to block any forthcoming smooth scroll tasks to handle the head of line task (seems like the correct thing to do but from comment #7 I understand there is more kinds of programmatic scrolling which are not necessarily recursive)?

I'm not sure I understand your question here. Do you mean what to do when another smooth scroll is started? Today we cancel any existing smooth scroll and reset the queue. This would be tricky to do across processes and we'd need another IPC to clear out the sequences in child processes. I can imagine this being difficult to get right in all edge cases. My advice would be to implement just the per-local-root refactoring. I'm not sure the added complexity of totally-correct sequencing is worth the small UX win in a potentially rare edge case. We can reevaluate if we see some real-world examples.
> I don't think so. I think Frame A would only need to know about Frame B somehow (e.g. Sequencer stores the Element being scrolled into view - check if it's an OOPIF iframe). Frame B would ping C in the same way once it's done.

Yes agreed but that's an extra bit of routing logic to consider.

> I'm not sure I understand your question here. Do you mean what to do when another smooth scroll is started? Today we cancel any existing smooth scroll and reset the queue. This would be tricky to do across processes and we'd need another IPC to clear out the sequences in child processes. I can imagine this being difficult to get right in all edge cases. My advice would be to implement just the per-local-root refactoring. I'm not sure the added complexity of totally-correct sequencing is worth the small UX win in a potentially rare edge case. We can reevaluate if we see some real-world examples.

Yes my concern was something along those lines - i.e., a situation which a scroll related task would make us resync with other processes.


All in all I also agree the refactor is a good (and the right) move. I think even if we decide to do more later, it would help to have this logic refactored into a LocalRoot. That said I highly doubt implementation is easy: to track scroll sequences across frames, route scroll IPCs to the right frames (some bookkeeping required), canceling current scrolls for all frames, making scroll tasks aware of their frame, etc. which also might introduce more room for bugs and potential deadlocks.

I can take over the refactoring part but I am not super familiar with the logic to make a call on how simple the fix is. If we are targeting M69 the fix can't be too large. A temporary disabling of smooth scroll for OOPIFs is still on the table.
I suspect moving the logic to be per-LocalRoot will be conceptually rather simple. The only defect at that point would be that each local root will run their sequences in parallel (with each other). My comment in #9 was that I'm not convinced fixing this parallelism is worth the cost in any time frame.
Labels: M-69
Status: Started (was: Assigned)
Thanks. I will poke at the refactor then. Hopefully will be small enough for a merge.
Project Member

Comment 13 by bugdroid1@chromium.org, Aug 3

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8504f3e0526fa3490866b954eae4533d64c70950

commit 8504f3e0526fa3490866b954eae4533d64c70950
Author: Ehsan Karamad <ekaramad@chromium.org>
Date: Fri Aug 03 22:00:32 2018

Move SmoothScrollSequence to LocalRoot

The smooth-scroll animations are local frame root concepts. This move
will avoid a nested OOPIF same process with a parent to use the same
sequencer so their animations are independently scheduled.

Bug:  865446 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ic07c22eef5d7a6ec25fe51c32d3e22e056398522
Reviewed-on: https://chromium-review.googlesource.com/1159189
Commit-Queue: Ehsan Karamad <ekaramad@chromium.org>
Reviewed-by: David Bokan <bokan@chromium.org>
Reviewed-by: Sandra Sun <sunyunjia@chromium.org>
Reviewed-by: Alex Moshchuk <alexmos@chromium.org>
Cr-Commit-Position: refs/heads/master@{#580663}
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/content/browser/site_per_process_browsertest.cc
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/frame/local_frame.cc
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/frame/local_frame.h
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/layout/layout_object.cc
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/page/page.cc
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/page/page.h
[modify] https://crrev.com/8504f3e0526fa3490866b954eae4533d64c70950/third_party/blink/renderer/core/paint/paint_layer_scrollable_area.cc

Thanks ekaramad@ and bokan@!  Looks like this is working in Windows Canary 70.0.3514.0, after r580663 landed in 70.0.3512.0.  Can this be marked fixed?

In terms of merge, are there any known sites affected by this?  If so and it's a safe + easy merge, we should consider it, but otherwise it might be safest to wait for M70.
Status: Fixed (was: Started)
Marking the bug as fixed.

The fix is very small and I don't believe it should cause any troubles. Leaving the decision to bokan@.
I believe this is only broken when there are multiple local roots in a process and we do a smooth scroll-into-view, the combination of which should be quite rare. So unless we find real-world case I'm ok leaving this to move into stable as usual.

Sign in to add a comment