New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 765366 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

headless: History navigation can block virtual time forever

Project Member Reported by zoeclifford@chromium.org, Sep 14 2017

Issue description

Chrome Version: 62.0.3201.0 (Developer Build) (64-bit)
OS: Ubuntu Linux

What steps will reproduce the problem?
(1) Look at the attached HTML in headless mode with virtual time. For example "chrome --headless --disable-gpu --virtual-time-budget=1000 --screenshot file://attachment.html"
(2) Try again but without history.back()

What is the expected result?
A screenshot is produced in both cases

What happens instead?
The command runs forever if history.back is called.

Details:
The issue here is with virtual time. Virtual time is a feature of headless Chromium that allows for deterministic page loading and JavaScript execution.

To prevent a page-visible race between the renderer process and the browser process, the virtual time scheduler code in the renderer tries to wait for a navigation from the browser after calling history.back before allowing virtual time to advance.

It does this incorrectly, by assuming that history.back will always result in a single navigation on the same frame [1][2].

Actually history.back may:
* Trigger a navigation on a different frame, as in the testcase
* Trigger multiple navigations on different frames

Since the number of navigations is determined by the browser process [3], the current approach is not workable since the renderer has no way of knowing if all history.back navigations have been generated or not.

Instead the renderer needs to know when an entire batch of navigations is complete. This will likely require a new IPC message, or extra information bolted onto OnNavigate's IPC message.

[1]
WebViewSchedulerImpl::WillNavigateBackForwardSoon
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/renderer/web_view_scheduler_impl.cc?rcl=9e3ad303633058ab8c4fec1d7526aae8c7635368&l=256

[2]
WebViewSchedulerImpl::DidEndProvisionalLoad
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/renderer/web_view_scheduler_impl.cc?rcl=9e3ad303633058ab8c4fec1d7526aae8c7635368&l=269

[3]
NavigationControllerImpl::NavigateToPendingEntryInternal
https://cs.chromium.org/chromium/src/content/browser/frame_host/navigation_controller_impl.cc?rcl=9e3ad303633058ab8c4fec1d7526aae8c7635368&l=2078


 
See attached.
index.html
347 bytes View Download

Comment 2 by creis@chromium.org, Sep 14 2017

Cc: creis@chromium.org
Components: Internals>Sandbox>SiteIsolation
Hmm, why is virtual time managed in the renderer process?  That seems incompatible with out-of-process iframes, when a page has frames spread across multiple processes.  (Maybe this is a sub-problem of issue 715924?)  Note that uses of out-of-process iframes have already launched in M56, and more are on the way.

Maybe we should discuss this feature a bit more to see what the right way forward is.  I did the session history refactor, so I'm happy to chat more about it.
I'd be happy to chat, but we should also rope in alexclarke and/or skyostil since they did the bulk of the virtual time implementation and are generally a lot more familiar with headless than I am. See  http://crbug.com/696001 

As I understand it the renderer process needs to be aware of virtual time so that JavaScript can wait on certain browser logic to call back into the renderer before continuing. Otherwise you end up with a bunch of places where the browser and the renderer race in a page visible manner.

For example after calling history.back the JavaScript may schedule a setTimeout to inspect the contents of the iframe. If virtual time isn't blocked it probably won't see the effects of the navigation before the virtual time budget (which can go faster than real time) expires.

(I'm not yet familiar with how OOPIF will play into all this)


Comment 4 by creis@chromium.org, Sep 14 2017

Cc: skyos...@chromium.org alexclarke@chromium.org
Thanks.  If any of you will be at BlinkOn next week, we could chat there.  Otherwise let's set up a meeting the following week!

Comment 5 by dcheng@chromium.org, Sep 15 2017

Cc: dcheng@chromium.org
Status: Available (was: Untriaged)
I've dug into this a bit. To summarize the problem:

A page has one more iframes (some/all) of have navigation history.  JS calls history.back() which calls LocalFrameClientImpl::NavigateBackForward() on the root document which calls virtual_time_pauser_.PauseVirtualTime(true);

Subsequently some child iframes have a history navigation but the root LocalFrameClientImpl never hears about it and virtual time gets stuck.

I think instead of LocalFrameClientImpl::NavigateBackForward pausing virtual time I think that should be up to RenderViewImpl and we should arrange for something to tell it once the history navigation has been committed.

In a OOPIF world that would have to be a new IPC.  Of course OOPIF & VirtualTime is a whole other problem.

Ignoring OOPIF for now RenderFrameImpl::NavigateInternal could tell the RenderViewImpl once it's seen a history navigation.
> 
Ignoring OOPIF for now RenderFrameImpl::NavigateInternal could tell the RenderViewImpl once it's seen a history navigation.

One issue is that a single call to history.go may trigger more than one history navigation.

Consider a page with the following navigations:
* A -> B
* B contains two iframes, 1A and 2A
* 1A -> 1B
* 2A -> 2B

At this point:
* history.go(-1) will navigate 2B to 2A
* history.go(-2) will navigate 2B to 2A and 1B to 1A
* history.go(-3) will navigate B to A.

It seems like the number of navigations cannot be determined from RenderFrameImpl::NavigateInternal, that's why I thought we would need to change IPC.
Maybe it doesn't matter if there are multiple navigations from one history navigation? Once the navigation(s) have started, resource loading and the presence of one or more background html parsers will block virtual time.  

The tricky bit is blocking and then unblocking virtual time while we wait for a navigation to start.  LocalFrameClientImpl::NavigateBackForward tries to do that but it doesn't always unblock it.

My contention (and I could be wrong here) is that it's sufficient to keep state at the RenderView level which pauses virtual time if we're expecting a history navigation but have't seen one start yet.  Does that make sense? 
Project Member

Comment 10 by bugdroid1@chromium.org, Nov 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf

commit bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf
Author: Alex Clarke <alexclarke@chromium.org>
Date: Fri Nov 24 12:27:43 2017

Fix a bug with history navigation and virtual time

Its possible for the main frame to have several child frames with navigation
history.  If history.back() is called those child frame will navigate but
the main frame wont.  This is a problem for virtual time because the main
frame's LocalFrameClientImpl::NavigateBackForward will (correctly) block
virtual time but only the child frames will recieve calls to
LocalFrameClientImpl::DispatchDidCommitLoad.

This patch moves tracking of pending history navigations to RenderViewImpl
which solves the problem.

Bug:  765366 ,  777763 
Change-Id: I8f97cfec5164541948910d92b7485d591c91d9d8
Reviewed-on: https://chromium-review.googlesource.com/787671
Reviewed-by: Jochen Eisinger <jochen@chromium.org>
Reviewed-by: Sami Kyöstilä <skyostil@chromium.org>
Commit-Queue: Alex Clarke <alexclarke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#519090}
[modify] https://crrev.com/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf/content/renderer/render_frame_impl.cc
[modify] https://crrev.com/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf/content/renderer/render_view_impl.cc
[modify] https://crrev.com/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf/content/renderer/render_view_impl.h
[modify] https://crrev.com/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf/headless/lib/virtual_time_browsertest.cc
[modify] https://crrev.com/bf3dceb7cbd46c62d31715f9f7ef28f4b4e20ebf/third_party/WebKit/Source/core/exported/LocalFrameClientImpl.cpp

Status: Fixed (was: Available)
I am not quite sure about this, but I assume that the fix from #c10 doesn't yet address the OOPIF issues referred to in #c7.  Is that assumption correct?  Do we want to open a new bug to track OOPIF-vs-HeadlessVirtualTime issue?
A new bug wouldn't hurt but frankly I've no idea how to make OOPIF work with VirtualTime except via something expensive such as allowing it to advance 1ms at a time in lockstep.  I trust we'll be able to turn it off via some flag.
RE: A new bug wouldn't hurt [...]

I've opened issue 788850 :-)

Sign in to add a comment