FrameOwner is left without |contentFrame| after two cross-process navigations in a row |
|||||||
Issue descriptionSteps to reproduce: 1.) Assume the familiar cross-site redirection logic and test data from browsertests. 2.) Navigate to `http://a.com/frame_tree/page_with_one_frame.html`: <html> <head></head> <body> This page has one cross-site iframe. <iframe id="child0" src="/cross-site/baz.com/title1.html"></iframe> </body> </html> 3.) Navigate the subframe from JS: document.getElementById("child0").src = "http://a.com/cross-site/baz.com/title2.html"; 4.) Evaluate: document.getElementById("child0").contentWindow; Expectation: The navigation succeeds and `contentWindow` is non-null. Reality: -- With both `--enable-browser-side-navigation` + `--site-per-process` enabled, the navigation succeeds but `contentWindow` becomes null. The HTMLFrameOwnerElement has no |contentFrame|. It seems that it is deleted in response to the renderer receiving a FrameMsg_Delete message (and nothing else afterwards). -- With only `--site-per-process`, the navigation takes place, but the loading never terminates on the browser side. See repro: https://codereview.chromium.org/2635523002/
,
Jan 13 2017
Adding dcheng@, as this might be Blink side issue.
,
Jan 13 2017
This probably has something to do with the fact that you are using GetURL("a.com", "/cross-site/baz.com/title2.html") for allowed_subframe_url. Using just GetURL("baz.com", "/title2.html") works fine in both modes, so I suspect this has something to do with the transfer logic not handling this case properly.
What's happening here is the subframe is at baz.com, then we navigate that to a.com, which transfers back to baz.com. With only --site-per-process, it appears we get a DidCommit message for the redirected URL (baz.com/title2.html), but no DidStopLoading.
,
Jan 16 2017
So did some investigating. Considerd the starting scenario: main frame: a.com child0: b.com Trigger navigation in `child0`: a.com --redirect--> c.com: 1.) The speculative RFH for the navigation in `child0` is created for site instance `a.com`. 2.) On the renderer side, this corresponds to creating a provisional WebLocalFrame in the parent's process. -- This is initially created with a DummyOwner. -- But soon afterwards, the provisional WebLocalFrame's frame owner is set to the real HTMLIFrameElement (whose contentFrame is the real `child0` showing b.com). 3.) After the redirect, it is discovered that the speculative RFH is no good, and is discarded. -- FrameMsg_Delete sent to the provisional WebLocalFrame. -- Frame::detach() -> Frame::disconnectOwnerElement() -> FrameOwner::clearContentFrame() on the real owner. :( 4.) Ultimately, this turns out to be a remote -> remote subframe navigation, so no further administration is needed or done on frames/proxies on the renderer side (I think).
,
Jan 16 2017
That's unfortunate... I guess we can make clearContentFrame() become a no-op if it doesn't point back at the frame that called it, but that becomes pretty hacky. clamy/nasko: any estimate of when we'll be able to get rid of provisional frames? =P
,
Jan 16 2017
,
Jan 17 2017
OK I've prepared a patch for the Blink-side issue, but it looks like it's still broken without --enable-browser-side-navigation... hmm...
,
Jan 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0401e95afc66f0b87733d04fad5d233c04d35aa3 commit 0401e95afc66f0b87733d04fad5d233c04d35aa3 Author: dcheng <dcheng@chromium.org> Date: Tue Jan 17 11:07:01 2017 Avoid mutating frame owner's when detaching a provisional frame. When detaching a provisional frame, Blink was incorrectly clearing the content frame of the provisional frame's frame owner. A provisional frame is only partially attached to a frame tree. While it may have a frame owner set, the frame owner's content frame will not point back at the provisional frame. Similarly, though it may have a parent frame, the provisional frame will not appear in the parent frame's list of child nodes. Thus, it is important not to affect the frame tree structure when detaching a provisional frame. BUG= 681077 ,578349 Review-Url: https://codereview.chromium.org/2631233003 Cr-Commit-Position: refs/heads/master@{#444021} [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/core/frame/Frame.cpp [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/core/frame/FrameOwner.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/core/html/HTMLFrameElementBase.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/core/html/HTMLFrameOwnerElement.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/core/html/HTMLIFrameElement.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/web/RemoteFrameOwner.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/web/tests/FrameTestHelpers.cpp [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/web/tests/FrameTestHelpers.h [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/Source/web/tests/WebFrameTest.cpp [modify] https://crrev.com/0401e95afc66f0b87733d04fad5d233c04d35aa3/third_party/WebKit/public/web/WebLocalFrame.h
,
Jan 18 2017
Following up on #8: I looked into the browser-side issues for this with --site-per-process but without --enable-browser-side-navigation a bit more, and found a couple of interesting things, which help explain why this still doesn't work after Daniel's fix. The setup is: we've got a a current RFH for baz.com in the subframe, let's say RFH1. We navigate that to a.com/cross-site/baz.com/title2.html. This creates a pending RFH, say RFH2, for a.com, which transfers back to baz.com. 1) During the transfer, we end up picking the RFH1 as the new navigation target. That happens as the first thing in UpdateStateForNavigate: if (!frame_tree_node_->IsMainFrame() && !CanSubframeSwapProcess(dest_url, source_instance, dest_instance)) { // Note: Do not add code here to determine whether the subframe should swap // or not. Add it to CanSubframeSwapProcess instead. return render_frame_host_.get(); } This (correctly) returns RFH1, but it seems to skips some crucial steps just below, namely, calling Transfer() on the transfer_navigation_handle_ (we're in the middle of transfer here after all), and canceling the pending RFH. Not canceling the pending RFH possibly leads to leaking RFH2, AFAICT. Skipping the Transfer() call is more interesting, as it later leads to RFH1's transferred baz.com request to be dropped in ResourceDispatcherHostImpl::CompleteTransfer, here: if (it == pending_loaders_.end()) { // Renderer sent transferred_request_request_id and/or // transferred_request_child_id that doesn't have a corresponding entry on // the browser side. // TODO(lukasza): https://crbug.com/659613: Need to understand the scenario // that can lead here (and then attempt to reintroduce a renderer kill // below). return; } So this could be one explanation for why we saw the RDH_TRANSFERRING_REQUEST_NOT_FOUND kill in issues 659613 and 660407. CC-ing Charlie and Lukasz for that. I've verified that doing the Transfer() work and canceling the pending RFH for the !CanSubframeSwapProcess case in UpdateStateForNavigate seems to fix all this, and makes Balasz's test pass with --site-per-process. I'll work on a CL for this.
,
Jan 18 2017
This also explains why the `contentWindow` issue did not manifest without PlzNavigate. Based on my observations, RFH2 (which was a provisional frame in the parent's SiteInstance) was so much not canceled that it ended up loading the document, and it was RFH1 that got swapped out. So we essentially transferred back to the parent frame, meaning that RFH2 was never deleted, thus the issue with deleting a provisional frame was masked. After patching in https://codereview.chromium.org/2636193003/, I can confirm that now we correctly discard RFH2, and complete the navigation in RFH1. Blink side frame structure looks good too. Something still seems to be off with compositing/rendering, though. The frame is not interactive after the navigation, and keeps displaying the old content. Who would be the best person to look into that aspect?
,
Jan 27 2017
I've just filed issue 686184 for the remaining work here to fix the transfers and look at any compositing/rendering issues. With that, I think we can close this one as fixed.
,
Mar 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0431b6bc6c9bf5624cccf1121285392b07b74ec6 commit 0431b6bc6c9bf5624cccf1121285392b07b74ec6 Author: alexmos <alexmos@chromium.org> Date: Wed Mar 01 02:52:43 2017 Fix cross-site subframe navigations that transfer back to original RFH. This fixes a bug where if an OOPIF at site A navigates to site B which redirects back to A, the navigation timed out because Transfer() was never called on transfer_navigation_handle_, and the pending RFH for B was never canceled. This is because UpdateStateForNavigate returned the current RFH early, due to checking CanSubframeSwapProcess before reaching those steps. Not calling Transfer() later led to the transferred request being dropped in ResourceDispatcherHostImpl::CompleteTransfer, due to not finding a matching entry in pending_loaders_. This CL fixes the CanSubframeSwapProcess check to be performed after calling Transfer() and canceling the pending RFH (if necessary). This uncovered another complication, which is that while redirects to data: and about: URLs are explicitly disallowed as unsafe (e.g., see DataProtocolHandler::IsSafeRedirectTarget), they are allowed when an extension uses the webRequest API to redirect certain URLs. One of the tests, ExtensionWebRequestApiTest.WebRequestDeclarative1, was exercising this: it had a subframe which made a same-site request that was redirected to data:,. Previously, in --site-per-process, it worked due to the ordering problem that this CL fixes: we first decided to transfer the request (IsRendererTransferNeededForNavigation returned true), then we returned the current RFH early due to CanSubframeSwapProcess returning |false| before calling Transfer(). With the fixed ordering, we were now also calling Transfer(), as we were expecting the data: request to be put into a new SiteInstance which didn't match the current one, yet the request was never transferred (we weren't spinning up a new pending RFH, and we weren't sending a Navigate IPC to the current RFH due to the is_transfer_to_same case in NavigateToEntry, so no one actually made a request to the data: URL). To address this, CanSubframeSwapProcess is changed to allow a process swap for such transfers to data: and about:. This changes behavior where these redirects were previously kept in the original SiteInstance, but this should be safer, and it should not break any scripting relationships (the redirect URLs are provided by an extension; the site that made the request that's being redirected can't expect to script the redirect target). BUG= 681077 ,660407,659613 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/2636193003 Cr-Commit-Position: refs/heads/master@{#453800} [modify] https://crrev.com/0431b6bc6c9bf5624cccf1121285392b07b74ec6/content/browser/frame_host/render_frame_host_manager.cc [modify] https://crrev.com/0431b6bc6c9bf5624cccf1121285392b07b74ec6/content/browser/frame_host/render_frame_host_manager.h [modify] https://crrev.com/0431b6bc6c9bf5624cccf1121285392b07b74ec6/content/browser/site_per_process_browsertest.cc |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by engedy@chromium.org
, Jan 13 2017Summary: FrameOwner is left without |contentFrame| after two cross-process navigations in a row (was: FrameOwner is left without contentFrame after two cross-process navigations in a row)