New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 776884 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Behavior change in M63 for cross-origin main frame request interception

Project Member Reported by marshall@chromium.org, Oct 20 2017

Issue description

Chrome Version: Chromium master revision adb61db1 (#508578) (M63 branch point)
OS: Windows 10 64-bit

What steps will reproduce the problem?
(1) In the URLRequestInterceptor::MaybeInterceptRequest method call ResourceRequestInfo::GetRenderFrameForRequest.
(2) Navigate the main frame cross-origin.

What is the expected result?
GetRenderFrameForRequest should return the routing IDs for the RenderFrame that initiated the main frame cross-origin navigation.

What happens instead?
GetRenderFrameForRequest returns (-1, MSG_ROUTING_NONE).

I'm testing with a third-party application using the Content API but you should be able to reproduce the problem for debugging purposes by adding the GetRenderFrameForRequest call in a content/ or chrome/ location like AppCacheInterceptor::MaybeInterceptRequest.
 
Cc: creis@chromium.org jam@chromium.org nasko@chromium.org mmenke@chromium.org
Components: UI>Browser>Navigation Internals>Network
Labels: Proj-PlzNavigate
Owner: clamy@chromium.org
Status: Assigned (was: Untriaged)
Thanks for the report!  To repro this, I inserted the following into AppCacheInterceptor::MaybeInterceptRequest():

  int render_process_id = -1;
  int render_frame_id = MSG_ROUTING_NONE;
  bool result = ResourceRequestInfo::GetRenderFrameForRequest(
      request, &render_process_id, &render_frame_id);
  LOG(ERROR) << "MaybeInterceptRequest for " << request->url();
  LOG(ERROR) << "  result = " << result;
  LOG(ERROR) << "  process id = " << render_process_id;
  LOG(ERROR) << "  routing id = " << render_frame_id;

Running chrome and navigating to, say, https://csreis.github.io/tests/ gives me a trace like this:

[86615:86645:1020/145950.521691:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for https://csreis.github.io/tests/
[86615:86645:1020/145950.521726:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/145950.521737:ERROR:appcache_interceptor.cc(149)]   process id = -1
[86615:86645:1020/145950.521747:ERROR:appcache_interceptor.cc(150)]   routing id = -1
[86615:86645:1020/145950.541122:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for blob:https://csreis.github.io/838229c5-bbcd-4edd-869f-501f89443233
[86615:86645:1020/145950.541151:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/145950.541160:ERROR:appcache_interceptor.cc(149)]   process id = 3
[86615:86645:1020/145950.541168:ERROR:appcache_interceptor.cc(150)]   routing id = 2
[86615:86645:1020/145950.550836:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for https://csreis.github.io/tests/tests.css
[86615:86645:1020/145950.550864:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/145950.550876:ERROR:appcache_interceptor.cc(149)]   process id = 3
[86615:86645:1020/145950.550887:ERROR:appcache_interceptor.cc(150)]   routing id = 2
[86615:86645:1020/145950.717631:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for https://csreis.github.io/favicon.ico
[86615:86645:1020/145950.717666:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/145950.717676:ERROR:appcache_interceptor.cc(149)]   process id = 3
[86615:86645:1020/145950.717690:ERROR:appcache_interceptor.cc(150)]   routing id = 2

So yes, the process and routing IDs for the first MaybeInterceptRequest aren't valid (the routing ID is -1 though, and not MSG_ROUTING_NONE which is -2).  The subsequent request looks like the PlzNavigate blob URL for the response stream, and that seems to have the valid IDs, as do the subsequent subresource requests.  Interestingly, this doesn't appear to be just cross-origin: clicking on the same-site "simple.html" link, I also get:

[86615:86645:1020/150203.948652:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for https://csreis.github.io/tests/simple.html
[86615:86645:1020/150203.948693:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/150203.948706:ERROR:appcache_interceptor.cc(149)]   process id = -1
[86615:86645:1020/150203.948717:ERROR:appcache_interceptor.cc(150)]   routing id = -1
[86615:86645:1020/150203.993960:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for blob:https://csreis.github.io/7b876e37-aa1e-4965-be80-956ec6fce8b0
[86615:86645:1020/150203.993995:ERROR:appcache_interceptor.cc(148)]   result = 1
[86615:86645:1020/150203.994006:ERROR:appcache_interceptor.cc(149)]   process id = 3
[86615:86645:1020/150203.994016:ERROR:appcache_interceptor.cc(150)]   routing id = 2

I've tried reverting my https://crrev.com/27caae83 always-swap-with-a-proxy CL locally, and still get this behavior, so this does not appear to be related to that CL.

However, disabling PlzNavigate via --disable-browser-side-navigation, I get the following:

[87053:87083:1020/150339.708765:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for http://csreis.github.io/tests/
[87053:87083:1020/150339.708809:ERROR:appcache_interceptor.cc(148)]   result = 1
[87053:87083:1020/150339.708826:ERROR:appcache_interceptor.cc(149)]   process id = 3
[87053:87083:1020/150339.708842:ERROR:appcache_interceptor.cc(150)]   routing id = 2
[87053:87083:1020/150339.762858:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for http://csreis.github.io/tests/tests.css
[87053:87083:1020/150339.762904:ERROR:appcache_interceptor.cc(148)]   result = 1
[87053:87083:1020/150339.762919:ERROR:appcache_interceptor.cc(149)]   process id = 3
[87053:87083:1020/150339.762932:ERROR:appcache_interceptor.cc(150)]   routing id = 2
[87053:87083:1020/150339.921384:ERROR:appcache_interceptor.cc(147)] MaybeInterceptRequest for http://csreis.github.io/favicon.ico
[87053:87083:1020/150339.921423:ERROR:appcache_interceptor.cc(148)]   result = 1
[87053:87083:1020/150339.921434:ERROR:appcache_interceptor.cc(149)]   process id = 3
[87053:87083:1020/150339.921444:ERROR:appcache_interceptor.cc(150)]   routing id = 2

where the process/routing IDs in the first request are now correct.  marshall@: can you confirm whether running with --disable-browser-side-navigation fixes the problem you're having?

I don't know these APIs too well, and in particular don't know whether their semantics are expected to change with PlzNavigate.  It does appear that you can get the IDs you want if you watch the second blob: request.  clamy@: would you mind triaging this from the PlzNavigate side, to see if there's an issue here?

Comment 2 by mmenke@chromium.org, Oct 20 2017

I'm not navigation expert, but this is likely expected - with PlzNavigate, there is no RenderFrame until commit.  We don't even know what process it will end up in.
@comment#1: I can confirm that --disable-browser-side-navigation fixes the problem for me.
@comment#2: In my application I associate RenderFrames with an application-specific object (like the Browser object in Chrome) by tracking calls to WebContentsObserver::RenderFrameCreated/RenderFrameDeleted and storing the associated routing IDs. It would be sufficient for my use case if the URLRequest was associated with the currently-existing RenderFrame.

Comment 5 by mmenke@chromium.org, Oct 20 2017

So...Erm...URLRequests will eventually be in another process (Which is likely to give you as much fun as it's giving us getting to that point), but until then, ResourceRequestInfo has a FrameTreeNodeIdGetter, which I believe you have even when you don't yet have a destination RenderFrame.

Comment 6 by nasko@chromium.org, Oct 20 2017

Is there any reason why you aren't also listening to WebContentsObserver::RenderFrameHostChanged? For cross-process navigations this method is called to inform listeners that the underlying RenderFrame has been changed. I would strongly suggest looking at whether this callback will help solve your scenario. We are moving towards more and more RenderFrameHost changes in the future on bigger percentage of navigations (potentially even if they are in the same process).

It very much depends on what the goal of this code is, but I would be very weary of using FrameTreeNodeIdGetter mixed with routing ids.
Do you mean that URLRequests will no longer be in the main process? That will indeed lead to exciting times :)

I'm open to any and all suggestions on the best (and more forward-compatible) way to reliably map URLRequests (which live on the IO thread) to RenderFrameHosts (which live on the UI thread). I can then retrieve the associated WebContents using WebContents::FromRenderFrameHost. My top-level object is attached to the WebContents via UserData.

My plan for OOPIFs is to expose an API like "DoSomething(frame_id, ...)" where |frame_id| is mapped by my top-level object to a particular RenderFrameHost.

Comment 8 by mmenke@chromium.org, Oct 21 2017

marshall:  Indeed, URLRequests will be in the network process instead.  You can play with it now by using "--enable-features=NetworkService" and watch things fall over.

Anyhow, I'll defer to creis, nasko, and kinuko on how to do whatever navigation-related thing you're doing.

Comment 9 by clamy@chromium.org, Oct 23 2017

If you are listening to WebContents notification, WebContentsObserver::ReadyToCommitNavigation gives you a NavigationHandle. You can get the GlobalRequestID of the URLRequest that was used for navigation from the NavigationHandle. This should allow you to match the request with the navigation.
Thanks for the good feedback so far. I think the picture is becoming clearer for me :)

Taking a step back, I need to locate the associated WebContents for two use cases that run on the IO thread:

(A) Methods that receive a URLRequest argument (URLRequestInterceptor, NetworkDelegateImpl, ResourceDispatcherHostDelegate, etc).

(B) QuotaPermissionContext::RequestQuotaPermission (and maybe other methods?) which get a (render_process_id, render_frame_id) pair.

@clamy/@nasko: A few questions:

1. Are there cases where a RenderFrame is not associated with a WebContents but has an in-flight URLRequest?

2. Is it possible to associate URLRequests and WebContents without the intermediate step of tracking RenderFrame lifespan? (This seems to be the suggestion of comment#9 if I'm understanding correctly.)

3. If I wish to reliably associate all RenderFrames with a WebContents which callbacks should I be using?
 There are currently a number of WebContentsObserver callbacks related to RenderFrame life span: RenderFrameCreated, RenderFrameDeleted, RenderFrameHostChanged, FrameDeleted. Previously, I was using RenderFrameCreated/RenderFrameDeleted but that doesn't work well with swapped-out RenderFrames.
+nick@ since the question which WCO callbacks one should listen to has recently came up in a Task Manager CL - task_manager::WebContentsEntry used to only override RenderFrameHostChanged and RenderFrameDeleted, but this had to be fixed in r472325 to properly detect frames created when recovering from a sad frame.
Cc: nick@chromium.org
+nick@ for real this time...

Comment 13 by clamy@chromium.org, Oct 24 2017

So one thing to note: it's not possible to access a WebContents from the IO thread. It's a purely UI thread class. It doesn't have IDs either. So if you want to interact with it in any way, you have to be on the UI thread.

For your questions:
1) No it's not possible.
2) Yes. URLRequets's RequestInfo has a WebContentsGetter. It's a callback that you can run on the UI thread and that will return the WebContents associated with the request.
3) If you wish to access the WebContents from the RenderFrameHost, use WebContents::FromRenderFrameHost. If you want to go in the other direction, you can access the current main frame easily, but accessing other frames is harder.
@clamy: Thank you for the responses. In order to address the "not possible to access a WebContents from the IO thread" problem I currently store a thread-safe mapping of (render_process_id, render_frame_id) pairs to my object. There's a 1:1 relationship between my object and the WebContents (my object owns the WebContents and only interacts with it on the UI thread).

My assumptions up to this point have been that:

(a) (render_process_id, render_frame_id) pairs uniquely identify a single RenderFrame for the lifespan of the main process, and;
(b) a RenderFrame will only be associated with a single WebContents during its lifespan, and;
(c) a RenderFrame will have an assigned (render_process_id, render_frame_id) pair before it has any in-flight URLRequests.

It sounds like (c) is now false, which means that I need to associate early URLRequests directly with the WebContents instead of the RenderFrame.

For a callback like:

net::URLRequestJob* MaybeInterceptRequest(
    net::URLRequest* request,
    net::NetworkDelegate* network_delegate) const;

How can I retrieve the GlobalRequestID associated with |request|?

For URLRequest-related callbacks that come after the RenderFrame is assigned routing IDs I would like to map them to my thread-safe RenderFrame representation using the routing ID pair. For my mapping approach to work I need to add/remove entries from the map at the appropriate points in RenderFrame lifespan. Specifically, I need to add the entry ASAP after the RenderFrame is assigned routing IDs and remove the entry after the last URLRequest has completed/canceled for the RenderFrame.

What WebContentsObserver callbacks should I use to properly manage this map?
To answer one of my own questions:

> How can I retrieve the GlobalRequestID associated with |request|?

ResourceRequestInfo has a GetGlobalRequestID() method. So based on prior feedback I should be able to use WebContentsObserver::ReadyToCommitNavigation to track the association between GlobalRequest and WebContents.

Comment 16 by clamy@chromium.org, Oct 25 2017

No, (c) is still true. However there are requests (navigation requests) which are not associated with a RenderFrame since they did not originate in a RenderFrame but in the browser process directly.

I'm a bit worried by the thread-safe mapping of RFH/RPH ids on the IO thread. We have avoided doing that, because it would be likely to get out of date and be wrong. I suspect this would only get worse as we make net into a service. Or by thread-safe, do you mean an object that is shared on the UI and the IO thread, and whose access is synchronized?
> Or by thread-safe, do you mean an object that is shared on the UI and the IO thread, and whose access is synchronized?

Correct. The code that maintains the mapping of routing ID to my object uses base::Lock to synchronize access, and my object itself is base::RefCountedThreadSafe.

Perhaps I'm going about this the wrong way. Maybe I should instead be associating my object with the URLRequest via NavigationHandle and eventually SetUserData. My object could then be associated with the request in WebContentsObserver::ReadyToCommitNavigation (on the UI thread) and remain associated for the lifespan of the URLRequest object. I'm not sure yet what the plumbing for that would look like or if the resulting changes would be useful/acceptable to the larger Chromium community.
> What WebContentsObserver callbacks should I use to properly manage this map?

Based on prior feedback and a careful reading of the WebContentsObserver comments it sounds like I should use RenderFrameCreated and RenderFrameHostChanged for RFH creation and FrameDeleted for RFH deletion. I'll go with that and report back if it doesn't work.
@clamy: Thanks for your help so far. I'm moving on to a slightly different topic with this next question:

My application currently creates an object in the renderer process that gets associated with a RenderView. This object stores state in the renderer process that is associated with a WebContents in the main process. I rely on the relationship between RenderFrames and RenderViews to associate RenderFrames with my object (e.g. main frame and sub-frame RenderFrames in the same process and same frame tree return the same RenderView).

I'm assuming that in the future there will still be instances where the RenderFrame trees for multiple WebContents coexist in the same renderer process. For example, if two WebContents navigate to google.com then perhaps the RenderFrame trees for both WebContents will exist in the same renderer process. Without RenderViews how should I track the association between RenderFrames so that I can map all RenderFrames in the same frame tree to my object instance?

One idea is that I could walk the frame tree via blink::WebFrame::Top() to identify the top-level (main) WebFrame, retrieve the associated main RenderFrame via RenderFrame::FromWebFrame(), and use that main RenderFrame as a direct substitution for the RenderView in my current logic. Is it safe to assume that the main RenderFrame  will be the same object/pointer for all RenderFrames from the same frame tree in a given process?
@comment#15: It looks like the GlobalRequestID approach to mapping URLRequests prior to RenderFrameHost creation won't work. The GlobalRequestID is not available from NavigationHandle until after the RenderFrameHost is created (NavigationHandleImpl::WillProcessResponse is called with both the RFH and the GlobalRequestID). I'll try to instead make the association between my object and the URLRequest from WebContentsObserver::DidStartNavigation using the approach proposed in comment#17.

Comment 21 by clamy@chromium.org, Oct 26 2017

I'm sorry, I'm having a bit of trouble following what you're trying to do. From what I understand, it's quite likely that you will need to rethink your approach as it won't work with the architectural changes that we're doing in Chrome. Could you explain to us what you're trying to achieve and what are your requirements? This would help us in guiding you towards a solution that work in the long term.
Status: WontFix (was: Assigned)
Going to go ahead and close this - not to discourage further conversation, but because things are working as intended, and this is not a bug, or a publicly supported API.
Cc: sgu...@chromium.org
@clamy: Maybe it would help to consider a concrete example.

Android WebView has the AwRequestInterceptor::MaybeInterceptRequest method [1] with logic similar to what I'm trying to accomplish in my application. For the top-level frame navigation the GetCorrespondingIoThreadClient call doesn't find RFH routing IDs via ResourceRequestInfo::GetRenderFrameForRequest so instead falls back to looking up the FrameTreeNodeId via AwContentsIoThreadClient::FromID. Failure to find the AwContentsIoThreadClient will likely result in the request failing because it will go over the network stack instead of being handled by AndroidStreamReaderURLRequestJob in some application-specific manner (for example, being read from an application bundle on disk).

How should Android WebView re-implement this functionality given the upcoming architectural changes?

(Adding sgurun@ to CC as the owner of this WebView code.)

[1] https://cs.chromium.org/chromium/src/android_webview/browser/net/aw_request_interceptor.cc?type=cs&q=AwRequestInterceptor::MaybeInterceptRequest&sq=package:chromium&l=148
@comment#23: I should also mention that Android WebView is using the RenderFrameCreated/RenderFrameDeleted callbacks to maintain the FrameTreeNodeId map: https://cs.chromium.org/chromium/src/android_webview/browser/aw_contents_io_thread_client.cc?type=cs&q=ClientMapEntryUpdater::RenderFrameCreated&sq=package:chromium&l=174

Comment 25 by nasko@chromium.org, Oct 26 2017

Cc: boliu@chromium.org
Adding boliu@, who has looked at WebView and PlzNavigate.
Looks like falling back to using FrameTreeNodeId in webview was precisely to address this problem, when there is no render id associated associated with a arequest. See the CL that added this: https://codereview.chromium.org/2737663004

That seem to always work based on some local testing.

At this point, I'm curious if using FrameTreeNodeId only for this look up is good enough..
Testing with master revision adb61db1 (#508578) it appears that RenderFrameCreated/RenderFrameHostChanged are not called for popup browsers created via window.open(...). It's necessary in this case to also listen for the RenderViewCreated notification in order to maintain the FrameTreeNodeId map for the popup's MainFrame.

Comment 28 by boliu@chromium.org, Nov 28 2017

I'd expect popup to be a new WebContents, so you'd need to register Observer on the new WebContents when it's created.

Or rather, you probably want to watch for new pop up WebContents with WebContentsDelegate::AddNewContents, and just check the current render frames and whatnot in addition to registering another WebContentsObserver.

Hmm... I'm not sure if webview actually does this correctly either.

Sign in to add a comment