Issue metadata
Sign in to add a comment
|
Remove use of ChildProcessSecurityPolicyImpl from ResourceDispatcherHostImpl |
||||||||||||||||||||||||
Issue descriptionLooking into how to address ShouldServiceRequest() in RDHI. https://cs.chromium.org/chromium/src/content/browser/loader/resource_dispatcher_host_impl.cc?type=cs&q=shouldservicerequest+resourcedisp&sq=package:chromium&l=295 https://codereview.chromium.org/2098923002/ and https://codereview.chromium.org/2108643004/ started towards helping with this with a delegate from c/b/loader to content. This allows some notifications to work. But it's designed to be an async interface so that we are able to change it to a mojo interface in the future. However: ShouldServiceRequest() needs to decide synchronously whether to continue the request in ResourceDispatcherHostImpl::BeginRequest(). 1. One approach could be to break up BeginRequest() into a before/after it's been decided. So it would Post to the delegate and the delegate would Post back once it's decided. This would be relatively simple, but it seems likely to be too expensive for something like BeginRequest() (which I assume to be critical path on loads). 2. Another approach might be to push some data from the delegate into the loader (i.e. cache a bit of information on the other side of the future process boundary), so that loader can decide itself, synchronously. The methods that ShouldServiceRequest() needs to call on ChildProcessSecurityPolicy are CanRequestURL(), CanCommitURL(), CanReadFile(), CanReadFileSystemFile(). Looking at ChildProcessSecurityPolicyImpl::CanRequestURL, this seems non-trivial, partly because it delegates synchronously back to GetContentClient()->browser()->IsHandledURL(). It's also not obviously structured as "a bunch of data" that could be cached over. 3. A third approach would be to break up ChildProcessSecurityPolicy into two pieces, one of which lives only in the browser, and one that is implemented on the other side. Then both "sides" could have those functions compiled in and a lot of the implementation could stay the same. There's still some data that would have be passed over, in particular the data stored in ChildProcessSecurityPolicyImpl::SecurityState (some maps and sets from a quick look). I'm not sure what the best approach is. (Perhaps a combination of #2 and #3?) Any thoughts welcome.
,
Jun 29 2016
I've been looking into this one too, so I've had some thoughts. Good thing you filed this! (I'll work on something else, but put my thoughts here) Whatever solution we pick (in general, not just for this bug) we have to keep in mind: a) that the network service will eventually run in a separate process b) we don't want to add any additional slowdowns (i.e. pause request while waiting for other process to tell us some information) 1) would mean pausing every request in a future network process while waiting for the delegate, which would be in the browser process. So that becomes a no-go for performance reasons. 2) wouldn't work because ChildProcessSecurityPolicy is used for many different tasks. Some of them are networking related (i.e. this bug), while others are not. So if the whole class is pushed to the network process then it would contain non-networking related methods which would have to be called by the browser process (or other processes), and that would slow them down. So I think we should do 3.
,
Jun 29 2016
I thought that order of mojo messages wasn't guaranteed, so if we had to pass mojo messages over to a ChildProcessSecurityPolicy implementation in the network process, there's no way to guarantee it's been updated before the network requests that use a new process. Also, what is the plan for what the RDH will know about, in a mojo world, in terms of process IDs, renderer IDs, etc?
,
Jun 29 2016
@mmenke: regarding order of mojo messages not being guaranteed, that's currently the case for legacy IPC and mojo between renderer process and browser UI thread. That's being fixed. However that's a bit orthogonal from this issue, since a) while we just have an async C++ interface from content/browser to content/browser/loader, there'll be no mojo involved b) in the future when the network service is running in a separate process and the interface is a mojo interface, there are different pipes from renderer to network process vs browser to network, so by definition there's no order. When designing that interface we will have to ensure that changes to the loader code's ChildProcessSecurityPolicy are done in a way that they're processed before requests from the network process that depend on them. I don't see any hard stumbling blocks there, i.e. we need careful planning, but IMO let's not get bogged down worrying about that problem now. Regarding process IDs, RDH uses this extensively so I don't see a reason to remove that. Once legacy chrome IPCs are completely removed from the codebase, we shouldn't need routing IDs and can probably just use frame tree node IDs. but that's orthogonal from this refactoring.
,
Jun 29 2016
The ResourceThrottle approach sounds nice in that it would be great to generalize/simplify like that. On the other hand, we don't have a solution for throttles yet afaik, so for now that only defers finding a solution. I guess the drawback of that approach is that we'd cause a suspend on every BeginRequest(), is that what you were saying jam? Re: #2 (pushing some data over to the network-service side) I was thinking we'd have to have the data on *both* sides, with one just being a cache of the other. So I guess #2 and #3 are basically the same -- we're going to have to split CPSP into two chunks of code (one chunk will be network+browser and the other will be browser-only) and both sides will need a copy of some data (i.e. some of this stuff https://cs.chromium.org/chromium/src/content/browser/child_process_security_policy_impl.cc?rcl=0&l=271 ) (Going to deassign myself for now as I'm likely not going to be able to get this done before I'm out for vacation. Feel free to continue this one jam... or alternatively ignore it and I'll pick it up when I'm back. :)
,
Jun 30 2016
yep using a ResourceThrottle means that we would have to pause every request to do an IPC to the browser process, so I think we want to avoid that.
,
Jul 1 2016
I tried this a couple ways, but I didn't get too far, as it seemed to get out of hand. So I'm thinking as a first step, turning ChildProcessSecurityPolicyImpl into a facade that forwards to the implementation class in a location that browser and loader share, and then trying to figure out how to break up the browser-only and browser+loader code in further smaller steps (hopefully more like one method at a time). I started that here: https://codereview.chromium.org/2111343002 (I'm not sure what to call that shared location, it's just in c/b/shared right now, but that's just placeholder)
,
Jul 1 2016
I also outlined the parts of ChildProcessSecurityPolicyImpl that ResourceDispatcherHostImpl needs too: https://docs.google.com/document/d/1Xy-4uB0aIaroJ7Eu0HxUh_vXjGVDhAjKLLfNdIzesJ8/edit?usp=sharing . It's not huge, but it is non-trivial code and data. So leaning towards shared code seems necessary.
,
Jul 1 2016
I looked through usage a bit more carefully, and realized that both "sides" need substantially all data stored by both ChildProcessSecurityPolicyImpl and ChildProcessSecurityPolicyImpl::SecurityState. So wrapping as in #7 probably isn't too useful. The most useful observation I have at this point is the loader side is (almost) only read only, which makes it almost plausible to have it have a read-only cache of the data. The only write to the data is via ResourceDispatcherHostImpl::RegisterDownloadedTempFile -> RemoveDownloadFileFromChildSecurityPolicy -> RevokeAllPermissionsForFile. I don't understand the intricate details of the edge case described in that function to know if it would be ok for that permission to be revoked asynchronously or not. i.e. have the revoke (data write) occur in the browser in response to a message from the loader, followed by propagating the readonly data back to the loader containing that change. If it is, then the other question is whether it's generally OK to have the loader's view of the data be a cache that's pushed from the browser's source of truth (i.e. not synchronously updated). I don't really know if it is, but it seems like that boils down to a message ordering question. i.e. can the "cache update" messages to the loader be put in the same queue as all other requests to the loader. I'm not sure. I guess the next step is to audit all the calls that cause updates to the CPSPI data to try to get a handle on understanding ordering. (If a read-only cache is do-able, then I guess the approach would be to pull a "Data" interface out of CPSPI, have CPSPI's code built into both sides, and use a different "Data" implementation to the data store on either side.)
,
Aug 24 2016
,
Aug 31 2016
,
Mar 4 2017
I started looking at this again, unfortunately it hasn't gotten simpler. :) To summarize the history above, we need to make ChildPolicySecurityPolicyImpl available in both the browser and the loader. The browser is primarily responsible for modifying the data, but fundamentally it has to be available in two places. Here's a short outline of a plan for the first step of making progress on this: https://docs.google.com/a/chromium.org/document/d/1KUi4tDIBBzeo2gzt0fv4J0NaRGTb9YoV4YgbRgsOuHo/edit?usp=sharing .
,
Apr 3 2017
I talked to nick and creis on Friday (thanks!) about ChildPolicySecurityPolicy to get their perspective on how it might evolve. I just wanted to record what I remember about what we discussed (mostly for my benefit so I don't forget, and can hopefully stumble in a productive direction): CPSP is a conglomeration of "kicking the can down the road". It's an unusual class for Chrome in that it's accessed from all over the place on various threads, and just has a big lock to make that work. But there's no real consistency of ordering on accesses, so as a result it's particularly resistant to being shared across multiple processes. We discussed 3 versions of things that might help with synchronization: 1. Have the data (or a subset) replicated in the network process. The complexity here is in replication because threading and timing of use is unclear in the browser. For example, access might be granted and then a request made that requires that access. In a single process, this works because the grant takes effect right away (via lock+global data) but across multiple processes this would be a race. In order to mitigate this, on denied requests, the network could defer the request, round trip a message to the browser to ensure that any pending grants had made it through, and then process the deferred request (hopefully I got that proposal right). In this case we note that revocation of grants isn't too important from a security perspective (they might be harder to implement this way) because the exploiting requester is presumed to be able to avoid doing whatever it was that made it get revoked in the first place, if it already had it at some point. 2. Undertake a refactoring that attempts to pull each permission out to be represented by a mojo pipe. Initialization/grant by the browser would take the form of the browser creating a pipe that represents the granted permission, and passes one end to the network process and the other to the renderer. The renderer would then make any requests make a request that required that permission on that channel. This seems nice because is rolls the permission up into an existing primitive. The nice feature is that there's no racing between network/browser because mojo would take care of making sure that both sides are up and running and any messages would just be queued until both sides are connected. But it would mean that there were many different places that things would be requested, potentially making message ordering in the renderer more complex (this idea idea is still a bit vaguely formed for me.) 3. Pass some unforgeable data to the renderer that represents the permissions being granted. If only a token were given to the renderer, this would be no better than just updating the network's data (because that update would still race), but the clever idea here was to pass the actual permission *data* to the renderer. In order to accomplish this, the data has to be encrypted to avoid renderer spoofing. This design means we avoid races between the network and browser entirely because all grants are browser->renderer, and then subsequent communication renderer<->network. A sketch of this idea: - browser and network share a secret key at startup - when granting a permission, the browser creates a message containing the permissions and encrypts it using the secret key - the browser gives this encrypted data to the renderer, which then includes it in requests to the network. (I wasn't clear on this on Friday, but I realize now that the permission data would not have to be included on every request, rather it could simply be that the renderer forwards the data to the network on receipt to grant itself permissions. This is basically just using the renderer to serialize the messages, making it similar to #1 but without the drawback of having to defer and roundtrip for every non-granted request.) [ #1 feels like kicking the proverbial can down the road a bit farther, and adds otherwise unnecessary deferrals. #2 isn't quite fleshed out yet, but seems it could be pretty complex. #3 seems neat, but is perhaps a bit too clever, and adds some otherwise unnecessary encryt/decrypt work. ] Also noted was that CPSP is largely a per-process thing, maybe at odds with a more frame-centric world to which we're moving. (?) Also -- it's not clear that the network process should necessarily be doing all the permissions checks that ResourceDispatcherHostImpl currently does. e.g. CanSetAsOriginHeader() makes sense, but maybe [some of] CanCommitURL() belongs more back in Chrome. How do downloads fit in? (I might be getting any/all of this wrong as I was mostly just listening. Please feel free to add correct/clarify or add further thoughts!)
,
Apr 3 2017
Do we know all permissions at commit time, or can they change? The interface I'd like to see is that at commit time, when (With PlzNavigate) we pass along the mojo pipe for the frame's root document, we also pass in a pipe with permissions just for that frame (And with SW/AppCache set up for that frame as well). If it's a file URL, for instance, we'd allow file:// access on it, but if it's not, they wouldn't be allowed, etc. Anyhow, that sounds rather like #2.
,
Apr 3 2017
That pipe could also have a sidechannel for network->UI calls for things like open auth dialog for that frame and such, so we wouldn't need child IDs all over the place (Or route IDs, or request IDs)
,
Apr 3 2017
This is a good summary overall. Some existing calls to CanCommitURL can be thought of as proxies for "is this request same-origin with the context in which it was requested." For example: https://cs.chromium.org/chromium/src/content/browser/blob_storage/blob_dispatcher_host.cc?type=cs&q=CanCommitURL&l=295 So those could be made stronger by actually having them check against the committed origin, which is actually the check that we'd prefer do if those permissions were enforced on the UI thread. On the IO thread, the practice has been to use CanCommitURL instead, as a way of asking "is it plausible for a frame with this origin in this child process?"
,
Apr 3 2017
mmenke @14 -- I think that approach is generally viable, though I haven't thought deeply about RDH clients that aren't RenderFrames (serviceworker, plugins?). Requiring PlzNavigate seems appropriate here, even if it could be made to work without PlzNavigate. One thing to contemplate is what happens when we create a new iframe with a blank document (which commits synchronously). We have a choice there to either share the pipe of the parent document, or else to create a new pipe that we post to the browser process UI thread, where its permissions are computed, and the pipe is forwarded to the network service. AFAIK the existing CanCommitURL permissions for an origin ought to be granted before commit, because any later and the permission grants might not be there in time for the page's subresource requests to arrive on the IO thread. A lot of CanCommitURL/CanRequestURL as it's implemented today is actually static, based on the scheme. The dynamic part is really about non-web-safe schemes (chrome:// URLs, extensions) which are granted to that child_id the first time a navigation is placed there.
,
Apr 4 2017
More talking with Ananta... summarizing that discussion: In a "#2"/ comment @14, it seems like we could potentially be introducing raciness for non-navigation requests. If the browser grants a permission based on some action and then notifies the renderer, the subsequent actions would be racing that. (Or does that not happen based on your last paragraph in comment @17, Nick?) We also talked about what's mostly a reframing of "#1" that makes it feel a bit nicer to me: viewing it as a "pull" from the browser. The network's read-only copy of the CPSP data would start as empty == deny everything. Then, for any request that would be denied, the request is suspended, and the network asks the browser for updated data. When the response to the request for updated data is received, the network resumes outstanding requests and continues as normal. Since actually-denied requests shouldn't be common (?) this seems like it shouldn't have a performance impact in the normal case. (And, because we're not worrying about revocation currently, we don't have to worry about the Grant->Request->Revoke->UpdateCache ordering.) This is a less audacious plan, but seems more tractable given that we haven't set up a lot of the other machinery around the network process yet. So I'm somewhat leaning this way now.
,
Apr 4 2017
Do we ever grant new network permissions to a frame based on user actions? I know we do for devtools, but that's already racy, and failure there just results in hiding a little data for the first request right when it opens. Anything than that (And with more problematic behavior if we fail closed)?
,
May 24 2017
,
Jun 22 2017
Since we changed approaches for the network service, this isn't needed anymore.
,
Jul 17 2017
Closing this for now since RDH isn't used with Network Service. CPSP will need to get consulted of course. For frame requests, the code is still in the browser process. We will have to audit subresource requests from the renderer as we convert networking glue for various features.
,
Nov 7 2017
Apologies, applied the wrong component in bulk. |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by mmenke@chromium.org
, Jun 29 2016