Currently, after the browser process fetches the request with PlzNavigate, it writes it into a stream and sends the blob url to the renderer. Given that we're seeing non-trivial performance regression, it might be worth trying to do this with mojo data pipes and send that to the renderer instead.
btw we can do this orthogonal to the mojo loading work I think. We have the ability to send mojo handles in old chrome IPCs, seem Darin's cl https://codereview.chromium.org/2422793002/ as an example (this sends mojo message pipe, but sending data pipe is straightforward from that one). Then we can reuse the data pipe reader perhaps in the child process?
Did a quick code skimming... so after we kick loading in RDHI::BeginNavigationRequest -> initialize Stream in NavigationResourceHandler::OnResponseStarted, the Blob URL gets passes to the renderer in RenderFrameHostImpl::CommitNavigation, where we use old IPC.
Looks like Darin's mojo handle transfer code uses traits code in ipc/ipc_mojo_param_traits which basically reads / writes attachment as ScopedHandle, and data pipe is basically a mojo handle too so maybe we can just do the same.
I was thinking that we could utilize mojo loading code (recently horo@ did similar thing for Service Worker nav preload using mojo loader), but as John suggests directly using data pipe might be more straightforward.
FYI: clamy@ and carlosk@ have worked on a prototype 1 year ago:
https://codereview.chromium.org/1693563002
I don't know if it is still relevant though, but it is probably useful to know about it.
#6- I see thanks for clarification. So we write data into data pipe, send it to the renderer process and use mojo loading code to do regular loading (instead of loading from blob/stream URL) for navigation, is my understanding right? Looking into clamy & carlosk's patch linked from #5 (thanks!) it feels we might be able to simplify the approach now that we have mojo loading code.
I took a (incomplete) shot at it today-- reusing some producer code from MojoAsyncResourceHandler in NavigationResourceHandler, sending consumer handle to renderer process on CommitNavigation and then attaching URLLoaderClientImpl to it to read out the data seems to somewhat work (though still broken, needs debugging):
https://codereview.chromium.org/2797443005/
We might be able to just use URLLoaderFactoryImpl but it feels it'd needs more changes in the browser code.
Just a quick update, one reason for the big perf regression was found and fixed, which brought metrics much closer to what we expect to see. It will be great to move over to using Mojo, but at this time it is better understood that that stream read is not the cause of huge delays.
Thanks Kinuko for looking at this!
As Nasko mentions here, it turns out that the perf regression was due to incorrect cache flags set on back-forward navigations. Considering this, I would prefer if we waited on PlzNavigate to ship before shifting to data pipes, as we're currently trying to bring stability with PlzNavigate to acceptable levels. Switching to the data pipe right now could cause a certain number of regressions and potentially set back PlzNavigate's launch (which I would like to avoid :)).
I'm also thinking that it will be easier to land once we have done a bit of cleanup in the blink loader code.
Wdyt?
Thanks for the update! If we were able to fix the perf regression and we can just ship PlzNavigate I'm happy too =)
My patch seems to work but there's no reason that it can't wait for the launch (and I agree that it could likely cause other regressions), so please go ahead!
Thanks Kinuko for getting that done quickly. While fortunately the egregious perf issues are fixed, we'll still want this code soon (I agree after launch though, to decrease moving parts). One other reason is that it'll be helpful with the network service, as I don't see a reason at this point why we need streams anymore (now that we have data pipe).
(btw for others following; we've synced with Kinuko about this)
Scott is landing Kinuko's change in https://codereview.chromium.org/2813243002/ but only for network service, so that we don't add risk to PlzNavigate. We do need to use PlzNavigate for the network service work, and we don't want to add support for streams there and just wanted to reuse data pipes.
Thanks a lot Kinuko again, this unblocked us.
I've heard the plan is to ship this soon (great!).
I'm wondering about the SetController IPC race for service workers (my comment on https://docs.google.com/document/d/1fQzym0Sqg9OrqtISgZV0XvH2swBMmDnmphM3WQD68JU/edit#)
It seems plausible to me that this will cause a race. We might need to do what issue 778898 did for S13nServiceWorker and set the controller as part of commit navigation, to ensure the renderer knows about the controller before the first response comes in.
Thanks falken@!
I tried to add a usleep(10000) in ServiceWorkerProviderHost::SendSetControllerServiceWorker like you did.
Indeed, it makes several tests failing.
I will take a look at it.
Thanks Arthur. It also seems good to minimize the difference between S13nServiceWorker and non-S13nServiceWorker anyway where possible.
Do you have an idea about how to plumb the controller info to the CommitNavigation function for non-NetworkService code? For NetworkService, https://chromium-review.googlesource.com/#/c/742961/ it comes from the SubresourceLoaderParams which go from:
ServiceWorkerControlleeRequestHandler -> NavigationURLLoaderNetworkService::URLLoaderRequestController -> NavigationURLLoaderNetworkService(::OnReceiveResponse()) -> NavigationRequest(::OnResponseStarted())
Then NavigationRequest::CommitNavigation() passes it to RenderFrameHostImpl::CommitNavigation().
In the non-NetworkService path, we seem to go:
NavigationResourceHandler::OnResponseStarted -> NavigationURLLoaderImplCore::NotifyResponseStarted -> NavigationURLLoaderImpl::NotifyResponseStarted -> NavigationRequest::OnResponseStarted
And NavigationResourceHandler::OnResponseStarted only has a ResourceResponse, ResourceController, NavigationData, URLRequest.
The controller is first determined at ServiceWorkerControlleeRequestHandler::DidLookupRegistrationForMainResource, so somehow we have to get SWControlleeRequestHandler and NavigationResourceHandler talking to each other, I assume.
Ah I guess we can stuff it as URLRequest data in ServiceWorkerResponseInfo.
I haven't tried to update NavigationResourceHandler for the moment because it is used when NavigationMojoResponse is disabled. For the moment I only tried to update the other code path (the one affected by this race condition).
I get the controller in NavigationURLLoaderNetworkService::Controller::OnReceiveResponse using
* ServiceWorkerRequestHandler::GetProviderHost(url_request)
* ServiceWorkerProviderHost::GetOrCreateServiceWorkerHandle()
Then I think I will put it into the SubresourceLoaderParams that is already here (I don't know if it make sense though)
Finally I will adapt thing that need to be updated after that.
I haven't tought about updating the non-NavigationMojoResponse path yet, but I think they are almost the same, except they are using different set of classes:
non NavigationMojoResponse | NavigationMojoResponse
--------------------------------------------------------------------------
NavigationURLLoader | NavigationURLLoaderNetworkService
NavigationURLLoaderCore | NavigationURLLoaderNetworkService::Controller
NavigationResourceLoader | MojoAsyncResourceHandler
I agree that it might be good to do it even if there are no race condition in this mode.
Ah sorry seems like you've got it figured out. I didn't realize the extent of NavigationMojoResponse. It'd probably be sufficient to just fix it in NMR and not worry about non-NMR.
I would like to merge in M65 two patches together:
* https://crbug.com/705744 (https://crrev.com/535339) <- Here in comment 56.
* https://crbug.com/808071 (https://crrev.com/534706)
They have been on M66 canary for a few days in 66.0.3344.0 (Feb 8) and 66.0.3342.0 (Feb 6).
They are fixing a potential race condition issue between Service workers and NavigationMojoResponse.
NavigationMojoResponse is disabled by default and finch is used. There is a 50%/50% experiment on M65 Canary. I would like to merge these two patches in order to move the experiment from Canary to Dev and get some performance statistics. I do not plan to continue the experiment further than beta in M65.
Risk is low because these two patches are modifying a code path that is not used by default (NavigationMojoResponse and/or S13nServiceWorker). Finch can be used to disable the experiment on Canary/Dev if needed.
FYI: NavigationMojoResponse launch bug: https://crbug.com/805851.
This bug requires manual review: Reverts referenced in bugdroid comments after merge request.
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop)
For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Approving merge to M65 branch 3325 for both cl listed at #57 per comment #57 and offline chat with jam@. Pls merge ASAP so they can be picked up for this week M65 Beta release. Thank you.
Pls apply appropriate OSs labels too.
I merged the https://crrev.com/534706.
Unfortunately, for the second one https://crbug.com/705744 . There are a lot of merge conflicts that need to be fixed. One of them is not obvious. It is very late now in my timezone. I need to take a closer look tomorrow.
Requesting merge targeting M66 for CL landed in comment 78. It will enable by default NavigationMojoResponse, which has been on beta as Finch trial. Merging this will enable it in WebView to ensure we don't have issues before shipping it to stable in M66.
This bug requires manual review: We are only 13 days from stable.
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)
For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Comment 1 by jam@chromium.org
, Mar 27 2017