New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 817909 link

Starred by 4 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Feature



Sign in to add a comment

Speed: Service Worker-controlled navigations invoke SafeBrowsing on the critical path

Project Member Reported by slightlyoff@chromium.org, Mar 1 2018

Issue description

We're seeing slowness in PWA startup because SafeBrowsing is being consulted for these navigations which are, by definition, Service Worker controlled. The attached trace shows ~300ms of SafeBrowsing time for Google Maps Go loading the top-level resource which was returned directly from the Cache Storage API by the Service Worker.

To get a Service Worker controlling a navigation, the site must have been previously visited. Presumably, it doesn't make sense for us to continue to check the URL of something that is being returned from local script.

Related: https://crbug.com/802818
 
maps_pwa_512mb_webapk_warm.html
13.5 MB View Download
Even if a site had been visited in the past, issues may have been detected
since then. It could be that this isn't worth the latency tradeoff due to
unlikely security concern, but could see a case for where it makes sense.
Yeah, we've talked about this before and our current conclusion is that resource loading from SW would still need to go through SafeBrowsing checks.

Questions to SB people: would there be a way to somehow detect updates in SB database and only apply checks for the URLs, or could it be okay to lazily run the checks (and evict resource from the cache) after the critical resource load is done?
For S13nSW: in the current implementation, doesn't the service worker intercept before safe browsing? Could it make sense for safe browsing to just sit in front of network service, so if the SW responds with a resource from Cache Storage we don't go to SB?
This would be solved by implementing local Safe Browsing cache on the client (similar to how desktop Safe Browsing works). This is on the 2018 roadmap of the Safe Browsing team.

The 300ms blocking is an outlier though (99.95th %ile):
https://uma.googleplex.com/p/chrome/histograms/?endDate=20180228&dayCount=1&histograms=SB2.RemoteCall.CheckDispatchTime&fixupData=true&showMax=true&filters=platform%2Ceq%2CA%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial
Labels: -Type-Bug-Regression Type-Feature
From a security model perspective, Safebrowsing's assumptions just don't make sense in a SW-controlled world.

Safebrowsing matches individual URLs: https://developers.google.com/safe-browsing/v4/lookup-api
Service Workers operate on *scopes*: https://developers.google.com/web/ilt/pwa/introduction-to-service-worker#registration_and_scope

It's meaningful to check a destination URL for uncontrolled pageloads (first visit to a site). It stops making sense for Service Worker controlled page loads because the actor is no longer the server; it's the local script proxy which may have been installed at some point in the past.

It *MAY* be useful to (asynchronously) check on the Service Worker (although I don't know that Safebrowsing tracks pwn'd serviceworkers) and uninstall them or perhaps async check the URL (to, again, uninstall the SW and refresh the page), but blocking the navigation is pointless.

csharrison@: this trace is on a slow device at Chrome Startup. This might be unusual for Chrome navigations overall, but that doesn't really speak to how painful it makes the "tap the homescreen icon to launch the app" experience. As PWAs become more common, so will this experience.


Ah gotcha thanks for the clarification. We should definitely add a histogram for the first check. I think there is another Android bug for something related to startup too... found it, issue 802818.

In fact, the Safe Browsing check is _worse_ than 300ms, that's just the task that dispatches the check to GMSCore (I'm not quite sure why it takes so long but I think it is aggressively descheduling the thread). The response comes in at the 4200ms mark, making the total check more like 900ms!

However, I haven't stared at the trace for long but is the navigation actually blocked on Safe Browsing? The 300ms time is spent on a thread pool and is mostly descheduled (not consuming a core), so it shouldn't be blocking critical work. I'm not used to analyzing SW traces but I would expect to see a "blocked by delegate: Safe Browsing..." somewhere in the netlog if we're actually paused waiting for SB results.

In terms of the security model for SafeBrowsing + SW, I'm not sure I have enough experience of either to make a good judgement call. Maybe SB team can comment.
Thanks for digging into that and for the context, csharrison!

From a security model perspective, in addition to the mismatch with Safebrowsing when Service Workers are active (generally), consider that the URL being launched is *very* trusted from the user's perspective when invoked from the homescreen. 

Even if we're not happy removing this check in the common case for SWs (although I argue we should at least be happy to make it very asynchronous and/or skip it), it seems just punitive to punish startup in this specific case.
Cc: bgalbs@google.com
Status: Available (was: Unconfirmed)
Thinking more about the security model question, I'd like to propose that:

 - We avoid the synchronous check for SW-controlled navigations
 - We add a Safebrowsing check when issuing an update check for the Service Worker script: https://w3c.github.io/ServiceWorker/#soft-update-algorithm

The rationale for this change:

 - Checking URLs whose content is being returned by a Service Worker doesn't do what we assume it does. The content is the product of previous navigations, not the current one, so checking it in the moment isn't useful per sae
 - That said, we *do* issue a background check to the network asynchronously to understand if the SW script is fresh. This fetch cannot be intercepted by the existing Service Worker; this makes it a good time and place to check a URL
 - Since the Service Worker being checked is the result of a navigation, we can check the URL last navigated to
 - A failed SafeBrowsing check at SW update time should:
   - Unregister the Service Worker for the origin and perhaps clobber all state (ala Clear-Site-Data: https://www.w3.org/TR/clear-site-data/)
   - Replace tab contents for all controlled windows (tabs) with the safebrowsing interstitial

Thoughts?
Cc: nparker@chromium.org vakh@chromium.org
+vakh,nparker explicitly: Can you comment on #9?

Comment 11 by vakh@chromium.org, Mar 8 2018

I do not understand the details of how SWs work so the following may not be 100% correct from SW perspective:

> We avoid the synchronous check for SW-controlled navigations
We do not have any synchronous checks in Android. For a SW, since requests in its scope may be returned locally, the SB check would always lose that race and cause a wait. Is that what you meant?

> We add a Safebrowsing check when issuing an update check for the Service Worker script
Yes, does it not already happen? If not, that's a bug because it means we are executing code locally without checking its SB reputation.

> The content is the product of previous navigations, not the current one, so checking it in the moment isn't useful per se
Yes, but as pointed out in #c1: "Even if a site had been visited in the past, issues may have been detected since then." and the SW is caching content that we have since discovered to be bad.
We want to evaluate the reputation of each URL as it is requested, irrespective of its past reputation.

> This fetch cannot be intercepted by the existing Service Worker; this makes it a good time and place to check a URL.
In order to ensure that the user is safe, we must verify that Safe Browsing thinks they are not unsafe -- whether it is at SW install time, or update time.

> A failed SafeBrowsing check at SW update time should ...
Yes, I agree with both of those proposals.

Comment 12 by vakh@chromium.org, Mar 8 2018

> A failed SafeBrowsing check at SW update time should ...
Yes, I agree with both of those proposals but I don't think that would be sufficient since a malicious SW will choose to not update itself at all or for a long period of time if it knows that an update would force an SB check.
Service Workers don't get to decide how often the check is performed. We issue the check on their behalf with a maximum lifetime of 24 hours. That is, you can specify whatever caching headers you want on your SW, but if you navigate and we would check for an updated SW, then we only respect cache headers up to a single day.
vakh@, csharrison@: does the above make sense? If not, happy to describe the SW lifecycle in more details and/or discuss in a meeting.
There could be different types of resource loads: SW scripts and other resources (main resource and sub-resources).

For scripts situation looks clearer:

* For installed SW scripts that come from internal storage we actually already skip SB checks.

* For updating SW scripts they should be checked against SB on update time (I'll check the current behavior again, but the desirable state would be we do the check).

* I agree that busting the SW if we find issues in SB check during updates would make sense.

For non-script resources:

* SW updates may not really load (install) all the resources that are to be used for the subsequent navigations.  Skipping all resource loads from controlled page that go to SW doesn't seem to make sense.

* If we agree that using the resources that come from Cache Storage is safe we can possibly change the current layering from: [page -> SB -> SW -> SB -> Net] to: [page -> SW -> SB -> Net].  This is NOT easy for non-S13N code path to do this though. I can imagine we can do this for S13N code path in a reasonable timeframe though.

Well, for the last case for non-S13N code path case: we can possibly do some filtering for installed resources somewhere not to go through SB, it'd be hacky but sounds possible.

I think the biggest question we should have consensus is this:
"If we agree that using the resources that come from Cache Storage is safe"

This seems a bit sketchy.  Or... the proposal is to distinguish resources that are stored during installation/updates and others?

Comment 17 by vakh@chromium.org, Mar 16 2018

Labels: SafeBrowsing-Triaged
csharrison: Any update on "This would be solved by implementing local Safe Browsing cache on the client (similar to how desktop Safe Browsing works)."?

This looks related to  issue 873575 . Safe Browsing is one of several URLLoaderThrottles. It's looking like the plan is to apply throttles before service worker interception, partially because some throttles mutate the request and it would make sense for the SW to see the mutated request.
Sorry I don't know, I haven't been active in Safe Browsing for a while. vakh may know.
You can follow the updates here on Issue 747428.

The current status is that I have the design doc ready and a working version of the CL ready but I am trying to figure out how to measure the memory/performance impact of the local blacklist.

Sign in to add a comment