Need way to ignore WebGL blacklist in Android WebView if EXT_robustness not supported |
||||||||||
Issue descriptionThis software_rendering_list.json entry: https://cs.chromium.org/chromium/src/gpu/config/software_rendering_list.json?l=572 is causing some developers (Scirra in particular) issues when compiling HTML apps using Cordova into native Android executables. There are some phones – even recent Samsung phones – which don't advertise either EXT_robustness or KHR_robustness. This means that WebGL is irrevocably disabled on them. To help these developers we at least need a way for them to bypass this blacklist entry. Maybe, looking at https://developer.android.com/reference/android/webkit/WebView.html : <meta-data android:name="android.webkit.WebView.ForceEnableWebGL" android:value="true" /> ? Or make the blacklist entry only take effect for Chrome and not WebView? Other ideas?
,
Feb 9 2018
However, without the EXT_robustness or KHR_robustness mechanisms, isn't it dangerous to enable WebGL for developers?
,
Feb 10 2018
It's not a big deal in this use case. While lack of robustness extensions means that the WebView could hang the phone, the user would probably just stop using the app which has embedded that WebView. I don't know whether embedding apps can pass command line arguments to the WebView they instantiate. It looks like that manifest mechanism is one supported way of configuring the WebView.
,
Feb 10 2018
Also, I'm trying to understand whether this should be configurable on an app-by-app basis, or whether we should just change that blacklist entry to not apply to WebView.
,
Feb 10 2018
I think the key distinction is Cordova apps only run local content, which can be assumed to be safe - the app could use OpenGL ES directly after all. So it's fine to blacklist WebGL in the webview for remote content, but AFAICT there's no reason to for local content. So Cordova Android apps are unnecessarily limited in their ability to use WebGL.
,
Feb 12 2018
problem summary for other webview people: webgl on android is currently only enabled if the robustness extension exists *or* if gpu is running in its own process. Since webview always runs gpu in browser process, only the robustness extension part of that check applies. And turns out lots of even newer android devices don't have the robustness extension, so building something that relies on webgl on webview is problematic ken: is the robustness extension check only there for stability and *not security*? imo there are good arguments for keeping things as is if removing the check relaxes security defenses. > So it's fine to blacklist WebGL in the webview for remote content, but AFAICT there's no reason to for local content. local != trusted
,
Feb 12 2018
I think it's undoubtably relaxing security restrictions to enable WebGL in the absence of robustness extensions. However, I think there might be a case for allowing applications to opt in to this. I think the application is the only thing that can say for sure whether it trusts the content it loads. It might be wrong (accidentally - but probably not maliciously), but given request interception and the potential to load untrusted content from file:// urls, there's no way for webview code to make that decision. I think we could go with a metadata tag to control this, but would developers adopt it, given that there will always be some devices that would not enable WebGL (because of having not updated webview) and that there's no reasonable fallback path? I guess shifting the balance of supported devices might be enough for some people, even if it's not to complete support.
,
Feb 12 2018
Assigning to kbr to drive the work, since it's P1. Please reassign as appropriate
,
Feb 13 2018
Ignoring stability and assuming this is a security concern.. The argument for allowing app to control this is simple. Some apps (eg gmail) only uses webview to load trusted content. So for these apps, the blacklist is unnecessarily holding back webgl. Despite that, *my* argument for keeping things as is that security decisions are often very subtle and allowing app developers to make this decision will often lead to bad results. A lot of app developers only really care about getting things working, and not care about security. There will be a "how to enable webgl in webview" stackoverflow thread that doesn't mention security concerns, and a lot of apps will just copy that code more or less blindly. Even for more knowledgeable developers who read official docs and decide they load trusted content only, it's still a subtle decision because developers may not know all the webviews used in their app. Eg if the app uses a third party library for ads, then that library could create webviews that load untrusted content. So this is definitely a feature-security trade off, and I'm not sure where the right line is. Other things > Also, I'm trying to understand whether this should be configurable on an app-by-app basis, or whether we should just change that blacklist entry to not apply to WebView. There are browsers based on webview with the same security concerns as chrome. > I think we could go with a metadata tag to control this imo metadata tag is a bad idea because it would be impossible to secure the "app with trusted content but also loads webview ads" case. The setting has to be per-webview to make that use case secure. And only way to do that is to add a new API. Webview can only update implementation, not API, through the play store, which means API will appear in a yet-to-be-released android version, which will take even more time to gain enough market share to really matter.
,
Feb 13 2018
We should have someone from the security team weigh in (perhaps Justin can help redirect). Assuming that we can do it safely (or the risk is manageable), I'd personally like to see us resolve this in favor of developers to encourage usage of more advanced web capabilities on Mobile.
,
Feb 13 2018
Android's update cycle is so slow that if this is only addressed in a future Android version and up, we will be forced to work around it anyway. Our workaround could involve a Cordova plugin that implements OpenGL ES 2 solely to bypass this blacklist entry. So then the "how to enable WebGL" StackOverflow thread you're worried about will just say "use that Cordova plugin". Then the end security situation is worse, since I doubt we'll be able to add as thorough validation as WebGL does.
,
Feb 13 2018
Cordova apps do *not* only run local content in general - maybe yours don't, but other people's certainly do. There is no class of WebView app for which this is definitely safe; it's always a case-by-case basis for every individual app. I agree with Bo that providing a global switch for this is very risky as many developers will just enable it even though their app loads arbitrary third party content (either because they don't care or because they don't realise that it does). Having a Cordova plugin that does something dubious is much less of a concern for the ecosystem, because many developers don't use Cordova, or won't care enough about this to go to the trouble of adding a plugin. If it's simply a manifest entry it will get copy-pasted into a bunch of apps that don't even require it in the first place, because cargo-cult coding is common in this space :(
,
Feb 13 2018
Hey Robert, Would you mind commenting on whether the proposed change would be a concern to the security team?
,
Feb 13 2018
The use of the EXT_robustness and KHR_robustness extensions in this context is *only* for stability, *not* security. The availability of these extensions implies – but does not guarantee – the presence of a GPU watchdog timer that will kill errant content submitted to the GPU. The main issue with running WebGL content without these extensions present is an increased chance that the phone or device might lock up with badly written or malicious WebGL content. Historically, the Chrome team hasn't considered such issues security issues, because people will just stop either visiting the web page, or running the installed WebView based app. Even if these extensions are present, Chrome still does all validity checking and rewrites shaders to ensure that they can't access arrays or vertex data out-of-bounds. It seems to me that honoring a new meta-data flag in the app's manifest is a viable option. Does this sound acceptable given that this is a stability-only change?
,
Feb 13 2018
if security team agrees this isn't a security concern, then I'd go further and say just ignore robustness extension on webview altogether, as suggested in #4 above. avoids the hidden setting that probably will just end up being copied around to a lot of apps anyway, eventually
,
Feb 14 2018
I had a chat with kbr@ and the takeaways were: The robustness extension (https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_create_context_robustness.txt) is _largely_ for stability, but it does have some security affordances. The primary one is a watchdog timer to interrupt content that runs for too long on the GPU. There is no good way to replicate the watchdog in userspace software. The secondary affordance protects against out-of-bounds accesses in shader programs, but the Chromium-side command buffer filter already applies these protections. However we would be reducing defense-in-depth by only relying on the command buffer filter, since the GL extension wouldn't backstop it. The main concern then is really around the failure mode in the case where we don't have a watchdog timer. The concern is that an ad unit rendered in WebView could lock up the entire device for an indefinite amount of time, potentially forcing a hard-reboot. kbr@ is going to run some tests using WebView to see how various devices that do not have the extension handle this scenario. (N.B.: WebView build instructions here: https://docs.google.com/document/d/1RvorDadxWRNFDv7QyeG1WdwasMexJJvSaiAKyaImEbs/edit#). I agree that in an ideal world we would have an API that would allow ignoring the EXT_robustness blacklisting behavior on a per-WebView instance basis. But as noted in #9, we can't add new API surface in a backwards-facing manner. The question then is whether an application-wide (i.e. metadata level) manifest option is better than not having one at all. I _think_ the answer is yes, because it is at least gating this on some explicit opt-in. But WebView team have more experience with seeing how those kinds of things are actually used by developers. I do not have a problem generally with allowing apps to control the behavior, since WebView allows plenty of opportunities to shoot yourself in the foot security-wise.
,
Feb 14 2018
I think the main concern with having an application-level opt-in is that some (possibly large) number of apps which use ad SDKs that create webviews will opt in, either intentionally because they actually use it in their own webviews, or because they copied it from somewhere, and so these apps will be exposing these capabilities to ad networks in general. I think the webview team's general experience is that many app developers aren't very aware of exactly what ad SDKs are doing and don't consider their ads when touching the few things in WebView that have process-global effects. As Robert notes there are a large number of existing ways to achieve security-related foot-shooting with WebView already, though (some of which may be being used by ad SDKs themselves). So.. it sounds like this probably isn't a sufficiently large additional risk to be worried about it, to me..
,
Feb 14 2018
How about giving the option a scary-sounding name like "enable unreliable WebGL" to discourage casual copy-pasting?
,
Feb 14 2018
(note to self: rename WebSettings.allowFileAccessFromFiles to allowDangerousFileUrlExploits) :)
,
Feb 27 2018
Here's an example of a typical user report which we believe stems from this: https://www.scirra.com/forum/slowdown-with-phonegap-build-works-fine-on-mobile-browsers_t201167
,
Mar 8 2018
Is there any news on this? Reports continue to come in, e.g.: https://www.scirra.com/forum/framerate-stutter-achieve-smooth-fps_t201444
,
Mar 8 2018
So kbr/bo: what do you think? We've got three options here (other than doing nothing): 1) stop paying attention to this entirely as Bo suggests in #15 2) let people control it with a metadata flag and then just accept that a bunch of apps are going to copy that for no reason (but many also will not, so it will be a different outcome from ignoring it entirely) 3) add a new API to control it per-webview in a future OS version and then roll out access to this via the upcoming support library so that people can use it on older OS versions (many apps will probably still turn it on for no reason but we can at least try to document that you should only do it for trusted content you control)
,
Mar 8 2018
According to #16, kbr@ was going to run some tests?
,
Mar 9 2018
We have a bit more to explore internally and will likely have updates towards the end of the month.
,
Mar 16 2018
boliu@ and I talked today; I need to ask for help from the WebView team. The testing that is needed is: 1) Build a local copy of WebView (the testing shell – I don't know what this is) that has WebGL enabled 2) Try the following stress tests on a few devices that don't have the EXT_robustness extension: https://www.khronos.org/registry/webgl/sdk/tests/extra/lots-of-polys-example.html https://www.khronos.org/registry/webgl/sdk/tests/extra/lots-of-polys-shader-example.html https://www.khronos.org/registry/webgl/sdk/tests/extra/slow-shader-example.html 3) See whether the behavior of these tests is markedly different or worse than in Chrome on the same device. Basically, as long as the app switcher will let the WebView be killed, and Chrome doesn't handle this much more gracefully, then it should be generally OK to stop paying attention to the presence of this extension. Of course, this doesn't prove that all devices will work as well, but this is the initial testing that's needed to gain confidence in our decision. Thanks for your help doing these tests.
,
Mar 16 2018
-> jamwalla to try the steps in #25
,
Mar 20 2018
I'm having a bit of trouble finding a device to test this on. What devices were you working with? I tried a couple Galaxy S6s (SM-G920I, SM-G920F, SM 6920-T, SM-G928G) and a lot of other devices with Mali GPUs. The only ones I could find without the robustness extension had Mali-400 MP GPUs, but are too old to install updates to webview on.
,
Mar 21 2018
+ash to provide feedback on specific Android devices tested jamwalla@ I thought the plan was to not install a WebView update but one of the Chromium build targets that installs a side-by-side WebView-like harness.
,
Mar 21 2018
> jamwalla@ I thought the plan was to not install a WebView update but one of the Chromium build targets that installs a side-by-side WebView-like harness. That's to get around not having root, not for installing the harness on an older build of android that doesn't support updatable webview. There is no point in testing build that don't support updatable webview anyway.
,
Mar 21 2018
Scirra claimed that the Samsung Galaxy S6 didn't support EXT_robustness, but that was probably an incorrect assertion. The value in testing one or two of the older devices which don't support updatable WebView is to understand the consequence of enabling WebView on devices which don't advertise this extension. That was the point of this exercise.
,
Mar 21 2018
> The value in testing one or two of the older devices which don't support updatable WebView is to understand the consequence of enabling WebView on devices which don't advertise this extension. That was the point of this exercise. We can't do anything about the webview in those devices, so I'm not sure what the point is. Also the test harness only supports L and up (versions of android that support updatable webview), and it's going to take some non-trivial work to make it run on older versions of android.
,
Mar 21 2018
Sorting this list: https://opengles.gpuinfo.org/gles_listreports.php?extensionunsupported=GL_EXT_robustness by Date, as well as querying individual Android releases, it's difficult to find any recent and reasonable devices. There are some, like this one: https://opengles.gpuinfo.org/gles_generatereport.php?reportID=2101 which claims to be a Samsung Galaxy S8+, but the cpuarch of the report is i686, so I think it's probably something like the ARC++ runtime. Marking this as Needs-Feedback from Scirra. Thanks jamwalla@ for your attempts to reproduce so far.
,
Mar 21 2018
I just tested some random devices from our cabinet. They all support robustness. Huawei P9 (EVA-AL00), Samsung Galaxy Note 8 (GT-SM-N950F/DS), Oneplus One (A0001), and Samsung Galaxy S7 (SM-G930F).
,
Mar 21 2018
None of our devices directly support this either - we just regularly get reports that performance is fine in Chrome, but is really slow in the web view. From talking to users we found it seemed to be because WebGL was disabled, and we eventually tracked down this blacklist entry as the most likely culprit. It was an LG G Stylo mentioned in the first report that came to us: https://github.com/Scirra/Construct-3-bugs/issues/467 I've asked for more details about affected devices. Unfortunately I've also filed crbug.com/822727 (janky performance in webview, even with WebGL) which muddies the waters - we often see reports like "it's slow/juddery in the web view" and it can be hard to tell if it's this issue or the other one.
,
Mar 22 2018
I just remembered, according to this site around ~23% of OpenGL ES submissions (presumably devices?) lack GL_EXT_robustness for Android devices. You can view a list here, which includes device names: https://opengles.gpuinfo.org/gles_listreports.php?extensionunsupported=GL_EXT_robustness The list includes devices like the Nexus 10, Samsung Galaxy S6, Samsung Galaxy Tab 2/3/S, etc. I thought 23% of devices was worryingly high (which added to my concern about this issue) but AFAIK there's no public data on actual usage for devices like these.
,
Mar 22 2018
Looking back, I think I missed a few comments where that site was already mentioned :P Sorry, it's the only data we have to go on though. It does say the S6 is unsupported but I guess they have regional models that vary in support or something.
,
Mar 22 2018
The availability of GL_EXT_robustness is basically 100% for >=M devices. For Android L and L MR1 about 70% of devices grouped by board and 80% by build fingerprint have the extension.
,
Mar 22 2018
As far as I'm able to ascertain there are no LG G Stylo devices that lack that extension.
,
Mar 22 2018
Given the prevalence of reports we get along the lines of "the game runs fine in Chrome but is slow in an APK", perhaps those are all coming from ~30% of Android 5.x devices? If it's not this then I'm not sure what to make of all the reports we get. I apologise if we've ended up mis-attributing the reports we get to this but it's the best conclusion we could come to based on what we were hearing.
,
Mar 22 2018
Webview is updatable only on L and up, which means there is nothing we can do if majority of these reports are coming from K (android 4.4.2 was the first version of webview that even supports webgl at all iirc) or older. So I think the premise here was probably off to begin with. There's probably more users that can't update their webview than users that can update webview, but don't have webgl due to lack of robustness extension. (Eyeballing data from https://developer.android.com/about/dashboards/index.html)
,
Mar 22 2018
Construct 3 only supports Android 5.0+, so presumably none of the reports are for 4.x.
,
Mar 22 2018
Ok. Well, back to actually finding one of these mythical devices then..
,
Mar 22 2018
If we're talking about 30% of android 5.x devices that's about 7.5% of devices in the field and dropping over time. It doesn't seem like introducing another confusing webview configuration option to accommodate these devices is particularly worth it, to me. If you want to actually establish whether this is one of the causes of the issues you see, then probably your framework/library should query the GL features of the device and check the support; do you have analytics/metrics where you could see how common this is in your population? You could at least surface this to the app developer if you aren't already, so they know whether this is a likely cause of issues or not.
,
Mar 26 2018
I tested these three sites on a Nexus 10 running L (LMY49J): (see c25) https://www.khronos.org/registry/webgl/sdk/tests/extra/lots-of-polys-example.html https://www.khronos.org/registry/webgl/sdk/tests/extra/lots-of-polys-shader-example.html https://www.khronos.org/registry/webgl/sdk/tests/extra/slow-shader-example.html On both webview and chrome, the first two tests pass (chrome says "WebGL hit a snag"; webview says nothing; neither draws anything). The third test (slow-shader-example) causes ANRs on both applications. Chrome is eventually killed after a gpu hang; android will eventually let you kill webview. So, I think basically the same behavior. Let me know if there's more tests to run or if you need more details.
,
Mar 26 2018
To clarify, ANR is the android system dialog that comes up when the UI thread of the app hangs. That's not actually what happens to chrome. When chrome runs slow-shader-example, the entire android system UI hangs and eventually dies and restarts. But then chrome generally survives the whole thing just fine. Sometimes chrome gpu process suicides due to watchdog triggering, but that's it. Webview does ANR. So webview's behavior is arguably more user friendly.
,
Mar 29 2018
Thanks jamwalla@ for testing. boliu@, what do you think we should do in response? Should we remove that blacklist entry for Android devices, or leave things as they are?
,
Mar 29 2018
Extrapolating from a single test/device.. I guess it means dropping robustness requirement for webview is fine. Webview ANRs either way?
,
Mar 29 2018
If we're okay with just dropping the requirement that's fine by me; my objection is only to adding a config setting for it :)
,
Apr 2 2018
jamwalla@: in your experiments in #44, was Chrome running in single-process mode (i.e., in-process GPU)? If so: could you please tell us? If not, could you please force Chrome into this mode, re-run your tests, and tell us its behavior? Thanks.
,
Apr 3 2018
I didn't run those experiments with chrome in single-process mode. Unfortunately I don't have access to the devices this week; unless boliu@ is able to do it I can run these experiments next monday. Sorry for the inconvenience.
,
Apr 4 2018
Tried on chrome with in-process GPU. Essentially same thing as out of process GPU. But SystemUI never manages to generate the ANR dialog for chrome (even though at least in one trace I took, it looks like the UI thread in chrome is hanging as well). And SystemUI suicides due to its own hang monitor. I can't explain the difference between chrome and webview honestly, why SystemUI manages to throw up the ANR dialog for webview but not chrome. So the thing that wasn't made clear in the testing above (because I didn't personally try it) is that once chrome or webview starts spinning the GPU, SystemUI (ie the UI for the entire OS) becomes extremely sluggish as well, to the point of hanging and killing itself. Assuming this user experience generalizes to other devices, I think we shouldn't do this for webview, purely out of the terrible user experience. Maybe we can enable our gpu watchdog for single-process as well, but only enable it when there are webgl contexts around? I do agree there isn't really any difference between webview and in-process chrome, and whatever we end up doing, we should do the same thing for both of them.
,
Apr 6 2018
Trying turning on the GPU watchdog in single-process mode when WebGL contexts are active sounds like a good experiment. To keep the code changes simple, we could just try enabling it all the time in single-process mode and then run some of these stress tests. The result will either be that the browser terminates and the system becomes responsive again – the best result – or that we don't have the ability to recover from this, and we abandon the effort to turn on WebGL on these devices. Right? I assume that we can't provide more information to Android about why the browser shut itself down. Bo, is there any chance you or jamwalla@ can try this experiment too?
,
Apr 6 2018
CL I tried locally: https://chromium-review.googlesource.com/c/chromium/src/+/1000095 Watchdog works. Chrome/webview suicides after 10 seconds. However this earlier observation actually makes a difference between chrome and webview: > I can't explain the difference between chrome and webview honestly, why SystemUI manages to throw up the ANR dialog for webview but not chrome. With webview, SystemUI is still usable, just super slow. But things recover after webview suicides. I'd consider this acceptable. With chrome, SystemUI appears to just hang. Since chrome's watchdog timeout is 10s, and SystemUI's timeout is presumably 5s (timeout for ANR), SystemUI still suicides (after chrome dies). That doesn't seem acceptable. The problem is I don't understand what's the difference between webview and chrome that's causing this difference in behavior. So I can't reason if this is just a fluke on this device, or there is reason to believe the same behavior generalizes to other devices as well.
,
Apr 7 2018
Thanks for testing this. Hmm. Could you try turning down the GPU watchdog's timeout to 4 seconds and see what happens with Chrome? I'm guessing the issue is that Chrome's main thread isn't actually blocked, the GPU thread is, but in WebView the main thread is blocked, so the system discovers the problem sooner? Maybe in Chrome more of the pathological draw calls are issued to the system. We've certainly seen that behavior in the past when wiring up the DoS defense mechanisms for TDR. If the ~4 second watchdog timer is sufficient then could you change GpuInit::InitializeInProcess to be able to configure the GPU watchdog with a 4 second timeout? Thanks much for continuing to test this.
,
Apr 9 2018
> Maybe in Chrome more of the pathological draw calls are issued to the system. That sounds plausible. I guess chrome's "proper" solution there would involve the gpu scheduler? Should someone explore that, or do we not care much about chrome here and only worry about webview? 4s seems too aggressive for production. Probably going catch a lot of false positives.
,
Apr 11 2018
sunnyps@ and I discussed whether the GPU scheduler could help for this purpose and the short answer is that it doesn't have enough information to do so. Let's focus on WebView for this case. Do we have an agreed-upon resolution? Should we add a mechanism to specialize GPU blacklist entries to WebView/non-WebView, and restrict the GPU blacklist entry here: https://cs.chromium.org/chromium/src/gpu/config/software_rendering_list.json?l=576 so it only applies to non-WebView?
,
Apr 11 2018
For webview-only, I'm okay with these together: * enable watchdog (for in-process gpu) only when webgl context is present * remove blacklist for webgl
,
Jun 18 2018
,
Jun 22 2018
Based on this thread a possible device to test for this is an Asus Zenphone 5: https://www.construct.net/forum/construct-3/how-do-i-8/how-to-make-c3-games-work-fast-135318
,
Sep 10
boliu@ suggested in a conversation today that the way to make a blacklist entry WebView-specific would be to add an Android-only "IsWebView()" method to GpuClient and override it in the WebView embedder. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by zmo@chromium.org
, Feb 9 2018