New issue
Advanced search Search tips

Issue 591959 link

Starred by 15 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug


Show other hotlists

Hotlists containing this issue:
Hotlist-1
cr39


Sign in to add a comment

Network change causes WebView to be unable to communicate

Reported by j...@saxobank.com, Mar 4 2016

Issue description

Device name: Samsung Galaxy S6 edge (SM-G925F)
Android version: 5.1.1
WebView version (from system settings -> Apps -> Android System WebView):  49.0.2623.63
Application: WebView Developer Browser
Application version: 1.2

URLs (if applicable): www.google.com

The Android System WebView seems to be unable to pick up network changes, often causing the communication to hang.

We see this problem with all of our apps using WebView, on both version 48 and 49 beta of the WebView.

The same scenario in Android Chrome (48.0.2564.95) on the same device works without problems.

The particular app used for reproducing this is available at:
https://play.google.com/store/apps/details?id=com.webviewbrowser&hl=en
Source code for the app is at:
https://github.com/austindavidbrown/WebView-Developer-Browser


Preconditions:
The device has access to both Wi-Fi and cellular, both are enabled.

Steps to reproduce:
1. Open the WebView Developer Browser app and navigate to www.google.com.
2. Search for "goat"
3. Swipe from the top of the screen to open the quick settings, and disable Wi-Fi
4. Search for "cheese"

Expected result:
Google search results for "cheese".

Actual result:
The page appears to hang and no new search results are shown.

 

Comment 1 by b...@chromium.org, Mar 4 2016

Components: Internals>Network>Connectivity

Comment 2 by torne@chromium.org, Mar 5 2016

Cc: timvolod...@chromium.org
WebView doesn't observe network changes (and never has), but I wouldn't expect this to be necessary for the scenario you describe since the second search would send a new request through whatever the default network connection was at the time..

Comment 3 by j...@saxobank.com, Mar 5 2016

I believe what is happening is that WebView is using an old stale connection for the second search. (Admittedly, this is conjecture on my behalf.)
This seems like an important bug to fix.  There many things in Chrome's (and hence WebView's) networking stack that must be flushed upon network changes (e.g. socket pools, HTTP/2 and QUIC connection pools, DNS result caches etc) to ensure correctness and prevent hangs.  This is the reason we have the NetworkChangeNotifier.  It looks like WebView's AwNetworkChangeNotifier::OnConnectionTypeChanged() does nothing.

Comment 5 by torne@chromium.org, Mar 16 2016

Yes, we know, but our attempts to enable this currently completely break the ability to use the network at all: when you create the first webview and start up the network stack, the first connectivity change event comes in at the perfect time to immediately nuke all the connections you are making to do the first pageload, resulting in very frequent failures to load the first page.

Chrome doesn't seem to have this problem because it registers for network change notifications early on in its startup process, long before there is a tab being shown. WebView doesn't have an early startup, and is frequently loading a page immediately after creation.
@torne: There should be no initial signal from the NCN.  Signals should only be generated when actual network changes are witnessed.

Comment 7 by torne@chromium.org, Mar 16 2016

I'm pretty sure we observed this repeatably. Maybe something was hooked up
wrong for WebView but it seemed to happen for Chrome as well (just at an
early enough point not to disrupt things).
Adding relevant bug for reference:  crbug.com/520088 .

The reason of the issues mentioned above is because the NCN is not always enabled. In Chrome it is tied to ApplicationStatus, in WebView to the presence of live webviews. So the new signal can come when a new webview is created. pauljensen@ and I had a similar discussion on this see e.g.  crbug.com/520088 #39 .

However it's probably worth investigating, I think an idea could be to further incrementally enable the stack in webview but not completely. For example like we already did for the network information api but then taking it further and at the same time making sure to avoid the issue we had with  crbug.com/520088 .

If you unregister the NCN when no WebViews are present and then register it when a WebView comes along, which is how I think the AwNCN works, you could get network change signals when the NCN is registered if there have been network changes in the intervening time (when there were no WebViews).  If the Chrome network stack is alive while there are no WebViews, then this behavior is required for correctness.  If the Chrome network stack is not alive while there are no WebViews, then you could shutdown and delete the NCN completely during this time when there are no WebViews, which would prevent any possible initial signal when the NCN is registered again.

Comment 10 by torne@chromium.org, Mar 17 2016

Yes, the network stack is always live, we don't have any way to shut it off, and yes, the change signal is required for correctness, but it's basically *always* guaranteed to come at the worst possible time as a result of this, and so unless we can find another strategy we can't really do it :)

The point of unregistering it when there's no WebView is to avoid the app being woken for every network change just because it used a WebView at some time in the past. Ideally we would unregister it any time there wasn't a WebView in the foreground, even if they existed (which is how Chrome does it), but we don't have a reliable signal for whether we're in the foreground or not, and so at least handling the case of "no webviews existing" was the minimum bar.
Chrome's network stack cannot reliably be used without listening for network change signals, so we need to fix this.

What about a solution like:
1. When AwNCN is unregistered send out a network change signal to flush any state that could become stale.
2. When AwNCN is registered tell the Java NCN to re-initialize (basically just rerun lines 420-424 of NetworkChangeNotifierAutoDetect.java)
3. Reconnect the AwNetworkChangeNotifier::OnConnectionTypeChanged() to send the network change signals

Comment 12 by torne@chromium.org, Mar 17 2016

A couple of questions, sorry if I'm just not understanding:

1) What does Chrome (on Android) actually do currently? Does it send out a network change signal every time chrome goes into the background as you're proposing here?

2) If we send out a network change signal when we unregister the notifier, isn't that going to kill all current connections that are doing something at that time, instead of killing them when we re-register? I guess we're less likely to be doing useful network communication at that point (especially right now when it's based on the last webview being deleted; I guess only service workers or other weird cases might be using the network then?), but if we do ever find a sensible way to tie this to background state, I'm not sure that we'd expect all inflight network requests to be cancelled upon going into the background..
1) No. Chrome on Android unregisters when the app is backgrounded.  When the app is foregrounded it re-registers, and if there has been a network change in the intervening time, it sends out a network change signal.

2) If there are no WebViews when we unregister then I don't imagine users receiving any ERR_NETWORK_CHANGED errors, so there shouldn't be significant fallout.  If we do tie this to background state, I imagine we could avoid using this work-around (to send a network change signal on unregistering).

Comment 14 by torne@chromium.org, Mar 23 2016

How does the timing work out for Chrome? For example, is there actually anything preventing this sequence of events:

1) Send Chrome to the background, NCN unregisters
2) Network changes, Chrome doesn't observe this because it's unregistered.
3) Tap a link in another application that causes a new Chrome tab to appear in the foreground
4) New tab starts sending network requests to load resources
5) NCN re-registers and discovers the network has changed
6) New tab's network requests get cancelled with ERR_NETWORK_CHANGED

This appears to be analogous to what's breaking it in WebView. If this sequence can't happen in Chrome, how come?

If there's no WebViews the risk is lower, yes, as there's only a few things like service workers that might be using the network stack then. We can't tie it to the background state in WebView as we have no sensible way to observe it (we're just a View, we don't have a background state). The only useful signal we have right now is whether any WebViews exist; we might be able to come up with some other signals but haven't got them currently.
There isn't anything preventing the sequence of events you mention for Chrome.  I imagine this doesn't cause too many issues for Chrome (but could for WebView) for a couple reasons:
1. Chrome will fire up a new renderer to load the link.  Firing up the renderer is probably so slow that by the time it goes to issue network requests, the network change signal has already finished propagating through the network stack (and won't cause and ERR_NETWORK_CHANGED).
2. Chrome has network error pages that retry.  So even if a ERR_NETWORK_CHANGED transpires, the page may reload instantly and the user might not see the error.

Comment 16 by torne@chromium.org, Mar 31 2016

OK. So that's the problem for WebView: it's *very* common that what will happen in an app is "new WebView()" (causing us to register the NCN if there were previously 0 webviews) immediately followed by "webview.loadUrl("whatever")", which seems to cause this sequence of events to happen maybe 25% of the time. Our network errors don't retry (we can't do this really as we signal errors to the app to let them handle them if they want), and the time lag is much smaller. :/

So, to enable it in WebView we'd need to find a way to stop this from happening...
My proposal in Comment 11 should work for WebView.  I didn't know WebView supported service workers...that complicates things.  One possible solution to the sub-problem of service workers would be to register for network events when the service workers are running; this shouldn't have a significant cost as the service worker is already keeping the CPU active and probably exchanging network traffic.

Comment 18 by torne@chromium.org, Mar 31 2016

Our service worker support is a bit weird and probably doesn't work quite right at the moment; when we fix it we should probably make them count as "alive" instead of just counting WebView instances, yes.
Yes, interesting idea, actually I am currently looking at the various edge cases for service workers in webview...

Not exactly sure regarding #11, will need to have a deeper look at this.. My current thinking is that NCN propagating changes to its various observers can still be an issue. Upon the very first initialization of NCN there are no observers so we can set any state without triggering 'rebalancing' of the network stack.. After the browser process is initialized there are observers so any change in state will propagate to them potentially causing dropping of ongoing network requests.

Owner: timvolod...@chromium.org
Status: Assigned (was: Unconfirmed)
assigned to tim to remove from bug cop queue :)

Comment 21 by j...@saxobank.com, Apr 5 2016

Thanks for looking into this issue, your assistance is very much appreciated. I realize that this is harder to fix than it looks from the outside.
While waiting for the proper solution inside of the WebView, is there something we could do in our native application to mitigate this?
Like if we listen for network change signal, we could perhaps destroy and recreate the WebView to flush the connections? Except we might destroy the WebView in situations, where there were no stale connections cached.
Any good suggestions are welcome.
Destroying and recreating the webview won't do anything; the state of the chromium network stack is global and will not be reset by that.

If you wanted to work around it, you should listen for network changes and then just do a new navigation in the WebView to cancel any previous navigations, i.e. call "loadUrl" again.
 Issue 585836  has been merged into this issue.
Actually just tried the original description of the bug:
both with the listed app (WebView Developer Browser) and our SystemWebViewShell. In both cases I get a "net::ERR_INTERNET_DISCONNECTED" after switching off wifi as expected.

device: Nexus9, Android N, WebView 53
@timvolodine comment 24: This may not be easily reproduced as it involves stale unflushed network stack state interfereing with future network requests.  I should also note that trying to reproduce this with a Nexus 9 will fail to show the bug unless your Nexus 9 also has an active LTE connection, otherwise turning off WiFi leaves the Nexus 9 with no network connection at all and so everything will fail instantly.  Note how the original report came from a phone.

Comment 26 by xing...@intel.com, Apr 8 2017

I reproduced this issue with Xiaomi 4(arm) Android 6.0.1/55.0.2883.91	WebViewDemo(using default WebView: 55.0.2883.91). The Key to reproduce this is when turn off wifi, you have to click reload very quickly.

Case code:
https://github.com/axinging/AndroidDemo/blob/master/WebViewDemo

Steps:
1), WIFI on, when page load is done, turn WIFI OFF, then click RELOAD button quickly.


xiaomi4_webview_reload.mp4
6.2 MB View Download

Comment 27 Deleted

Comment 28 by ro...@aruodas.lt, Mar 26 2018

Still exists, app is as fresh as possible, all libs was updated before this test.
On video you can see that first time after network "change" app is working, but after second one with link click and instant refresh after it isn't.
Kapture 2018-03-26 at 11.44.19.mp4
1.4 MB View Download

Sign in to add a comment