Update WebView.loadDataWithBaseUrl documentation |
|||||||||||||
Issue descriptionThe documentation for loadDataWithBaseUrl isn't very specific about what the exact effects of changing the base url and history url actually are. The "base URL" is not really used the same way as the HTML spec concept of the base URI (it's not just for resolving relative links). It would be good to investigate its actual current behaviour to make sure we are clear about the semantics that it has (and that apps may be relying on), and update the documentation to describe it more clearly and ideally with terminology compatible with Web specifications.
,
Dec 15 2016
,
Dec 16 2016
I won't have time to look at this for a long time (I'm not familiar with loadDataWithBaseUrl either), so this will probably just be lying dormant for a while - if that's alright I'll have a look whenever I get time.
,
Mar 9 2017
,
Mar 9 2017
,
Sep 8 2017
,
Sep 8 2017
,
Sep 13 2017
,
Sep 18 2017
I played with inputs to this api and put together a table of things I noticed. I'd love feedback if anyone has other ideas of effects to look at or of what the meaning of this behavior is. https://docs.google.com/a/google.com/document/d/1VY6TzRJXo_IXVxc3vDJzrW9xxXtef0yus1qjgl7gdTo/edit?usp=sharing
,
Sep 19 2017
The main effects that are interesting and need documenting are the effects on web content and the web platform, not the other Java APIs in WebView. I don't think we're going to be able to get the information we need just by trying things out - we likely need to investigate the implementation and then run experiments to confirm our understanding is correct. For example: the "base URL" determines the actual URL of the document, reflected in window.location and related properties, and therefore also determines the origin of the document, with all the usual security-related consequences. This is hinted at by the current documentation suggesting using loadDataWithBaseUrl instead of loadData if you need a particular origin for same-origin policy reasons, but it's not really explicitly stated anywhere and it's confusingly named (since there's already a "base URI" concept in HTML that means something else).
,
Sep 19 2017
not much investigation needed in that case then? base url should be the only url that the web platform sees
,
Sep 19 2017
Doesn't the history URL end up in the session history as well and exposed? I'm not even sure under what circumstances you end up navigating there :)
,
Sep 19 2017
well... "awhile" ago, refreshing a loadDataWithBaseURL would crash, so ymmv.. I guess a page can see its own history state. That's the only corner case then, probably. I guess the part that always confuses me is, which url am I gonna see in onPageStarted/Finished? In navigation/resource interception? etc etc.
,
Sep 19 2017
I'm new to this, so I'm not sure if there are other places to look at session history, but I know that canGoBack will return false if all you do is loadDataWithBaseUrl. (reload is fine, though) onPageFinished pretty much just depends on baseUrl. I could check abou the other cases!
,
Sep 19 2017
It's the current entry in the session history, so if you've only loaded one URL then there won't be anything to go back to, no. If you navigate somewhere else first then go back, it will go back to the history URL..
,
Sep 19 2017
stackoverflow says history.state isn't implemented in webkit. history.current/prev/next don't appear to work either. Maybe that's intentional? Either way, js doesn't seem to be able to get at the history url. maybe should double check all the callbacks on WebViewClient and WebChromeClient that pass up a url parameter, and check which url is passed up. There's a lot of them. Probably not all of them apply to loadDataWithBaseURL. torne, you think that's useful?
,
Sep 19 2017
The fact that when you go forward and back you end up somewhere different is itself an observable difference, is my point. The current documentation just says "The historyUrl is used for the history entry" and nothing else, which doesn't really explain what that actually means in practise. I didn't include any reference in this bug to the context of why I was filing it, unfortunately, and I'm not sure I remember :/
,
Sep 19 2017
> The fact that when you go forward and back you end up somewhere different is itself an observable difference, is my point. Err... bug? I mean if we have to save that gigantic data url in the navigation history in order to correctly support forward/back, then we probably should do that. I thought we did do that though. > The current documentation just says "The historyUrl is used for the history entry" and nothing else, which doesn't really explain what that actually means in practise. WebViewClient.doUpdateVisitedHistory? Just guessing..
,
Sep 19 2017
Isn't that exactly the point of having the history URL in the first place, specifically so that it *will* return to the history URL instead of the data? I thought that was the whole reason..
,
Sep 19 2017
who knows what the reason was. the design decision to make doUpdateVisitedHistory special makes to me right now So we should double check that, along with all the other callbacks?
,
Sep 19 2017
err, makes *no sense* to me
,
Sep 20 2017
I mean the actual back/forward navigation, not just doUpdateVisitedHistory. So I guess someone should check if that's even how history navigation actually works or if I'm mistaken.
,
Sep 20 2017
Yes, there's WebBackForwardList, ie the list of navigation entries for the webview. jamwalla is double checking all the APIs now
,
Sep 20 2017
No, I mean literally "what URL do you end up on after you navigate forward and then back from a page that was loaded with a historyUrl != baseUrl" - i.e. the observable behaviour to a user looking at the WebView, nothing to do with the APIs.
,
Sep 20 2017
I'd expect it be the same result as the original loadDataWithBaseURL, ie it should not hit the network.
,
Sep 21 2017
,
Sep 21 2017
Right, but I expect the opposite: if you call loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "UTF-8", "http://www.google.com"), then loadUrl("about:blank"), then goBack(), I would expect the WebView to go hit the network and load www.google.com. If that's not what it does then I'm surprised, and that's why I think we need to document the effects better :)
,
Sep 21 2017
i.e. when it says that's the URL used for the history entry, I expect that it will *actually visit that URL* when it's accessed again via history.
,
Sep 21 2017
definitely not how that works :p eg a fragment "navigation" (ie just jump to a different part of the same document) still creates a new navigation entry. certainly wouldn't expect clicking back in on a fragment navigation in a loadDataWithBaseURL to start loading a different page..
,
Sep 21 2017
I tried the following:
webView.loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "utf-8", "http://www.google.com)
webView.loadUrl("about:blank")
webView.goBack()
after the call to goBack:
webView.getUrl() is "http://www.google.com"
webView.getOriginalUrl() is "data:text/html;charset=utf-8;base64,"
onPageStarted is not called
onPageFinished is called with url "http://foo.com/"
Does that align with what we're expecting? How else can I check if webView is hitting the network?
,
Sep 21 2017
does the page still say "hi"?
,
Sep 21 2017
yup
,
Sep 21 2017
Wait, so does that mean the first time we go to the page the URL is http://foo.com but then after going forward and back it becomes http://www.google.com, but in both cases we just render the data? That's... *even more* weird and surprising than either option I would have guessed, if that's correct. What on earth is historyUrl even for? I'm having a hard time imagining why this feature even exists now :)
,
Sep 21 2017
The outputs in the first step (calling loadDataWithBaseUrl) and the third (going back to that same page) are actually almost identical: according to webView.getUrl(), we are always on http://www.google.com according to onPageFinished, we are always on http://foo.com The only difference I can see is that onPageStarted is not triggered by goBack()
,
Sep 21 2017
> Wait, so does that mean the first time we go to the page the URL is http://foo.com I would expect it to be google.com the first time as well. ie historyUrl always gets returned from webview.getUrl. surprise! > What on earth is historyUrl even for? I'm having a hard time imagining why this feature even exists now :) I had the same question a few replies up
,
Sep 21 2017
Huh. So if the difference between baseUrl and historyUrl here is that window.location is one thing and webview.getUrl is another thing then, uhm.. that's definitely surprising indeed. I thought the point of historyUrl was to enable you to have some sensible behaviour on going back (i.e. going to a real URL that could for example be intercepted by the app or loaded from a real server), because I thought we *didn't* save the actual data in history. I guess we do, though, so that doesn't work out. This is what I mean when I say the documentation needs to be *way* more explicit about the purpose/meaning of these things, though :) Ideally we would include actual practical examples of why you might want to do different things, but if we don't actually even know why these features exist at all that might be a challenge. (yet another question: if you loadDataWithBaseUrl on a huge data: URL and then call saveState, does your saved state bundle just explode from being too big? That's yet another reason I thought we didn't save it and used the historyUrl instead)
,
Sep 21 2017
I think you are agreeing with me now we should go through all the APIs and check which url is returned by which API :p > (yet another question: if you loadDataWithBaseUrl on a huge data: URL and then call saveState, does your saved state bundle just explode from being too big? That's yet another reason I thought we didn't save it and used the historyUrl instead) Looking at this: https://cs.chromium.org/chromium/src/android_webview/browser/state_serializer.cc?rcl=1b52ed8e9c06da6d5c846d2668408cbc95bbbb85&l=168 I don't think we used to, but we don't anymore, and that's a bug.. When I wrote state_serializer, the data was made into a url and just saved in the url field (iirc). But since then, there's now an optimization to put the data into another refcounted string field to avoid many multiple copies, but that field isn't saved there. arrggg
,
Sep 21 2017
> I don't think we used to, but we don't anymore, and that's a bug.. err, I *think* we used to
,
Sep 21 2017
arrg, I can't read (the version check threw me off..) GetDataURLAsString is saved. We are good. And yes, the gigantic data string will end up in the state bundle.
,
Sep 21 2017
Here's another wrinkle (no idea if this is clarifying haha): if i call
webView.loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "utf-8", "http://www.google.com")
webView.loadUrl("http://www.google.com")
webView.goBack()
The webview can't go back to the originally loaded data. That is, if we loadDataWithBaseUrl and a given historyUrl and then navigate to that historyUrl, the second navigation (call to loadUrl(historyUrl)) doesn't create a new entry in the WebBackForwardList but instead changes the original one. In this case the WebBackForwardList has a single entry with url "http://google.com" and originalUrl "data:text/html;charset=utf-8;base64," when I would expect it to have 2 entries.
,
Sep 21 2017
Hmm... it's being treated as a refresh probably?
,
Sep 22 2017
So, re: historyUrl, it doesn't show up in doUpdatedHistory or in the WebBackForwardList. The only place it does appear is webView.getUrl(). In the WebView Shell Browser the only place getUrl called is if the browser is restoring state from a previous session. But otherwise (like, to update the URL bar) the browser uses the url from onPageFinished, which is baseUrl, not historyUrl. When do we intend for webView.getUrl() to be called?
,
Sep 23 2017
Hmm.. I wonder if we should just deprecate history url parameter and just always make it equal base url instead..
,
Sep 25 2017
Maybe? I dunno how we'd find out whether apps are relying on anything here. Also: I wonder if this behaviour has accidentally changed at some point. Does the old WebView show the same behaviours here? It seems plausible that this behaviour is weird and complex enough that we've changed it without anyone particularly noticing, though if that's the case that'd be evidence that apps aren't really relying on the details :)
,
Sep 26 2017
In org.chromium.content_public.browser.LoadUrlParams, the field that historyUrl is passed into is called mVirtualUrlForDataUrl, and the docs refer to it as a virtual URL. Is that a term that makes more sense for WebView's documentation?
,
Sep 26 2017
I like the docs for chrome's <webview/> tag here: https://developer.chrome.com/apps/tags/webview#method-loadDataWithBaseUrl "Loads a data URL with a specified base URL used for relative links. Optionally, a virtual URL can be provided to be shown to the user instead of the data URL."
,
Sep 26 2017
,
Sep 27 2017
I *think* webview tag copied us. Anyway, looking at code doesn't help much. Can you check older versions of webview (maybe try kitkat), and webview classic in jellybean and see if they all behave the same way
,
Sep 28 2017
Yeah, that kind of text in #46 isn't necessarily going to make sense for WebView - what URL gets shown to the user in WebView use cases (if there is even a URL shown at all) depends how the app implements its URL display and might be based on several different possible webview callbacks, and the earlier results in this bug suggest many of those callbacks don't return the history URL anyway.
,
Oct 10 2017
Revisiting this - behavior does change slightly between Jellybean and Oreo, though the behavior of getUrl and onPageStarted/Finished doesn't change. Here's yet another sheet with these behaviors. https://docs.google.com/a/google.com/spreadsheets/d/1OAQKGfYYY_p53EYceUv0BbbiTvNKojZ_cj_XEz2kAac/edit?usp=sharing
,
Oct 12 2017
When you mention the different OS versions there, I assume you mean with whichever version of webview is preinstalled on that OS version? If the URLs we return from these calls have changed then that's probably actual bugs we've introduced and should investigate to see why things have changed.. :(
,
Oct 12 2017
Hmm, that history to empty data url transition doesn't match anything I can think of, looking at things from git log -Gdata_url_as_string -- content/renderer/render_frame_impl.cc maybe worth figuring out which CL introduced that change
,
Oct 13 2017
That change - from historyUrl to empty data url in getOriginalUrl, doUpdateVisitedHistory, onPageCommitVisible, and WebBackForwardList - happened between chrome versions 49.0.2573.0 and 49.0.2574.0. This is probably the commit in question: https://chromium.googlesource.com/chromium/src/+/15890e4b345bfef3854566899a0a462f7c5939a9
,
Oct 14 2017
so behavior changed a looong time ago, and at least I've not heard anything from that. perhaps that's a good indication we can just deprecated history url, and replace it with base url instead for all the APIs?
,
Oct 16 2017
Since getUrl (and only getUrl) still returns the historyUrl, it seems like there is a use case where the baseUrl is important (for the reasons described in the documentation) but we prefer to show another url (maybe to the user? and usually about:blank) through getUrl. Feels like a significant change to start returning baseUrl in getUrl, but maybe it wouldn't actually matter? In the mean time, would it be correct to update the documentation to refer to getUrl (ie "the history url is used for calls to getUrl")? The current docs say the historyUrl is used for the "history entry," which, if this refers to the WebBackForwardList, is no longer true.
,
Oct 30 2017
(ping - thoughts on above comment?) I'm going to add a UMA metric to see how historyUrl is used. The bug for this is crbug.com/779661 .
,
Oct 30 2017
I think we should just deprecate history url, and make all APIs return the base url instead. What happens to documentation depends on what we decide here, so hold off on that too I guess
,
Oct 31 2017
Yeah, getting some metrics seems like a good idea, because without some idea of how this is being used in the field I'm not sure what deprecating it would really mean :) Bo, I assume what you actually mean is "we should just change the existing behaviour so that the parameter is ignored", which would presumably be roughly equivalent to clients always passing the same URL for base and history in terms of the effects? Just for apps that target a new API level, or what?
,
Oct 31 2017
Yeah that. For all apps. It already happened inadvertently once before (#53), so it just looks like no one actually cares
,
Oct 31 2017
For all apps isn't really a deprecation, then, but yes, I guess we could try it and see what happens, or collect some stats first?
,
Oct 31 2017
We could collect data just for having the data available for other things in the future. But if we want to use data to decide whether to break the API or not, then what data do you want to look for and how should the decision be made specifically? (Going with advice from that stats class, that we should state the experiment first before collecting data)
,
Oct 31 2017
If there's a significant amount of usage where (historyUrl != baseUrl && historyUrl != null) then that suggests to me that there might be some risk here. I'm still not entirely convinced that the return values of some of WebView's Java APIs are the only things affected by historyUrl, so the fact that we've changed those in the past doesn't quite guarantee that changing more of them is safe. If almost all usage is either using the same URL for both, or just passing null/emptystring for historyUrl (which I don't think is likely to be an interesting case, since we treat that as about:blank if I understand correctly, and that's probably Not Important?) then just ignoring it seems pretty safe.
,
Oct 31 2017
> "we should just change the existing behaviour so that the parameter is ignored" Do we know if this will break CTS or integration tests? Uploading an experimental CL implementing this might be a good first step to check.
,
Nov 1 2017
Here's a CL that sets historyUrl = baseUrl: https://chromium-review.googlesource.com/c/chromium/src/+/748304 It breaks some tests: https://storage.cloud.google.com/chromium-result-details/html/webview_instrumentation_test_apk_android_n5x_swarming_rel_294368_2017_11_01_T02_58_47-UTC I think these are all expected?
,
Nov 1 2017
yeah that's expected
,
Nov 11 2017
Another weird thing I saw: issue 779671 (the app calls `loadDataWithBaseUrl(null)`)
,
Nov 11 2017
Er, that's issue 783942
,
Mar 15 2018
Ping, any update on this bug? Also, boliu@ tagged this bug in this PlzNavigate-related test [1], is this bug actually relevant to that? [1] https://cs.chromium.org/chromium/src/android_webview/javatests/src/org/chromium/android_webview/test/AwContentsClientShouldInterceptRequestTest.java?q=AwContentsClientShouldInterceptRequestTest+testLoadDataWithBaseUrlTriggersShouldInterceptRequest&sq=package:chromium&l=987
,
Mar 15 2018
I proposed in #43 above to deprecate historyUrl. How that TODO gets resolved will depend on the historyUrl decision
,
Mar 15 2018
If the UMA results for this are good (it's been 3 months now) I think we could deprecate it: https://uma.googleplex.com/p/android_webview/histograms?endDate=latest&dayCount=28&histograms=WebView.LoadDataWithBaseUrl.HistoryUrl&fixupData=true&showMax=true&filters=channel%2Ceq%2C4%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial what do you all think?
,
Mar 15 2018
imo definitely good enough to update the documentation to say the parameter is deprecated maybe wait until we have finch before actively changing behavior of things
,
Mar 15 2018
Do we have a buganizer to track the docs change?
,
Jun 1 2018
Docs change tracked internally as http://b/75033113
,
Jun 5 2018
We tried to send a CL to android to change the documentation, but they prefer waiting until the next release and actually deprecating the parameter then. I updated b/75033113 to reflect that.
,
Jun 18 2018
Passing to Changwan - after discussion ag/4074610, this bug should just be to deprecate the historyUrl parameter (i.e. don't change documentation until the parameter is actually deprecated)
,
Jan 15
|
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by ti...@chromium.org
, Dec 15 2016Status: Assigned (was: Available)