New issue
Advanced search Search tips

Issue 669885 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug

Blocked on:
issue 848305



Sign in to add a comment

Update WebView.loadDataWithBaseUrl documentation

Project Member Reported by torne@chromium.org, Nov 30 2016

Issue description

The documentation for loadDataWithBaseUrl isn't very specific about what the exact effects of changing the base url and history url actually are. The "base URL" is not really used the same way as the HTML spec concept of the base URI (it's not just for resolving relative links).

It would be good to investigate its actual current behaviour to make sure we are clear about the semantics that it has (and that apps may be relying on), and update the documentation to describe it more clearly and ideally with terminology compatible with Web specifications.
 

Comment 1 by ti...@chromium.org, Dec 15 2016

Owner: sgu...@chromium.org
Status: Assigned (was: Available)
Selim, please help triage.

Comment 2 by sgu...@chromium.org, Dec 15 2016

Owner: gsennton@chromium.org
I won't have time to look at this for a long time (I'm not familiar with loadDataWithBaseUrl either), so this will probably just be lying dormant for a while - if that's alright I'll have a look whenever I get time. 
Owner: sgu...@chromium.org
Labels: -Pri-3 Pri-2
Owner: jamwalla@chromium.org
Labels: -Pri-2 Pri-3
Cc: paulmiller@chromium.org
Cc: sgu...@chromium.org gsennton@chromium.org
I played with inputs to this api and put together a table of things I noticed. I'd love feedback if anyone has other ideas of effects to look at or of what the meaning of this behavior is.

https://docs.google.com/a/google.com/document/d/1VY6TzRJXo_IXVxc3vDJzrW9xxXtef0yus1qjgl7gdTo/edit?usp=sharing

Comment 10 by torne@chromium.org, Sep 19 2017

The main effects that are interesting and need documenting are the effects on web content and the web platform, not the other Java APIs in WebView. I don't think we're going to be able to get the information we need just by trying things out - we likely need to investigate the implementation and then run experiments to confirm our understanding is correct.

For example: the "base URL" determines the actual URL of the document, reflected in window.location and related properties, and therefore also determines the origin of the document, with all the usual security-related consequences. This is hinted at by the current documentation suggesting using loadDataWithBaseUrl instead of loadData if you need a particular origin for same-origin policy reasons, but it's not really explicitly stated anywhere and it's confusingly named (since there's already a "base URI" concept in HTML that means something else).

Comment 11 by boliu@chromium.org, Sep 19 2017

not much investigation needed in that case then? base url should be the only url that the web platform sees

Comment 12 by torne@chromium.org, Sep 19 2017

Doesn't the history URL end up in the session history as well and exposed? I'm not even sure under what circumstances you end up navigating there :)

Comment 13 by boliu@chromium.org, Sep 19 2017

well... "awhile" ago, refreshing a loadDataWithBaseURL would crash, so ymmv..

I guess a page can see its own history state. That's the only corner case then, probably.

I guess the part that always confuses me is, which url am I gonna see in onPageStarted/Finished? In navigation/resource interception? etc etc.
Cc: -sgu...@chromium.org -gsennton@chromium.org boliu@chromium.org
I'm new to this, so I'm not sure if there are other places to look at session history, but I know that canGoBack will return false if all you do is loadDataWithBaseUrl. (reload is fine, though)

onPageFinished pretty much just depends on baseUrl. I could check abou the other cases!

Comment 15 by torne@chromium.org, Sep 19 2017

It's the current entry in the session history, so if you've only loaded one URL then there won't be anything to go back to, no. If you navigate somewhere else first then go back, it will go back to the history URL..

Comment 16 by boliu@chromium.org, Sep 19 2017

stackoverflow says history.state isn't implemented in webkit. history.current/prev/next don't appear to work either. Maybe that's intentional? Either way, js doesn't seem to be able to get at the history url.

maybe should double check all the callbacks on WebViewClient and WebChromeClient that pass up a url parameter, and check which url is passed up. There's a lot of them. Probably not all of them apply to loadDataWithBaseURL. torne, you think that's useful?

Comment 17 by torne@chromium.org, Sep 19 2017

The fact that when you go forward and back you end up somewhere different is itself an observable difference, is my point. The current documentation just says "The historyUrl is used for the history entry" and nothing else, which doesn't really explain what that actually means in practise.

I didn't include any reference in this bug to the context of why I was filing it, unfortunately, and I'm not sure I remember :/

Comment 18 by boliu@chromium.org, Sep 19 2017

> The fact that when you go forward and back you end up somewhere different is itself an observable difference, is my point.

Err... bug? I mean if we have to save that gigantic data url in the navigation history in order to correctly support forward/back, then we probably should do that. I thought we did do that though.

> The current documentation just says "The historyUrl is used for the history entry" and nothing else, which doesn't really explain what that actually means in practise.

WebViewClient.doUpdateVisitedHistory? Just guessing..

Comment 19 by torne@chromium.org, Sep 19 2017

Isn't that exactly the point of having the history URL in the first place, specifically so that it *will* return to the history URL instead of the data? I thought that was the whole reason..

Comment 20 by boliu@chromium.org, Sep 19 2017

who knows what the reason was. the design decision to make doUpdateVisitedHistory special makes to me right now

So we should double check that, along with all the other callbacks?

Comment 21 by boliu@chromium.org, Sep 19 2017

err, makes *no sense* to me

Comment 22 by torne@chromium.org, Sep 20 2017

I mean the actual back/forward navigation, not just doUpdateVisitedHistory.

So I guess someone should check if that's even how history navigation actually works or if I'm mistaken.

Comment 23 by boliu@chromium.org, Sep 20 2017

Yes, there's WebBackForwardList, ie the list of navigation entries for the webview.

jamwalla is double checking all the APIs now

Comment 24 by torne@chromium.org, Sep 20 2017

No, I mean literally "what URL do you end up on after you navigate forward and then back from a page that was loaded with a historyUrl != baseUrl" - i.e. the observable behaviour to a user looking at the WebView, nothing to do with the APIs.

Comment 25 by boliu@chromium.org, Sep 20 2017

I'd expect it be the same result as the original loadDataWithBaseURL, ie it should not hit the network.
Cc: ntfschr@chromium.org

Comment 27 by torne@chromium.org, Sep 21 2017

Right, but I expect the opposite: if you call loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "UTF-8", "http://www.google.com"), then loadUrl("about:blank"), then goBack(), I would expect the WebView to go hit the network and load www.google.com. If that's not what it does then I'm surprised, and that's why I think we need to document the effects better :)

Comment 28 by torne@chromium.org, Sep 21 2017

i.e. when it says that's the URL used for the history entry, I expect that it will *actually visit that URL* when it's accessed again via history.

Comment 29 by boliu@chromium.org, Sep 21 2017

definitely not how that works :p

eg a fragment "navigation" (ie just jump to a different part of the same document) still creates a new navigation entry. certainly wouldn't expect clicking back in on a fragment navigation in a loadDataWithBaseURL to start loading a different page..
I tried the following:

webView.loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "utf-8", "http://www.google.com)
webView.loadUrl("about:blank")
webView.goBack()

after the call to goBack:
webView.getUrl() is "http://www.google.com"
webView.getOriginalUrl() is "data:text/html;charset=utf-8;base64,"
onPageStarted is not called
onPageFinished is called with url "http://foo.com/"

Does that align with what we're expecting? How else can I check if webView is hitting the network?

Comment 31 by boliu@chromium.org, Sep 21 2017

does the page still say "hi"?
yup

Comment 33 by torne@chromium.org, Sep 21 2017

Wait, so does that mean the first time we go to the page the URL is http://foo.com but then after going forward and back it becomes http://www.google.com, but in both cases we just render the data?

That's... *even more* weird and surprising than either option I would have guessed, if that's correct.

What on earth is historyUrl even for? I'm having a hard time imagining why this feature even exists now :)
The outputs in the first step (calling loadDataWithBaseUrl) and the third (going back to that same page) are actually almost identical:

according to webView.getUrl(), we are always on http://www.google.com
according to onPageFinished, we are always on http://foo.com

The only difference I can see is that onPageStarted is not triggered by goBack()

Comment 35 by boliu@chromium.org, Sep 21 2017

> Wait, so does that mean the first time we go to the page the URL is http://foo.com 

I would expect it to be google.com the first time as well. ie historyUrl always gets returned from webview.getUrl. surprise!

> What on earth is historyUrl even for? I'm having a hard time imagining why this feature even exists now :)

I had the same question a few replies up

Comment 36 by torne@chromium.org, Sep 21 2017

Huh. So if the difference between baseUrl and historyUrl here is that window.location is one thing and webview.getUrl is another thing then, uhm.. that's definitely surprising indeed.

I thought the point of historyUrl was to enable you to have some sensible behaviour on going back (i.e. going to a real URL that could for example be intercepted by the app or loaded from a real server), because I thought we *didn't* save the actual data in history. I guess we do, though, so that doesn't work out.

This is what I mean when I say the documentation needs to be *way* more explicit about the purpose/meaning of these things, though :) Ideally we would include actual practical examples of why you might want to do different things, but if we don't actually even know why these features exist at all that might be a challenge.

(yet another question: if you loadDataWithBaseUrl on a huge data: URL and then call saveState, does your saved state bundle just explode from being too big? That's yet another reason I thought we didn't save it and used the historyUrl instead)

Comment 37 by boliu@chromium.org, Sep 21 2017

I think you are agreeing with me now we should go through all the APIs and check which url is returned by which API :p

> (yet another question: if you loadDataWithBaseUrl on a huge data: URL and then call saveState, does your saved state bundle just explode from being too big? That's yet another reason I thought we didn't save it and used the historyUrl instead)

Looking at this: https://cs.chromium.org/chromium/src/android_webview/browser/state_serializer.cc?rcl=1b52ed8e9c06da6d5c846d2668408cbc95bbbb85&l=168

I don't think we used to, but we don't anymore, and that's a bug..

When I wrote state_serializer, the data was made into a url and just saved in the url field (iirc). But since then, there's now an optimization to put the data into another refcounted string field to avoid many multiple copies, but that field isn't saved there. arrggg

Comment 38 by boliu@chromium.org, Sep 21 2017

> I don't think we used to, but we don't anymore, and that's a bug..

err, I *think* we used to

Comment 39 by boliu@chromium.org, Sep 21 2017

arrg, I can't read (the version check threw me off..) GetDataURLAsString is saved. We are good. And yes, the gigantic data string will end up in the state bundle.
Here's another wrinkle (no idea if this is clarifying haha): if i call

webView.loadDataWithBaseUrl("http://foo.com", "hi", "text/html", "utf-8", "http://www.google.com")
webView.loadUrl("http://www.google.com")
webView.goBack()

The webview can't go back to the originally loaded data. That is, if we loadDataWithBaseUrl and a given historyUrl and then navigate to that historyUrl, the second navigation (call to loadUrl(historyUrl)) doesn't create a new entry in the WebBackForwardList but instead changes the original one. In this case the WebBackForwardList has a single entry with url "http://google.com" and originalUrl "data:text/html;charset=utf-8;base64," when I would expect it to have 2 entries.


Comment 41 by boliu@chromium.org, Sep 21 2017

Hmm... it's being treated as a refresh probably?
So, re: historyUrl, it doesn't show up in doUpdatedHistory or in the WebBackForwardList. The only place it does appear is webView.getUrl().

In the WebView Shell Browser the only place getUrl called is if the browser is restoring state from a previous session. But otherwise (like, to update the URL bar) the browser uses the url from onPageFinished, which is baseUrl, not historyUrl.

When do we intend for webView.getUrl() to be called?

Comment 43 by boliu@chromium.org, Sep 23 2017

Hmm.. I wonder if we should just deprecate history url parameter and just always make it equal base url instead..

Comment 44 by torne@chromium.org, Sep 25 2017

Maybe? I dunno how we'd find out whether apps are relying on anything here.

Also: I wonder if this behaviour has accidentally changed at some point. Does the old WebView show the same behaviours here? It seems plausible that this behaviour is weird and complex enough that we've changed it without anyone particularly noticing, though if that's the case that'd be evidence that apps aren't really relying on the details :)
In org.chromium.content_public.browser.LoadUrlParams, the field that historyUrl is passed into is called mVirtualUrlForDataUrl, and the docs refer to it as a virtual URL. Is that a term that makes more sense for WebView's documentation?
I like the docs for chrome's <webview/> tag here: https://developer.chrome.com/apps/tags/webview#method-loadDataWithBaseUrl

"Loads a data URL with a specified base URL used for relative links. Optionally, a virtual URL can be provided to be shown to the user instead of the data URL."
Cc: changwan@chromium.org

Comment 48 by boliu@chromium.org, Sep 27 2017

I *think* webview tag copied us. Anyway, looking at code doesn't help much.

Can you check older versions of webview (maybe try kitkat), and webview classic in jellybean and see if they all behave the same way

Comment 49 by torne@chromium.org, Sep 28 2017

Yeah, that kind of text in #46 isn't necessarily going to make sense for WebView - what URL gets shown to the user in WebView use cases (if there is even a URL shown at all) depends how the app implements its URL display and might be based on several different possible webview callbacks, and the earlier results in this bug suggest many of those callbacks don't return the history URL anyway.
Revisiting this - behavior does change slightly between Jellybean and Oreo, though the behavior of getUrl and onPageStarted/Finished doesn't change. Here's yet another sheet with these behaviors.

https://docs.google.com/a/google.com/spreadsheets/d/1OAQKGfYYY_p53EYceUv0BbbiTvNKojZ_cj_XEz2kAac/edit?usp=sharing

Comment 51 by torne@chromium.org, Oct 12 2017

When you mention the different OS versions there, I assume you mean with whichever version of webview is preinstalled on that OS version?

If the URLs we return from these calls have changed then that's probably actual bugs we've introduced and should investigate to see why things have changed.. :(

Comment 52 by boliu@chromium.org, Oct 12 2017

Hmm, that history to empty data url transition doesn't match anything I can think of, looking at things from
git log -Gdata_url_as_string -- content/renderer/render_frame_impl.cc

maybe worth figuring out which CL introduced that change
That change - from historyUrl to empty data url in getOriginalUrl, doUpdateVisitedHistory, onPageCommitVisible, and WebBackForwardList - happened between chrome versions 49.0.2573.0 and 49.0.2574.0.

This is probably the commit in question: https://chromium.googlesource.com/chromium/src/+/15890e4b345bfef3854566899a0a462f7c5939a9

Comment 54 by boliu@chromium.org, Oct 14 2017

so behavior changed a looong time ago, and at least I've not heard anything from that. perhaps that's a good indication we can just deprecated history url, and replace it with base url instead for all the APIs?
Since getUrl (and only getUrl) still returns the historyUrl, it seems like there is a use case where the baseUrl is important (for the reasons described in the documentation) but we prefer to show another url (maybe to the user? and usually about:blank) through getUrl. Feels like a significant change to start returning baseUrl in getUrl, but maybe it wouldn't actually matter?

In the mean time, would it be correct to update the documentation to refer to getUrl (ie "the history url is used for calls to getUrl")? The current docs say the historyUrl is used for the "history entry," which, if this refers to the WebBackForwardList, is no longer true.
(ping - thoughts on above comment?)

I'm going to add a UMA metric to see how historyUrl is used. The bug for this is  crbug.com/779661 .

Comment 57 by boliu@chromium.org, Oct 30 2017

I think we should just deprecate history url, and make all APIs return the base url instead.

What happens to documentation depends on what we decide here, so hold off on that too I guess

Comment 58 by torne@chromium.org, Oct 31 2017

Yeah, getting some metrics seems like a good idea, because without some idea of how this is being used in the field I'm not sure what deprecating it would really mean :)

Bo, I assume what you actually mean is "we should just change the existing behaviour so that the parameter is ignored", which would presumably be roughly equivalent to clients always passing the same URL for base and history in terms of the effects? Just for apps that target a new API level, or what?

Comment 59 by boliu@chromium.org, Oct 31 2017

Yeah that. For all apps. It already happened inadvertently once before (#53), so it just looks like no one actually cares

Comment 60 by torne@chromium.org, Oct 31 2017

For all apps isn't really a deprecation, then, but yes, I guess we could try it and see what happens, or collect some stats first?

Comment 61 by boliu@chromium.org, Oct 31 2017

We could collect data just for having the data available for other things in the future. But if we want to use data to decide whether to break the API or not, then what data do you want to look for and how should the decision be made specifically? (Going with advice from that stats class, that we should state the experiment first before collecting data)

Comment 62 by torne@chromium.org, Oct 31 2017

If there's a significant amount of usage where (historyUrl != baseUrl && historyUrl != null) then that suggests to me that there might be some risk here. I'm still not entirely convinced that the return values of some of WebView's Java APIs are the only things affected by historyUrl, so the fact that we've changed those in the past doesn't quite guarantee that changing more of them is safe.

If almost all usage is either using the same URL for both, or just passing null/emptystring for historyUrl (which I don't think is likely to be an interesting case, since we treat that as about:blank if I understand correctly, and that's probably Not Important?) then just ignoring it seems pretty safe.
> "we should just change the existing behaviour so that the parameter is ignored"

Do we know if this will break CTS or integration tests? Uploading an experimental CL implementing this might be a good first step to check.
yeah that's expected
Another weird thing I saw: issue 779671 (the app calls `loadDataWithBaseUrl(null)`)
Er, that's issue 783942

Comment 69 by boliu@chromium.org, Mar 15 2018

I proposed in #43 above to deprecate historyUrl. How that TODO gets resolved will depend on the historyUrl decision

Comment 71 by boliu@chromium.org, Mar 15 2018

imo definitely good enough to update the documentation to say the parameter is deprecated

maybe wait until we have finch before actively changing behavior of things
Do we have a buganizer to track the docs change?
Docs change tracked internally as http://b/75033113
We tried to send a CL to android to change the documentation, but they prefer waiting until the next release and actually deprecating the parameter then. I updated b/75033113 to reflect that.
Owner: changwan@chromium.org
Passing to Changwan - after discussion ag/4074610, this bug should just be to deprecate the historyUrl parameter (i.e. don't change documentation until the parameter is actually deprecated)
Blockedon: 848305

Sign in to add a comment