New issue
Advanced search Search tips

Issue 707179 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 690383



Sign in to add a comment

Extend WebContentsObserver to report manifest URLs

Project Member Reported by mastiz@chromium.org, Mar 31 2017

Issue description

This was discussed via e-mail but I'd rather have a well-scoped discussion here.

It is my impression that it would be desirable to extend WebContentsObserver with something analogous to DidUpdateFaviconURL(), but for web manifest URL (base::Optional<GURL>()).

This would be helpful to verify if a webmanifest is cached (current proposal relies on favicons for this, but regardless) without actually fetching and parsing the JSON. This is currently not possible with the existing WebContents::GetManifest().

dominickn@ assigning to you, since you mentioned earlier that you weren't very convinced about this.

 
For background, we talked about this a little in  https://crbug.com/689173 , where a PWA wasn't passing our checks because it used Javascript to inject a manifest after the page finished loading. At the time, I was fairly hesitant to add a new renderer -> browser IPC and plumbing for a relatively niche use case.

The key question to me is whether we are caching the full manifest data, or just storing names/short names? I don't quite see how exposing a base::Optional<GURL> getter on WebContentsObserver will help with caching if we are only storing titles.

Even then, we would still have to at least fetch (and possibly parse) the manifest itself to determine if the cached data was out of date, e.g. new icons have been added or a new title set.

Otherwise, I still don't see too much demand for manifest injection after the load event finishes, which is the point when we trigger our installability tests.

Please correct me if I'm missing anything. :)
"Even then, we would still have to at least fetch (and possibly parse) the manifest itself to determine if the cached data was out of date, e.g. new icons have been added or a new title set."

I disagree with this statement: the URL could be sufficient, together with a cache timestamp, to skip fetching altogether. We can assume that manifests don't change all that often, can't we? (we certainly do this for favicons)

This argument is stronger if we cache favicons only. In pkotwicz's last proposal, the manifest content is going to be ignored anyway if there's a cached favicon.

Does that make sense?
Hmm, I see your point here. I guess I'd like to see exactly how much extra data and/or time this might save before going ahead with it (i.e. we should prioritise some data collection on manifest sizes and how often users browse to pages with manifests).

The trade-off here is that we could implement all of this and find that it makes a negligible difference to just fetching the manifest all the time (the main difference with favicons being that pretty much every page has at least one favicon, which is definitely not going to be the case for a while with manifests. Plus, manifests are probably significantly smaller than icons on average).

Additionally, we actually did fetch the manifest and manifest icons (if they existed) on page load up until Chrome ~54, when we moved the manifest fetch behind a site engagement cutoff.
Should we then focus on measuring the data consumed by manifest-fetching, similarly to https://bugs.chromium.org/p/chromium/issues/detail?id=698347?
That definitely makes sense as a first step here. We can look into this in SYD, liaising with the WebAPKs folks since they've also done some manifest scraping and analysis (from the server side).
Meanwhile, to understand the bigger picture: how would complexity would it be involved in reporting the URL itself? It really seems like a low-hanging fruit.
Yes, it would be easy. We actually had a manifest existence method on WebContents a little while back (https://chromium.googlesource.com/chromium/src/+/b8c25a180344260005e9ecd261dd594b84d9c903/content/public/browser/web_contents.h). We removed it since there wasn't a compelling reason at the time to have two separate IPCs to check if a manifest existed, then try to fetch it.

Before we add something like it back in (reporting the URL instead of just existence), I'd like to have a bit more certainty that we'd use it. But if you're pretty sure it'll be useful, I can be convinced otherwise. :)
When we write the Web Manifest icon fetching code for favicons we need to ensure that we don't make URL loading on the desktop look like the video that I have attached. I modified the FaviconHandler for the sake of the video. URL loading does not look like this now

Notice that the tab's favicon briefly switches to the default icon when amazon.com is loading
out2 (1).avi
188 KB Download

Comment 9 by mastiz@chromium.org, Apr 10 2017

Dominick: wrt #7, I wrote https://codereview.chromium.org/2799273002/ which shows how we'd consume the manifest URL for fetching favicons, while still avoiding repeated downloads of the manifest. It's still WIP, but is essentially following the principles discussed in the design doc ( https://docs.google.com/document/d/1JlvRhNunF6OeP88kSCxf2U_0YvR5Xa-6n38aFeIUWQ0).
Thanks for the WIP CL! That helps greatly, and I think I'm happy with how that looks.

I think this would also benefit if we potentially moved the currently renderer-side manifest cache (ManifestManager) to the browser process (since that would then cut down on IPC overhead back and forth for getting the manifest object itself in favicons / InstallableManager).
If we move the renderer-side manifest cache to the browser side, what will we do with |ManifestManager::manifest_dirty_| ?
Moving the cache to the browser side would also involve moving as much of the ManifestManager state and logic to the browser. This is markedly easier and cleaner now with Mojo instead of Chrome IPC.

For example, in ManifestManager::DidChangeManifest, you could use Mojo to IPC directly to the browser process and inform it of the change in manifest. We'd have the manifest_dirty_ concept in the browser side cache rather than the renderer side.
Blocking: 690383
Marking as blocker for crbug.com/690383, since I believe we reached consensus on this being the right thing to do.

dominickn@: is this something your team could try to prioritize?

Perhaps, before submitting the patch, we could wait for preliminary feedback from UI review, which I plan to request ASAP.
Cc: sa...@chromium.org
Yep, let's wait for feedback from UI review on what they think, and based on that we can prioritise as necessary.
FYI, we're blocked on our PM being OOO (sick), so it might take a few days. I'll ping this bug with updates.
We now have LGTM from UI review for this intent-to-implement, see https://bugs.chromium.org/p/chromium/issues/detail?id=711187&desc=4

dominickn@: how can we do progress here?

I know there are hopes to always download manifests on page load, independently of this effort. This would make the scope of this bug obsolete, but since it's in an experimental stage, I assume this is not something we'd like to depend on?
mastiz: yes, we probably shouldn't block this on separate efforts to eagerly download manifests. How about we either meet or put a doc together on what APIs you need exposed and we can add them as sammc@ Mojo-ifies manifests (I think he has most of that done now, it's just awaiting a clean up and then review).
Thanks! I started a dedicated section "Extensions to WebContents API" in go/manifest-icons-design, please add comments there or here in the bug.

It's fairly simple: we need something like DidUpdateWebManifestURL() which delivers a URL to the browser process and ultimately to WebContentsObserver. It's important to also trigger it for the empty case, i.e. using an empty URL (or base::nullopt if base::Optional is used).
Cc: -sa...@chromium.org dominickn@chromium.org
Owner: sa...@chromium.org
Over to sammc@: as part of the Mojoification of manifests (or possibly immediately after), can you look into adding a DidUpdateWebManifestURL() method on WebContentsObserver for if/when the main frame's manifest <link> element is changed?
Also, a rough ETA would be appreciated. BP for M60 is May the 25th and that'd be an important milestone. Thx!
Friendly ping for sammc@, thanks!
sammc@ I was looking at your CL. I am concerned that it will be difficult to determine the order in which to expect WebContentsObserver::DidUpdateWebManifestURL() and WebContentsObserver::DidUpdateFaviconURL()

FaviconHandler needs to know when both:
The "favicon URL" is known (<link rel="icon" />)
The "manifest URL" is known (<link rel="manifest" />)

It is possible for a page to not have a manifest URL. We don't want to wait for a "manifest URL" when there is none.

From the point of view of FaviconHandler, it would be most convenient if the manifest URL was sent to the UI via WebContentsObserver::DidUpdateFaviconURL(). I realize that this is not a very generic API.
Project Member

Comment 24 by bugdroid1@chromium.org, May 11 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/145fb1704d655ca174061c5978ab476cd4e8a520

commit 145fb1704d655ca174061c5978ab476cd4e8a520
Author: Sam McNally <sammc@chromium.org>
Date: Thu May 11 03:39:29 2017

Report manifest URL changes to WebContentsObservers.

Bug: 707179
Change-Id: I57c859d27c7f940002cceb2923e6418a75acb92a
Reviewed-on: https://chromium-review.googlesource.com/497233
Commit-Queue: Sam McNally <sammc@chromium.org>
Reviewed-by: Nasko Oskov <nasko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#470796}
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/browser/manifest/manifest_browsertest.cc
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/browser/manifest/manifest_manager_host.cc
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/browser/manifest/manifest_manager_host.h
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/browser/web_contents/web_contents_impl.cc
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/browser/web_contents/web_contents_impl.h
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/common/BUILD.gn
[add] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/common/manifest_observer.mojom
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/public/browser/web_contents_observer.h
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/renderer/manifest/manifest_manager.cc
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/renderer/manifest/manifest_manager.h
[modify] https://crrev.com/145fb1704d655ca174061c5978ab476cd4e8a520/content/test/data/manifest/dynamic-manifest.html

sammc@ Stupid question: Will there be an issue similar to  crbug.com/391585  where it is possible for WebContentsImpl::NotifyManifestUrlChanged() to be called with a manifest_url for a previous page.

I wasn't able to get this to occur locally
I see that in your CL you guarantee the ordering of DidUpdateWebManifestURL() and DidUpdateFaviconURL(). Please ignore comment #23

My (stupid) question in comment #25 still stands

Comment 27 by sa...@chromium.org, May 15 2017

Status: Fixed (was: Assigned)
I'm not sure; navigation is complicated.
Status: Assigned (was: Fixed)
Reopening the bug to get an answer on comment #25. For WebContentsImpl::OnUpdateFaviconURL() checks RenderViewHostImpl::is_active()

I really really don't want to deal with the bug where some users have incorrect favicons again. It was a pain to debug since favicons are both synced and cached in a database.
sammc@ Ping!
sammc@: friendly ping, can we wrap this up?
Owner: ----
Status: Available (was: Assigned)
Status: Untriaged (was: Available)
Available, but no owner or component? Please find a component, as no one will ever find this without one.

Sign in to add a comment