Issue metadata
Sign in to add a comment
|
Save linked page for later fails on Google search result |
||||||||||||||||||||||||
Issue descriptionOpen a search result page in Chrome, long-tap the title of a result and choose the "Save linked page for later" context menu item. The download is executed, but the downloaded page has no title and its URL is the Google redirect URL (instead of the URL that it eventually redirects to). Probably, (all) redirects should be resolved before downloading a page.
,
Aug 30 2016
As was discussed before, in certain scenarios, the final (redirected) URL should be used as a url of the saved page. User-initiated download can be one of those scenarios. API-initiated may not be, since it'll refer to the page by the original (non-redirected) URL. Perhaps a new bit for per-namespace policy.
,
Aug 30 2016
Is this a problem for the Downloads feature originated requests or just the developer context menu experiment? We do have existing bug found in the developer context menu experiment: 618716
,
Aug 31 2016
,
Aug 31 2016
Dmitry points out this is different issue - seem to not be saving redirected to page now. His repro: "...It looks to me that we are snapshotting the page that is a redirect request. I played with results of Google Search and sometimes I get the final redirected page but often the redirect page itself (empty body, the reload actually redirects to the target). Here are my steps: 1. open www.google.com 2. search for term "search" 3. Long-press, "save later" on entry for Bing - observe actual Bing page to be loaded 4. Long-press, "save later" on other entries - observe the redirect page to be saved." I repro-ed myself and opened offline internals page and see the redirect URL rather than its target URL.
,
Sep 1 2016
Repro-ed with some logging, the last committed URL I am getting from the WebContents from the prerenderer is indeed the search result URL instead of a redirected URL. So I need to look more at the WebContents and then perhaps at prerenderer.
I tried another case from the offline-internals page and did get a redirect saved for same domain ("http://ebay.com" => "http://www.ebay.com") so I wonder if particular to search or cross domains.
,
Sep 1 2016
Wild guess: Prerender does not follow a redirect if it has "Follow-Only-When-Prerender-Shown" in response headers (case-insensitive). Another wild guess: they are might be using client side redirect (i.e. via javascript), which could be based on page visibility state. Not sure why they would do this (it is slow), could be an A/B experiment running in search, but worth confirming.
,
Sep 1 2016
Thanks for the ideas Egor! I think maybe the second idea is close to the mark. Added some logging and I don't see any WillRedirectRequest() calls. If I click on link in browser, I see a NavigationHandleImpl::NavigationStart on the final URL but when I request save later, then I see the NavigationStart on the google search URL, it completes loading (and we save the page) and then I do see NavigationStart on the target URL before we kill the prerender. I will try adding some delay after the OnLoad before we snapshot to see what happens.
,
Sep 1 2016
Yep, this looks like a right-moment-to-snapshot challenge. Waiting 7 seconds after DOMContentLoaded of original search URL gave me a good target page mhtml on WIFI but not on GIN-3g.
,
Sep 1 2016
Adding logging in PrerenderContents to see what it sees. One interesting point is that it looks like we could get additional OnPrerenderDomContentLoaded and OnPrerenderStopLoading callbacks to background loader level. This run has a 7 second delay from the original OnPrerenderDomContentLoaded so we do end up seeing the target page before we trigger the snapshot. 09-01 15:45:47.293 [prerendering_loader.cc(31)] XXXXX LoadPage https://www.google.com/url?sa=t&source=web&rct=j&url=https://duckduckgo.com/&ved=0ahUKEwiczKLGn-_OAhUU82MKHQZ2ANkQFgg2MAs&usg=AFQjCNErpcUya7_GKBR5LAdayJyuKMsTtA 09-01 15:45:47.334 [prerender_contents.cc(516)] XXXXX PrerenderContents::RenderFrameCreated 09-01 15:45:48.694 [prerender_contents.cc(544)] XXXXX PrerenderContents::DidStartProvisionalLoadForFrame has parent: 0 09-01 15:45:48.774 [prerender_contents.cc(570)] XXXXX PrerenderContents::DidNavigateMainFrame redirects: 1 09-01 15:45:48.774 [prerender_contents.cc(590)] XXXXX PrerenderContents::DidNavigateMainFrame AddAliasURL: https://www.google.com/url?sa=t&source=web&rct=j&url=https://duckduckgo.com/&ved=0ahUKEwiczKLGn-_OAhUU82MKHQZ2ANkQFgg2MAs&usg=AFQjCNErpcUya7_GKBR5LAdayJyuKMsTtA 09-01 15:45:48.869 [prerender_contents.cc(533)] XXXXX PrerenderContents::DocumentLoadedInFrame has parent: 0 09-01 15:45:48.869 [prerendering_loader.cc(106)] XXXXX OnPrerenderDomContentLoaded 09-01 15:45:48.875 [prerender_contents.cc(562)] XXXXX PrerenderContents::DidFinishLoad has parent: 0 09-01 15:45:48.877 [prerender_contents.cc(526)] XXXXX PrerenderContents::DidStopLoading notify 09-01 15:45:48.878 [prerendering_loader.cc(96)] XXXXX OnPrerenderStopLoading <<<<< This is the point where we currently trigger snapshot (and so wouldn't see the target content) 09-01 15:45:48.880 [prerender_contents.cc(544)] XXXXX PrerenderContents::DidStartProvisionalLoadForFrame has parent: 0 09-01 15:45:49.145 [prerender_contents.cc(570)] XXXXX PrerenderContents::DidNavigateMainFrame redirects: 2 09-01 15:45:49.145 [prerender_contents.cc(590)] XXXXX PrerenderContents::DidNavigateMainFrame AddAliasURL: https://www.google.com/url?sa=t&source=web&rct=j&url=https://duckduckgo.com/&ved=0ahUKEwiczKLGn-_OAhUU82MKHQZ2ANkQFgg2MAs&usg=AFQjCNErpcUya7_GKBR5LAdayJyuKMsTtA 09-01 15:45:49.145 [prerender_contents.cc(590)] XXXXX PrerenderContents::DidNavigateMainFrame AddAliasURL: https://duckduckgo.com/ 09-01 15:45:51.499 [prerender_contents.cc(533)] XXXXX PrerenderContents::DocumentLoadedInFrame has parent: 0 09-01 15:45:51.499 [prerendering_loader.cc(106)] XXXXX OnPrerenderDomContentLoaded <<<<< This is the first point where background loader could see there is more in progress (with its currently observed events). 09-01 15:45:51.673 [prerender_contents.cc(562)] XXXXX PrerenderContents::DidFinishLoad has parent: 0 09-01 15:45:51.677 [prerender_contents.cc(526)] XXXXX PrerenderContents::DidStopLoading notify 09-01 15:45:51.677 [prerendering_loader.cc(96)] XXXXX OnPrerenderStopLoading 09-01 15:45:55.870 [prerendering_offliner.cc(63)] XXXXX PrerenderingOffliner::OnLoadPageDone save page for original url: https://www.google.com/url?sa=t&source=web&rct=j&url=https://duckduckgo.com/&ved=0ahUKEwiczKLGn-_OAhUU82MKHQZ2ANkQFgg2MAs&usg=AFQjCNErpcUya7_GKBR5LAdayJyuKMsTtA 09-01 15:45:55.871 [prerendering_offliner.cc(64)] XXXXX PrerenderingOffliner::OnLoadPageDone save page with committed url: https://duckduckgo.com/
,
Sep 23 2016
What is the redirect mechanism here? Should we rather include the detection of that in our redesign of SnapshotController? THe delays may never address the issue properly.
,
Jan 26 2017
Fixed per some bookends support added by jian? Would be good to check behavior of new Download Link long press option on google search results to see if they work or have this issue. Assigning to romax to check this behavior.
,
Jan 27 2017
Updating on this one. Currently there's no "Save linked page for later" in long press menu, however there's a "Download link" which I think serves the same. In a search result page, downloading a link now will have the post-redirect url and also saves without issue. So I'm closing this bug as cannot reproduce, please feel free to reopen if it occurs again.
,
Jan 30 2017
However now for some search results the saved page has the correct metadata (title/url) but not the correct one in notification bar. Opening another bug to track that issue (crbug/685908). |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by sheriffbot@chromium.org
, Aug 30 2016