Saving page as complete html doesn't work when ad-blocking extensions are present
Reported by
netspide...@gmail.com,
Mar 11 2016
|
|||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36 Example URL: Steps to reproduce the problem: 1. Go to any site, for example https://www.google.com/chrome/browser/desktop/index.html 2. Hit STRL+S to save page. 3. Saving page stops after a few bytes and you can't resume it. What is the expected behavior? Web page should be saved as html-file. What went wrong? Saving pages doesn't work. I can't resume downloading, only cancel it. Did this work before? Yes On previous version, 48. Chrome version: 49.0.2623.87 Channel: stable OS Version: 10.0 Flash Version: Shockwave Flash 21.0 r0 This happened with the latest Canary version too.
,
Mar 11 2016
,
Mar 11 2016
Here it is.
,
Mar 11 2016
I can confirm that this issue is happening to me too, same version of Chrome. Only differences are that the downloads are permanently stuck on "Starting...", and I'm using Windows 8.1.
,
Mar 12 2016
Tested this on Win 7 and Win8.1 with same version Chrome/49.0.2623.87 - it is working fine- not able to repo. Mac /version Chrome/49.0.2623.75 and Chrome/49.0.2623.87- Not able to repro. Thanks
,
Mar 12 2016
mmohammad@ 64-bit or 32-bit versions? I'm using 64-bit version. 51.0.2674.0 canary (64-bit) also got this bug.
,
Mar 12 2016
I can confirm that OS X version of Chrome/49.0.2623.87 is not affected.
,
Mar 13 2016
I am struggling from this bug on Windows 2003 running 49.0.2623.87 m
,
Mar 14 2016
@asanka: Hey, would you mind checking the above issue as per comment#3 ? As, I am not able to reproduce the above issue on Windows 10 with chrome version '49.0.2623.87' I really appreciate your help. Thank you!
,
Mar 14 2016
lukasza@ ^^^ Let me know if you'd like me to have a look at it instead. I haven't kept up-to-date with your work there, hence tossing it at you.
,
Mar 14 2016
I've dusted off my Win10 VM and tried the repro steps with 49.0.2623.87 (Official Build) m (64-bit) but got no repro... :-( I assume that the problem reported here always consistently repros for some users and we need to figure out what variable (locale? anti-virus? extensions?) is different for me. We've already ruled out 32-vs-64-bit difference (i.e. per #c6 I've tried 64-bit Chrome to repro). I think we can probably also rule out the language/locale difference - I've switched Windows and Chrome language to Russian and I still could not repro. There is also one more difference I see from the original repro. When saving as "complete html", the download bar shows the progress by displaying the number of saved and total files. In all my repro attempts I see 31/31 files being saved, but the screenshots attached to the original repro are showing 3/32 files. One (blind/random) guess would be that maybe an extension is injecting extra frames or resources into the page being saved? One way to limit the number of variables that might affect the repro would be to see if the problem repros in incognito mode. netspiderua@, daph@, luispvital@ - could you please report if the problem repros for you in incognito mode?
,
Mar 14 2016
I've also tried looking at the net internals log in #c3 (thanks netspiderua@ :-). In this file, I see the following net_errors: - 2 cases of net_error=-348 => PAC_NOT_IN_DHCP - 17 cases of net_error=-804 => DNS_CACHE_MISS - 7 cases of net_error=-2 => FAILED :-/ - 6 cases of net_error=-20 => BLOCKED_BY_CLIENT - 24 cases of net_error=-105 => NAME_NOT_RESOLVED (os_error_string = "Этот хост неизвестен" = "This host us unknown" - 16 cases of net_error=-3 => ABORTED (with description "Closing all sessions") FWIW, I don't get these error kinds when I try to repro the problem with 49.0.2623.87 (Official Build) (64-bit) [I tried once on Linux and once on Windows]. One exception is -2/FAILED (but then even in this case I got only 2 instances of this error, rather than 7 instances as in the file attached to #c3). @asanka - do these error codes ring any bells? Also - would you have any hints or tools for reading the log from #c3? (i.e. how would I find out which URLs got stuck).
,
Mar 15 2016
@lukasza I can save a page in incognito mode
,
Mar 15 2016
I have no extension that is disabled in incognito though
,
Mar 15 2016
I have an update on the situation. Any kind of ad blocking extension seems to be the cause of this issue. Both Adblock and Adblock Plus allow the download to start, but they get stuck on 1 KB. uBlock Origin, however, keeps the download stuck on "Starting...". Having none active allows me to download pages successfully. Perhaps an update has caused a conflict between the HTML download and ad blocking extensions.
,
Mar 16 2016
luispvital@, thanks a lot for narrowing down the repro - it helps a lot. I confirm that I can repro the issue with ABP and Chrome 49.0.2623.87 on Win10, Mac (OS X El Capitan), Linux. Since this issue seems to affect Save-Page-As for all users of ABP (and possibly other similar extensions), I am bumping up the priority of this issue. One immediate observation from the repro is that ABP shows 0 blocked ads after the page has fully loaded, but starts to show 2 blocked ads while Save-Page-As-Complete-Html is in progress / hangs. Same behavior (ABP showing 2 blocked ads) can be observed on Chrome 48, but Save-Page-As continues fine (i.e. doesn't hang) in this version. Let me try to dig a bit further...
,
Mar 16 2016
For one of savable resources SaveFileManager::SaveFinished(..., is_sucess=false) is called when SaveFileManager does not yet have a SaveFile and in this case ignores this callback. This means that SavePackage doesn't get notification about the status of its SaveItem and waits for it forever...
,
Mar 16 2016
Issue 595345 has been merged into this issue.
,
Mar 16 2016
FWIW, the root cause seems to be https://codereview.chromium.org/1484093002 (initially in 49.0.2587.0): commit 870cf11e7b1548dabca2c0deb62e346568ea4e8b Author: lukasza <lukasza@chromium.org> Date: Wed Dec 9 11:25:26 2015 -0800 Allowing multiple SaveItems to have same URLs.
,
Mar 16 2016
I've started putting together a fix @ https://crrev.com/1812693002 At this point I think I have the right fix (OTOH, I still need to work on CL description to convince reviewers and myself that this is indeed the right fix). I don't yet have a browser test to prevent future regressions in this scenario (what I have in the CL always passes even without the fix) - I'll try to work on that tomorrow. Worst case scenario - I'll push out just the product fix for review at the end of Thursday.
,
Mar 22 2016
The following revision should have refered to this bug: https://chromium.googlesource.com/chromium/src.git/+/2813991c65e09a65de095074c867e88af557088e commit 2813991c65e09a65de095074c867e88af557088e Author: lukasza <lukasza@chromium.org> Date: Tue Mar 22 15:54:53 2016 Fix how Save-Page-As responds to web requests blocked by extensions. AdBlocker extensions exercise (via chrome.webRequest API) a code path where ResourceHandler (and in particular SaveFileResourceHandler) reports a failure via OnResponseCompleted without first calling OnResponseStarted. This code path is also executed when the request fails before HTTP headers are received and parsed (i.e. if the host name could not be resolved by DNS). In this code path, SaveFileManager::SaveFinished is called before SaveFileManager::StartSave got called. This means that when SaveFinished calls LookupSaveFile, it won't find anything. This behavior has regressed in https://crrev.com/1484093002, which removed the "else" clause in SaveFinished method, incorrectly thinking that save_item_id-based lookup will always succeed because of the other changes made in the CL. This CL fixes the problem by always calling OnSaveFinished (even if LookupSaveFile failed). To make this work, we need to make sure that SavePackage lookup in SaveFinished succeeds, even if OnStartSave was not called (this is where |package_| map was populated before the fix). To accomplish this, the fix moves populating the |package_| map (SaveItemId to SavePackage* map) earlier - from OnStartSave into SaveURL method. BUG= 595345 Review URL: https://codereview.chromium.org/1812693002 Cr-Commit-Position: refs/heads/master@{#382577} [modify] https://crrev.com/2813991c65e09a65de095074c867e88af557088e/content/browser/download/save_file_manager.cc [modify] https://crrev.com/2813991c65e09a65de095074c867e88af557088e/content/browser/download/save_file_manager.h [modify] https://crrev.com/2813991c65e09a65de095074c867e88af557088e/content/browser/download/save_package.cc
,
Mar 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d19c1443dccf9bc0b2cd7b6e9afc2b080e95cd02 commit d19c1443dccf9bc0b2cd7b6e9afc2b080e95cd02 Author: lukasza <lukasza@chromium.org> Date: Tue Mar 22 22:56:59 2016 Regression test for handling early failure during network download. The test would cause a hang/timeout before the fix in crrev.com/1812693002. BUG= 594219 , 595345 Review URL: https://codereview.chromium.org/1820363002 Cr-Commit-Position: refs/heads/master@{#382718} [modify] https://crrev.com/d19c1443dccf9bc0b2cd7b6e9afc2b080e95cd02/chrome/browser/download/save_page_browsertest.cc [modify] https://crrev.com/d19c1443dccf9bc0b2cd7b6e9afc2b080e95cd02/chrome/test/data/save_page/broken-image.htm
,
Mar 23 2016
CL from https://crbug.com/594219#c21 above has ended up in 51.0.2688.0 Using this version of Chrome Canary, I can no longer repro the problem (even though ABP shows 2 blocked downloads). Let's try merging the fix to Beta and Stable...
,
Mar 23 2016
Your change meets the bar and is auto-approved for M50 (branch: 2661)
,
Mar 23 2016
The following revision should have refered to this bug: https://chromium.googlesource.com/chromium/src.git/+/c9bda843237953f3ebba5e6a3234bcc96421df48 commit c9bda843237953f3ebba5e6a3234bcc96421df48 Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Wed Mar 23 22:15:53 2016 Fix how Save-Page-As responds to web requests blocked by extensions. AdBlocker extensions exercise (via chrome.webRequest API) a code path where ResourceHandler (and in particular SaveFileResourceHandler) reports a failure via OnResponseCompleted without first calling OnResponseStarted. This code path is also executed when the request fails before HTTP headers are received and parsed (i.e. if the host name could not be resolved by DNS). In this code path, SaveFileManager::SaveFinished is called before SaveFileManager::StartSave got called. This means that when SaveFinished calls LookupSaveFile, it won't find anything. This behavior has regressed in https://crrev.com/1484093002, which removed the "else" clause in SaveFinished method, incorrectly thinking that save_item_id-based lookup will always succeed because of the other changes made in the CL. This CL fixes the problem by always calling OnSaveFinished (even if LookupSaveFile failed). To make this work, we need to make sure that SavePackage lookup in SaveFinished succeeds, even if OnStartSave was not called (this is where |package_| map was populated before the fix). To accomplish this, the fix moves populating the |package_| map (SaveItemId to SavePackage* map) earlier - from OnStartSave into SaveURL method. BUG= 595345 Review URL: https://codereview.chromium.org/1812693002 Cr-Commit-Position: refs/heads/master@{#382577} (cherry picked from commit 2813991c65e09a65de095074c867e88af557088e) Review URL: https://codereview.chromium.org/1829893003 . Cr-Commit-Position: refs/branch-heads/2661@{#367} Cr-Branched-From: ef6f6ae5e4c96622286b563658d5cd62a6cf1197-refs/heads/master@{#378081} [modify] https://crrev.com/c9bda843237953f3ebba5e6a3234bcc96421df48/content/browser/download/save_file_manager.cc [modify] https://crrev.com/c9bda843237953f3ebba5e6a3234bcc96421df48/content/browser/download/save_file_manager.h [modify] https://crrev.com/c9bda843237953f3ebba5e6a3234bcc96421df48/content/browser/download/save_package.cc
,
Mar 24 2016
The fix is merged into Beta: commit 2813991c65e09a65de095074c867e88af557088e was: initially in 51.0.2688.0 merged to 50.0.2661.51 (as c9bda843237953f3ebba5e6a3234bcc96421df48) I am not sure if we need any "bake" time in Beta channel, but I assume that we eventually want to merge the fix to the Stable channel (we probably already missed one Stable refresh, but hopefully the fix will end up in the next one).
,
Mar 24 2016
[Automated comment] Request affecting a post-stable build (M49), manual review required.
,
Mar 24 2016
Let me reiterate information about this bug, to hopefully help with triaging the M49 merge request: - Impact when hit: Save-Page-As-Complete-HTML (default UI option on Windows/Mac/Linux) hangs / doesn't make any progress (the user can't complete the save, but can [sort-of] "recover" by canceling the save altogether). - How easy to hit: Impacts saving of all pages if an ad-block extension is present and blocks a subresource on the given page (bug is triggered when network request triggered by Save-Page-As fails before parsing headers - this is most visible with ad-blockers but not really limited to this scenario) - Risk of fix: relatively small (the fix is very small, merges easily and without conflicts, got to CRs + I think I thoroughly reviewed the correctness in https://codereview.chromium.org/1812693002/#msg9) - Bake time of the fix: rather short at this time: initially in 51.0.2688.0, merged to 50.0.2661.51
,
Mar 24 2016
@lukasza you didn't initialize local variable int64_t bytes_so_far. Style guide require you to initialize local variables to eliminate possible nondeterministic behavior after refactoring or in case of a bug.
,
Mar 24 2016
daph@, thanks for pointing this out - I appreciate it. I'll follow-up in https://crrev.com/1833723002. Let me note that AFAICT this should have no impact on the fix behavior and/or the M49 merge request.
,
Mar 24 2016
Per commement #25, this is already merged to M50 branch 2661. Hence, removing "Merge-Approved-50" label.
,
Mar 24 2016
Just so I understand, based on c#29, is the CL expected to change? If so, please remove the merge request and re-request once that change is in and baked.
,
Mar 24 2016
sshruthi@, the fix is not expected the change - the change suggested in c#29 is about following c++ style guide, not about the correctness of the fix.
,
Mar 24 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/34f48dde1eb15cdf568cba166aaf7db7f71a679e commit 34f48dde1eb15cdf568cba166aaf7db7f71a679e Author: lukasza <lukasza@chromium.org> Date: Thu Mar 24 22:24:24 2016 Initialize |bytes_so_far| local variable in SaveFileManager::SaveFinished. As pointed out in https://crbug.com/594219#c29 the local variable |bytes_so_far| is not initialized at the point of declaration. This is okay in the current code (this variable *does* get intialized in all the possible code flows), but the lack of initialization violates the style guide [1] and is an unnecessary refactoring risk. [1] https://google.github.io/styleguide/cppguide.html#Local_Variables says: Place a function's variables in the narrowest scope possible, and initialize variables in the declaration. BUG= 594219 Review URL: https://codereview.chromium.org/1833723002 Cr-Commit-Position: refs/heads/master@{#383163} [modify] https://crrev.com/34f48dde1eb15cdf568cba166aaf7db7f71a679e/content/browser/download/save_file_manager.cc
,
Mar 28 2016
Marking as fixed (in ToT and M50), while waiting for M49 merge request to be considered ... (the only CL to be merged is 2813991 from #c21).
,
Mar 28 2016
sshruthi@, let me know if you need any other data to review the M49 merge request
,
Mar 29 2016
Merge approved for M49 (branch 2623)
,
Mar 29 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ad3e3ad15e43f110839c80fe078fd3b9c21eaea7 commit ad3e3ad15e43f110839c80fe078fd3b9c21eaea7 Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Tue Mar 29 20:09:05 2016 Fix how Save-Page-As responds to web requests blocked by extensions. AdBlocker extensions exercise (via chrome.webRequest API) a code path where ResourceHandler (and in particular SaveFileResourceHandler) reports a failure via OnResponseCompleted without first calling OnResponseStarted. This code path is also executed when the request fails before HTTP headers are received and parsed (i.e. if the host name could not be resolved by DNS). In this code path, SaveFileManager::SaveFinished is called before SaveFileManager::StartSave got called. This means that when SaveFinished calls LookupSaveFile, it won't find anything. This behavior has regressed in https://crrev.com/1484093002, which removed the "else" clause in SaveFinished method, incorrectly thinking that save_item_id-based lookup will always succeed because of the other changes made in the CL. This CL fixes the problem by always calling OnSaveFinished (even if LookupSaveFile failed). To make this work, we need to make sure that SavePackage lookup in SaveFinished succeeds, even if OnStartSave was not called (this is where |package_| map was populated before the fix). To accomplish this, the fix moves populating the |package_| map (SaveItemId to SavePackage* map) earlier - from OnStartSave into SaveURL method. BUG= 594219 Review URL: https://codereview.chromium.org/1812693002 Cr-Commit-Position: refs/heads/master@{#382577} (cherry picked from commit 2813991c65e09a65de095074c867e88af557088e) Review URL: https://codereview.chromium.org/1842873003 . Cr-Commit-Position: refs/branch-heads/2623@{#661} Cr-Branched-From: 92d77538a86529ca35f9220bd3cd512cbea1f086-refs/heads/master@{#369907} [modify] https://crrev.com/ad3e3ad15e43f110839c80fe078fd3b9c21eaea7/content/browser/download/save_file_manager.cc [modify] https://crrev.com/ad3e3ad15e43f110839c80fe078fd3b9c21eaea7/content/browser/download/save_file_manager.h [modify] https://crrev.com/ad3e3ad15e43f110839c80fe078fd3b9c21eaea7/content/browser/download/save_package.cc
,
Mar 30 2016
Verified the fix on Windows 7, MAC (10.11.3) & Ubuntu Trusty (14.04) for Google Chrome Beta Version - 50.0.2661.57 Screen-recording is attached. Added TE-Verified label.
,
Apr 7 2016
Verified the fix on Windows 7, MAC (10.11.3) & Ubuntu Trusty (14.04) for Google Chrome Stable Version - 49.0.2623.112 Screen-recording is attached. Added TE-Verified label. |
|||||||||||||||
►
Sign in to add a comment |
|||||||||||||||
Comment 1 by csharrison@chromium.org
, Mar 11 2016