Clear-Site-Data cache deletion can be slow and break loading
Reported by
pskub...@opera.com,
Sep 6 2017
|
|||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3180.0 Safari/537.36 Example URL: https://global.rakuten.com/en/ Steps to reproduce the problem: 1. Go to https://global.rakuten.com/en/ What is the expected behavior? Site loads properly What went wrong? on 62.0.3180.0 (Developer Build) (64-bit) site is loading and stay on waiting loop on Google Chrome 61.0.3163.79 (Official Build) (64-bit) site is not fully loaded, so it's damaged Does it occur on multiple sites: N/A Is it a problem with a plugin? No Did this work before? Yes 60.0.3112.113 Does this work in other browsers? Yes Chrome version: 62.0.3180.0 Channel: dev OS Version: 10.0 Flash Version:
,
Sep 7 2017
Hrm...Something's going haywire - The page kills about:net-internals. Stopping the page after a couple seconds of loading, I'm seeing 46622 DISK_CACHE_ENTRY sources being opened and then closed, for only 223 URL_REQUESTS. Most of the requests are for random google URLS: 15 opens for https://clients5.google.com/pagead/drt/dn/dn.js, 225 opens for https://mail.google.com/mail/u/0/images/cleardot.gif, 20 for https://ssl.gstatic.com/gb/images/v1_e3444bc5.png, etc. I don't see all that stuff when loading a functional page, like cnn.com. [+maksim.sisov]: Has someone started doing anything new and exciting with the cache lately? [+bnc]: The ERR_NETWORK_CHANGED looks to be an H2 bug. We look to be failing requests with ERR_NETWORK_CHANGED in response to receiving an ABANDONED message from a server. Is this expected? Sample failure: t=4569 [st= 0] +REQUEST_ALIVE [dt=135] --> priority = "HIGHEST" --> url = "https://global.rakuten.com/en/common/css/ranking_widget_customized.css" t=4570 [st= 1] URL_REQUEST_DELEGATE [dt=0] t=4570 [st= 1] +URL_REQUEST_START_JOB [dt=134] --> load_flags = 33024 (MAYBE_USER_GESTURE | VERIFY_EV_CERT) --> method = "GET" --> url = "https://global.rakuten.com/en/common/css/ranking_widget_customized.css" t=4570 [st= 1] URL_REQUEST_DELEGATE [dt=0] t=4570 [st= 1] HTTP_CACHE_GET_BACKEND [dt=0] t=4570 [st= 1] HTTP_CACHE_OPEN_ENTRY [dt=2] --> net_error = -2 (ERR_FAILED) t=4572 [st= 3] HTTP_CACHE_CREATE_ENTRY [dt=1] t=4573 [st= 4] HTTP_CACHE_ADD_TO_ENTRY [dt=0] t=4573 [st= 4] +HTTP_STREAM_REQUEST [dt=87] t=4573 [st= 4] HTTP_STREAM_JOB_CONTROLLER_BOUND --> source_dependency = 572421 (HTTP_STREAM_JOB_CONTROLLER) t=4660 [st= 91] -HTTP_STREAM_REQUEST t=4660 [st= 91] +HTTP_TRANSACTION_SEND_REQUEST [dt=4] t=4660 [st= 91] HTTP_TRANSACTION_HTTP2_SEND_REQUEST_HEADERS --> :authority: global.rakuten.com :method: GET :path: /en/common/css/ranking_widget_customized.css :scheme: https accept: text/css,*/*;q=0.1 accept-encoding: gzip, deflate, br accept-language: en-US,en;q=0.9 referer: https://global.rakuten.com/en/ user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3207.0 Safari/537.36 t=4664 [st= 95] -HTTP_TRANSACTION_SEND_REQUEST t=4664 [st= 95] +HTTP_TRANSACTION_READ_HEADERS [dt=40] t=4704 [st=135] HTTP2_STREAM_ERROR --> description = "ABANDONED (stream_id=9): https://global.rakuten.com/en/common/css/ranking_widget_customized.css" --> net_error = "ERR_NETWORK_CHANGED" --> stream_id = 9 t=4704 [st=135] -HTTP_TRANSACTION_READ_HEADERS --> net_error = -21 (ERR_NETWORK_CHANGED) t=4704 [st=135] -URL_REQUEST_START_JOB --> net_error = -21 (ERR_NETWORK_CHANGED) t=4704 [st=135] URL_REQUEST_DELEGATE [dt=0] t=4704 [st=135] -REQUEST_ALIVE --> net_error = -21 (ERR_NETWORK_CHANGED)
,
Sep 7 2017
,
Sep 7 2017
Repros on - Linux M61 - Linux ToT (broken.png from yhirano) - Canary M63 Work as expected on: - Windows M60
,
Sep 7 2017
Able to reproduce the issue on Windows 7, Mac 10.12.6 & Ubuntu 14.04 using latest Canary#63.0.3207.0 and reported version-62.0.3180.0 as per the below fiddle. https://global.rakuten.com/en/ Manual Bisect info: ------------------ Good-61.0.3151.4- Revision-484781 Bad-61.0.3152.0- Revision-485105 Per revision bisect info: ----------------------- You are probably looking for a change made after 484889 (known good), but no later than 484890 (first known bad). CHANGELOG URL: The script might not always return single CL as suspect as some perf builds might get missing due to failure. https://chromium.googlesource.com/chromium/src/+log/f251d6591f25c3b769e47aee16f82ffabafb19b0..7d08794ea091f1a9d5684b0056fbf88fed090298 Review-Url: https://codereview.chromium.org/2972083002 msramek@,Could you please take a look and reassign to the right owner if it is not related to your change. Thanks..!
,
Sep 7 2017
URGENT - PTAL. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and get it merged into the release branch ASAP. Thank you! Note: We're planning to cut M61 Stable RC on Monday, 09/11 @ 4:00 PM PT.
,
Sep 7 2017
Ok, so I think the issue here is that clear data does the same thing as when we flush the cache/cookies - which includes tearing down any active connections, since those are essentially cached data (Cache DNS entry, cached authentication info, cache session cookies). This isn't expected for the clear data header, so causes problems. Openning the same entry multiple times when clearing data seems weird, though - maybe some cache enumeration issue? Anyhow, not the core thing going on here.
,
Sep 7 2017
This website is a real-life stress test for Clear-Site-Data. The landing page and multiple resources on it send the header. That doesn't really make sense for a website that has a sign-in section. Furthermore, the landing page also sends Set-Cookie headers, which, as the spec (https://www.w3.org/TR/clear-site-data/) explains, will be overriden by Clear-Site-Data. So this particular website would end up broken regardless of the bug. For a long-term solution, we might need to rethink how clearing "cache" works. The fact that things are cleared before the navigation/resource load finishes, and that deletion incorporates the kind of data mmenke@ mentioned, are intentional. Maybe we'll always need to restart the request when "cache" is deleted? But anyways - for a short-term M61 solution, we can disable the "cache" deletion. The page appears to load correctly with "cookies" and "storage" only. I'll send out a CL.
,
Sep 7 2017
Hi! чт, 7 сент. 2017 г. в 3:01, mmenke via monorail < monorail+v2.3079273537@chromium.org>: Sorry, I have no idea. I've been working on mus and ozone for the last half a year and didn't have time for networking.
,
Sep 7 2017
[+morlovich] Sorry, wrong maksim. :)
,
Sep 7 2017
I'll take a look at the DISK_CACHE_... spam on the assumption that was the intent of the CC.
,
Sep 7 2017
Yea, that was. The main issue has been figured out, creating the same DISK_CACHE entry 20 times was not (I assume there's a fair bit of overhead in repeatedly recreating cache entries, though could be wrong).
,
Sep 7 2017
I think most of it it's actually iterating through stuff using the iterator interface from ConditionalCacheDeletionHelper, used from StoragePartitionHttpCacheDataRemover... which really isn't a thing we should be doing on the main Http cache; it's performance would be utterly horrific (at least for simple cache, block might actually do a lot better, not sure).
,
Sep 7 2017
For the record, the CL is here: https://canary-chromium-review.googlesource.com/c/chromium/src/+/655718 I will land this on trunk in any case. Regarding M61 merge, I discussed this with govind@, who reminded me that 61 is already in stable and so there's obviously a very high bar for even a quick fix. We expect the breakage to be low given that: 1. This is a new API that we're shipping just now, and we're the first browser to ship it, so the usage is currently low. 2. The API has been hidden behind an experimental flag for some time, and we know that it has been experimented with by some websites, yet this is the first report we get of a breakage. It seems that https://global.rakuten.com/en/ has started using the header while it was still experimental without ever testing it. 3. This website would probably not break if it didn't overuse the header. That doesn't mean it's not a bug in Chrome, but it means that most developers experimenting with header will not have this problem, or will immediately notice when it occurs. We can reach out to this particular website to inform them of the problem. With that, we lean towards not merging to M61. We should probably revisit the decision if any other report comes tomorrow (Friday), so that we can proceed with merge by Monday.
,
Sep 7 2017
Yeah, so the thing pushing the big red button on cache iteration through the many layers is unsurprisingly the implementation of Clear-Site-Data; and, well, as far I can see Clear-Site-Data for cache isn't really implementable sanely with SimpleCache w/o major architecturalish changes. That's because deleting all cache entries for a domain requires opening ~all the cache files, so think something to the order of 10,000 - 20,000 files. That would be because the keys are not actually kept in memory, and keeping them in memory would cost too much. A bit of context: out/Default/cachetool /usr/local/google/home/morlovich/.cache/google-chrome-beta/Default/Cache/ simple list_keys | wc 11269 11268 2671645 ... That is, on this machine there are 11,269 cache entries, and their keys take up 2,671,645 bytes. (Perhaps there is an intermediate solution somewhere, of course, like only storing domains --- especially if they are atomized/normalized --- or storing some secondary index, etc). Blockfile can in principle do better (can't say what it actually does), but it would still need to open separate files for every entry > 16K (roughly 1900 of those here), and also pretty much scan through a large portion of the block files, too.
,
Sep 7 2017
Thank you msramek@. Please request a merge for change listed at #14 once change is well baked/verified in Canary and safe to merge to M61 if we see more reports coming in. Applying Android and Chrome as this is Blink related issue. + abdulsyed@ (Chrome on Desktop TPM)
,
Sep 7 2017
,
Sep 7 2017
Re #15: How does this compare with deleting cache entries based on timestamps, which is what chrome://settings/clearBrowserData does? Would it help to extend the disk_cache::Backend interface with URL-based deletion methods, similarly as we already have timestamp-based ones?
,
Sep 7 2017
Going by timestamps only is pretty efficient for Simple (for Block is depends a lot on the range in question, but it can benefit from it too), if the methods specific to that are used --- I don't know what that UI actually does, but at any rate, it's not so bad to spend a few seconds churning disk in response to such an action as compared to an HTTP header. A special interface would only help theoretically, as the current data organization makes an efficient implementation impossible.
,
Sep 7 2017
Thanks for the explanation. I guess we'll have to revisit the "cache" datatype entirely. I guess it still makes sense in the spec, because a browser can have an implementation where efficient deletion is possible. But maybe Chrome will choose not to use it, or only use it with the blockfile backend, or simply ship it with a disclaimer that this can take a long time based on the user's existing cache size.
,
Sep 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/63d6097e91187b1f6cb171b7e44d88ab0a91124c commit 63d6097e91187b1f6cb171b7e44d88ab0a91124c Author: Martin Sramek <msramek@chromium.org> Date: Thu Sep 07 22:40:53 2017 Temporarily remove support for the "cache" datatype in Clear-Site-Data "cookies" and "storage" are still supported. As it turns out, deleting cache while a site is being loaded breaks some sites. See crbug.com/762417 for details. Bug: 762417 Change-Id: If1ddead4ea5249c3eab71cd70b6fdf088c096cd7 Reviewed-on: https://chromium-review.googlesource.com/655718 Reviewed-by: Matt Menke <mmenke@chromium.org> Commit-Queue: Martin Šrámek <msramek@chromium.org> Cr-Commit-Position: refs/heads/master@{#500416} [modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle.cc [modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc [modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle_unittest.cc
,
Sep 8 2017
https://global.rakuten.com/en/ has removed their usage of Clear-Site-Data, so this particular issue has been resolved on their side. Repurposing the bug to the general discussion about cache. Also, requesting a merge of the fix to M62.
,
Sep 8 2017
Approved for M62. Branch: 3202
,
Sep 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3 commit bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3 Author: Martin Sramek <msramek@chromium.org> Date: Fri Sep 08 23:41:49 2017 Temporarily remove support for the "cache" datatype in Clear-Site-Data "cookies" and "storage" are still supported. As it turns out, deleting cache while a site is being loaded breaks some sites. See crbug.com/762417 for details. Bug: 762417 Change-Id: If1ddead4ea5249c3eab71cd70b6fdf088c096cd7 Reviewed-on: https://chromium-review.googlesource.com/655718 Reviewed-by: Matt Menke <mmenke@chromium.org> Commit-Queue: Martin Šrámek <msramek@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#500416}(cherry picked from commit 63d6097e91187b1f6cb171b7e44d88ab0a91124c) Reviewed-on: https://chromium-review.googlesource.com/657841 Reviewed-by: Martin Šrámek <msramek@chromium.org> Cr-Commit-Position: refs/branch-heads/3202@{#105} Cr-Branched-From: fa6a5d87adff761bc16afc5498c3f5944c1daa68-refs/heads/master@{#499098} [modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle.cc [modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc [modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle_unittest.cc
,
Sep 11 2017
Verified this issue on chrome Windows-10, Ubuntu 14.04 and Mac OS 10.12.6 using chrome latest canary #63.0.3212.0 by following steps mentioned in the original comment, observed the page loads fine with no crashes or network error. Hence adding TE-Verified label for M-63. Thanks!
,
Sep 14 2017
Re comment #2: SpdySessionPool::OnIPAddressChanged() calls SpdySession::StartGoingAway(ERR_NETWORK_CHANGED) on all connections, which calls LogAbandonedActiveStream() or LogAbandonedStream(), which generate the description starting with "ABANDONED" then call SpdyStream::LogStreamError(), which logs HTTP2_STREAM_ERROR. So ERR_NETWORK_CHANGED can be correct here, though the error message might be confusing.
,
Dec 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d0976380e9c7cbf22f4689b640d1e260b10daddc commit d0976380e9c7cbf22f4689b640d1e260b10daddc Author: Martin Sramek <msramek@chromium.org> Date: Thu Dec 07 14:10:32 2017 Reenable the "cache" datatype in Clear-Site-Data Testing under heavy load (on the order of thousands of resource loads writing into the cache and clearing it) did not produce crashes or other adverse effects. Therefore, we reenable the cache datatype for M65. With plenty of time to test on unstable channels of Chromium, and to perform additional heavy load tests, we should be able to re-ship with confidence in M65. Bug: 762417 Change-Id: I17bc1efb38fb9d3f5b7e91de2048420dbe75f6ac Reviewed-on: https://chromium-review.googlesource.com/809128 Reviewed-by: Mike West <mkwst@chromium.org> Commit-Queue: Martin Šrámek <msramek@chromium.org> Cr-Commit-Position: refs/heads/master@{#522414} [modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle.cc [modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc [modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle_unittest.cc
,
Dec 8 2017
On what platform/machine did you test this, and with what kind of storage device, and with what kind of size of existing cache?
,
Jan 3 2018
Sorry, I didn't manage to get to the bottom of this before holidays. I put together a demo at https://clear-site-data-demo.appspot.com and from there, tested it on various devices (my Linux workstation, some Nexus Android devices, even Android emulator) and in all cases cache deletion succeeded. I'm not sure if any of those was sufficiently low-end, and I didn't fill the cache beyond 150MB (it takes some time to populate it between deletions), so I'll do some more testing. My demo should theoretically be able to populate the cache quickly by loading a large number of distinct random resources, but in practice it doesn't - do you have any tips for doing that from your testing experience?
,
Jan 3 2018
And by the way, I filed issue 798760. While I wasn't able to reproduce broken loading with cache, I was able to do so with cookies. That might be a completely different problem (which would be sad, as we would have two broken datatypes :( ), but it might be also related. |
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by pbomm...@chromium.org
, Sep 6 2017Components: Internals>Network Blink>Loader
Labels: ReleaseBlock-Stable M-61 Needs-Bisect OS-Linux OS-Mac
Status: Available (was: Unconfirmed)