New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 762417 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Chrome , Mac
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Clear-Site-Data cache deletion can be slow and break loading

Reported by pskub...@opera.com, Sep 6 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3180.0 Safari/537.36

Example URL:
https://global.rakuten.com/en/

Steps to reproduce the problem:
1. Go to https://global.rakuten.com/en/ 

What is the expected behavior?
Site loads properly

What went wrong?
on 62.0.3180.0 (Developer Build) (64-bit) site is loading and stay on waiting loop
on Google Chrome 61.0.3163.79 (Official Build) (64-bit) site is not fully loaded, so it's damaged 

Does it occur on multiple sites: N/A

Is it a problem with a plugin? No 

Did this work before? Yes 60.0.3112.113

Does this work in other browsers? Yes

Chrome version: 62.0.3180.0  Channel: dev
OS Version: 10.0
Flash Version:
 
Cc: kinuko@chromium.org kouhei@chromium.org mmenke@chromium.org
Components: Internals>Network Blink>Loader
Labels: ReleaseBlock-Stable M-61 Needs-Bisect OS-Linux OS-Mac
Status: Available (was: Unconfirmed)
Abel to reproduce the issue on latest Chrome stable i.e., 61.0.3163 on Windows 7,10, Mac and Linux.

When tried to bisect I saw that few builds the page doesn't even load and see lot of network error in JS console like "Failed to load resource: net::ERR_NETWORK_CHANGED" and the builds where it loads I see the page is completely broken please seen attached screen shot for reference. 


Note : This is a regression since the page loads correctly oin M60 and doesn't on Chrome dev(62.0.3202.9) and Canary(63.0.3207.0) as well.






Cc: b...@chromium.org maksim.sisov@chromium.org
Hrm...Something's going haywire - The page kills about:net-internals.  Stopping the page after a couple seconds of loading, I'm seeing 46622 DISK_CACHE_ENTRY sources being opened and then closed, for only 223 URL_REQUESTS.  Most of the requests are for random google URLS:  15 opens for 	https://clients5.google.com/pagead/drt/dn/dn.js, 225 opens for https://mail.google.com/mail/u/0/images/cleardot.gif, 20 for https://ssl.gstatic.com/gb/images/v1_e3444bc5.png, etc.

I don't see all that stuff when loading a functional page, like cnn.com.

[+maksim.sisov]:  Has someone started doing anything new and exciting with the cache lately?

[+bnc]:  The ERR_NETWORK_CHANGED looks to be an H2 bug.  We look to be failing requests with ERR_NETWORK_CHANGED in response to receiving an ABANDONED message from a server.  Is this expected?  Sample failure:

t=4569 [st=  0] +REQUEST_ALIVE  [dt=135]
                 --> priority = "HIGHEST"
                 --> url = "https://global.rakuten.com/en/common/css/ranking_widget_customized.css"
t=4570 [st=  1]    URL_REQUEST_DELEGATE  [dt=0]
t=4570 [st=  1]   +URL_REQUEST_START_JOB  [dt=134]
                   --> load_flags = 33024 (MAYBE_USER_GESTURE | VERIFY_EV_CERT)
                   --> method = "GET"
                   --> url = "https://global.rakuten.com/en/common/css/ranking_widget_customized.css"
t=4570 [st=  1]      URL_REQUEST_DELEGATE  [dt=0]
t=4570 [st=  1]      HTTP_CACHE_GET_BACKEND  [dt=0]
t=4570 [st=  1]      HTTP_CACHE_OPEN_ENTRY  [dt=2]
                     --> net_error = -2 (ERR_FAILED)
t=4572 [st=  3]      HTTP_CACHE_CREATE_ENTRY  [dt=1]
t=4573 [st=  4]      HTTP_CACHE_ADD_TO_ENTRY  [dt=0]
t=4573 [st=  4]     +HTTP_STREAM_REQUEST  [dt=87]
t=4573 [st=  4]        HTTP_STREAM_JOB_CONTROLLER_BOUND
                       --> source_dependency = 572421 (HTTP_STREAM_JOB_CONTROLLER)
t=4660 [st= 91]     -HTTP_STREAM_REQUEST
t=4660 [st= 91]     +HTTP_TRANSACTION_SEND_REQUEST  [dt=4]
t=4660 [st= 91]        HTTP_TRANSACTION_HTTP2_SEND_REQUEST_HEADERS
                       --> :authority: global.rakuten.com
                           :method: GET
                           :path: /en/common/css/ranking_widget_customized.css
                           :scheme: https
                           accept: text/css,*/*;q=0.1
                           accept-encoding: gzip, deflate, br
                           accept-language: en-US,en;q=0.9
                           referer: https://global.rakuten.com/en/
                           user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3207.0 Safari/537.36
t=4664 [st= 95]     -HTTP_TRANSACTION_SEND_REQUEST
t=4664 [st= 95]     +HTTP_TRANSACTION_READ_HEADERS  [dt=40]
t=4704 [st=135]        HTTP2_STREAM_ERROR
                       --> description = "ABANDONED (stream_id=9): https://global.rakuten.com/en/common/css/ranking_widget_customized.css"
                       --> net_error = "ERR_NETWORK_CHANGED"
                       --> stream_id = 9
t=4704 [st=135]     -HTTP_TRANSACTION_READ_HEADERS
                     --> net_error = -21 (ERR_NETWORK_CHANGED)
t=4704 [st=135]   -URL_REQUEST_START_JOB
                   --> net_error = -21 (ERR_NETWORK_CHANGED)
t=4704 [st=135]    URL_REQUEST_DELEGATE  [dt=0]
t=4704 [st=135] -REQUEST_ALIVE
                 --> net_error = -21 (ERR_NETWORK_CHANGED)
Labels: -Pri-2 Pri-1
Repros on
- Linux M61
- Linux ToT (broken.png from yhirano)
- Canary M63

Work as expected on:
- Windows M60
broken.png
302 KB View Download
Cc: jmukthavaram@chromium.org
Labels: -Type-Compat -Needs-Bisect hasbisect-per-revision Needs-Triage-M62 Type-Bug-Regression
Owner: msramek@chromium.org
Status: Assigned (was: Available)
Able to reproduce the issue on Windows 7, Mac 10.12.6 & Ubuntu 14.04 using latest Canary#63.0.3207.0 and reported version-62.0.3180.0 as per the below fiddle.

https://global.rakuten.com/en/

Manual Bisect info:
------------------
Good-61.0.3151.4- Revision-484781
Bad-61.0.3152.0- Revision-485105

Per revision bisect info:
-----------------------
You are probably looking for a change made after 484889 (known good), but no later than 484890 (first known bad).
CHANGELOG URL:
The script might not always return single CL as suspect as some perf builds might get missing due to failure.
https://chromium.googlesource.com/chromium/src/+log/f251d6591f25c3b769e47aee16f82ffabafb19b0..7d08794ea091f1a9d5684b0056fbf88fed090298

Review-Url: https://codereview.chromium.org/2972083002

msramek@,Could you please take a look and reassign to the right owner if it is not related to your change.

Thanks..!



URGENT - PTAL.
Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and get it merged into the release branch ASAP. Thank you!


Note: We're planning to cut M61 Stable RC on Monday, 09/11 @ 4:00 PM PT.
Ok, so I think the issue here is that clear data does the same thing as when we flush the cache/cookies - which includes tearing down any active connections, since those are essentially cached data (Cache DNS entry, cached authentication info, cache session cookies).  This isn't expected for the clear data header, so causes problems.

Openning the same entry multiple times when clearing data seems weird, though - maybe some cache enumeration issue?  Anyhow, not the core thing going on here.
Cc: mkwst@chromium.org
This website is a real-life stress test for Clear-Site-Data. The landing page and multiple resources on it send the header. That doesn't really make sense for a website that has a sign-in section. Furthermore, the landing page also sends Set-Cookie headers, which, as the spec (https://www.w3.org/TR/clear-site-data/) explains, will be overriden by Clear-Site-Data.

So this particular website would end up broken regardless of the bug.

For a long-term solution, we might need to rethink how clearing "cache" works. The fact that things are cleared before the navigation/resource load finishes, and that deletion incorporates the kind of data mmenke@ mentioned, are intentional. Maybe we'll always need to restart the request when "cache" is deleted?

But anyways - for a short-term M61 solution, we can disable the "cache" deletion. The page appears to load correctly with "cookies" and "storage" only. I'll send out a CL.
Hi!

чт, 7 сент. 2017 г. в 3:01, mmenke via monorail <
monorail+v2.3079273537@chromium.org>:

Sorry, I have no idea. I've been working on mus and ozone for the last half
a year and didn't have time for networking.
Cc: -maksim.sisov@chromium.org morlovich@chromium.org
[+morlovich] Sorry, wrong maksim.  :)
I'll take a look at the DISK_CACHE_... spam on the assumption that was the intent of the CC. 
Yea, that was.  The main issue has been figured out, creating the same DISK_CACHE entry 20 times was not (I assume there's a fair bit of overhead in repeatedly recreating cache entries, though could be wrong).
I think most of it it's actually iterating through stuff using the iterator interface from ConditionalCacheDeletionHelper, used from StoragePartitionHttpCacheDataRemover... which really isn't a thing we should be doing on the main Http cache; it's performance would be utterly horrific (at least for simple cache, block might actually do a lot better, not sure).




Cc: gov...@chromium.org
Labels: -ReleaseBlock-Stable
For the record, the CL is here: https://canary-chromium-review.googlesource.com/c/chromium/src/+/655718

I will land this on trunk in any case.

Regarding M61 merge, I discussed this with govind@, who reminded me that 61 is already in stable and so there's obviously a very high bar for even a quick fix.

We expect the breakage to be low given that:
1. This is a new API that we're shipping just now, and we're the first browser to ship it, so the usage is currently low.
2. The API has been hidden behind an experimental flag for some time, and we know that it has been experimented with by some websites, yet this is the first report we get of a breakage. It seems that https://global.rakuten.com/en/ has started using the header while it was still experimental without ever testing it.
3. This website would probably not break if it didn't overuse the header. That doesn't mean it's not a bug in Chrome, but it means that most developers experimenting with header will not have this problem, or will immediately notice when it occurs. We can reach out to this particular website to inform them of the problem.

With that, we lean towards not merging to M61. We should probably revisit the decision if any other report comes tomorrow (Friday), so that we can proceed with merge by Monday.
Yeah, so the thing pushing the big red button on cache iteration through the many layers is unsurprisingly the implementation of Clear-Site-Data; and, well, as far I can see Clear-Site-Data for cache isn't really implementable sanely with SimpleCache w/o major architecturalish changes. That's because deleting all cache entries for a domain requires opening ~all the cache files, so think something to the order of 10,000 - 20,000 files. That would be because the keys are not actually kept in memory, and keeping them in memory would cost too much. A bit of context:

 out/Default/cachetool /usr/local/google/home/morlovich/.cache/google-chrome-beta/Default/Cache/ simple list_keys | wc
11269   11268 2671645
... That is, on this machine there are 11,269 cache entries, and their keys take up 2,671,645 bytes. (Perhaps there is an intermediate solution somewhere, of course, like only storing domains --- especially if they are atomized/normalized --- or storing some secondary index, etc).

Blockfile can in principle do better (can't say what it actually does), but it would still need to open separate files for every entry > 16K (roughly 1900 of those here), and also pretty much scan through a large portion of the block files, too.

Cc: abdulsyed@chromium.org
Thank you msramek@. Please request a merge for change listed at #14 once change is well baked/verified in Canary and safe to merge to M61 if we see more reports coming in.

Applying Android and Chrome as this is Blink related issue.

+ abdulsyed@ (Chrome on Desktop TPM)
Labels: OS-Android OS-Chrome
Re #15:

How does this compare with deleting cache entries based on timestamps, which is what chrome://settings/clearBrowserData does?

Would it help to extend the disk_cache::Backend interface with URL-based deletion methods, similarly as we already have timestamp-based ones?
Going by timestamps only is pretty efficient for Simple (for Block is depends a lot on the range in question, but it can benefit from it too), if the methods specific to that are used --- I don't know what that UI actually does, but at any rate, it's not so bad to spend a few seconds churning disk in response to such an action as compared to an HTTP header. 

A special interface would only help theoretically, as the current data organization makes an efficient implementation impossible. 


Thanks for the explanation. I guess we'll have to revisit the "cache" datatype entirely.

I guess it still makes sense in the spec, because a browser can have an implementation where efficient deletion is possible. But maybe Chrome will choose not to use it, or only use it with the blockfile backend, or simply ship it with a disclaimer that this can take a long time based on the user's existing cache size.
Project Member

Comment 21 by bugdroid1@chromium.org, Sep 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/63d6097e91187b1f6cb171b7e44d88ab0a91124c

commit 63d6097e91187b1f6cb171b7e44d88ab0a91124c
Author: Martin Sramek <msramek@chromium.org>
Date: Thu Sep 07 22:40:53 2017

Temporarily remove support for the "cache" datatype in Clear-Site-Data

"cookies" and "storage" are still supported. As it turns out, deleting
cache while a site is being loaded breaks some sites.

See crbug.com/762417 for details.

Bug: 762417
Change-Id: If1ddead4ea5249c3eab71cd70b6fdf088c096cd7
Reviewed-on: https://chromium-review.googlesource.com/655718
Reviewed-by: Matt Menke <mmenke@chromium.org>
Commit-Queue: Martin Šrámek <msramek@chromium.org>
Cr-Commit-Position: refs/heads/master@{#500416}
[modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle.cc
[modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc
[modify] https://crrev.com/63d6097e91187b1f6cb171b7e44d88ab0a91124c/content/browser/browsing_data/clear_site_data_throttle_unittest.cc

Labels: Merge-Request-62
Summary: Clear-Site-Data cache deletion can be slow and break loading (was: Waiting loop on global.rakuten.com/en)
https://global.rakuten.com/en/ has removed their usage of Clear-Site-Data, so this particular issue has been resolved on their side.

Repurposing the bug to the general discussion about cache.

Also, requesting a merge of the fix to M62.
Labels: -Merge-Request-62 Merge-Approved-62
Approved for M62. Branch: 3202
Project Member

Comment 24 by bugdroid1@chromium.org, Sep 8 2017

Labels: -merge-approved-62 merge-merged-3202
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3

commit bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3
Author: Martin Sramek <msramek@chromium.org>
Date: Fri Sep 08 23:41:49 2017

Temporarily remove support for the "cache" datatype in Clear-Site-Data

"cookies" and "storage" are still supported. As it turns out, deleting
cache while a site is being loaded breaks some sites.

See crbug.com/762417 for details.

Bug: 762417
Change-Id: If1ddead4ea5249c3eab71cd70b6fdf088c096cd7
Reviewed-on: https://chromium-review.googlesource.com/655718
Reviewed-by: Matt Menke <mmenke@chromium.org>
Commit-Queue: Martin Šrámek <msramek@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#500416}(cherry picked from commit 63d6097e91187b1f6cb171b7e44d88ab0a91124c)
Reviewed-on: https://chromium-review.googlesource.com/657841
Reviewed-by: Martin Šrámek <msramek@chromium.org>
Cr-Commit-Position: refs/branch-heads/3202@{#105}
Cr-Branched-From: fa6a5d87adff761bc16afc5498c3f5944c1daa68-refs/heads/master@{#499098}
[modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle.cc
[modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc
[modify] https://crrev.com/bf5cd3d2ae42c7b1708a419f88e14bb82cfe82c3/content/browser/browsing_data/clear_site_data_throttle_unittest.cc

Labels: TE-Verified-M63 TE-Verified-63.0.3212.0
Verified this issue on chrome Windows-10, Ubuntu 14.04 and Mac OS 10.12.6 using chrome latest canary #63.0.3212.0 by following steps mentioned in the original comment, observed the page loads fine with no crashes or network error. Hence adding TE-Verified label for M-63.

Thanks!

Comment 26 by b...@chromium.org, Sep 14 2017

Re comment #2: SpdySessionPool::OnIPAddressChanged() calls SpdySession::StartGoingAway(ERR_NETWORK_CHANGED) on all connections, which calls LogAbandonedActiveStream() or LogAbandonedStream(), which generate the description starting with "ABANDONED" then call SpdyStream::LogStreamError(), which logs HTTP2_STREAM_ERROR.  So ERR_NETWORK_CHANGED can be correct here, though the error message might be confusing.
Project Member

Comment 27 by bugdroid1@chromium.org, Dec 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d0976380e9c7cbf22f4689b640d1e260b10daddc

commit d0976380e9c7cbf22f4689b640d1e260b10daddc
Author: Martin Sramek <msramek@chromium.org>
Date: Thu Dec 07 14:10:32 2017

Reenable the "cache" datatype in Clear-Site-Data

Testing under heavy load (on the order of thousands of resource loads
writing into the cache and clearing it) did not produce crashes or other
adverse effects.

Therefore, we reenable the cache datatype for M65. With plenty of time
to test on unstable channels of Chromium, and to perform additional
heavy load tests, we should be able to re-ship with confidence in M65.

Bug: 762417
Change-Id: I17bc1efb38fb9d3f5b7e91de2048420dbe75f6ac
Reviewed-on: https://chromium-review.googlesource.com/809128
Reviewed-by: Mike West <mkwst@chromium.org>
Commit-Queue: Martin Šrámek <msramek@chromium.org>
Cr-Commit-Position: refs/heads/master@{#522414}
[modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle.cc
[modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle_browsertest.cc
[modify] https://crrev.com/d0976380e9c7cbf22f4689b640d1e260b10daddc/content/browser/browsing_data/clear_site_data_throttle_unittest.cc

On what platform/machine did you test this, and with what kind of storage device, and with what kind of size of existing cache?

Sorry, I didn't manage to get to the bottom of this before holidays. I put together a demo at https://clear-site-data-demo.appspot.com and from there, tested it on various devices (my Linux workstation, some Nexus Android devices, even Android emulator) and in all cases cache deletion succeeded.

I'm not sure if any of those was sufficiently low-end, and I didn't fill the cache beyond 150MB (it takes some time to populate it between deletions), so I'll do some more testing. My demo should theoretically be able to populate the cache quickly by loading a large number of distinct random resources, but in practice it doesn't - do you have any tips for doing that from your testing experience?
And by the way, I filed issue 798760. While I wasn't able to reproduce broken loading with cache, I was able to do so with cookies. That might be a completely different problem (which would be sad, as we would have two broken datatypes :( ), but it might be also related.

Sign in to add a comment