Blob references block Cache Storage removal via Clear Browsing Data/Clear Site Data/DevTools
Reported by
drmrbre...@gmail.com,
Dec 18 2017
|
||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36 Steps to reproduce the problem: (1) open new non-incognito tab (2) go to https://www.chromestatus.com/features (3) F12 and go to Application > Clear Storage section (4) leave all checkboxes ticked and hit Clear Site Data button What is the expected behavior? Since "Cache Storage" checkbox is ticked, I'd expect the Cache Storage to be cleared. What went wrong? But it isn't... the red pie section for Cache Storage remains non-zero. Maybe it will be explained away that clearing cache doesn't actually clear it completely but merely marks it as being available to clear. But even so, I think if the user hits Clear Site Data then they expect that to happen NOW and not at some time in future. Plus... doing the repro steps in an INCOGNITO window *does* result in the Cache Storage going to zero after hitting the button... that's what I'd expect to happen in a non-incognito window too. Did this work before? N/A Chrome version: 63.0.3239.108 Channel: stable OS Version: 6.1 (Windows 7, Windows Server 2008 R2) Flash Version:
,
Dec 19 2017
Can repro on M63. ToT works fine.
,
Dec 19 2017
Also seems to be an issue on Android. My tethered device (Pixel 2016) doesn't clear the storage. I keep getting Quota Exceeded errors until I manually clear the site's data on the phone.
,
Dec 19 2017
,
Jan 17 2018
I looked into this and I could reproduce the issue by going to https://www.chromestatus.com/features and navigating around until some cache storage is collected (like going to "All features", "Releases", "Samples"). Clicking "Clear site data" will sometimes not remove this storage, sometimes it does. I also tried to delete cookies and cache from Clear Browsing Data and it doesn't finish when devtools fail to delete as well, so this issue is not just related to devtools. Sadly I couldn't create a way to trigger this reliably. Further debugging showed that StoragePartitionImpl tells QuotaManager to delete cache storage, which calls CacheStorageQuotaClient::DeleteOriginData. This task never succeeds. I created a CL to add some log output: https://crrev.com/c/870770 Closing Chrome gives the following error: [200149:200181:0117/155716.372996:ERROR:storage_partition_impl.cc(134)] Couldn't remove data of type 0 for origin https://www.chromestatus.com/. Status: 17 falken@: Could you take a look at this?
,
Jan 17 2018
,
Jan 17 2018
Cache Storage folks would be better... applying the Component and re-assigning to cmumford.
,
Jan 17 2018
bsittler@: I remember we saw a very similar bug for chromestatus.com/features. If I'm right, can you please close as a duplicate? If I'm wrong, can you please try to repro in Stable (to make sure we have a repro) and then in Canary (to see if the Cache Storage fix covers it)?
,
Jan 17 2018
Possibly a duplicate of https://bugs.chromium.org/p/chromium/issues/detail?id=801024 but have not yet repro'ed so still unsure
,
Jan 18 2018
I reproduced the issue in Chromium on master (crrev.com/529292) The issue doesn't look like incorrect reporting to QuotaManager as CacheStorageQuotaClient::DeleteOriginData never calls its callback, I guess it doesn't finish at all.
,
Jan 19 2018
#c10 - does the apparent usage vanish when you reload the page? so far that's what i'm seeing, reproducibly from M63 all the way through ToT
,
Jan 19 2018
,
Jan 22 2018
I played with it a bit more and recorded a video of Cache Storage never disappearing on a new Chrome profile. If I only select the Cache Storage checkbox, the data is not deleted after reload: https://drive.google.com/file/d/1FzSJSBM4PSCzb_jEDXDpYq13fubEwWvq/view?usp=sharing Afterwards I tried it again but I kept all checkboxes on. This time it deleted cache storage after reload.
,
Jan 22 2018
If that deletion does not include the Service Worker and the Service Worker keeps a handle for a cache around, I suspect the behavior you see may be more or less expected. I don't see how the Service Worker could release such a handle before it goes out of scope and gets cleaned up by a subsequent garbage collection. Do you know whether the Service Worker on that page releases cache handles aggressively?
,
Jan 23 2018
I have no idea, what the code on this site is doing. I'm mostly concerned that we have a problem that impacts deletions from Clear Browsing Data and maybe also the Clear-Site-Data header. Would it be possible to detect that there are open cache handles and clear the content of the cache instead of deleting the caches? If you keep the site open and try to clear cookies and cache from chrome://settings/clearBrowserData, the operation will not finish until the chrome status tab is closed. This is a huge issue if more sites start to use service workers. Screencast: https://drive.google.com/open?id=1JsjM-T87K_iCSMFR_QAx6AXWi5s75nYU
,
Jan 23 2018
Should clearing any sort of storage through the developer tools UI or site settings UI result in a reload of all affected tabs?
,
Jan 23 2018
(and any serviceworkers registered on the origin, provided the storage is SW-accessible?)
,
Jan 23 2018
The behavior for cache, cookie and other site data is that the data is just deleted without reloading tabs. Would it be possible to do the same here as well?
,
Jan 23 2018
For stateful APIs, "the data is just deleted" is not that simple; without reloading the tabs we need to define the behavior of the API when the backing store is suddenly gone. Yes, we absolutely should clear the data. But "just" can understate the complexity. FYI, for Indexed DB we forcibly close connections when browsing data is deleted; this was in violation of the spec until v2 where we defined this behavior, and introduced an event to signal pages that the connection was closed out from underneath them. ... It would also be nice to clarify this bug relative to issue 777462. I'm guessing it was originally a dupe (e.g. see comment #2), but if it still repros in TOT (see comment #10) then the fix put in https://chromium-review.googlesource.com/755844 is insufficient?
,
Jan 24 2018
Thanks, I think this is independent of the quota manager bug from issue 777462. I think there are two issues: 1. Cache storage doesn't get deleted unless you also delete the service worker and reload the tab. 2. Cache storage deletion doesn't finish until all tabs using cache storage are closed or reloaded. This causes Clear Browsing Data to wait indefinitely. If the specs don't allow removing caches that are currently open, would it be possible to keep caches alive but remove their content? I think this could solve both issues but I don't really know how Cache storage works and how it is used ;)
,
Jan 24 2018
Still trying to follow the fixes from crbug.com/777462 with the aid of some debug logging, but I did notice that a reload is not required to reclaim the usage - merely telling devtools to stop the running service worker is sufficient
,
Jan 24 2018
(that is, merely stopping it without even unregistering it is sufficient)
,
Jan 24 2018
Also, CacheStorage::GetSizeThenCloseAllCachesImpl is indeed being called, and each of the caches is found via cache_index_->ordered_cache_metadata(), but no caches are found in doomed_caches_
,
Jan 31 2018
Any updates on this issue? (I noticed that in #20 I confused issue 777462 with issue 801024 from above. 777462 seems to describe the same problem as this one but the fix wasn't sufficient)
,
Jan 31 2018
After talking with cmumford@, it sounds like I was mis-remembering the scope of issue 777462 - that covers caches that should be deleted but haven't been yet (i.e. "doomed"). In this case, the SW is still holding the cache, so (as noted in #20) the data is retained until the SW goes away. This matches what bsittler@ sees in #23 where there isn't anything "doomed" in this case. So back to my comment in #19 (or #20) - we need rework the whole data deletion path for cache storage to put the caches into a new "sorry, your data got deleted out from under you, everything is an error" state even if that violates the current spec, and actually delete the data so that CBD doesn't wait. A bunch more engineering work. :( (Thanks for persisting on this, dullweber@ and bsittler@ !)
,
Jan 31 2018
Since I have a hard time finding the UI, what dullweber@ mentions in #5 is: Settings > Privacy and security > Content settings > Cookies > See all cookie and site data And then filter by site, e.g.: chrome://settings/cookies/detail?site=www.chromestatus.com ... and then drill in and you can see - or try to delete - Cache Storage, e.g. without also deleting Service Workers.
,
Jan 31 2018
,
Feb 1 2018
The UI I mentionend in #5 is in DevTools > Application > Clear Storage. You can see it in the video from #12 but the settings UI is probably doing the same. Also thanks for investigating the issue!
,
Feb 3 2018
Mini-status update: I'm currently experimenting with a change to delete all the individual cache entries when deleting a cache through the devtools or privacy and security UIs so that more storage will be reclaimed sooner. It's semi-equivalent to doing this prior to deleting the cache and GC'ing: for (let req of await cache.keys()) await cache.delete(req); I believe this will cause the user-expected behavior without requiring changes to the API surface. The end result is not dissimilar to the broken states caused by any other partial storage deletion, so I think it's not a significant new platform wrinkle. If a service worker or other script has a cache handle open at the time of the UI-initiated deletion, that handle will still "work", but the cache itself will disappear too as soon as the last reference is gone, and will cease appearing in enumerations and working in open() immediately. All old entries will immediately cease appearing in enumerations and working in match() etc., and each entry will also no longer actually use storage space once any outstanding references to them are dropped. Because you can't find the to-be-deleted cache via any enumaration of caches after the deletion UI is used, I think it poses no significant risk of being used post-deletion by newly started code. The code that already had the handle open already knew of its (former) existence and could have stashed copies of (or references to) its contents elsewhere already — for instance, in JS globals. If these side-effects are undesirable, the devtools also allow the service workers to be stopped, and open tabs can be closed to terminate non-SW scripts. Semi-related: filed https://github.com/w3c/ServiceWorker/issues/1276 proposing to change the service worker spec to explicitly admit to the possibility of user-initiated cache/cache entry deletion.
,
Feb 3 2018
for clarity and attribution: this is exactly the "keep caches alive but remove their content" suggestion from #20
,
Feb 16 2018
I hit this issue while fetch a page with the Clear-Site-Data header through a service worker. The page hangs forever. Will try not calling cacheStorage.open() as a workaround.
,
Feb 20 2018
,
Mar 19 2018
Any updates?
,
Apr 12 2018
Is there any update on this bug? Whenever I hit the quota limit on incognito, and trying to use cache API to delete all the known caches, the quota does not return to 0. I suspect it is somewhat related to this bug.
,
Apr 12 2018
I won't be able to complete this work. I have two different old in-progress attempts to resolve this issue and I hope to upload them soon to help future fixers
,
Apr 19 2018
I think I hit this problem on Mac, M65 Stable, today. Would be good to get attention on this bug.
,
Apr 19 2018
jsbell@: Could someone from the storage team pick this up after bsittler@?
,
Apr 24 2018
I've had this (or something similar) repeatedly while working on a PWA. I find that if I make sure the service workers are stopped, by clicking on 'Service Workers' and then 'stop' by any service worker that is still running, and then clear the storage in 'Clear storage', then I get much more predictable behaviour. It seems to be that the 'Clear storage' page needs a 'Stop all service workers' checkbox in addition to 'Unregister service workers' (perhaps they should be combined).
,
May 22 2018
Any chance that someone from the storage team could work on this?
,
May 23 2018
We believe that this is a serious privacy issue because Clear Browsing Data, Clear Site Data and a data deletion related feature in Dev Tools are currently not working correctly. The bug was unassigned for about a month, could you give us some estimate of when this will get fixed?
,
May 23 2018
After digging into the code, looking at proposed patches, and reviewing the call paths... it seems like everything *should* be working as-is after the fix for issue 777462 (which landed in M64). Handles to caches are closed and the directory is deleted. I'm unable to repro in a tip-of-tree build or M66, but I'm likely missing something. -------------------------------------------------- Clear Browsing Data 1. Visit https://www.chromestatus.com/ 2. Click "All features", "Releases", "Samples" to create/fill the cache 3. Menu > More tools > Clear browsing data… 4. Leave option as Last hour 5. Click Clear data 6. Open new tab to chrome://settings/siteData Expect: no www.chromestatus.com entry Actual: as expected (logging indicates DeleteOriginDidDeleteDir was called; inspection of Default/Service Worker/CacheStorage shows the directory was indeed deleted) -------------------------------------------------- Clear Site Data 1. Visit https://www.chromestatus.com/ 2. Click "All features", "Releases", "Samples" to create/fill the cache 3. Open new tab to chrome://settings/siteData 4. Click on www.chromestatus.com entry 5. Click X next to Cache Storage 6. Close tab 7. Open new tab to chrome://settings/siteData (note: results cached, so use new tab) 8. Click on www.chromestatus.com entry Expect: no Cache Storage line Actual: as expected (logging indicates DeleteOriginDidDeleteDir was called; inspection of Default/Service Worker/CacheStorage shows the directory was indeed deleted) -------------------------------------------------- Dev Tools 1. Visit https://www.chromestatus.com/ 2. Click "All features", "Releases", "Samples" to create/fill the cache 3. Menu > More tools > Developer Tools 4. Select Application tab 5. Click Application > Clear storage (on the left) 6. Uncheck everything but Cache storage 7. Click Clear site data Expect: Cache Storage pie segment disappears Actual: as expected (logging indicates DeleteOriginDidDeleteDir was called; inspection of Default/Service Worker/CacheStorage shows the directory was indeed deleted) -------------------------------------------------- This doesn't jibe with comments #5 and #10 though, indicating a test at r529292 (and the fix for 777462 was at r514535) dullweber@ - are you still able to repro?
,
May 24 2018
I'm still able to reproduce the issue on 66.0.3359.181 :(
,
May 24 2018
Thanks for confirming. I just observed it happening on my Chromebook too. We'll get working on this.
,
May 24 2018
For reference, here's an updated set of repro steps that seems to work in 66 and ToT: 1. Visit https://www.chromestatus.com/ 2. Click "All features", "Releases", "Samples" to create/fill the cache 3. Menu > More tools > Developer Tools 4. Select Application tab; observe that there's a bunch of stuff in the cache 5. Close the tab (but not the browser, use an empty tab if needed) 6. Visit https://www.chromestatus.com/ again 7. Menu > More tools > Developer Tools 8. Select Application tab; observe there's still a bunch of stuff in cache 9. Click Application > Clear storage (on the left) 10. Uncheck everything but Cache storage 11. Click Clear site data Expect: Cache Storage pie segment disappears Actual: Pie doesn't change.
,
May 24 2018
And to confirm the observation in #5/#10, progress is just stopping on the callback chain. More proximally, CacheStorage::GetSizeThenCloseAllCachesImpl sets up a BarrierClosure for SizeRetrievedFromAllCaches which is never executed.
,
May 24 2018
Further note to self: With four caches, GetSizeThenClose/GetSizeThenCloseDidGetSize/CloseImpl all run for each. But only 3 run the scheduled DeleteBackendCompletedIO. Maybe destruction is happening early and c/o weakptr magic it never runs?
,
May 25 2018
Nope, not destruction happening early. Well, destruction of the CacheStorageCache at least. The callback does eventually run on browser shutdown. (I'm nervous about the use of method/weakpointer callbacks being used for this work in general since it means things like barrier closures will never get run if destruction occurs unexpectedly, even if that turns out not to be the cause here...) Also, there seems to be another bug. In a test run where the deletion did occur (callbacks all fired, directory was deleted, etc), the Application tab's pie chart stubbornly refused to update until browser shutdown. So something was caching incorrect values. Wheee.
,
May 25 2018
Looks like it's a CacheStorageCacheDataHandle that can keep an entry alive, which prevents the cleanup from completing. Will need to talk to the Blob experts.
,
May 25 2018
Okay, handing off to mek@
From the blob side of things, this would be "by design" since you don't generally want to delete the data out from behind a blob. But this is a case where things reading the blob should get errors.
Looks like this will need something like:
* CacheStorageCache tracks CacheStorageCacheDataHandles it has minted, which can be invalidated on close. Without leaking?
* BlobDataItems can support invalidation; reading that slice would error (???)
* CacheStorageCacheDataHandle is given a pointer to its corresponding BlobDataItem and can invalidate it (???)
The other issue I noted ("deletion did occur... Application tab's pie chart stubbornly refused to update") seems to be caused by failure to report accurate deltas from the Cache Storage system to the Quota system. I can repro if I run through the repro steps for this bug w/o a browser restart, so it will likely go away when the main bug is fixed.
,
May 25 2018
,
May 25 2018
Is this a spec problem? Shouldn't the spec arrange for all origin pages to be detached somehow while the storage invalidation occurs? Perhaps by triggering the page reloads, but delaying those loads from progressing until invalidation actually completes.
,
May 25 2018
* CacheStorageCache tracks CacheStorageCacheDataHandles it has minted, which can be invalidated on close. Without leaking? Probably just by keeping raw pointers to these in a list and having the CacheStorageCacheDataHandle destructor remove itself from that list? * BlobDataItems can support invalidation; reading that slice would error (???) * CacheStorageCacheDataHandle is given a pointer to its corresponding BlobDataItem and can invalidate it (???) That doesn't really work (at least not easily), since arbitrarily many BlobDataItems can reference the same DataHandle (i.e. when you slice a blob we'll create a new DataItem sharing the DataHandle with the existing blob). So what I think I'll do is add an IsValid() method to the DataHandle interface, and make sure to check that any time before attempting to read the data. Longer term I'd love to somehow make DataHandle own things like the disk_cache::Entry* and expose some abstract interface for actually reading the data. That way the blob system could stop having to know anything specific about cache storage (and similarly about the filesystem API implementation). But that's a separate thing and shouldn't be needed to fix this.
,
May 25 2018
Re #52, in the case of blob handles holding on to the actual data there is nothing that would limit that to same origin pages. The blob can just as happily have been transferred to a different origin page.
,
May 25 2018
> That doesn't really work .... And that's why I passed this off to mek@ :) > Is this a spec problem? Sort of, but we shouldn't block this on the spec. Related notes: * Clearing just one storage type, or clearing storage w/o also detaching pages, doesn't guarantee privacy since state can still be tracked. That's why the Clear-Site-Data header (https://www.w3.org/TR/clear-site-data/) includes the "executionContexts" type. * Most specs underspecify the behavior when storage is cleared. We did introduce a close() event on IDBDatabase to handle this case. But as noted in offline discussion w/ mek@ that doesn't cover the behavior of blobs stored in the database; in chrome, at least, the files will be deleted and the blobs will hit I/O errors, which breaks the contract that blobs are immutable. I suspect we'd want Storage to have hooks for other specs e.g. "write 'cache' data into the default bucket of origin X" and "When 'cache' data is cleared from a bucket, ...", use those hooks in IDB, SW, etc, and and Clear-Site-Data and then browser UX would align with those, and then stuff in FileAPI would cover Blobs backed by data that's no longer available.
,
May 29 2018
> Re #52, in the case of blob handles holding on to the actual data there is nothing that would limit that to same origin pages. The blob can just as happily have been transferred to a different origin page. So, you mean `Blob` object that is passed to a cross-origin context via `postMessage()`? Shouldn't that structure clone such that the resulting blob is functionally a copy of the original? At least from the spec perspective... > Sort of, but we shouldn't block this on the spec. Yea, I'm not trying to block your implementation. I'm just trying to figure out if this is something that needs to be addressed at the spec level. We're trying to figure out if we can/should implement Clear-Site-Data in firefox. I'll just file a spec issue to take my discussion there: https://github.com/w3c/webappsec-clear-site-data/issues/49 Sorry for the noise here.
,
May 29 2018
> So, you mean `Blob` object that is passed to a cross-origin context via `postMessage()`? Shouldn't that structure clone such that the resulting blob is functionally a copy of the original? At least from the spec perspective... Yes, from the spec perspective every structured clone effectively copies the data of the blob, but in at least our implementation we almost never actually copy any blob data, and instead just have multiple blobs reference the same underlying data. Similar to how a blob can also be backed by a file on disk. If I would want to spec the chrome behavior I'm currently implementing here for the cache case, I think it would be en extension of the snapshot state concept (although that concept itself is already horribly underspecified). I.e. deleting a cache like this would "invalidate" the snapshot state of any blobs that were created by the cache, similar to how a file backed blob becomes invalid if the file on disk changes.
,
May 29 2018
In firefox we currently "spill" large memory blobs into tmp files to make them file-backed. Does chrome do the same? I was wondering if it would be possible to move the backing file to this kind of tmp storage for any outstanding blobs when the origin storage is purged. (I imagine its probably way harder than it sounds...)
,
May 29 2018
> In firefox we currently "spill" large memory blobs into tmp files to make them file-backed. Does chrome do the same? I was wondering if it would be possible to move the backing file to this kind of tmp storage for any outstanding blobs when the origin storage is purged. (I imagine its probably way harder than it sounds...) Yeah, we pretty much do the same for large memory backed blobs. But it seems a bit weird to copy your cache storage data somewhere else when you're trying to delete it just to keep blobs alive, and doing it preemptively would be unfortunate too. Also not sure how our cache storage actually exists on disk. In the blob implementation we're literally just referencing the disk cache entries, without needing to know anything about the underlying storage mechanism used by the disk cache.
,
May 29 2018
> But it seems a bit weird to copy your cache storage data somewhere else when you're trying to delete it just to keep blobs alive, and doing it preemptively would be unfortunate too. Sure, but this is for blobs that were structured cloned and technically should be copies, no? And I guess in my simplified mental model it would be more of a file rename instead of an actual copy. *waves hands* Anyway, I'll stop hijacking this issue. Sounds like the blob lifecycle stuff is a bigger known issue with the specs.
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1 commit 3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1 Author: Marijn Kruisselbrink <mek@chromium.org> Date: Tue May 29 22:12:58 2018 [CacheStorage] Invalidate cache-backed blobs when cache is closed. This adds a new IsValid() method to BlobDataItem::DataHandle, and uses that in CacheStorageCache to make it possible to invalidate a blob backed by cache storage. All callsites that try to access the cache entry for a blob data item have been updated to check for validity before trying to derefence the entry. Also some refactoring to MojoBlobReader to unify error handling and make sure all paths actually handle read errors correctly. Bug: 795701 Change-Id: I78865380efa9c6e18a70ab122c5ceb26ffc38a0d Reviewed-on: https://chromium-review.googlesource.com/1073868 Commit-Queue: Marijn Kruisselbrink <mek@chromium.org> Reviewed-by: Joshua Bell <jsbell@chromium.org> Reviewed-by: Daniel Murphy <dmurph@chromium.org> Cr-Commit-Position: refs/heads/master@{#562623} [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/content/browser/cache_storage/cache_storage_cache.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/content/browser/cache_storage/cache_storage_cache.h [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/content/browser/cache_storage/cache_storage_cache_unittest.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/blob_data_item.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/blob_data_item.h [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/blob_impl.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/blob_reader.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/mojo_blob_reader.cc [modify] https://crrev.com/3fb12fb8b5fe9ae8cb1d9f8a2f98b0ebc4c48db1/storage/browser/blob/view_blob_internals_job.cc
,
May 29 2018
Tried a local build and I'm no longer able to repro using the steps from comment #45, and similar variants using Clear Browsing Data and Clear Site Data. dullweber@ - can you give a local build or the next canary a try?
,
May 29 2018
,
May 30 2018
I also can't reproduce the issue anymore. Thanks for fixing it! |
||||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||||
Comment 1 by krajshree@chromium.org
, Dec 18 2017