Issue metadata
Sign in to add a comment
|
IndexedDB: Deleting records does not free up space |
||||||||||||||||||||||||
Issue descriptionRepro: 1. Create a database 2. Write 50MB of records to a store 3. Use navigator.storage.estimate() to get the estimated usage 4. Delete all the records in the store 5. Use navigator.storage.estimate() to get the estimated usage Expected: Usage is lower Actual: Usage is unchanged (or possibly higher) until a compaction happens. We force a compaction when a database is deleted, but not on other operations. This means sites can't easily intelligently manage data in low quota situations (e.g. purging some content). Maybe we want to trigger a compaction in other scenarios?
,
Oct 30 2017
,
Oct 30 2017
,
Oct 30 2017
Does this apply to navigator.storage.estimate() only? Or does it also apply to queryUsageAndQuota() ?
,
Oct 30 2017
Both. The APIs talk to the same back end. The issue is that like most modern databases, Indexed DB lazily reclaims space - trading performance for accuracy. That's one reason we called the newer API "estimate"
,
Oct 30 2017
What are the criteria for reclaiming space - or at least for the estimation to update? Is it based on time passing, or certain events, or number of records update/deleted ? We definitely have logic that looks up the quota, and keeps deleting data until we're back in good quota. This bug suggests that that behavior might be broken, and there is no way to actually implement this behavior correctly today?
,
Oct 31 2017
#4, #6: Thank you very much for sharing your implementation experience! You are correct -- there currently is no correct cross-browser way of deleting data and stopping precisely when your quota falls below a certain value. I think we should have an answer to this problem, but right now I don't know what that answer should be. To the best of my knowledge, IndexedDB can be implemented on top of SQLite (Firefox, Safari), LevelDB (Chrome), or a proprietary storage engine (Edge). I don't know anything about the last case, but I know that LevelDB and SQLite return free space to the operating system in batched operations (compaction / vacuuming) that are triggered using implementation-specific methods. In both storage engines that I know about, free space reclaiming can be manually triggered. However, at least in LevelDB's case, the operation is not idempotent -- invoking compaction repeatedly might result in more space getting freed up. Reclaiming all the space used by a database would be very expensive, essentially requiring re-reading and re-writing the entire database. This makes me think we can't just provide an API for reclaiming free space, even if we could figure out how to prevent the abuse of such a feature. I'm sorry, I know that what I wrote doesn't solve your problem. I hope it explains why it's not an easy problem to fix, though.
,
Nov 1 2017
So determining disk quota is dependent on actually reclaiming the space? (i.e. we can't provide a more accurate disk quota number without actually compacting?) Also, is there a way we could force IDB to somehow trigger compaction (Short of deleting the data?) Would closing and reopening IDB trigger a compaction for example?
,
Nov 2 2017
#8: Here are the answers to your two questions, in the order in which they were asked. 1) We would need design changes to be able to return a number that is effectively the result of iterating through the entire database and summing up the key and value sizes. Deploying design changes would require a full database read for every database, so it'd be tricky. Separately, it's not clear to me that quota usage should be based on the method that you suggest. While that would make the Web platform more rational (which has obvious benefits for developers), the current quota numbers do reflect true disk usage. Would it make you feel better to get a "1GB" usage number if your database actually uses up 5GB of disk space? (and 5gb is the number we'd use to make eviction decisions) By the way, I do not mean to imply that the issue you raised is without merit. As I said above, we do need to come up with a sensible way for you to manage your disk usage. 2) In most cases, closing and reopening the IndexedDB would not trigger a compaction. I was trying to explain above that even if you could trigger a compaction, this would not necessarily make the quota usage = disk space usage reflect all the delete operations. Longer explanation below. LevelDB uses an LSM (log-structured merge) tree. Delete operations are stored as deletion markers, and data is actually removed when the table containing the deletion marker is merged with the table containing the data that was deleted. In a nutshell, a compaction operation selects a bunch of tables at the same level and merges them into a new table that gets placed at a higher level. Note that it is possible that by the time you issue a delete operation, the deleted key's data made it to a higher level table. If this happens, doing _one_ compaction would not make disk usage go down. We'd need to do enough compactions to get the deletion marker merged with the data it deletes. In the worst case, this'd require going through the entire database.
,
Nov 3 2017
1) Yes, that makes sense. It would probably make things more confusing. 2) a) Would disk space *eventually* trend towards effective data size? Seems like compaction doesn't necessarily solve the problem entirely, but progressive compactions could make disk space usage more efficient over time? b) Comment 1 suggests that today compaction only happens on database deletion, does it mean that it's possible that a database keeps growing and never reclaims the space? c) Again from comment 1, is creating & deleting a dummy database on the domain a possible workaround to force compaction? d) What is your best recommendation for quota management, given what we have today? When we reach our maximum quota, should we delete a large number of items from our database (maybe half), and then wait till the database eventually compacts which would allow us to reclaim the dead space?
,
Nov 3 2017
I know these aren't very good answers. I'm really sorry I don't have better answers right now. 2a) Yes. 2b) Chrome: LevelDB compactions happen "regularly". Whether "regularly" is defined in terms of wall clock or number of operations is an implementation detail. Other browsers: SQLite vacuuming should also happen regularly, assuming correct configuration (the autovacuum option must be turned on via a PRAGMA and compile-time support must be enabled) c) This is a clever idea. Unfortunately, the workaround wouldn't be very efficient. Details: We do ask LevelDB to do a compaction when an IndexedDB database is deleted. All IndexedDB databases for an origin are stored in the same LevelDB database, so the compaction request would hit the correct LevelDB database. However, when we issue the compaction request, we hint it with the key range of the deleted database. This hint will most likely result in the wrong tables (from the perspective of your workaround) being selected for the merge. d) Today, the only recommendation I can give is storing most data in Blobs. Sadly, even this weak recommendation is Chrome-dependent. In the short term, once we re-enable IndexedDB large value wrapping, any large objects will automatically be stored in Blobs. IIRC, you use large objects, so you should get the benefits described below automatically. Details: Blobs are managed as separate files in a per-origin directory, so all the LevelDB concerns don't apply to them. After the IndexdDB object containing a Blob is deleted (meaning the transaction containing the delete commits) and the renderer releases all references to the Blob (you're not hanging on to it in your JavaScript) the file is deleted. Modulo some bugs that we're looking at, the quota usage should quickly reflect Blob deletion.
,
Nov 3 2017
Thank you for the detailed & thoughtful answer. RE: "Whether "regularly" is defined in terms of wall clock or number of operations is an implementation detail." I understand this is currently an implementation detail, but knowing the current state of the implementation in Chrome can potentially help us workaround this issue. So is there any more information you can share about how often/when compaction happens in Chrome?
,
May 16 2018
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by sheriffbot@chromium.org
, Oct 30 2017Status: Untriaged (was: Available)