New issue
Advanced search Search tips

Issue 770307 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Chrome , Mac , Fuchsia
Pri: 2
Type: Bug

Blocked on:
issue 918937



Sign in to add a comment

[Storage] Size accounting & reporting for SessionStorage

Project Member Reported by dmu...@chromium.org, Sep 29 2017

Issue description

Session storage currently doesn't report size at all, and this can be an issue when it gets really big. We have an example device with >400MB of session storage (Chrome on Android) but none if it is reported, and the user is shown that websites are only taking up ~80MB.


We need to report this size, and it looks like we'll need to do some decent plumbing/thinking to get this information available.
 

Comment 1 by mek@chromium.org, Sep 29 2017

DOMStorageContext::GetSessionStorageUsage is generally where I'd expect size to be reported, but unlike the GetLocalStorageUsage method, GetSessionStorageUsage only returns origins+namespace IDs, but doesn't include any kind of size.

Actually being able to report per-origin sizes would probably involve some change to the database format similar to how localstorage keeps track of per-origin usage. And that would still not match actual on-disk usage, but not sure if that matters.

Comment 2 by mek@chromium.org, Sep 29 2017

Having > 400MB of session storage also sounds like there are other issues going on btw, as (this being session storage) there shouldn't be any storage for origins that aren't open anymore, and there is code that tries to delete unused data...

Comment 3 by mek@chromium.org, Oct 5 2017

Hopefully I'm missing something, but it appears to me that StartScavengingUnusedSessionStorage (the code responsible for cleaning up no longer needed session storage after startup) isn't actually called from any code that is compiled on android... Combined with android probably being more likely to get the browser process killed and thus not cleaning up on shutdown that could be pretty bad.
Yeah it looks like everything is done on session restore, and we don't do session restore on Android. We might want to rope in some Android folks to see how they would want to solve this, as it's nontrivial w/ tabs that we have killed but we still want them to work. Also WebView is an unknown here.
Cc: tedc...@chromium.org ssid@chromium.org
My understanding is that on Android when the browser is killed we don't want to clean up session storage immediately because the kills are rather frequent and more of an implementation detail to the user who may be switching between their IM and their browser.

However, sounds like as a result we never actually remove any session storage from disk then?

I think the question here is what defines a session on Android. At the very least we should be able to remove storage associated with sites that were closed by the user. Not sure what we should do for the user who never closes tabs. Maybe remove it when we start without savedInstanceState?

Comment 6 by mek@chromium.org, Oct 11 2017

For what it's worth, from at least a quick test it seems we don't actually restore session storage at all currently on android. For example if I have a tab with some values stored in session storage, kill chrome (by swiping chrome away), and start chrome again my tab no longer has data in session storage.

Comment 7 by ssid@chromium.org, Oct 11 2017

I see 3 options:
1. Do not store session storage on disk in Android. This would cause 2 problems:
The data has to be stored in-memory and will cause memory regressions. This would get much worse over long sessions.

2. Store it on disk and clear the files at startup. I do not see any cons. The amount of disk writes or memory would not regress.

3. Restore the database and delete all origins that are not currently active, else the data would be ever growing. This has multiple open questions:
It is tricky to detect if the origin is still open (stale pages, iframes) and could introduce more bugs. Might also cause regressions if some page is left open in background keeps adding more data. I think in desktop the storage is only restored in session restore. Now we would want to restore always on Android? What would happen to the user with 400MB storage if we restore?

I would suggest we just clear the files and maybe gather more ideas of how to improve the session on Android.

Comment 8 by mek@chromium.org, Oct 11 2017

Option 2 is more or less what we agreed on earlier, so that sounds good to me. It's a bit weird that session storage on android doesn't survive browser kills, but people don't seem to be complaining, and we wouldn't be regressing anything, so probably indeed the best way to go.

Comment 9 by dmu...@chromium.org, Oct 11 2017

Yes let's do option 2 - especially because we don't store session storage namespaces for tabs serialized to disk anyways on android.
Owner: dmu...@chromium.org

Comment 11 by pasko@chromium.org, Oct 13 2017

Cc: pasko@chromium.org

Comment 12 by pasko@chromium.org, Oct 13 2017

ssid: Are you saying that session storage currently may be offloaded to disk? In which circumstances does this happen? How does it work with the sync API? Are we allowing to jank the renderer for the duration of roundtrip to disk?

Potentially (2) and (3) will slows down startup, so I'd be careful.

There are other options:

4. running a job to clear stale data when the device is charging and the screen is off

5. complicated hack, mentioning for completeness: implement something like 'hidden_env.h' as disk backend for session storage leveldb on Android that would only create unlinked files (guaranteed by the kernel not to survive restarts/crashes, but readable/writable as long as the file descriptor is open) - on pre-3.11 (without O_TMPFILE support) it is not fully race-free, but the mitigation is easy and cheap. This hack would work on all POSIX systems, but not on Windows.
It happens like this:

* In the renderer, we have a cache of session storage in memory.
* All writes are also sent to disk.
* If there is a crash & session restore (desktop only, not android), then we reload session storage from disk.
* When we create a new window or do a navigation session storage is cloned like this: https://html.spec.whatwg.org/multipage/browsers.html#copy-session-storage - where we read that data from disk (as the renderer can be gone that has it in memory. BUT we currently don't limit the session storage to just the origin of the browsing context, we actually copy all origin data (So iframes will still have data from the first tab) - we can remove this, as it's not to spec.

Basically the reason to still have the data on disk is for reading it back in for navigations or child pages (clicking on a link that opens a new page).

If we wanted to move everything in-memory (and totally avoid disk for everything other than session restore), it gets more fragile, as we need to move memory between two renderers, so there has to be a good system to do that. This is definitely possible now due to mojo, but I don't know if we're at the point where we can move this data like that.

4. This might not be too hard, as we can probably reuse the cleaning code from desktop.

5. We can almost definitely do this for our disk storage. This might be useful for blob storage as well.

Comment 14 by pasko@chromium.org, Oct 16 2017

Cc: primiano@chromium.org lizeb@chromium.org
dmurph: that explains it for me, thanks!

curious: can several renderers serve the same origin in a single session? The cleaning still sounds like the best/simplest option in short term, though may want to return back to it to reduce amount of disk write pressure, which is .. 'somewhat possible'.

Suppose we have a key-value storage with these properties:
* disk-based, but can be locked to memory
* when unlocked, the kernel decides whether to push it out to disk based on memory pressure (it does not know about disk wear, which is unfortunate, but existing heuristics could be 'enough for everyone')
* no more than a single copy of the database is stored in memory at all times
* goes away on crashes/restarts
* hopefully does not have to deal with multiple processes accessing the data at the same time

Sounds like we can do this on top of sqlite/mmap/mlock/mincore/O_TMPFILE easily and cheaply, but not so well with leveldb .. (moving back to sqlite is kinda unfortunate I guess)

So if the above is possible, we could 'unlock' the thing from the renderer on navigations, keep the database open in the browser process, and attempt to lock it back when we need a restore. Bonuses: the data is always consistent (modulo disk corruptions that may kill the browser, but .. rare), the data is not flushed to disk unless there is memory pressure. Cons: need to keep a couple of file descriptors open for each origin in the browsing session - new risk running out of FDs?

I think we would still have the same issues with that. Here are the operations we need to support:

1: Fast in-renderer reading/writing to session storage
Javascript can write and read key-values to and from session storage. This needs to be in-memory, in the renderer.

Note: A session is per-tab, and it storages data on a per-origin bases (so iframes would save things to the same session storage bucket, and data access is restricted by origin).

2: 'Cloning' the session storage to a new browsing context when a navigation happens.
When we navigate, this is considered part of the same session, and the storage should be the same. This applies to both in-tab navigation and navigation that opens new tabs or windows. Those tabs or windows must also have an in-memory copy of the data for fast access and modification.

Note: We currently seem to clone too much vs what the spec says - we only need to clone the parent frame's origin's session storage, not all third party (iframes) session storage as well.

3. 'Restoring tabs' - When we restore chrome after a crash, shutdown, or whatever, the session restore feature needs to restore the session storage data to all tabs that use it.


Our current implementation:
* Renderer has the key-value map in memory. Renderer sends write or clear operations (batched) to the browser.
* Browser only has the keys in-memory, and stores everything in a database on disk. It tries to optimize storage with copy-on-write semantics for what is stored on disk.
* On session restore, we use that database to restore the session storage for all tabs.
* On navigation - we currently read in the session storage from disk for new browsing contexts. This *could* be optimized to send the information from the 'parent' renderer now with mojo - but this might not be possible yet.
Project Member

Comment 16 by bugdroid1@chromium.org, Nov 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3ba651931865078e17b50057d8cdda7115ba2dd4

commit 3ba651931865078e17b50057d8cdda7115ba2dd4
Author: Daniel Murphy <dmurph@chromium.org>
Date: Sat Nov 18 00:14:15 2017

[Session Storage] Deleting database on startup for Andriod

R: mek@chromium.org
Bug: 770307
Change-Id: If5dfbdb3064e1b4f336680d45dfcff1d1ef27add
Reviewed-on: https://chromium-review.googlesource.com/775674
Commit-Queue: Daniel Murphy <dmurph@chromium.org>
Reviewed-by: Marijn Kruisselbrink <mek@chromium.org>
Cr-Commit-Position: refs/heads/master@{#517628}
[modify] https://crrev.com/3ba651931865078e17b50057d8cdda7115ba2dd4/content/browser/dom_storage/dom_storage_context_impl_unittest.cc
[modify] https://crrev.com/3ba651931865078e17b50057d8cdda7115ba2dd4/content/browser/dom_storage/session_storage_database.cc

Status: Assigned (was: Available)
Owner: ----
Status: Available (was: Assigned)
Blockedon: 918937

Sign in to add a comment