non-media cache filled with streaming media from google play music |
||||||||||
Issue descriptionWithin a couple hours of listening to streaming music on my Linux Desktop (which is now running simple cache) the cache contains almost entirely music, effectively clearing my cache. This is especially painful given that the resources aren't even conditionalizable. This applies to YouTube streams as well. Presumably this is an effect of the SimpleCache LRU eviction algorithm.
,
Jun 14 2016
This data should be in the media cache instead of the general cache. Google Play Music (which is what I've seen this issue with) uses XHR to gets its data, and presumably XHRs don't get tagged as media resources.
,
Jun 14 2016
,
Jun 14 2016
What's the solution here? Tagging XHRs into the media cache in some situations? A better eviction algorithm for simple cache?
,
Jun 20 2016
I think a better eviction algorithm is the right approach. Any reason we couldn't drop the media cache after doing so?
,
Jun 20 2016
YT also uses XHRs for HTML5 media playback, including on mobile.
,
Jun 20 2016
jkarlin@, why do you suppose a better eviction algorithm is the right approach here? I'm nervous in general of complicating simple cache's eviction algorithm if we can get most of the wins by tagging these for use in media cache. I understand the desire to have one, sane cache, but it seems like a longer term goal than fixing this bug.
,
Jun 20 2016
How do you detect if an XHR request is for media or not at request time? That's the point that we decide which cache to use.
,
Jun 20 2016
Good point, the only think I can think of is to use the URL but that's probably way too hacky. Any ideas to improve the eviction algorithm in a simple way?
,
Jun 20 2016
gavinp@, csharrison@, jkarlin@, and shivanisha@ met and decided to proceed as follows: We'll create separate bins within the SimpleCache based upon response content-type. Each bin will have a max-percentage of the cache that it's allowed to use. One of the bins will be a media bin. Eviction will continue to be LRU but will be run on the given bin. If a new entry will be too large for its bin it won't write to cache. The end result will be that all media requests (with appropriately labeled content types) will wind up in the media bin of the cache, fixing the XHR problem. This will also allow us to remove the media cache. Next steps: 1) Add UMA to track bytes written to cache for various content types. This is to give us an idea of what people's caches look like today. 2) Wait for SimpleCache to launch everywhere 3) Wait for SimpleCache to improve its index reliability 4) Implement the SimpleCache changes
,
Jun 20 2016
"If a new entry will be too large for its bin it won't write to cache." Are we able to enact this policy at response time? cc pasko@ as FYI
,
Jun 20 2016
In general, getting rid of a backend instance for Media Cache sounds good. We do not always know which bin should the entry be in, when creating the entry. So that'd be more flexible, also less index files to flush to disk (yay). > > "If a new entry will be too large for its bin it won't write to cache." > > Are we able to enact this policy at response time? I am not sure this is a good idea. We often do not know the size of the entry until we fully write to it. What this proposal seems to mean is that we need to be able to discard writing to a large entry (and return error?) when eviction starts, which complicates the (already complex) entry state machine (what if the entry is being doomed? in optimistic write?) Meta question: I see a lot of "wait" in the plan. Can we run experiments (probably Clovis-like) to see what cache splitting strategy is better for cycling across a bunch of popular sites?
,
Jun 20 2016
> > > "If a new entry will be too large for its bin it won't write to cache." > > > Are we able to enact this policy at response time? > I am not sure this is a good idea. Me neither, but if it's simple it's worth at least looking into when feasible. > Meta question: I see a lot of "wait" in the plan. Can we run experiments (probably Clovis-like) to see what cache splitting strategy is better for cycling across a bunch of popular sites? Do we have a reasonable approximation for what users navigations look like? E.g., playing music in the background all day while navigating other sites?
,
Jun 20 2016
> Do we have a reasonable approximation for what users navigations look like? > E.g., playing music in the background all day while navigating other sites? We both know that we don't know how it looks like :/ Even this basic question is unclear: how to estimate the proportion of users that are affected by XHR eating their caches. I think this usecase is non-typical (esp on mobile), but it is rather a wild guess of mine. Ideas are very welcome. I am not even sure what I am trying to say :) Perhaps that the content type heuristic is cool, but there could be something else that is simpler and/or more efficient? Also, we should make sure that by splitting cache in separate bins we do not regress UX for those who do not play music on the background and do not watch YT, but I did not easily understand how the plan guarantees it. To check for that I thought we could "just" cycle through X URLs simulating a user session. But now thinking a little more about it, any possible way we invent would raise super many questions :/ Maybe makes sense to run an origin-controlled experiment on YT that would save all YT XHR in a separate cache backend when in experiment group? Note: doing experiments around cache eviction is super long, we need to clear caches in Control group and wait until caches fill up (several weeks AFAIR).
,
Jun 20 2016
Agree that mobile is less likely to have this issue as streaming often happens in dedicated apps. If main cache grows by the size of the deleted media cache, and the media bin is the same same size as the old media cache, then we won't regress UX. It'll be the same.
,
Jun 20 2016
I should mention that I'm not a huge fan of binning either. It leave some areas of the cache potentially under utilized, as you say. Ideally a simple eviction strategy would just work well, regardless of what we put in it. We have some time to come up with something better.
,
Sep 23 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/81c04644f176ac5873c84db8f1cc8193762f88f3 commit 81c04644f176ac5873c84db8f1cc8193762f88f3 Author: jkarlin <jkarlin@chromium.org> Date: Fri Sep 23 18:08:59 2016 [HttpCache] Add cache metrics for audio/video behavior Cleans up the content-type metrics in http_cache_transaction and adds metrics for two new content types, audio and video. BUG=617620 Review-Url: https://codereview.chromium.org/2350183002 Cr-Commit-Position: refs/heads/master@{#420661} [modify] https://crrev.com/81c04644f176ac5873c84db8f1cc8193762f88f3/net/http/http_cache_transaction.cc [modify] https://crrev.com/81c04644f176ac5873c84db8f1cc8193762f88f3/tools/metrics/histograms/histograms.xml
,
Sep 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9e25895414b4a327440262769ca8cd785fdbcddb commit 9e25895414b4a327440262769ca8cd785fdbcddb Author: jkarlin <jkarlin@chromium.org> Date: Mon Sep 26 18:35:29 2016 [SimpleCache] Track some SimpleCache size stats on initialization Once the SimpleCache has initialized, record its size, its max size, how full it is, and how many entries it has. This gives us insight into the utilization of caches. BUG=617620 Review-Url: https://codereview.chromium.org/2355493003 Cr-Commit-Position: refs/heads/master@{#420941} [modify] https://crrev.com/9e25895414b4a327440262769ca8cd785fdbcddb/net/disk_cache/simple/simple_index.cc [modify] https://crrev.com/9e25895414b4a327440262769ca8cd785fdbcddb/tools/metrics/histograms/histograms.xml
,
Apr 18 2017
Given the delay on shipping Simple Cache everywhere, I guess this is on indefinite hold? For servicification of the network stack, I really don't want to support an API that lets us bifurcate the cache based on resource type (Or even just bifurcating the cache and providing a separate mojo interfaces to use each one)
,
Apr 19 2017
Right, I don't see anyone getting to this issue anytime soon. Given that we're already trashing the main cache anyway for lots of sites (e.g., youtube) it might be reasonable to argue that you don't need to support it. How about starting up a Finch trial and seeing how removing the media cache affects the ratio of bytes from network and page load performance?
,
Apr 19 2017
+hubbe we definitely have a bunch of bug reports from when we accidentally did this...
,
Aug 24
What about doing something like the Linux page cache (https://lwn.net/Articles/584101/, merged in 3.15): split the cache in two halves, one containing entries that were never hit and one containing entries that were hit at least once. Seems like a simple and cheap heuristic that will prevent YouTube and Play Music from thrashing the entire cache unless one plays all videos twice. :-)
,
Aug 27
So the SimpleCache backend now prioritizes by age and size. Media will get evicted sooner than other things. I believe the next plan is to remove the media cache and increase the size of the main cache. Of course, this doesn't work particularly well with the blockfile backend...
,
Aug 27
,
Aug 27
Yeah, I noticed, but in the modern world all too many pages load a 1M javascript which needs to be cached for the website to be usable outside of gigabit links (in other words, outside of selected areas of the West). With the default cache size of 320M, half an hour of watching youtube evicts these cached JS files. With --disk-cache-size=2000000000, one needs to watch youtube for several hours to evict that JS away. But that's not a theoretical scenario: after reading the source code of SimpleCache and disk_cache and whatnot and deciding that 2G of cache should help, I went home, happened to binge some dashcam videos until 2am, and then again had to wait half a minute for Jira to load (normally it'd load under 5 seconds). I mean, what's the point of evicting a javascript that has seen hundreds of cache hits every day to store a 2M fragment of youtube video that never being played again?
,
Aug 27
On what platform? SimpleCache isn't used on Windows or Mac.
,
Aug 27
Linux.
,
Aug 27
Great, thanks. Balancing the eviction knobs is slow-going work, but should be done.
,
Aug 27
FYI: When I implemented the age-and-size priority in simplecache, I also tried a lot of different methods for deciding what entries to evict. My observation was that most of these algorithms gave rather small improvements, and may not be worth it. Increasing the cache size made a much bigger difference. My suggestion would be to have a more flexible upper limit for the cache size. Normally the cache size is only a couple of percent of the available storage, but it's probably ok for chrome to use more than that while people are actually using chrome. This could be implemented by simply ignoring recently used entries when deciding if it's time to evict stuff or not. Another interpretation would be to have a limit of cache size * 1.5 for as long as the browser is considered "active", then trigger a cache eviction without the 1.5 when the browser goes "idle". Either way, the increased cache size will also make the cache eviction more efficient, as it will have more entries to choose from when evicting.
,
Aug 27
An adaptive cache size is intriguing. We've previously studied the effects of statically doubling the cache size. It had a marginal improvement (1-2%) to cache hit rate.
,
Aug 27
Well, I'm just one data point of many, but: - I'm not willing to dedicate more than cca 4G to chrome cache - with 4G of cache, I think I can still reproduce the issue, but I'd have to watch 4K videos for a few hours :-) - I have no idea what other effects the "hit at least once" heuristic has, I'm convinced it'd fix my problem but it may break other use cases :-/
,
Aug 28
hubbe: AFAIR the usage pattern you experimented with was not involving watching 4K videos for a few hours :) re: the "hit at least once" heuristic: + clearly solving one problem, seems nice + easy to implement in simplecache? could occupy one bit in the index entry, and then some simple magic in eviction + dividing in two equal parts looks arbitrary, but also could be tuned + the only drawback I can see is that the cache is less efficient when a user drastically changes access pattern (and half of the cache is not enough for that) - should not happen often? re: adaptive size: + attractive, the current sizing is mostly accidental
,
Aug 28
Agree that it should be fairly simple to implement. I worry about how the various eviction strategies will behave together. We need some simulations.
,
Aug 28
see also squid eviction policies http://www.hpl.hp.com/techreports/1999/HPL-1999-142.pdf
,
Aug 28
Blockfile actually implements something like this, but more complex (of course). I am bit worried about things like metadata write on JS files producing an extra touch, etc., though maybe that's a feature :p
,
Aug 28
> I worry about how the various eviction strategies will behave together. We need some simulations. I don't understand. I thought we have a framework of cache clearing that prevents strategies playing together in the field... For simulations we could go the approach from hubbe@ - recording local access patterns and then simulating those. I've recently explored how to, among other things, dump data from Clank (go/dumpium - sorry, googlers only), it is more complicated than necessary for this purpose, but feel free to take the ideas/snippets. We could even try to dump access patterns for watching-videos-for-hours. That would be the most difficult part of the project.
,
Aug 28
Ah, I'm saying if we were to layer the frequency bit on top of Hubbe's age+recentness strategy, it could be strange. Yes, we can test them in whatever combination we like.
,
Aug 28
yea, I agree it is not the easiest thing to reason about
,
Aug 30
One crazy thought I had was to try to have a cache API that lets one influence eviction scoring based on in-memory hints. Only SimpleCache having those, and general difficulty making this a generically meaningful API is a big counter-argument, as is a lack of a versioning story for that... but also including HINT_UNUSABLE_PER_CACHING_HEADERS using the same mechanism as some sort of re-used bit would be nice.
,
Aug 30
If the consumer knew that something shouldn't be stored then couldn't it request the resource with a DO_NOT_CACHE (or whatever it is) flag?
,
Aug 30
Well, yes, but rewinding videos (Or replaying short videos) probably isn't too uncommon. So ideally, we'd cache video, but evict it more aggressively (Or only cache so much of it), particularly given how big videos can be (Which means fetching a video can be expensive, but so is caching it).
,
Aug 30
Ah, yes. So now I'm imagining a consumer that tracks media resources on a page and once the page has been navigated away from it can tell the backend to prioritize deleting those resources. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by jkarlin@chromium.org
, Jun 6 2016