Looping video downloaded twice |
||||||||||||
Issue descriptionChrome Version: ToT (M59) OS: Linux Steps to reproduce: 1) open chrome, clear cache 2) F12->network 3) https://jsfiddle.net/abidCharlotte49er/qdw41j1j/ Observe how the video is downloaded twice, then the third time it plays it says "from cache" in the request. It should be enough to download it once. This is a re-file of bug 697112 . The original bandwidth hog has been fixed, but another problem has been found during verification.
,
Mar 9 2017
Could be a bug with clear cache? If you start a fresh chrome profile does it still show the bad behavior?
,
Mar 9 2017
If I disable "simple disk cache" in about:flags, it works better. Assigning to pasko@ for additional triage.
,
Mar 11 2017
Here is another, and perhaps better example of how the "simple HTTP cache" can make things not work. (Thanks to david.emmett@zoodigital.com.)
,
Mar 11 2017
Adding some other folks. Thanks for examples.
,
Mar 13 2017
I tried the jsfiddle from the first comment and it caches it the first time and plays from cache the second. Can't reproduce. When I try from ToT it doesn't actually play the video at all. I don't know what's going on with that.
,
Mar 13 2017
Can you attach a chrome://net-internals log including both the first playthrough and then the second attempt?
,
Mar 13 2017
Attached net-internals logs both with and without simple cache.
,
Mar 13 2017
,
Mar 15 2017
Thanks for the logs! Strange. The simple cache has the entry the second time around, but the HttpCache::Transaction decides to skip it. Will have to dig into this. I wish I could reproduce it though.
,
Mar 16 2017
Did a little debugging, and while I don't have any real answers yet, I discovered that in partial_data.cc, line 159, "cached_start_" is ~20971176 (it's slightly different in each run) during the second play-through, which makes http_cache_transation think the range is not in the cache. Any ideas for where to look next?
,
Mar 16 2017
Dug a little deeper; I added code to write out all the sparse entries inside SimpleSynchronousEntry::GetAvailableRange. During the second loop, the first entry has an offset around ~20Mb (but have a file offset of 90). During the third loop, the entries start at 0 (but the file offset for this entry is somewhere in the middle of the file.) It's like it decided to start caching 20Mb into the file. Alternatively, something evicted the first 20Mb of the file in the middle of playback.
,
Mar 16 2017
I enabled -vmodule=simple_synchronous_entry=2 and got this: [14812:14842:0316/140048.891477:VERBOSE1:simple_synchronous_entry.cc(539)] Truncating sparse data file (20962320 + 32768 > 20971520) This seems to neatly explain what is going on. Digging deeper, it seems like the 20971520 comes from 10% of the max cache size, and the max cache size is kDefaultCacheSize * 2.5 (because that is between 1 and 10 percent of my available disk space.) (It seems that 8.6Gb of free disk space is not enough to cache a 22mb video?) What I don't understand is why the entry doesn't get truncated again during the next play-through. It seems like the "sparse data size" is calculated wrong, because on the second playthrough, the 2Mb that were cached from the first playthrough are not counted in the sparse data size, saving the entry from getting truncated again somehow....
,
Mar 17 2017
Thanks for digging in hubbe. That helps tremendously. As for cache size, it appears that most Linux media caches are ~300MB, so they should be able to hold 30MB. Perhaps this is why it worked for me and not for you, since your disk has less free space and hence a smaller media cache. Anyway, I agree that even 30MB is pretty tiny for a media cache. I believe there is a difference between the SimpleCache and the BlockFileCache in determining how much space a single entry is allowed to use, which explains why one worked for you but not the other. I'm a bit confused as to what is happening on the 2nd request in your SimpleCache log. It sends a request for the first 20MB of the file (the part after it truncated on the first attempt) but it looks like the request is cancelled after about 17MB? How does the video keep playing? Does Blink store some data and play from its own buffer? Then the third request happens and it successfully reads from cache, but it doesn't get to the 17MB point to find out what would happen then.
,
Mar 17 2017
Yes, we (media/blink/multibuffer*) have our own buffer. so that could be why it doesn't need all the data. The buffer is basically LRU, though, so for sufficiently large looping files, it really won't help much.
,
Mar 17 2017
Note that rdsmith@, mmenke@, and myself were musing over not caching range requests in the browser process, and letting clients (e.g., media and pdf) handle their own caching. The HTTP cache doesn't seem to do a great job of it anyway, and it adds a ridiculous amount of complexity to the cache. WDYT?
,
Mar 17 2017
,
Mar 17 2017
,
Mar 17 2017
Re #16: I think that's a terrible idea. Sparse entries do add complexity, but it's not any easier to do it on the client side. In fact, we'd have to re-invent all the things the http cache does, possibly duplicating it once per client. The client-side buffer we have today is memory-backed, which limits it's usefulness as a cache. It's primary role is to implement pre-fetching, which we need to avoid jank in video playback.
,
Mar 17 2017
hubbe@: I hear you, but I'd like to explore the question a bit further. My reasons for thinking getting rid of range request caching might make sense is that the behavior of caching for video/audio/streams seems pretty basically different than "regular entries". Specifically: * It's a lot bigger, so there's a higher cost for caching an entity. * I would expect the "useful" caching would have a different pattern. Specifically, for most entries you want all or none. But for streams you care about different portions of the stream differently: e.g. you *definitely* want to cache anything just ahead of the watch point, you might well want to cache the past couple of minutes (For "Wait, what was that?" moments) but you wouldn't often want to cache the whole thing for a rewatch. * I wouldn't expect that caching would be particularly useful between clients--I don't think there would be any common access patterns between clients for watching/listening. So a per-client cache makes *more* sense to me than a global one. Can you give me a sense as to why you disagree with those thoughts?
,
Mar 17 2017
(I'll also note that the requirements on a video cache and a network cache are different; as noted in issue 700197 trying to present HTTP header semantics on top of the cache is a losing proposition, and wouldn't be needed in a speciality cache targeted at video/audio. But I think that's a relatively minor point.)
,
Mar 17 2017
Seems like caching range requests has long been a source of brokenness for the cache. The current design requires two caches, which has been a complexity and brokenness all over the place from content/ to chrome/ (SessionStorage knows about the split, ProfileIOData has a ton of code to deal with two caches for every single URLRequestContext, and we consume twice as much cache space on disk as a result of the split, BrowsingDataRemover needs to know about it, etc). It also gives us things like https://crbug.com/541149 and https://crbug.com/700197, has caused crashes in the memory cache, etc. Seems like we should either ditch caching range requests in the cache, or staff it and completely re-architect it.
,
Mar 17 2017
Re #20: > * It's a lot bigger, so there's a higher cost for caching an entity. That's why we need ranges.... Although, ranged requests can make sense for other data types too. I've been pondering using range requests for reading JPEGs so that we can void reading the high-frequency portion of progressive JPEGs if we don't need it. Long looped GIFs should probably also use range requests. And of course, ultra-super-long text files (logs) should be using range requests as well. Any really large file can benefit from range requests. > * I would expect the "useful" caching would have a different pattern. > Specifically, for most entries you want all or none. But for streams you > care about different portions of the stream differently: e.g. you > *definitely* want to cache anything just ahead of the watch point, you > might well want to cache the past couple of minutes (For "Wait, what > was that?" moments) but you wouldn't often want to cache the whole > thing for a rewatch. Looping videos really needs to cache the whole thing, or we're going to eat bandwidth forever. All other browsers do this. Videos are often a part of the design of a web page, and for anybody who uses the "open-another-tab-and-read-it-after-its-loaded" (like me) pattern, having a shared cache is pretty important. > * I wouldn't expect that caching would be particularly useful between > clients--I don't think there would be any common access patterns between > clients for watching/listening. So a per-client cache makes *more* sense > to me than a global one. Per-client also means per-tab. So opening the same site in two tabs will load the video twice. Re #22: > Seems like we should either ditch caching range requests in the cache, > or staff it and completely re-architect it. I don't know if it needs to be completely re-architected or not, but I completely agree that we need to figure out what the right architecture is and make sure that we have enough resources to implement and maintain it. It's possible that the final answer is to solve this on the client side, but it's going to be roughly equally complex and cost (both in terms of manpower and disk size) if we do this on the client side or in the http cache, so I don't really buy those particular arguments. Maybe the solution is to get rid of the disk cache and use squid instead?
,
Jul 20 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8 commit a9eeb01aaf3e555b843f78298de19b2f1ac29fa8 Author: hubbe <hubbe@chromium.org> Date: Thu Jul 20 00:13:20 2017 Changes the eviction algorithm to take the size of the entry into account. Weighting by size means that large entries will get thrown out of the cache sooner and the end result is that the cache will have more, but smaller entries. Design doc: https://docs.google.com/document/d/1N07uwDiOipRcMG3SanxR8X8flvxrh9Axi1SLpfRNKCY/edit BUG= 700102 , 736437 Review-Url: https://codereview.chromium.org/2918893002 Cr-Commit-Position: refs/heads/master@{#488058} [modify] https://crrev.com/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8/net/disk_cache/simple/simple_experiment.cc [modify] https://crrev.com/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8/net/disk_cache/simple/simple_experiment.h [modify] https://crrev.com/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8/net/disk_cache/simple/simple_experiment_unittest.cc [modify] https://crrev.com/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8/net/disk_cache/simple/simple_index.cc [modify] https://crrev.com/a9eeb01aaf3e555b843f78298de19b2f1ac29fa8/net/disk_cache/simple/simple_index_unittest.cc
,
Nov 28 2017
I haven't fully confirmed it, but looks like 788600 is another one of these -- it has a 31MiB file, and caches on CrOS are usually 200MiB. There may be a few silly things we can do to alleviate this somewhat, perhaps by giving a large share of cache size to sparse files for MEDIA_CACHE? The truncate thing in SimpleCache (pretty much forced by the brilliantly minimalist/simple way that sparse is done) seems like a problem, but I think even with piece-wise LRU you would end up just cycling through things, throwing out stuff that was max-permitted-size old anyway, wouldn't you?
,
Nov 28 2017
I've been meaning to change the threshold from 1/10th the cache size to something like 1/3rd of the cache size, now that the cache throws out large entries first, that should not be as big of a problem as it used to be. My other suggestion is to implement something that lets the cache grow larger than the 200MiB temporarily. I was thinking that it would be pretty simple to just add the size of recently (last 30 minutes?) used entries to the max size of the cache, up to some limit, like maybe 50% of the size of the cache or something.
,
Nov 29 2017
What's a bit curious re: temporary growth suggestion is that the simple backend kinda behaves like that if an open entry gets evicted --- it's still around but not counting. (blockfile I think just doesn't let open entries be candidates for eviction). The layering really isn't anywhere near right to take advantage of that, plus we probably can't exactly let websites reserve users's harddisk space like that... Oh, one more bit of silliness: non-sparse files actually use 1/8th as the limit (kMaxFileRatio) vs. sparse files' 1/10th (kMaxSparseDataSizeDivisor)
,
Nov 29 2017
I think as long as there is an upper limit to how much space the "active set" is allowed to use, I think it's still ok. Right now we set the cache limits very cautiously, because we don't want the user to be bothered by the amount of disk space we use, but I don't think the users will object to some extra disk use *while they are actively using chrome*.
,
Dec 11 2017
,
Dec 11 2017
Have we decided on how we approach this issue?
,
Dec 11 2017
Another backend to keep in mind is the memory backend. It will fail writes to the cache after the entry hits its limit.
,
Dec 11 2017
From a network service standpoint, we'll probably want to get away from having a separate media cache. "NetworkContext::CloneButUseANewCacheAndAutomaticallyTearDownIfSourceNetoworkContextIsDestroyed" isn't an API I want to provide.
,
Dec 11 2017
Sorry, for #31 I meant to say it'll fail the writes after the backend hits its limit. It doesn't let individual entries grow beyond the backend limit like the other backends do. In regards to #32, I agree. Let's kill the media cache.
,
Feb 16 2018
,
Dec 7
Hello! This bug is receiving this notice because there has been no acknowledgment of its existence in quite a bit of time. - If you are currently working on this bug, please provide an update. - If you are currently affected by this bug, please update with your current symptoms and relevant logs. If there has been no updates provided by EOD Wednesday, 12/12/18 (5pm EST), this bug will be archived and can be re-opened at any time deemed necessary. Thank you!
,
Dec 13
Due to lack of action this bug has been Archived. If work is still being done on this issue or you are still experiencing this issue please feel free to re-open with the appropriate information. |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by hubbe@chromium.org
, Mar 9 2017