New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 650201 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Bug



Sign in to add a comment

Should multibuffer evict lower blocks first after inserting multiple blocks?

Project Member Reported by wdzierza...@opera.com, Sep 26 2016

Issue description

When seeking, the change in reader position can cause multiple "present" blocks to transition from pinned to unpinned state, and thus enter the LRU all at once.  Should these blocks then be evicted (by Pop()) in the lowest-position-first order to avoid an MRU behavior in the LRU cache?

This bug complements the discussion in https://codereview.chromium.org/2363953003/.
 
The images posted so far:

> This shows the current eviction order (the green bars are the "present" blocks):
> http://pasteboard.co/6H74RxXQq.png
> 
> This is how it looks like with this CL: http://pasteboard.co/6H83AjEie.png
> 
> Notice how in the former case the blocks are evicted in the MRU order. I assumed
> this wasn't intentional in a cache with the LRU strategy.

...are here: reads-seek-ffmpeg-evict-higher-first.png (original) vs. reads-seek-ffmpeg-evict-lower-first.png (CL).
reads-seek-ffmpeg-evict-higher-first.png
39.5 KB View Download
reads-seek-ffmpeg-evict-lower-first.png
33.3 KB View Download
> Unfortunately I have no idea which of these two patterns are actually *better*.
> One seems to optimize for "seeking back to where I was before seeking" while
> the other optimizes for "seeking to the beginning of the file". Not sure which of
> those is actually more common in real life.

I have some doubts about the "seeking to the beginning of the file" optimization. This CL seems neutral to several such scenarios. More pictures! :)

1. It's obviously neutral when the seek distance is too small to cause a bulk eviction of multiple blocks: reads-play10-seek0-play10.png (original) vs. reads-play10-seek0-play10.cl.png (CL)
2. Medium seek distance: reads-play30-seek0-play10.png (original) vs. reads-play30-seek0-play10.cl.png (CL)
3. Large seek distance: reads-play40-seek0-play20.png (original) vs. reads-play40-seek0-play20.cl.png (CL)

I came up with one scenario where the "seeking to the beginning of the file" optimization is visible. It's when the seek distance is large enough for the pinned ranges to no longer overlap _and_ when the seek to 0 is _quickly_ followed by a second seek to a position that is very close to the position from before the first seek. Compare reads-play38-seek0-play2-seek40-play5.png (original) vs. reads-play38-seek0-play2-seek40-play5.cl.png (CL). Sounds like a version of "seeking to where I was before seeking", only the first seek is backwards in this case.

I'm only writing this because it occurred to me that saying that the current behavior optimizes for "seeking to the beginning of the file" is probably making it sound more general (and thus desirable) than it is in reality.
reads-play10-seek0-play10.png
262 KB View Download
reads-play10-seek0-play10.cl.png
250 KB View Download
reads-play30-seek0-play10.png
363 KB View Download
reads-play30-seek0-play10.cl.png
363 KB View Download
reads-play40-seek0-play20.png
397 KB View Download
reads-play40-seek0-play20.cl.png
396 KB View Download
reads-play38-seek0-play2-seek40-play5.png
363 KB View Download
reads-play38-seek0-play2-seek40-play5.cl.png
361 KB View Download
> Btw, I love the graphs you created, they are excellent for visualizing what's going on
> in the cache. I don't suppose you could share the code and/or scripts to used to
> create those?

It's just the right log statements (0001-Multibuffer-diagnostics.patch), grep (plot.sh), and gnuplot (reads.gnuplot).  I'm sure it could be automated further :)

Oh and some of the log statements probably access some data structures on the wrong thread.
0001-Multibuffer-diagnostics.patch
6.9 KB Download
plot.sh
543 bytes View Download
reads.gnuplot
883 bytes Download

Comment 4 by hubbe@chromium.org, Sep 26 2016

I've looked over the images, and generally I seem to prefer freeing from back-to-front, as that seems to do better if someone seeks back to where they were before. I also like that in some of your plots, freeing from back to front avoids splitting the cached region into two.  However, it occurs to me that maybe "distance from previous read position" would be better...

Either way, I agree that your CL is mostly neutral, but maybe I missed some cases where your CL actually makes things better?

Comment 5 by hubbe@chromium.org, Sep 26 2016

Hmm the reads-play38-seek0-play2-seek40-play5.png makes me a bit worried though.
It looks like there might be a bug when we seek to 40 and don't immediately start to buffer until we actually run out of buffer. Might be a bug in the multibuffer code...

Status: Assigned (was: Untriaged)
Description: Show this description
> Either way, I agree that your CL is mostly neutral, but maybe I missed some cases
> where your CL actually makes things better?

By evicting the lower blocks first after seeking, it applies the LRU eviction strategy (the lower blocks were fetched and consumed earlier than the higher blocks), and thus favors "seek forwards and then backwards to where I was before seeking" scenarios.  Due to some peculiar read patterns this actually saves the day in Opera in certain cases, so we're probably going to use it at least in those cases anyway.

Is it generally better or worse than the current strategy that favors "seek backwards and then forwards to where I was before seeking"?  Frankly, I'm not sure.  I don't know if one type of user behavior is more common than the other.  Apart from user behavior, from time to time, people will come across a web-unfriendly MP4 file that requires the decoding library (_any_ decoding library) to read a bit from the beginning, then reach to the end of the file for a container header, then continue reading from the beginning.  I have no idea if it's common enough to optimize for.

That said, I think the CL makes the cache eviction more consistent and predictable.  I was surprised when I noticed the cache exhibiting MRU behavior in some cases.  The design document only talks about LRU (for now).  I went looking in the code and the only explanation I found was the comment implying this particular order of operations in a loop executes faster -- which I don't believe is true given the data structures the LRU class operates on internally.

Comment 9 by hubbe@chromium.org, Sep 27 2016

I should update the comment at the very least, because the comment is supposed to convey the message that the freeing from-back-to-front is intentional in order to get things into the LRU in that order. I didn't do it because I thought it was faster, I did it because I thought it might improve the caching slightly.

I may have a source of data that can tell me what people "usually" do, which I should be able to use to construct a benchmark for this, then we could tweak the algorithms to optimize that benchmark. It's not going to be on the top of my todo list though.
Cc: wdzierza...@opera.com
Owner: hubbe@chromium.org
Alright, fair enough.  I'm re-assigning to you then and I'm looking forward to seeing such a benchmark.

Thanks for looking into this!
Cc: -hubbe@chromium.org dalecur...@chromium.org
@hubbe, can you detail your ideas here so wdzierzanowski@ can try his hand at the benchmark if he wants? I don't think it's fair to hold up his efforts (thanks for your contribution!) here for something you don't think you'll get to anytime soon.

Comment 12 by hubbe@chromium.org, Sep 27 2016

I was going to use session logs from chromecast mirroring dogfooders, which might have data on seeks in it to construct the benchmark. Since timing is irrelevant in this test, I would want to create a clockless test which plays back the seek operations from the logs. 

We could also create the same test and populate it with synthetic data. Since we wouldn't know what uses cases are actually more common, it would be most useful for optimizations that don't make some use cases worse.
Thanks, that sounds interesting.  Since freeing from back to front is intentional, and not a side effect of an implementation-detail level decision as I previously assumed, I understand the need for a more structured assessment before deciding which way to go.

Unfortunately, as I said in https://codereview.chromium.org/2363953003/, I don't have the time to work on the benchmark now.
Status: WontFix (was: Assigned)
Unfortunately, the session logs I was thinking of using doesn't have the required data, so benchmarks would have to be synthetic.
Marking this as WontFix for now, feel free to re-open if additional work is needed.

Sign in to add a comment