Should we use madv_free on content::ResourceBuffer and gpuTransferBuffers? |
||||||||||
Issue descriptionBoth use shared memory regions. Both don't operate at full capacity. When we free memory in each, should we be using madv_free to actually free the underlying resident pages? +avi to comment for content/ +vmiura to comment for gpu/
,
Apr 20 2017
Looks like content::ResourceBuffer is only used in AsyncResourceHandler, which is owned by the network team. What's the advantage/potential disadvantage of using madv_free on content::ResourceBuffer? What's the current behavior?
,
Apr 20 2017
content::ResourceBuffer is a circular buffer, implemented with a contiguous base::SharedMemory region, and some metadata [start, end]. When we first create the SharedMemory region, the unused part of the region is not present in physical memory [has no resident footprint]. As we start to use the content::ResourceBuffer, the OS lazily grabs and reserves pages. However, as we update the metadata to "free" part of the region, we never actually free the underlying physical pages. The OS doesn't know that the pages can be reclaimed, even though they're not in use. e.g. 1) |start end -----------------------------| resident:0 ResourceBuffer accounting: 0 2) |start ---------------- end ------------| resident:5 ResourceBuffer accounting: 5 3) |---------------- start end ------------| resident:5 ResourceBuffer accounting: 0 I haven't verified this in detail, this is just what I suspect is happening. If instead we used madv_free when we updated the internal accounting, resident memory should go down as well. This may potentially have a large impact on Android, where saving a couple of MBs here might be very significant. Adding some people on the intersection of Android + Memory, as this may be very interesting to them.
,
Apr 20 2017
Maybe the takeaway here is that any place we are implementing our own allocators [partition alloc, and in this case, I think of content::ResourceBuffer as a super simple allocator, transfer buffers, etc.], we should be using madv_free to make sure that the OS knows which memory is actually in use, and which can be reclaimed.
,
Apr 20 2017
,
Apr 20 2017
AsyncResourceHandlers are generally short-lived for small resources. However, it's possible that we get many outstanding requests, in which case we will have many of these buffers around. I'm adding UMA to see how many outstanding requests we get https://codereview.chromium.org/2828733003/ from renderers. Is there any metrics that we can collect from the wild to see if content::ResourceBuffer is a problem?
,
Apr 20 2017
Interesting. I believe PartitionAlloc already madvises memory away. How big is that buffer, and is it reported to memory-infra?
,
Apr 20 2017
AsyncResourceHandler is going away very, very soon (As in we're running field trials now that use Mojo instead). So as far as AsyncResourceHandler is concerned, I don't think we want to invest time in improving it. No idea about gpuTransferBuffers.
,
Apr 20 2017
mmenke: Is the same problem going to be present in the new Mojo version?
,
Apr 20 2017
No idea. If it is, it's in mojo's code for transferring streams of data, not content/browser/loader code.
,
Apr 20 2017
+rockot, jam to answer the Mojo question.
,
Apr 20 2017
Removing network label, since the focus is no longer network code.
,
Apr 21 2017
I'm not sure I fully understand what the question is. My understanding is that you're referring to the general scenario of having a large shared memory buffer where some region is unused (or already consumed) and could thus be madv_free'd. I assume the Mojo question here is in reference to Mojo data pipes, which use a shared ring buffer for internal storage. madv_free'ing chunks of the pipe buffer as they're consumed seems reasonable to me. Is there any significant reason not to do this? I can't imagine it's an expensive operation.
,
Apr 21 2017
> Is there any significant reason not to do this? I can't imagine it's an expensive operation. I don't know of any. This seems like a straight-forward optimization.
,
Apr 21 2017
On Linux/android madv_free has non zero cost (although I don't have any data handy). From what I recall the kernel walks the mm and frees up the pages, un linking the pets from them, which in turn requires a tlb flush iirc. While doing this it hold the map_sem lock, which will lock any other attempt to mmap from other threads within the same process. If isn't too hard this is something I'd do in a idle handler on in some quiescent moment, not every single time the ring buffer reader advances.
,
Apr 21 2017
If we intend on doing work that is *relatively fast*, we should do it immediately, rather than schedule work to be done at some undetermined time in the future. Especially if it frees resources that may very well be necessary elsewhere. I would recommend: be aware of page boundaries, and don't bother making the syscall unless we cross page boundaries.
,
Apr 25 2017
> If we intend on doing work that is *relatively fast*, we should do it immediately, This really depends on whether the code that would call madvise is a fastpath or not. I would definitely advise against madvise (like any other syscall) in a fastpath.
,
Apr 25 2017
I did some experiments a while ago, and madvise(DONT_NEED) on 17 KiB buffer was roughly equivalent to malloc+free on Android. On Linux it was several times slower (i.e. it was faster to malloc+free). What sizes we're talking about here? Don't see any figures in the previous messages.
,
Apr 25 2017
For URL loading, AFAICT the data pipes are configured with a buffer size of 512 kB. But these are ring buffers and ultimately we're talking about xfering arbitrarily large resources through them, so at least on the order of (data_size / 512kB) madvise calls.
,
Apr 25 2017
Worth noting that most responses are less than 512k (Or at least I believe they are), and we generally only fill the buffer ~32k at a time, so depending on how often the call is made, we may be seeing many more or less ma_advise calls.
,
Apr 25 2017
Hmm, I wonder if the simpler way to improve this is to reset start/end to the beginning when they become equal. I.e. instead of moving start/end pair through the whole buffer (and dirtying it in the process), we would operate on the beginning of the buffer. Or maybe it already works that way?
,
Apr 25 2017
We read and write to different parts of the buffer, so we never stall on processing data (Assuming processing + IPC rtt is never longer than it takes to read 512k from the network).
,
Apr 25 2017
It does not work that way today. Doing so would require a little extra synchronization between producer and consumer -- essentially, if the producer is able to detect that the buffer was empty before doing a write, it can send an extra message to the consumer indicating a buffer reset before signaling the write itself. Also to clarify, the important behavior here wrt madvise is the consumer's read operation, rather than the write operation. So if we typically write in 32k chunks for loading, how do we read? Do we always just flush all available bytes in the consumer? If the common case is loading sub 512 kB resources, this is probably all moot anyway because the whole buffer will just be freed before anything interesting can happen.
,
Apr 25 2017
We always flush all data we've received to the consumer.
,
Jun 20 2017
,
Jun 21 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 17
,
Nov 15
Backlog triage. I notice this discussion fizzled out It sounds like madv_free may be too expensive to call indiscriminately on every data pipe read, but we could still consider adding an explicit Mojo API for a consumer to signal that it expects the pipe to be idle for some time. Consumers of particularly wide data pipes could call it as a *potential* memory optimization after they finish reading a large stream. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by erikc...@chromium.org
, Apr 20 2017