New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 713775 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Feature



Sign in to add a comment

Should we use madv_free on content::ResourceBuffer and gpuTransferBuffers?

Project Member Reported by erikc...@chromium.org, Apr 20 2017

Issue description

Both use shared memory regions. Both don't operate at full capacity. When we free memory in each, should we be using madv_free to actually free the underlying resident pages? 

+avi to comment for content/
+vmiura to comment for gpu/
 
Cc: xunji...@chromium.org
Components: Internals>Network Blink>Loader
Looks like content::ResourceBuffer is only used in AsyncResourceHandler, which is owned by the network team. 
What's the advantage/potential disadvantage of using madv_free on content::ResourceBuffer?  What's the current behavior?
Cc: mariakho...@chromium.org haraken@chromium.org
content::ResourceBuffer is a circular buffer, implemented with a contiguous base::SharedMemory region, and some metadata [start, end]. 

When we first create the SharedMemory region, the unused part of the region is not present in physical memory [has no resident footprint]. As we start to use the content::ResourceBuffer, the OS lazily grabs and reserves pages. However, as we update the metadata to "free" part of the region, we never actually free the underlying physical pages. The OS doesn't know that the pages can be reclaimed, even though they're not in use.

e.g.
1)
|start end -----------------------------|
resident:0
ResourceBuffer accounting: 0

2)
|start ---------------- end ------------|
resident:5
ResourceBuffer accounting: 5

3)
|---------------- start end ------------|
resident:5
ResourceBuffer accounting: 0

I haven't verified this in detail, this is just what I suspect is happening.

If instead we used madv_free when we updated the internal accounting, resident memory should go down as well. This may potentially have a large impact on Android, where saving a couple of MBs here might be very significant.

Adding some people on the intersection of Android + Memory, as this may be very interesting to them.
Maybe the takeaway here is that any place we are implementing our own allocators [partition alloc, and in this case, I think of content::ResourceBuffer as a super simple allocator, transfer buffers, etc.], we should be using madv_free to make sure that the OS knows which memory is actually in use, and which can be reclaimed.
Cc: dskiba@chromium.org
AsyncResourceHandlers are generally short-lived for small resources. However, it's possible that we get many outstanding requests, in which case we will have many of these buffers around. I'm adding UMA to see how many outstanding requests we get https://codereview.chromium.org/2828733003/ from renderers.
Is there any metrics that we can collect from the wild to see if content::ResourceBuffer is a problem?

Comment 7 by dskiba@chromium.org, Apr 20 2017

Interesting. I believe PartitionAlloc already madvises memory away.

How big is that buffer, and is it reported to memory-infra? 

Comment 8 by mmenke@chromium.org, Apr 20 2017

AsyncResourceHandler is going away very, very soon (As in we're running field trials now that use Mojo instead).  So as far as AsyncResourceHandler is concerned, I don't think we want to invest time in improving it.  No idea about gpuTransferBuffers.
mmenke: Is the same problem going to be present in the new Mojo version?
No idea.  If it is, it's in mojo's code for transferring streams of data, not content/browser/loader code.
Cc: roc...@chromium.org jam@chromium.org
+rockot, jam to answer the Mojo question.
Components: -Internals>Network -Blink>Loader Internals>Mojo
Removing network label, since the focus is no longer network code.
I'm not sure I fully understand what the question is. My understanding is that you're referring to the general scenario of having a large shared memory buffer where some region is unused (or already consumed) and could thus be madv_free'd.

I assume the Mojo question here is in reference to Mojo data pipes, which use a shared ring buffer for internal storage. madv_free'ing chunks of the pipe buffer as they're consumed seems reasonable to me.

Is there any significant reason not to do this? I can't imagine it's an expensive operation.
> Is there any significant reason not to do this? I can't imagine it's an expensive operation.

I don't know of any. This seems like a straight-forward optimization. 
On Linux/android madv_free has non zero cost (although I don't have any data handy). From what I recall  the kernel walks the mm and frees up the pages, un linking the pets from them, which in turn requires a tlb flush iirc. While doing this it hold the map_sem lock, which will lock any other attempt to mmap from other threads within the same process. If isn't too hard this is something I'd do in a idle handler on in some quiescent moment, not every single time the ring buffer reader advances. 
If we intend on doing work that is *relatively fast*, we should do it immediately, rather than schedule work to be done at some undetermined time in the future. Especially if it frees resources that may very well be necessary elsewhere. 

I would recommend: be aware of page boundaries, and don't bother making the syscall unless we cross page boundaries. 
> If we intend on doing work that is *relatively fast*, we should do it immediately, 
This really depends on whether the code that would call madvise is a fastpath or not.
I would definitely advise against madvise (like any other syscall) in a fastpath.
I did some experiments a while ago, and madvise(DONT_NEED) on 17 KiB buffer was roughly equivalent to malloc+free on Android. On Linux it was several times slower (i.e. it was faster to malloc+free).

What sizes we're talking about here? Don't see any figures in the previous messages.
For URL loading, AFAICT the data pipes are configured with a buffer size of
512 kB. But these are ring buffers and ultimately we're talking about
xfering arbitrarily large resources through them, so at least on the order
of (data_size / 512kB) madvise calls.
Worth noting that most responses are less than 512k (Or at least I believe they are), and we generally only fill the buffer ~32k at a time, so depending on how often the call is made, we may be seeing many more or less ma_advise calls.
Hmm, I wonder if the simpler way to improve this is to reset start/end to the beginning when they become equal.

I.e. instead of moving start/end pair through the whole buffer (and dirtying it in the process), we would operate on the beginning of the buffer.

Or maybe it already works that way?
We read and write to different parts of the buffer, so we never stall on processing data (Assuming processing + IPC rtt is never longer than it takes to read 512k from the network).
It does not work that way today. Doing so would require a little extra
synchronization between producer and consumer -- essentially, if the
producer is able to detect that the buffer was empty before doing a write,
it can send an extra message to the consumer indicating a buffer reset
before signaling the write itself.

Also to clarify, the important behavior here wrt madvise is the consumer's
read operation, rather than the write operation. So if we typically write
in 32k chunks for loading, how do we read? Do we always just flush all
available bytes in the consumer?

If the common case is loading sub 512 kB resources, this is probably all
moot anyway because the whole buffer will just be freed before anything
interesting can happen.
We always flush all data we've received to the consumer.
Labels: -Type-Bug Type-Feature
Status: Available (was: Untriaged)
Project Member

Comment 26 by sheriffbot@chromium.org, Jun 21 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: -roc...@chromium.org rockot@google.com
Components: -Internals>Mojo Internals>Mojo>Core
Labels: -Hotlist-Recharge-Cold
Status: Available (was: Untriaged)
Backlog triage. I notice this discussion fizzled out

It sounds like madv_free may be too expensive to call indiscriminately on every data pipe read, but we could still consider adding an explicit Mojo API for a consumer to signal that it expects the pipe to be idle for some time. Consumers of particularly wide data pipes could call it as a *potential* memory optimization after they finish reading a large stream.

Sign in to add a comment