New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 713768 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug
Hotlist-MemoryInfra

Blocked on:
issue 741984
issue 742042

Blocking:
issue 707021
issue 737671



Sign in to add a comment

base::SharedMemoryTracker counts virtual, but not resident/dirty shared memory.

Project Member Reported by erikc...@chromium.org, Apr 20 2017

Issue description

While investigating calculations for shared memory in Chromium:
https://docs.google.com/document/d/1vltgFPqylHqpxkyyCM9taVPWNOTJkzu_GjuqdEwYofM/edit#heading=h.nzl03y839y4u

I noticed the following: If I start Chromium with a clean profile, the browser process has 10MB of virtual shared memory, but only 3MB of resident/dirty.

I added some super simple tracking and noticed that Chromium also counted 10MB of mapped shared memory. 

I modified base::SharedMemory to fault in all pages after mapping, and this brought up resident/dirty to 10MB.

Looking into consumers of shared memory, the two largest are:
  content::ResourceBuffer [content/browser/loader/resource_buffer.cc.]
  gpu Transfer Buffers [gpu/ipc/service/gpu_command_buffer_stub.cc, see GpuCommandBufferStub::OnRegisterTransferBuffer]

Both of these are buffers used to transfer resources between processes, and most of the time, not all pages will be in use.

A legitimate argument can be made that we care about virtual size, not resident/dirty. 

Regardless, I need to be able to count resident/dirty memory for calculating the private memory footprint on macOS:
https://docs.google.com/document/d/1vltgFPqylHqpxkyyCM9taVPWNOTJkzu_GjuqdEwYofM/edit#heading=h.d4pkr7n5m4fr

The easiest way to do this would be to register the base::SharedMemory object itself during map, and then we can have

base::SharedMemoryTracker::GetResidentSize()
  resident = 0
  for shared_memory in shared_memory_regions:
    resident += shared_memory.GetResidentSize()

base::SharedMemory::GetResidentSize would use syscalls to get the resident size of the shared memory region.
    
There are other approaches that I think scale less well [e.g. make more memory dump providers for content::ResourceProvider, gpu transfer buffers that keep track of in-use memory.]

Thoughts?
 
Blocking: 707021
Cc: xunji...@chromium.org
AsyncResourceHandler is one consumer of ResourceBuffer ( Issue 622363 ). It's on my TODO list to investigate the 512K ring buffer per resource. Please keep me updated as well.
Cc: ssid@chromium.org
Here's my TL;DR thought on this:
 Short term (~ until Q3):
  Are GetResidentSize() performances tolerable? *
  YES: use GetResidentSize()
  NO: just assume resident == virtual
 Long term: (>Q3)
  Use the data from the more specific MDP (net/gpu) to compute resident, where available, and either fallback to GetResidentSize() or just assume resident==virtual for the rest.

Why not long-term right now?
This is because in the MVP of memory-infra UMA we won't have yet the effective_size (which is computed using the memory-infra graphs and its edges) until we move all that logic from trace-viewer to chrome, which won't happen before Q3.
So right now we could make it work only for chrome://Tracing /  telemetry / slow-reports (and anything that uses trace-viewer / catapult) but not the in-chrome memory-infra service and hence memory-infra UMA.


* Concretely GetResidentSize() is a matter of doing mincore() calls on linux/android/OSX, which is one syscall for each contiguous virtual address range (i.e. one per Shared memory). Something very similar on Windows (QueryWorkingSetEx). See ProcessMemoryDump::CountResidentBytes() which does that today.
IIRC +ssid a while ago even collected some data, ssid do you remember in which of our M or bugs is that? :)

I'm not sure if this is related to this issue or not, but in DiscardableSharedMemoryManager::OnMemoryDump, "virtual_size", "locked_size" and "resident_size" are used instead of "size". IIUC "resident_size" should be "size" but why does this use different words?
As the experience of crbug/604726, we found that GetResidentSize that calls mincore can't be called from renderers/gpu processes due to sandboxes, but we need to know resident sizes in SharedMemoryTracker::OnMemoryDump that can be called in any processes.

ssid@, I thought you had an idea to tackle this (sorry if I'm misunderstanding).

Comment 6 by ssid@chromium.org, Jun 15 2017

Re #3: the performance of mincore calls in linux and Android are bad. I don't remember the numbers, but high enough that we couldn't enable it on light dumps which happened every 250ms. So the time consumed is comparable to 200ms just for discardable memory segments.

Re #4: discardable OnMemoryDump doesn't add size because it expects the renderer to add the "size"attribute using the global dump trick. Any process can add "size" to the global dump and both processes' dumps associated with the global dump gets the "size" from the global dump. If you see discardable dump from renderer side it'll add size. None of locked size, resident size or virtual size is taken as size.

Re #5: yes we have a way as I mentioned just above to propagate size to dumps in renderer if they can't call mincore. But the real issue I think is the performance of mincore.

Also yes, if we want this information in uma then we need the effective size calculation logic or some have similar to what we did for summary is needed. We can get to that once we solve performance issue in all platforms. I think Erik has a solution for fast resident size calculation in macos.
To get resident size of a shared memory region on macOS, you can use mach_vm_region(...,TOP_INFO)... which is a single sys-call that checks from OS-counters for the underlying memory object.

example: https://cs.chromium.org/chromium/src/base/process/process_metrics_mac.cc?type=cs&q=VM_REGION_TOP_INFO+package:%5Echromium$&l=481
Labels: -Pri-3 Pri-2
Owner: hajimehoshi@chromium.org
Blocking: 737671
#6

>  If you see discardable dump from renderer side it'll add size. None of locked size, resident size or virtual size is taken as size.

Thanks, I understood that 'size' is added at renderer side (discardable_shared_memory_heap.cc), but I couldn't find a file to take 'resident_size' or 'virtual_size' when 'size' doesn't exist.
#6

> Re #5: yes we have a way as I mentioned just above to propagate size to dumps in renderer if they can't call mincore.

You mean we use sync IPC?

IIUC, base::SharedMemory is created at browser process in most cases and mmap-ed 
in any processes. Would it make sense to track all 'created' SharedMemory in browser process and report their resident sizes?

The current SharedMemoryTracker dumps only mmap-ed shared memory objects, so we would need to fix this.
> Thanks, I understood that 'size' is added at renderer side (discardable_shared_memory_heap.cc), but I couldn't find a file to take 'resident_size' or 'virtual_size' when 'size' doesn't exist.

I think discarable dumps "most" of the sizes on the renderer side (renderer is the only one who knows about freelists) but dumps the resident size via mincore from the browser side (specifically here:
https://cs.chromium.org/chromium/src/components/discardable_memory/service/discardable_shared_memory_manager.cc?rcl=2541c478ddb42e82cfb019d50eca544878ed6464&l=337)

> IIUC, base::SharedMemory is created at browser process in most cases and mmap-ed 
> in any processes. Would it make sense to track all 'created' SharedMemory in browser process and report their resident sizes?
If we can rely on the fact that one side of the sharedmemory is always in the browser, yes, I think a good strategy is to have always the browser (which is generally more privileged) to dump the resident information.

> The current SharedMemoryTracker dumps only mmap-ed shared memory objects, so we would need to fix this.
WDYM here?
Status: Started (was: Untriaged)
> > Thanks, I understood that 'size' is added at renderer side (discardable_shared_memory_heap.cc), but I couldn't find a file to take 'resident_size' or 'virtual_size' when 'size' doesn't exist.
> 
> I think discarable dumps "most" of the sizes on the renderer side (renderer is the only one who knows about freelists) but dumps the resident size via mincore from the browser side (specifically here:
https://cs.chromium.org/chromium/src/components/discardable_memory/service/discardable_shared_memory_manager.cc?rcl=2541c478ddb42e82cfb019d50eca544878ed6464&l=337)

Yeah I understand resident size is calculated on browser side, but my question was, when 'size' is not registered, 'resident_size' or 'virtual_size' is taken as 'size' at memory-infra as ssid@ said, but what file does this?

> > IIUC, base::SharedMemory is created at browser process in most cases and mmap-ed 
> > in any processes. Would it make sense to track all 'created' SharedMemory in browser process and report their resident sizes?
> If we can rely on the fact that one side of the sharedmemory is always in the browser, yes, I think a good strategy is to have always the browser (which is generally more privileged) to dump the resident information.

Not always but mostly, SharedMemory is created at browser process:

https://docs.google.com/spreadsheets/d/13GGbVPyAh6WlBfCZTeDYqw5zodZKjhRpSTsNAWU8Xqo/edit?usp=sharing

There are few shared memory created at other processes, but it's rare. So, I'm thinking to track created shared memory and dump them with resident size. The remaining concern is that mincore is expensive. As erikchen@ suggested at #15, mach_vm_region is available but isn't this still expensive? How about Linux's mincore or Windows? Probably we need to take some measurements.

> > The current SharedMemoryTracker dumps only mmap-ed shared memory objects, so we would need to fix this.
> WDYM here?

The current SharedMemoryTracker tracks only mmapp-ed shared memory, and we need to fix this to track also created memory, right?
I've written a design doc: https://docs.google.com/document/d/1pNwUJVarTCZYaBbiZ4LUuRyTGAocCHKicnats20mLSM/edit#
, although I'm still not sure the design, especially regarding attributes like 'size' or 'resident_size'.
Cc: danakj@chromium.org
+danakj@
With --disable-gpu, shared memory is created in the renderer by the layer compositor for raster. This API is used to tell the browser about them. https://cs.chromium.org/chromium/src/cc/ipc/shared_bitmap_allocation_notifier.mojom?rcl=1bf67f8508b01d475944d1b028813cf65439abef&l=15
Blockedon: 742042
Blockedon: 741984
Re #16:

Thank you for the information, it seems rare but it actually happens.


As primiano@ and I have discussed, counting resident bytes of only created shared memory objects on browser process seems not good idea. There are created but not mmapp-ed shared memory objects in browser process, and mmap is required just to get resident size for them. Consuming virtual address space for such purpose would not be allowed.

Finally we concluded to use mincore. Now mincore is allowed on child processes ( crbug.com/741984 ), and performance seems not bad as I described on the doc (https://docs.google.com/document/d/1pNwUJVarTCZYaBbiZ4LUuRyTGAocCHKicnats20mLSM/edit#). On other platforms than Linux, CountResidentBytes is working even in sandbox.

One blocking task is to replace mincore usages with mach_vm_region on macOS for performance ( crbug.com/742042 ), and this is still work in progress.

I have updated the doc: https://docs.google.com/document/d/1pNwUJVarTCZYaBbiZ4LUuRyTGAocCHKicnats20mLSM/edit#

We have confirmed that mincore in POSIX (or other counterparts on the other platforms) is not too expensive and it looks like we can use them in SharedMemoryTracker::OnMemoryDump.
Awesome, thanks for doing the analysis!

Good to see that mach_vm_region is quite fast. :)
I have created a CL to use resident size for shared memory, but this doesn't work until sandbox is fixed (crbug/741984):

https://chromium-review.googlesource.com/c/582065/
Project Member

Comment 23 by bugdroid1@chromium.org, Jul 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7679db6eae7ce7ce4614f0980dad28df4ba0df27

commit 7679db6eae7ce7ce4614f0980dad28df4ba0df27
Author: Hajime Hoshi <hajimehoshi@chromium.org>
Date: Mon Jul 31 09:01:50 2017

Use resident bytes for tracking SharedMemory usage

This CL fixes SharedMemoryTracker::OnMemoryDump to use resident bytes
instead of virtual mmap-ed size so that memory-infra allocator dumps
will show accurate memory usages of shared memory.

SharedMemoryTarcker::OnMemoryDump will use CountResidentBytesInSharedMemory,
that uses syscalls (e.g. mincore on Linux). Calling those adds 1-2 [ms] in total to
create allocation dumps with one renderer, one browser and one gpu process.
See the document regarding the experiments of performance:
https://docs.google.com/document/d/1pNwUJVarTCZYaBbiZ4LUuRyTGAocCHKicnats20mLSM/edit#

Bug:  713768 
Change-Id: Ied163094f9adcf4eba1c109f8ac39a1dae6bb6d3
Reviewed-on: https://chromium-review.googlesource.com/582065
Commit-Queue: Hajime Hoshi <hajimehoshi@chromium.org>
Reviewed-by: danakj <danakj@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Reviewed-by: Siddhartha S <ssid@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490732}
[modify] https://crrev.com/7679db6eae7ce7ce4614f0980dad28df4ba0df27/base/memory/shared_memory_tracker.cc

Status: Fixed (was: Started)

Sign in to add a comment