New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 812346 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Bug



Sign in to add a comment

Update macOS accounting for private_memory_footprint.

Project Member Reported by erikc...@chromium.org, Feb 14 2018

Issue description

On macOS, both phys_footprint and [internal + compressed] overcounts private memory footprint by including anonymous, shared memory that is:
  * faulted 
  * resident

So let's say that there's a shared memory object [that has been mapped in another process and has 10 dirty pages]. The shared memory object is mapped three times into the current process, but only 1 page of one of the memory regions is faulted.

Then phys_footprint will overcount by exactly 1 page. This was confirmed with a test program at https://docs.google.com/document/d/1vltgFPqylHqpxkyyCM9taVPWNOTJkzu_GjuqdEwYofM/edit?ts=58e54b8e#heading=h.wwg2vj3pnloa

So I see three ways forward:

1) Don't touch the implementation of private_memory_footprint. It will be off by O(5MB) per renderer. O(20MB) per browser.

2) Attempt to do some discounting. Unfortunately, there is no way to determine if a given page has been faulted. We can *guess* that all resident, shared memory pages have been faulted. 

3) Fault every page on mapping a shared memory region, and then discounting will be accurate.  


primiano, lalitm, hjd: Thoughts?
 
Chatted with primiano. After some discussion, he seemed to like (2), where we discount by SharedMemoryTracker's resident_size(). 
Project Member

Comment 2 by bugdroid1@chromium.org, Feb 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d81c5a39cfdae8e19e88e51088f4f99024c897f2

commit d81c5a39cfdae8e19e88e51088f4f99024c897f2
Author: erikchen <erikchen@chromium.org>
Date: Fri Feb 23 17:15:16 2018

Fix platform private footprint on macOS.

The previous calculation used platform, process counters, which includes
anonymous, faulted, shared memory. Since we now explicitly track all shared
memory regions, we discount by anonymous, resident, shared memory. This
introduces a slight source of error [difference between resident vs faulted],
but the error should be lower than previous.

Measured locally, this change will cause PrivateMemoryFootprint to be lower by
0-8MB for most processes, up to 30MB [rarely] for the browser.

Change-Id: I8b562fc185d215debbcfde60a3ad5c35302271b9
Bug:  812346 
Reviewed-on: https://chromium-review.googlesource.com/924541
Commit-Queue: Erik Chen <erikchen@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Cr-Commit-Position: refs/heads/master@{#538812}
[modify] https://crrev.com/d81c5a39cfdae8e19e88e51088f4f99024c897f2/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc

Status: Fixed (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Mar 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/db5d5779e8571c186dc2c046367f4fcb87d59034

commit db5d5779e8571c186dc2c046367f4fcb87d59034
Author: erikchen <erikchen@chromium.org>
Date: Thu Mar 15 22:56:28 2018

Fix underflow in computation of PrivateMemoryFootprint on macOS.

The underflow was occurring during subtraction of two uints, before the
saturated_cast could fix the result.

Bug:  812346 
Change-Id: I08acf91e137880fe7aba0d41f8ea362c9f81315c
Reviewed-on: https://chromium-review.googlesource.com/955908
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Commit-Queue: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#543533}
[modify] https://crrev.com/db5d5779e8571c186dc2c046367f4fcb87d59034/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc

Labels: -Pri-3 Merge-Request-66 M-66 Pri-1
Project Member

Comment 6 by sheriffbot@chromium.org, Mar 20 2018

Labels: -Merge-Request-66 Merge-Review-66 Hotlist-Merge-Review
This bug requires manual review: M66 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: OS-Mac
How safe is this merge overall? 
Very safe. It's a tiny accounting change.
Labels: -Merge-Review-66 Merge-Approved-66
Approved - branch:3359
Status: Started (was: Fixed)
Beh. I'm actually going to revert the first change from M-66 [also a very small change]. There are still some accounting issues I want to sort out in Canary.
Cc: sullivan@chromium.org charliea@chromium.org tandrii@chromium.org dpranke@chromium.org erikc...@chromium.org
 Issue 822722  has been merged into this issue.
Unfortunately, this logic doesn't work as-is. The problem is that the browser's implementation of DiscardableSharedMemory pathologically violates the assumption that resident shared memory is faulted.

To reproduce this issue, navigate a tab to flickr and start scrolling. Note that browser memory footprint drops to 0.

In the current implementation of DiscardableSharedMemory, the manager [the browser process] holds a mapping of the shared memory segment. On memory pressure signal, the browser sets a bit in the first page [using compare-and-swap] to indicate that the segment has been discarded. If the CAS was successful, the browser calls madvise(MADV_FREE_REUSABLE) to mark the pages as reusable. 

The renderer uses DiscardableSharedMemory for its decoded image cache, whose contents is shared wit hteh GPU for rendering. This results in all the pages being resident, but [almost] none of them being faulted in the browser process.
Project Member

Comment 13 by bugdroid1@chromium.org, Mar 20 2018

Labels: -merge-approved-66 merge-merged-3359
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/236210c254fbee4c2ef24b7575245f88fe1e77f7

commit 236210c254fbee4c2ef24b7575245f88fe1e77f7
Author: erikchen <erikchen@chromium.org>
Date: Tue Mar 20 17:05:46 2018

[Branch 3359] Revert "Fix platform private footprint on macOS."

There is a pathological use case of shared memory that causes this logic
to drastically undercount browser memory usage.

https://bugs.chromium.org/p/chromium/issues/detail?id=812346#c12

This reverts commit d81c5a39cfdae8e19e88e51088f4f99024c897f2.

Bug:  812346 
Change-Id: I7ce0e1f229b4e44cd7a6bce1f39a37aac1bb6978
Reviewed-on: https://chromium-review.googlesource.com/971207
Reviewed-by: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/branch-heads/3359@{#346}
Cr-Branched-From: 66afc5e5d10127546cc4b98b9117aff588b5e66b-refs/heads/master@{#540276}
[modify] https://crrev.com/236210c254fbee4c2ef24b7575245f88fe1e77f7/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc

And just to satisfy my own curiosity, I confirmed that calling madvise(...MADV_FREE_REUSABLE) on a shared memory segment has the correct behavior [updates private memory footprint in all mapped processes].
Cc: -tandrii@chromium.org
I came up with a solution.

We rely on two assumptions:

1) Consumers of shared memory use it from front to back. [If byte N is resident, then all bytes 0 to N will be resident] - unenforceable, but I believe this is currently true for consumers of shared memory in Chrome. e.g. there are no relevant calls to madvise(MADVISE_FREE_REUSABLE) that do anything different.

2) We always run SharedMemoryTracker calculations before private_memory_footprint calculations.

Immediately following calculation for resident size in SharedMemoryTracker, we proceed to touch the first (RESIDENT_SIZE / PAGE_SIZE) pages of the shared memory region. 

* Most of the time, we expect the pages to already be resident and faulted, thus incurring a cache penalty read hit [since we have to touch all the pages to force faults]. 

* Rarely, we expect the pages to be resident but not faulted, resulting in soft faults + cache penalty. 

I expect this performance hit to be relatively small, and only occurred when trying to calculate private memory footprint.


Then, when we run private_memory_footprint calculations, we expect resident == faulted [modulo changes that occurred since SharedMemoryTracker calculations].
> 1) Consumers of shared memory use it from front to back. 
Yup, not enforceable but reasonable

> 
2) We always run SharedMemoryTracker calculations before private_memory_footprint calculations.

I think this is by design and we can depend on it, right?

> Immediately following calculation for resident size in SharedMemoryTracker, we proceed to touch the first (RESIDENT_SIZE / PAGE_SIZE) pages of the shared memory region. 

A bit of a hack, but effective:  touching 1 byte per page shouldn't be that bad.

> I expect this performance hit to be relatively small, and only occurred when trying to calculate private memory footprint.

Yup. Realistically I don't think I've ever seen > 32M shmem pages. 32M / 4K = a loop with 8k accesses which should resolve in most cases in a noop.

SGTM.

> Yup. Realistically I don't think I've ever seen > 32M shmem pages. 32M / 4K = a loop with 8k accesses which should resolve in most cases in a noop.

The reason this is relevant is because the implementation of DiscardableSharedMemory on macOS has up to 500MB of cached, decoded [un-faulted] images in the browser process. There's an open bug to reduce this number:

https://bugs.chromium.org/p/chromium/issues/detail?id=810486#c12
Project Member

Comment 19 by bugdroid1@chromium.org, Mar 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7c1d03f55ec7a248aece22abfdef25e100bcf4f4

commit 7c1d03f55ec7a248aece22abfdef25e100bcf4f4
Author: erikchen <erikchen@chromium.org>
Date: Thu Mar 22 15:46:04 2018

Correct macOS private memory footprint calculations.

On macOS, measurements for private memory footprint overcount by
faulted pages in anonymous shared memory. To discount for this, this CL touches
all the resident pages in anonymous shared memory, thus making them
faulted as well. This relies on two assumptions:

1) Consumers use shared memory from front to back. Thus, if there are
(N) resident pages, those pages represent the first N * PAGE_SIZE bytes in
the shared memory region.

2) The faulting logic is run shortly before the logic that calculates
phys_footprint, thus ensuring that the discrepancy between faulted and
resident pages is minimal.

The performance penalty is expected to be small.

* Most of the time, we expect the pages to already be resident and faulted,
thus incurring a cache penalty read hit [since we read from each resident
page].

* Rarely, we expect the pages to be resident but not faulted, resulting in
soft faults + cache penalty.

* If assumption (1) is invalid, this will potentially fault some
previously non-resident pages, thus increasing memory usage, without fixing
the accounting.

Bug:  812346 
Change-Id: I04e91bc09bf6bdf2f9179dd8d678c92425e98fed
Reviewed-on: https://chromium-review.googlesource.com/973883
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Reviewed-by: Albert J. Wong <ajwong@chromium.org>
Commit-Queue: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#545086}
[modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/base/trace_event/process_memory_dump.cc
[modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc
[modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/public/cpp/memory_instrumentation/client_process_impl.cc
[modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/public/cpp/memory_instrumentation/client_process_impl.h

Status: Fixed (was: Started)

Sign in to add a comment