Update macOS accounting for private_memory_footprint. |
|||||||||||
Issue descriptionOn macOS, both phys_footprint and [internal + compressed] overcounts private memory footprint by including anonymous, shared memory that is: * faulted * resident So let's say that there's a shared memory object [that has been mapped in another process and has 10 dirty pages]. The shared memory object is mapped three times into the current process, but only 1 page of one of the memory regions is faulted. Then phys_footprint will overcount by exactly 1 page. This was confirmed with a test program at https://docs.google.com/document/d/1vltgFPqylHqpxkyyCM9taVPWNOTJkzu_GjuqdEwYofM/edit?ts=58e54b8e#heading=h.wwg2vj3pnloa So I see three ways forward: 1) Don't touch the implementation of private_memory_footprint. It will be off by O(5MB) per renderer. O(20MB) per browser. 2) Attempt to do some discounting. Unfortunately, there is no way to determine if a given page has been faulted. We can *guess* that all resident, shared memory pages have been faulted. 3) Fault every page on mapping a shared memory region, and then discounting will be accurate. primiano, lalitm, hjd: Thoughts?
,
Feb 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d81c5a39cfdae8e19e88e51088f4f99024c897f2 commit d81c5a39cfdae8e19e88e51088f4f99024c897f2 Author: erikchen <erikchen@chromium.org> Date: Fri Feb 23 17:15:16 2018 Fix platform private footprint on macOS. The previous calculation used platform, process counters, which includes anonymous, faulted, shared memory. Since we now explicitly track all shared memory regions, we discount by anonymous, resident, shared memory. This introduces a slight source of error [difference between resident vs faulted], but the error should be lower than previous. Measured locally, this change will cause PrivateMemoryFootprint to be lower by 0-8MB for most processes, up to 30MB [rarely] for the browser. Change-Id: I8b562fc185d215debbcfde60a3ad5c35302271b9 Bug: 812346 Reviewed-on: https://chromium-review.googlesource.com/924541 Commit-Queue: Erik Chen <erikchen@chromium.org> Reviewed-by: Primiano Tucci <primiano@chromium.org> Cr-Commit-Position: refs/heads/master@{#538812} [modify] https://crrev.com/d81c5a39cfdae8e19e88e51088f4f99024c897f2/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc
,
Feb 28 2018
,
Mar 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/db5d5779e8571c186dc2c046367f4fcb87d59034 commit db5d5779e8571c186dc2c046367f4fcb87d59034 Author: erikchen <erikchen@chromium.org> Date: Thu Mar 15 22:56:28 2018 Fix underflow in computation of PrivateMemoryFootprint on macOS. The underflow was occurring during subtraction of two uints, before the saturated_cast could fix the result. Bug: 812346 Change-Id: I08acf91e137880fe7aba0d41f8ea362c9f81315c Reviewed-on: https://chromium-review.googlesource.com/955908 Reviewed-by: Primiano Tucci <primiano@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#543533} [modify] https://crrev.com/db5d5779e8571c186dc2c046367f4fcb87d59034/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc
,
Mar 20 2018
,
Mar 20 2018
This bug requires manual review: M66 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Mar 20 2018
How safe is this merge overall?
,
Mar 20 2018
Very safe. It's a tiny accounting change.
,
Mar 20 2018
Approved - branch:3359
,
Mar 20 2018
Beh. I'm actually going to revert the first change from M-66 [also a very small change]. There are still some accounting issues I want to sort out in Canary.
,
Mar 20 2018
Issue 822722 has been merged into this issue.
,
Mar 20 2018
Unfortunately, this logic doesn't work as-is. The problem is that the browser's implementation of DiscardableSharedMemory pathologically violates the assumption that resident shared memory is faulted. To reproduce this issue, navigate a tab to flickr and start scrolling. Note that browser memory footprint drops to 0. In the current implementation of DiscardableSharedMemory, the manager [the browser process] holds a mapping of the shared memory segment. On memory pressure signal, the browser sets a bit in the first page [using compare-and-swap] to indicate that the segment has been discarded. If the CAS was successful, the browser calls madvise(MADV_FREE_REUSABLE) to mark the pages as reusable. The renderer uses DiscardableSharedMemory for its decoded image cache, whose contents is shared wit hteh GPU for rendering. This results in all the pages being resident, but [almost] none of them being faulted in the browser process.
,
Mar 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/236210c254fbee4c2ef24b7575245f88fe1e77f7 commit 236210c254fbee4c2ef24b7575245f88fe1e77f7 Author: erikchen <erikchen@chromium.org> Date: Tue Mar 20 17:05:46 2018 [Branch 3359] Revert "Fix platform private footprint on macOS." There is a pathological use case of shared memory that causes this logic to drastically undercount browser memory usage. https://bugs.chromium.org/p/chromium/issues/detail?id=812346#c12 This reverts commit d81c5a39cfdae8e19e88e51088f4f99024c897f2. Bug: 812346 Change-Id: I7ce0e1f229b4e44cd7a6bce1f39a37aac1bb6978 Reviewed-on: https://chromium-review.googlesource.com/971207 Reviewed-by: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/branch-heads/3359@{#346} Cr-Branched-From: 66afc5e5d10127546cc4b98b9117aff588b5e66b-refs/heads/master@{#540276} [modify] https://crrev.com/236210c254fbee4c2ef24b7575245f88fe1e77f7/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc
,
Mar 20 2018
And just to satisfy my own curiosity, I confirmed that calling madvise(...MADV_FREE_REUSABLE) on a shared memory segment has the correct behavior [updates private memory footprint in all mapped processes].
,
Mar 20 2018
,
Mar 20 2018
I came up with a solution. We rely on two assumptions: 1) Consumers of shared memory use it from front to back. [If byte N is resident, then all bytes 0 to N will be resident] - unenforceable, but I believe this is currently true for consumers of shared memory in Chrome. e.g. there are no relevant calls to madvise(MADVISE_FREE_REUSABLE) that do anything different. 2) We always run SharedMemoryTracker calculations before private_memory_footprint calculations. Immediately following calculation for resident size in SharedMemoryTracker, we proceed to touch the first (RESIDENT_SIZE / PAGE_SIZE) pages of the shared memory region. * Most of the time, we expect the pages to already be resident and faulted, thus incurring a cache penalty read hit [since we have to touch all the pages to force faults]. * Rarely, we expect the pages to be resident but not faulted, resulting in soft faults + cache penalty. I expect this performance hit to be relatively small, and only occurred when trying to calculate private memory footprint. Then, when we run private_memory_footprint calculations, we expect resident == faulted [modulo changes that occurred since SharedMemoryTracker calculations].
,
Mar 21 2018
> 1) Consumers of shared memory use it from front to back. Yup, not enforceable but reasonable > 2) We always run SharedMemoryTracker calculations before private_memory_footprint calculations. I think this is by design and we can depend on it, right? > Immediately following calculation for resident size in SharedMemoryTracker, we proceed to touch the first (RESIDENT_SIZE / PAGE_SIZE) pages of the shared memory region. A bit of a hack, but effective: touching 1 byte per page shouldn't be that bad. > I expect this performance hit to be relatively small, and only occurred when trying to calculate private memory footprint. Yup. Realistically I don't think I've ever seen > 32M shmem pages. 32M / 4K = a loop with 8k accesses which should resolve in most cases in a noop. SGTM.
,
Mar 21 2018
> Yup. Realistically I don't think I've ever seen > 32M shmem pages. 32M / 4K = a loop with 8k accesses which should resolve in most cases in a noop. The reason this is relevant is because the implementation of DiscardableSharedMemory on macOS has up to 500MB of cached, decoded [un-faulted] images in the browser process. There's an open bug to reduce this number: https://bugs.chromium.org/p/chromium/issues/detail?id=810486#c12
,
Mar 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7c1d03f55ec7a248aece22abfdef25e100bcf4f4 commit 7c1d03f55ec7a248aece22abfdef25e100bcf4f4 Author: erikchen <erikchen@chromium.org> Date: Thu Mar 22 15:46:04 2018 Correct macOS private memory footprint calculations. On macOS, measurements for private memory footprint overcount by faulted pages in anonymous shared memory. To discount for this, this CL touches all the resident pages in anonymous shared memory, thus making them faulted as well. This relies on two assumptions: 1) Consumers use shared memory from front to back. Thus, if there are (N) resident pages, those pages represent the first N * PAGE_SIZE bytes in the shared memory region. 2) The faulting logic is run shortly before the logic that calculates phys_footprint, thus ensuring that the discrepancy between faulted and resident pages is minimal. The performance penalty is expected to be small. * Most of the time, we expect the pages to already be resident and faulted, thus incurring a cache penalty read hit [since we read from each resident page]. * Rarely, we expect the pages to be resident but not faulted, resulting in soft faults + cache penalty. * If assumption (1) is invalid, this will potentially fault some previously non-resident pages, thus increasing memory usage, without fixing the accounting. Bug: 812346 Change-Id: I04e91bc09bf6bdf2f9179dd8d678c92425e98fed Reviewed-on: https://chromium-review.googlesource.com/973883 Reviewed-by: Primiano Tucci <primiano@chromium.org> Reviewed-by: Albert J. Wong <ajwong@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#545086} [modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/base/trace_event/process_memory_dump.cc [modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/memory_instrumentation/queued_request_dispatcher.cc [modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/public/cpp/memory_instrumentation/client_process_impl.cc [modify] https://crrev.com/7c1d03f55ec7a248aece22abfdef25e100bcf4f4/services/resource_coordinator/public/cpp/memory_instrumentation/client_process_impl.h
,
Mar 26 2018
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by erikc...@chromium.org
, Feb 15 2018