Implement a custom calculation for MemoryPressure on macOS. |
||||||
Issue description
The current implementation uses sysctlbyname("kern.memorystatus_vm_pressure_level",...)
https://cs.chromium.org/chromium/src/base/memory/memory_pressure_monitor_mac.cc?q=memorypressure+package:%5Echromium$&dr=Ss&l=105
This is implemented in xnu-37891.32/bsd/kern/kern_memorystatus.c:6612, which just forwards to xnu-37891.32/osfmk/vm/vm_pageout.c:4341 (vm_pressure_response).
In that function, there are two values calculated. An enum [memorystatus_vm_pressure_level], and a percentage [memorystatus_level].
-----------------------------------------
memorystatus_level calculation:
roughly, memorystatus_level = AVAILABLE_NON_COMPRESSED_MEMORY * 100 / total_pages.
AVAILABLE_NON_COMPRESSED_MEMORY = vm_page_active_count + vm_page_inactive_count + vm_page_free_count + vm_page_speculative_count
all memory = AVAILABLE_NON_COMPRESSED_MEMORY + VM_PAGE_COMPRESSOR_COUNT + wired pages
You can run this calculation yourself using the values in vm_stat, and confirm that it produces the "System-wide memory free percentage" from memory_pressure.
-----------------------------------------
memorystatus_vm_pressure_level is determined by a state machine. The transition functions are at xnu-37891.32/osfmk/vm/vm_pageout.c:11120
VM_PRESSURE_WARNING_TO_CRITICAL() pseudocode:
if compressed memory > 0.98 * min(64 GB, 16 * physical_memory)
return true
if compressor's memory footprint > 0.98 * min(64GB, 16 * physical_memory))
return true
if max_mem < 3GB:
if AVAILABLE_NON_COMPRESSED_MEMORY < 1.2 * 0.5 * max_mem
return true
if max_mem >= 3GB:
if AVAILABLE_NON_COMPRESSED_MEMORY < 1.2 * 0.28 * max_mem
return true
return false
The first two conditions are almost impossible to hit. The latter two conditions...are kind of crazy. Keep in mind that AVAILABLE_NON_COMPRESSED_MEMORY includes memory that's resident [in use]. The only types of memory that are *not* included are compressed and wired memory.
VM_PRESSURE_NORMAL_TO_WARNING() pseudocode:
if max_mem < 3GB:
if AVAILABLE_NON_COMPRESSED_MEMORY < 0.9 * max_mem
return true
if max_mem >= 3GB:
if AVAILABLE_NON_COMPRESSED_MEMORY < 0.5 * max_mem
return true
return false
-----------------------------------------
With all of this said, perhaps the most important point is that we had a massive memory leak (56GB of leaked GL textures), and the OS memory pressure levels never left NORMAL.
https://bugs.chromium.org/p/chromium/issues/detail?id=700928#c52
I think we need a new calculation. As a rough starting point, I propose:
total_pages_in_use = host_statistics64.wire_count + host_statistics64.inactive_count + host_statistics64.active_count + host_statistics64.total_uncompressed_pages_in_compressor
GetMemoryPressureLevel:
if pages_free > 0.1 * physical_memory
return normal
if total_pages_in_use < 1.0 * physical_memory:
return normal
if total_pages_in_use < 1.5 * physical_memory:
return warning
return critical
This is very rough, and I'd appreciate feedback/revisions, but seems like a better starting point than the OS level counters.
,
Apr 20 2017
> Leaking is surely a bad thing, but I'm not sure it should be reflected to memory pressure. Does the leak actually affect system's responsiveness? IMHO memory pressure should represent something like "we are running out physical memory", regardless there are leaks. Probably we should have other ways to detect leaks? The system was literally unusable [most applications were suspended, could not continue until chrome was killed] I'm just using this leak as an example. My point is that a system can be unresponsive because of excessive memory usage, but the memory pressure level might still show "normal". This just means that we shouldn't trust the macOS memory pressure level.
,
Apr 20 2017
Another real world example where a user was seeing 5-second page loads due to severe memory pressure, but memory pressure level was only "warning". https://bugs.chromium.org/p/chromium/issues/detail?id=700580#c32
,
Apr 20 2017
I agree with Erik that the memory pressure should be calculated so that a 'critical' state is set before a terrible UX happens. (I'm not familiar with Mac OS's implementation but) if you think that we cannot trust the OS's pressure level, I agree that we should calculate the pressure level manually. I might want to hear thoughts of shrike@.
,
Apr 20 2017
erikchen@ - can you relay more details about why you want to synthesize a memory pressure value? I'm not saying it doesn't make sense, I'm just wondering where it would be used. Originally I thought this was related to MemoryMonitorMac, which has a similar concept (it outputs an estimate of free memory, which the MemoryCoordinator uses to estimate memory pressure).
,
Apr 20 2017
shrike: Perhaps I'm mistaken, but I thought that MemoryPressureMonitor was the entity used by Memory Coordinator to decide policy [such as whether to discard tabs]. The two examples I give in c#1 and c#3 show cases where the system is under a significant amount of memory pressure, but the OS thinks the memory pressure level is normal, or warning, but not critical. The goal of implementing a synthetic memory pressure value is to allow the memory coordinator to make better policy decisions.
,
Apr 20 2017
OK, got it. Yes, if the goal is to create a better MemoryPressureMonitor[Mac], generating the pressure level synthetically seems like a good plan. The MemoryCoordinator system has changed over time so I'm not exactly sure of its current design, but the original plan was to replace the MemoryPressureMonitors with per-platform "MemoryMonitors", which return an estimate of available free memory. There currently is not a MemoryMonitorMac, which is in-part why the MemoryCoordinator does not run on the Mac. It occurred to me that what you're describing for the synthetic memory pressure signal could be what we want to return from the MemoryMonitorMac.
,
Apr 20 2017
Re: #2 Thanks for the clarification. If the system slows down but the system still thinks its memory state is normal, that's bad. In that case distrusting the system's pressure level makes sense. Re: #7 Yes, your explanation is perfect. The original plan was to replace existing MemoryPressureMonitor (located in base/) with MemoryMonitor (located in content/browser/memory). But as I wrote in #1 it may make sense to use MemoryPressureMonitor for memory coordinator. I'll think it over a bit more.
,
Apr 21 2017
Throwing an idea out there: a UMA / slow-reports study which: - collects a bunch of memory stats. with reference to the example above things like: total_pages_in_use, pages_free, total_pages, wired_pages etc - measures the time that it takes to do: x = malloc(8 MB), memset(x), free(x) ? Essentially the idea here would be to grab raw data from the field and see if/where there is a correlation with the time it takes to do pure memory-based operations. If so come with a model that matches what we measured from the field.
,
Apr 21 2017
Hey ericrk, vmiura: Does CC discardable memory currently rely on these signals? If so, we may want to reconsider how we work with image caches in the near future, since these signals don't work very well. haraken: Are there other sources of discardable memory that currently rely on these signals?
,
Apr 22 2017
Discardable memory is used by many things -- CC, Skia, Blink resources etc. Our plan is to replace the pressure signals with Memory Coordinator (for not only discardable memory but also everything). Can we try to implement a "right" pressure triggerer for Memory Coordinator?
,
Apr 22 2017
> Can we try to implement a "right" pressure triggerer for Memory Coordinator? This sounds good. :)
,
Apr 24 2017
,
Jun 5 2017
erikchen@ - in your definition of AVAILABLE_NON_COMPRESSED_MEMORY, what are vm_page_active_count, vm_page_inactive_count, and vm_page_speculative_count (I assume it's clear what vm_page_free_count is)?
,
Jun 6 2017
This definition for AVAILABLE_NON_COMPRESSED_MEMORY comes from ./osfmk/vm/vm_compressor.h. The terms used in the definition are directly available via host_statistics64 as well as vm_stat. Note that in host_statistics64, free_count = vm_page_free_count + vm_page_speculative_count, and speculative_count = vm_page_speculative_count. Whereas in vm_stat, "Pages free" is just vm_page_free_count. Also note that for vm_stat [free + active + inactive + speculative + wired + compressor] pages will add up the number of pages of physical memory. :) Meanings of these numbers: free: pages available for use by the system. speculative: pages that have been speculatively paged in, but are available for use by the system. active: Page is in use [referenced by a pmap]. inactive: Page *was* in use, but has not been faulted recently, and is ready to be compressed. wired: Page is *always* in use [e.g. cannot be compressed/paged out]. compressor: Pages used by the compressor. The reason that AVAILABLE_NON_COMPRESSED_MEMORY includes active and inactive pages is because those pages *could* be used by the compressor to compress more memory, which in turn frees up more pages to be used.
,
Dec 16 2017
,
Mar 20 2018
Another potentially useful syscall:
"""
1021 * mach_vm_pressure_monitor() collects past statistics about memory pressure.
1022 * The caller provides the number of seconds ("nsecs") worth of statistics
1023 * it wants, up to 30 seconds.
1024 * It computes the number of pages reclaimed in the past "nsecs" seconds and
1025 * also returns the number of pages the system still needs to reclaim at this
1026 * moment in time.
"""
,
Jan 11
Available, but no owner or component? Please find a component, as no one will ever find this without one.
,
Jan 14
Erik, can you route?
,
Jan 14
+ chrisha -- what are next steps here? Does Seb have a framework/ideas for how to best do this on Windows? It would be easy to whip up a simple implementation for macOS, but we'd need to be careful to confirm that we're causing net improvements.
,
Jan 15
FWIW we're experimenting with manually dispatching critical memory pressures on Android when the memory usage reaches 95%-tile numbers (e.g., 150 MB on 1 GB Android devices). The rationale is that tasak's ablation study has demonstrated that there's a high correlation between high memory usage and regressions on user-facing performance metrics [we'll publish a doc soon]. The assumption is that dispatching critical memory pressures on 95%-tile cases will lead to a better user experience. We'll send a design doc soon :) |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by bashi@chromium.org
, Apr 20 2017