Reading /proc/<pid>/totmaps is very slow under memory pressure |
|||||
Issue description
Chrome Version: 69.0.3497.0
OS: Chrome OS
What steps will reproduce the problem?
I opened several very memory-hungry tabs and cycle through them constantly to put the system under heavy memory pressure. Sometimes the system enters an observable freeze (e.g. seconds). I noticed sometimes resource_coordinator::TabManager::OnMemoryPressure() can block the UI thread for 500+ or even 1000+ ms.
Then I profile the browser process UI thread with perf with:
# /usr/bin/perf record -a -g -t 1377 -F 100
And in a time interval where TabManager::OnMemoryPressure() takes 500+ ms, I got:
93.45% 0.00% chrome chrome [.] _start
|
---_start
__libc_start_main
ChromeMain
service_manager::Main
(deleted for brevity. Refer to the attachment for deleted content.)
base::MemoryPressureListener::Notify
resource_coordinator::TabManager::OnMemoryPressure
resource_coordinator::TabManager::LogMemoryAndDiscardTab
resource_coordinator::TabManagerDelegate::LowMemoryKill
resource_coordinator::TabManagerDelegate::LowMemoryKillImpl
|
|--43.73%--resource_coordinator::TabManagerDelegate::KillTab
| resource_coordinator::TabLifecycleUnitSource::TabLifecycleUnit::Discard
| resource_coordinator::TabLifecycleUnitSource::TabLifecycleUnit::FinishDiscard
| |
(deleted for brevity. Refer to the attachment for deleted content.)
|
|--42.98%--resource_coordinator::TabLifecycleUnitSource::TabLifecycleUnit::GetEstimatedMemoryFreedOnDiscardKB
| base::ProcessMetrics::GetTotalsSummary
| base::ReadFileToStringWithMaxSize
| GI_libc_read
| entry_SYSCALL_64_fastpath
| sys_read
| __vfs_read
| seq_read
| totmaps_proc_show
| walk_page_vma
| __walk_page_range
| smaps_pte_range
| |
| --30.82%--swp_swapcount
| |
| |--18.78%--swap_info_get
| | _raw_spin_lock
| | do_raw_spin_lock
| |
| --12.04%--_raw_spin_unlock
| |
| --6.08%--do_raw_spin_unlock
It can be seen that to get how much memory we can free from killing a process (USS + swap), we read /proc/<pid>/totmaps, and it could be very slow under memory pressure, when the UI thread and kswapd step on each other's toes (30% time spent in swp_swapcount, and lots of cycles spent in spin locks). The estimations for handling memory pressures actually worsen the memory pressure. We could consider using a lighter but less accurate estimation for handling memory pressures so that system performance doesn't fall off the cliff when free memory drops to some extent.
,
Aug 8
that's an interesting result -- 43% of the time in LowMemoryKillerImpl is spent reading totmaps? am I reading it correctly? I think we're using totmaps (which is a chrome os specific thing -- we need to switch to smaps_rollup which is the upstream version) because we used to rely on rss from /proc/<pid>/stat but that wasn't very accurate. We could revisit that and see if RSS from /proc/<pid>/stat is good enough.
,
Aug 9
,
Aug 9
Yes, LowMemoryKillerImpl can spend 43% in reading totmaps. Anything walking the address space for the process, like smaps or smaps_rollup, is expected to give a similar result. We should consider using RSS+swap as a faster approximation. When a tab process consumes lots of memory, its shared memory could take a small fraction, where RSS is closer to USS. I need to have experiments to get some numbers.
,
Aug 10
Some experiment result with top 10 web sites on https://moz.com/top500 : RSS of of a tab process ranges from 300 MB to 110 MB. Correspoding USS numbers are 200 MB to 33 MB. Shared memory for these processes ranges from 100 to 70 MB. We need to be careful in using RSS for estimating freed memory for a process that it's an overestimation and could lead to too few processes killed. Then the system doesn't recover from low memory condition and needs to take the path from low memory notification to getting a process killed again, making the system stay under memory pressure for longer. We can add a negative offset of freed memory using RSS. Adding hysteresis to tab discard ( https://crbug.com/872253 ) also alleviates the precision problem of RSS.
,
Aug 17
It looks like in 4.14 and later /proc/<pid>/status contains enough information to get a USS number that is accurate for estimating memory freed by discard. /proc/<pid>/statm reports rss which is a sum of anon, file, and shmem file can be a large amount of memory which is mostly used for text In 4.14 status file we have anon, file, and shmem broken out: VmPeak: 614280 kB VmSize: 612744 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 164448 kB VmRSS: 134844 kB RssAnon: 106320 kB RssFile: 28244 kB RssShmem: 280 kB VmData: 189276 kB VmStk: 132 kB VmExe: 145020 kB VmLib: 44712 kB VmPTE: 1044 kB VmPMD: 496 kB VmSwap: 0 kB so we can just parse out Anon from this and use that as our estimation. We will need to backport this stat to older kernels, but that should be relatively easy.
,
Aug 31
To use RssAnon as an approximation to USS, I backported * https://chromium.googlesource.com/chromiumos/third_party/kernel/+/eca56ff906bdd0239485e8b47154a6e73dd9a2f3 and * https://chromium.googlesource.com/chromiumos/third_party/kernel/+/8cee852ec53fb530f10ccabf1596734209ae336b to v4.4 and tested on my device with top 10 web sites (search and mail are tested on google.com). Using RssAnon as our estimation on these sites works pretty well. Deltas of RssAnon and USS of the renderer processes consistently fall within 35 to 40 MB in 9 of 10 sites. The only exception of the top 10 sites is wikipedia.org, where RssAnon - USS is 27 MB: Tab USS Anon Delta ===================================== Facebook.com 127004 166664 39660 Twitter.com 84948 125592 40644 Google search 49536 90324 40788 Gmail 185808 222292 36484 Youtube.com 94664 134620 39956 Instagram.com 44052 84532 40480 Linkedin.com 119060 159332 40272 Wordpress.org 34564 70104 35540 Pinterest.com 77832 118300 40468 Wikipedia.org 57924 85824 27900 Wordpress.com 57860 98140 40280 The delta comes mostly from anonymous shared memory shared between zygote and renderer processes, like the following: 5a673fe00000-5a6741c00000 r-xp 00000000 00:00 0 Size: 30720 kB Rss: 30720 kB Pss: 1616 kB Shared_Clean: 0 kB Shared_Dirty: 30720 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 30720 kB Anonymous: 30720 kB AnonHugePages: 30720 kB (omitted for brevity) And with zygote, we can expect that there won't be many file-mapped private pages. I am going to proceed with backporting to v3.x kernel and then the chrome part.
,
Sep 7
re #7 -- that's interesting. the PSS of that process is also tiny which to me says this shared anon memory vma is probably shared among all of the renderer processes. I agree that it seems like a good enough approximation for what we need.
,
Sep 12
,
Sep 20
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/891c27ed82da443e7373152fdef17984852b6eae commit 891c27ed82da443e7373152fdef17984852b6eae Author: Chinglin Yu <chinglinyu@chromium.org> Date: Thu Sep 20 06:38:53 2018 Estimate freed memory in killing a process faster. Use memory_instrumentation::OSMetrics::FillOSMemoryDump(), which reads /proc/<pid>/statm, to get private bytes of a process, to avoid contention with kswapd under heavy memory pressure. BUG= chromium:872253 TEST=manual R=cylee@chromium.org, sonnyrao@chromium.org, fdoray@chromium.org Change-Id: I4d43933a39c3c89d8ebb81e3ccef20277cedb258 Reviewed-on: https://chromium-review.googlesource.com/1212246 Commit-Queue: Chinglin Yu <chinglinyu@chromium.org> Reviewed-by: François Doray <fdoray@chromium.org> Reviewed-by: Cheng-Yu Lee <cylee@chromium.org> Cr-Commit-Position: refs/heads/master@{#592701} [modify] https://crrev.com/891c27ed82da443e7373152fdef17984852b6eae/chrome/browser/resource_coordinator/tab_lifecycle_unit.cc [modify] https://crrev.com/891c27ed82da443e7373152fdef17984852b6eae/chrome/browser/resource_coordinator/tab_manager_delegate_chromeos.cc [modify] https://crrev.com/891c27ed82da443e7373152fdef17984852b6eae/chrome/browser/resource_coordinator/tab_manager_delegate_chromeos.h [modify] https://crrev.com/891c27ed82da443e7373152fdef17984852b6eae/chrome/browser/resource_coordinator/utils.cc [modify] https://crrev.com/891c27ed82da443e7373152fdef17984852b6eae/chrome/browser/resource_coordinator/utils.h
,
Oct 5
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by chinglinyu@chromium.org
, Aug 8