chrome exhibits high cpu use and is very janky |
|||||||||
Issue descriptionChrome exhibits high cpu use and is very janky. I took a sample and symbolized it. Looks like UI thread is spending a ton of time in CoreAnimation. Chrome 69.0.3472.3.
,
Jul 9
Physical footprint: 4.6G Physical footprint (peak): 13.3G Looks like the root problem is memory leak, causing slow container lookups. This version of Chrome includes my recent fix: https://chromium-review.googlesource.com/1103677 so it must be another issue. Can you run: "vmmap -v -interleaved <pid>" and "heap -addresses=all -guessNonObjects <pid>"
,
Jul 13
My browser process is at 2.4G now. So let's grab the diagnostics. In the zip is the vmmap, heap and sample outputs. The sample was after shortly unlocking my screen when Chrome was using a lot of CPU (190%?) Erik, mind taking a look? By the way, this time Chrome wasn't very janky yet - after it "finished" using a lot of CPU and reached a steady state of minimal cpu use (5%). But it's possible it hasn't fully gotten to the user visible "bad" state.
,
Jul 13
This is without MacViews. Here's the leaks <pid> output.
,
Jul 13
Thanks asvitkine. There's probably more than 1 problem at play. I'm looking into the memor yusage right now. The root symptom is that malloc has 90% fragmentation - the default zone has 1.6GB of PMF for only 181MB of allocated objects. looking more closely, there are 501 completely empty MALLOC_NANO regions [each of which consumes 128 kb]. There are 3968 non-empty MALLOC_NANO regions, but they're still mostly empty. This suggests that they're something very broken with malloc on macOS. Continuing to investigate.
,
Jul 13
The implementation of NanoMalloc never frees memory. I confirmed both by looking at the source, see: https://opensource.apple.com/tarballs/libmalloc/libmalloc-140.40.1.tar.gz -- ./src/nano_malloc.c Also see my attached test program for repro steps. Now the question is - what is causing Chrome to use nano malloc in the wild?
,
Jul 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b32a54d5896276f5f7be79e120e410e1e2537c63 commit b32a54d5896276f5f7be79e120e410e1e2537c63 Author: erikchen <erikchen@chromium.org> Date: Fri Jul 13 23:01:46 2018 Add the LSEnvironment variable MallocNanoZone=0. There appears to be a bug in Launch Services that causes MallocNanoZone=1 to always be set by default for all applications [this affects applications launched by Finder, and on restart, but not applications launched from the Terminal]. I filed https://bugreport.apple.com/web/?problemID=42181038 to track this. MallocNanoZone causes libMalloc to use the nano malloc zone by default [affects allocations <= 256 bytes]. The implementation of nano malloc never frees memory. See https://bugs.chromium.org/p/chromium/issues/detail?id=861939#c6 for more details. This has been observed to cause > 1GB of wasted memory in the browser process. Possibly a lot more when added up across all processes. To test this: 1) Build Chrome with patch. 2) Run /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister -f <path_to_chrome> 3) Launch Chrome from Finder 4) Use "vmmap <pid>" to observe whether there is a MALLOC_NANO summary. Bug: 861939 Change-Id: I740bcbb006e4ce641d0dde6f6e53dcf4f032ff0b Reviewed-on: https://chromium-review.googlesource.com/1137131 Reviewed-by: Avi Drissman <avi@chromium.org> Reviewed-by: Robert Sesek <rsesek@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#575108} [modify] https://crrev.com/b32a54d5896276f5f7be79e120e410e1e2537c63/chrome/app/app-Info.plist [modify] https://crrev.com/b32a54d5896276f5f7be79e120e410e1e2537c63/chrome/app/helper-Info.plist
,
Jul 13
asvitkine: Can we make this bug public?
,
Jul 16
Making public.
,
Jul 16
Issue 850154 has been merged into this issue.
,
Jul 16
Initial UMA stats are promising. I expect further improvements as time passes - as this memory leaks has a much larger affect on longer running instances of Chrome. I'd like to merge this to M-68. If we start to see significant memory improvements, I may even request a merge and spin to M-67.
,
Jul 16
,
Jul 16
This bug requires manual review: We are only 7 days from stable. Please contact the milestone owner if you have questions. Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 16
Approved - branch:3440
,
Jul 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/818d165fd2156f65e80ffd8306639a489f794392 commit 818d165fd2156f65e80ffd8306639a489f794392 Author: erikchen <erikchen@chromium.org> Date: Mon Jul 16 17:49:43 2018 [Merge to 3440] Add the LSEnvironment variable MallocNanoZone=0. There appears to be a bug in Launch Services that causes MallocNanoZone=1 to always be set by default for all applications [this affects applications launched by Finder, and on restart, but not applications launched from the Terminal]. I filed https://bugreport.apple.com/web/?problemID=42181038 to track this. MallocNanoZone causes libMalloc to use the nano malloc zone by default [affects allocations <= 256 bytes]. The implementation of nano malloc never frees memory. See https://bugs.chromium.org/p/chromium/issues/detail?id=861939#c6 for more details. This has been observed to cause > 1GB of wasted memory in the browser process. Possibly a lot more when added up across all processes. To test this: 1) Build Chrome with patch. 2) Run /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister -f <path_to_chrome> 3) Launch Chrome from Finder 4) Use "vmmap <pid>" to observe whether there is a MALLOC_NANO summary. Bug: 861939 Change-Id: I740bcbb006e4ce641d0dde6f6e53dcf4f032ff0b Reviewed-on: https://chromium-review.googlesource.com/1137131 Reviewed-by: Avi Drissman <avi@chromium.org> Reviewed-by: Robert Sesek <rsesek@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#575108}(cherry picked from commit b32a54d5896276f5f7be79e120e410e1e2537c63) Reviewed-on: https://chromium-review.googlesource.com/1138294 Reviewed-by: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/branch-heads/3440@{#680} Cr-Branched-From: 010ddcfda246975d194964ccf20038ebbdec6084-refs/heads/master@{#561733} [modify] https://crrev.com/818d165fd2156f65e80ffd8306639a489f794392/chrome/app/app-Info.plist [modify] https://crrev.com/818d165fd2156f65e80ffd8306639a489f794392/chrome/app/helper-Info.plist
,
Jul 16
I wrote a dtrace script (attached, and it requires SIP to be disabled) to track where _POSIX_SPAWN_NANO_ALLOCATOR is set.
When opening an app through Dock/Finder/open(1), LaunchServices sends a message to launchd, which then starts the process through xpcproxy.
dtrace: script 'posix_spawn_flags.d' matched 7 probes
CPU ID FUNCTION:NAME
0 216758 posix_spawn:entry PID 1 (launchd) posix_spawn(xpcproxy) with psa_flags = 0x400c
1 216758 posix_spawn:entry PID 1448 (xpcproxy) posix_spawn(/Applications/Utilities/Console.app/Contents/MacOS/Console) with psa_flags = 0x42c0
1 216758 posix_spawn:entry
kernel`unix_syscall64+0x268
kernel`hndl_unix_scall64+0x16
libsystem_kernel.dylib`__posix_spawn+0xa
libsystem_c.dylib`posix_spawnp+0x20a
0x10a27ecdd
0x10a27f91c
libdyld.dylib`start+0x1
Console`0x2
The spawning of xpcproxy creates a new process with CLOEXEC_DEFAULT|SETSIGDEF|SETSIGMASK. In the new process, xpcproxy then uses the posix_spawn variant that _just_ execs with CLOEXEC_DEFAULT|NANO_ALLOCATOR|SETEXEC|START_SUSPENDED.
When the process that is opening the app receives notification that the process has started, it sends SIGCONT to un-suspend the process. E.g., when using `open -a Calculator`:
0 226 kill:entry pid 2135 (open) sending 19 to 2136
0 226 kill:entry
libsystem_kernel.dylib`__kill+0xa
LaunchServices`_LSLaunchApplicationInternal(LSSessionID, __CFDictionary const*, __CFDictionary const*, __CFArray const*, char const* const*, __CFDictionary const**)+0x1406
LaunchServices`_LSLaunchApplication+0x359
LaunchServices`_LSLaunch(LSContext*, FSNodeStruct*, unsigned int, void*, __CFArray const*, AEDesc const*, AEDesc const*, __CFArray const*, __CFDictionary const*, unsigned int, audit_token_t const*, _LSOpen2Options const*, ProcessSerialNumber*, unsigned char*, NSError**)+0x2506
LaunchServices`_LSOpenApp(LSOpenState*, unsigned int, FSNodeStruct*, unsigned int, unsigned char*, ProcessSerialNumber*)+0x165
LaunchServices`_LSOpenItemsWithHandler_CFDictionaryApplier(void const*, void const*, void*)+0x2f6
CoreFoundation`__CFDictionaryApplyFunction_block_invoke+0x16
CoreFoundation`CFBasicHashApply+0x80
CoreFoundation`CFDictionaryApplyFunction+0xc0
LaunchServices`_LSOpenStuffCallLocal+0xc8d
LaunchServices`_LSOpenStuff+0x9d
LaunchServices`_LSOpenURLsWithRole_Common+0x38
LaunchServices`_LSOpenURLsWithRole_CommonLegacy(__CFArray const*, unsigned int, AEKeyDesc const*, LSApplicationParameters_V1 const*, _LSOpen2Options const*, ProcessSerialNumber*, long, __CFURL const**, unsigned char*)+0x5a
LaunchServices`LSOpenURLsWithRole+0x20
open`0x000000010a139131+0x248e
libdyld.dylib`start+0x1
open`0x3
I haven't figured out what is influencing xpcproxy to set _POSIX_SPAWN_NANO_ALLOCATOR, though.
,
Jul 17
An amusing page to look at is https://github.com/jemalloc/jemalloc/issues/1123 . The interesting part (and relevant one) is: """ Now, the interesting problem is that this doesn't happen consistently: if you run the app from the command line, you don't get this problem. Only if you run it with open or by clicking on it in the Dock/Finder/whatever. The difference between those two is that the $whatever that is responsible for spawning application processes in the latter case passes _POSIX_SPAWN_NANO_ALLOCATOR to posix_spawn, which you can't find any public references to outside the XNU source code. Running from the command line doesn't. """ This isn't the first time that someone ran into the mysterious _POSIX_SPAWN_NANO_ALLOCATOR and didn't understand where it came from.
,
Jul 18
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/41e25953327f25cd93aad5978f2874c840ac174f commit 41e25953327f25cd93aad5978f2874c840ac174f Author: erikchen <erikchen@chromium.org> Date: Wed Jul 18 15:13:13 2018 Disable MallocNanoZone for apps. Bug: 861939 Change-Id: Ia86ab5aaa039dd4df38ca722d853907262c16930 Reviewed-on: https://chromium-review.googlesource.com/1140197 Reviewed-by: Trent Apted <tapted@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#576064} [modify] https://crrev.com/41e25953327f25cd93aad5978f2874c840ac174f/chrome/app_shim/app_mode-Info.plist
,
Jul 23
Issue 865424 has been merged into this issue.
,
Aug 1
Issue 869828 has been merged into this issue. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by asvitk...@chromium.org
, Jul 9