tcmalloc aggressively uses MADV_FREE when switching to a new Linux sysroot even when unsupported by the kernel |
||||||
Issue descriptionTo reproduce: $ git cl patch 598737 $ gclient sync Do a release build using the arguments of linux_chromium_rel_ng: dcheck_always_on = true ffmpeg_branding = "Chrome" is_component_build = false is_debug = false proprietary_codecs = true strip_absolute_paths_from_debug_symbols = true symbol_level = 1 use_goma = true MemoryInstrumentationTest.PrivateFootprintComputatio fails with: ../../content/browser/tracing/memory_instrumentation_browsertest.cc:107: Failure Value of: after_kb - before_kb Expected: (is >= -6656) and (is <= 6656) Actual: 66732 ../../content/browser/tracing/memory_instrumentation_browsertest.cc:111: Failure Value of: during_kb - after_kb Expected: (is >= 63560) and (is <= 69560) Actual: -8
,
Aug 8 2017
So I spent some time investigating this today, repo was easy with the instructions above (Thanks Tom!). Looks like we correctly see the allocation but then never see the deallocation although the system knows the memory is free and will reuse it. We saw this optimisation on MacOS for small allocations (<64mb) but now we're seeing even for very large allocations (>640mb) on Linux. I tried various experiments with 2 allocations: // numbers in kb, alloc and free are the amount of memory allocated/freed the rest are measurements of browser_dump->os_dump->private_footprint_kb // 650mb then 650mb before1 7560 alloc1 665600 during1 674524 free1 665600 after1 674536 before2 674536 alloc2 665600 during2 674536 free2 665600 after2 674540 // 650mb then 325mb before1 7436 alloc1 665600 during1 674436 free1 665600 after1 674440 before2 674440 alloc2 332800 during2 674440 free2 332800 after2 674440 // 650mb then 1300mb before1 7584 alloc1 665600 during1 674552 free1 665600 after1 674552 before2 674556 alloc2 1331200 during2 2008380 free2 1331200 after2 2008380 // 65kb then 65kb before1 7404 alloc1 65 during1 9596 free1 65 after1 9688 before2 9688 alloc2 65 during2 9688 free2 65 after2 9756 // 650mb overlapped with 650mb before1 7516 alloc1 665600 before2 674464 alloc2 665600 during1 1341376 during2 1341376 free2 665600 after2 1341376 free1 665600 after1 1341380 Note that is all cases we never observe the memory reported by private footprint to go down. I also tried putting in a bunch of sleeps in case it was a timing issue but no dice. Next I'm going to try catting statm directly while running the test to check it isn't a bug with the service code.
,
Aug 9 2017
Which OS do I need in order to reproduce this? is this just sysroot dependent or depends also on the host OS (i.e. the actual libc version?) I need to take a look with strace to see what happens under the hoods
,
Aug 9 2017
I think it's only sysroot dependent and you can repro on at least Ubuntu Trusty
,
Aug 11 2017
So I did a quick check with strace and looks like we really end up mmap-ing the 65 MB array (note 65 not 650 as per #2, which is consistent with kAllocSize = 65 * 1024 * 1024; in the code) but never munmap-ing it. The actual computation code is working fine, so I have two theories here: 1) The switch to the sysroot is causing base::MakeUnique<char[]> to somehow mess up the deleter (see https://groups.google.com/a/chromium.org/d/msg/chromium-dev/iQgMedVA8-k/IP6qa__3BQAJ). In which case this would be a real bug because we use MakeUnique<.*[]> in chrome and that would mean we'd start leaking those unique pointers. 2) This is some weird optimization in the allocator. However there are two things I cannot get in this case: (i) "retaining 65 MB after a free() seems very hard to digest as a plausible optimization"; (ii) AFAIK the allocator is still part of libc.so and unless I am missing something, we are not statically linking it here. If this is the case this wouldn't explain how I get this repro on Trusty by just tweaking the sysroot. thomasanderson@, unfortunately I am about to leave. Could you please check that this isn't actually (1) ? If we are confident that there isn't a problem that causes us to leak memory but it's just a problem with the test: - assign back the bug to hjd@ - hjd@: just remove deallocation coverage in the test to unblock the sysroot work - keep this bug open and assigned to me because I still need to understand what's going on.
,
Aug 11 2017
I created a minified repro. I can rule out 1 and still convinced the test does the right thing. I can repro even with malloc/free (see https://chromium-review.googlesource.com/c/612070/1/test.cc) which suggests to me for whatever reason the allocator is seriously flawn or I am missing something gigantic here The minified repro works fine without the sysroot CL. Specifically with reference to the minified repro above. With the old sysroot I get: ----- VirtSize=19312 KB, RSS=2852 KB malloc done, press a key to .reset() VirtSize=86004 KB, RSS=69852 KB reset done, press a key to exit() VirtSize=86004 KB, RSS=3328 KB ----- With new sysroot: ----- VirtSize=19312 KB, RSS=2460 KB malloc done, press a key to .reset() VirtSize=86004 KB, RSS=69660 KB reset done, press a key to exit() VirtSize=86004 KB, RSS=69828 KB <- !!!!!!! ----- I think the key is in the strace output: old: --- mmap(0x1fafa6be1000, 68157440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x1fafa6be1000 ... madvise(0x1fafa6be1000, 68157440, MADV_DONTNEED) = 0 --- new: --- mmap(0xa2e967af000, 68157440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa2e967af000 ... madvise(0xa2e967af000, 68157440, 0x8 /* MADV_??? */) = -1 EINVAL (Invalid argument) --- So it looks like: - free() does NOT munmap() but just madvise, this behavior exists also in the current sysroot. - in the new sysroot free() uses a madvise argument that the trusty kernel does not understand. So, this is a real bug and could lead to real memory leaks in production code.
,
Aug 11 2017
,
Aug 11 2017
Ah there we go. Kernel 4.5 introduced recently MADV_FREE, which brings delayed free semantics https://github.com/torvalds/linux/commit/854e9ed09dedf0c19ac8640e91bcc74bc3f9e5c9 which means that memory gets freed in a delayed only upon pressure. In turn the "stretch" sysroot seem to have adopted that aggressively without performing and backwards compatible checks. So in essence free() in the new sysroot is using a syscall argument that kernels < 4.5 do not understand, effectively turning free() into a no-op (still a bit surprised, I though that free was immplemented in libc.so not in the sysroot, but maybe there is some header int he middle). I am afraid that we cannot use this sysroot as-is, at least not in this state. Also I wonder how many other surprises like this are there in the new sysroot (in terms of using bleeding edge kernel syscalls which are unsupported on current kernel versions)
,
Aug 11 2017
First of all, kudos to hjd@ for adding this test! It was the only test failure, and needless to say, it would have been disastrous to let this fly under the radar. As for the fix, I think the issue is in tcmalloc. 0x8 is MADV_FREE, which was added in Linux kernel 4.5 https://github.com/gperftools/gperftools/issues/780 But there are some places in tcmalloc where we test if we should use MADV_FREE if the macro is defined. eg. https://cs.chromium.org/chromium/src/third_party/tcmalloc/vendor/src/system-alloc.cc?type=cs&q=MADV_FREE&sq=package:chromium&l=67 Sure enough, MADV_FREE was *not* defined in the Jesise sysroot, but is in the Stretch sysroot. I think a good short-term solution is to switch tcmalloc to always use MADV_DONTNEED on Linux, and long-term to add runtime detection for MADV_FREE. primiano@ wdyt?
,
Aug 11 2017
c#8 beat me to it! We're using tcmalloc, not the libc allocator, correct?
,
Aug 11 2017
Ahhh I completely forgot about tcmalloc (double facepalm, as I am on of the two owners of //base/allocator). Yes I see the problem now, that code in tcmalloc was speculating on the fact that only FreeBSD supports MADV_FREE. Adding an #if defined(OS_LINUX) to that (note tcmalloc/chromium not tcmalloc/vendor, the latter is the upstream reference) seems the right thing to do. Also we should check if this has been fixed in tcmalloc upstream and if possible just cherry-pick that patch.
,
Aug 11 2017
,
Aug 14 2017
Fix has landed: https://chromium.googlesource.com/chromium/src.git/+/4c2ef4092040a7a1fce76b6c543d26de4f2f0f29 |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by hjd@chromium.org
, Aug 8 2017Status: Started (was: Assigned)