New issue
Advanced search Search tips

Issue 722911 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows
Pri: 2
Type: Bug



Sign in to add a comment

PartitionAllocTest.DumpMemoryStats crashes OOM when running PartitionAllocTest.* tests

Project Member Reported by gab@chromium.org, May 16 2017

Issue description

OS: Win10

out\Debug\base_unittests --gtest_filter=PartitionAllocTest.DumpMemoryStats --gtest_repeat=10

-> works fine.

out\Debug\base_unittests --gtest_filter=PartitionAllocTest.*

crashes.

So having multiple tests in same process causes issue when reaching PartitionAllocTest.DumpMemoryStats.


Stack:

KERNELBASE!RaiseException+0x62
base!base::PartitionOutOfMemory+0x51 [d:\src\chrome\src\base\allocator\partition_allocator\partition_alloc.cc @ 267] 
base!base::PartitionAllocSlowPath+0x6d0 [d:\src\chrome\src\base\allocator\partition_allocator\partition_alloc.cc @ 818] 
base_unittests!base::PartitionBucketAlloc+0x1b2 [d:\src\chrome\src\base\allocator\partition_allocator\partition_alloc.h @ 674] 
base_unittests!base::PartitionAlloc+0x187 [d:\src\chrome\src\base\allocator\partition_allocator\partition_alloc.h @ 714] 
base_unittests!base::PartitionAllocTest_DumpMemoryStats_Test::TestBody+0x36 [d:\src\chrome\src\base\allocator\partition_allocator\partition_alloc_unittest.cc @ 1443] 
base_unittests!testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,void>+0x34 [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 2460] 
base_unittests!testing::Test::Run+0x87 [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 2474] 
base_unittests!testing::TestInfo::Run+0xad [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 2660] 
base_unittests!testing::TestCase::Run+0xbf [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 2775] 
base_unittests!testing::internal::UnitTestImpl::RunAllTests+0x295 [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 4651] 
base_unittests!testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>+0x34 [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 2460] 
base_unittests!testing::UnitTest::Run+0xcf [d:\src\chrome\src\third_party\googletest\src\googletest\src\gtest.cc @ 4258] 
base_unittests!RUN_ALL_TESTS+0xf [d:\src\chrome\src\third_party\googletest\src\googletest\include\gtest\gtest.h @ 2238] 
base_unittests!base::TestSuite::Run+0x73 [d:\src\chrome\src\base\test\test_suite.cc @ 271] 
base_unittests!base::internal::FunctorTraits<int (__thiscall base::TestSuite::*)(void),void>::Invoke<base::TestSuite *>+0x12 [d:\src\chrome\src\base\bind_internal.h @ 215] 
base_unittests!base::internal::InvokeHelper<0,int>::MakeItSo<int (__thiscall base::TestSuite::*const &)(void),base::TestSuite *>+0x24 [d:\src\chrome\src\base\bind_internal.h @ 285] 
base_unittests!base::internal::Invoker<base::internal::BindState<int (__thiscall base::TestSuite::*)(void),base::internal::UnretainedWrapper<base::TestSuite> >,int __cdecl(void)>::RunImpl<int (__thiscall base::TestSuite::*const &)(void),std::tuple<base::internal::UnretainedWrapper<base::TestSuite> > const &,0>+0x3b [d:\src\chrome\src\base\bind_internal.h @ 361] 
base_unittests!base::internal::Invoker<base::internal::BindState<int (__thiscall base::TestSuite::*)(void),base::internal::UnretainedWrapper<base::TestSuite> >,int __cdecl(void)>::Run+0x24 [d:\src\chrome\src\base\bind_internal.h @ 339] 
base_unittests!base::Callback<int __cdecl(void),1,1>::Run+0x21 [d:\src\chrome\src\base\callback.h @ 80] 
base_unittests!base::`anonymous namespace'::LaunchUnitTestsInternal+0x118 [d:\src\chrome\src\base\test\launcher\unit_test_launcher.cc @ 216] 
base_unittests!base::LaunchUnitTests+0x51 [d:\src\chrome\src\base\test\launcher\unit_test_launcher.cc @ 458] 
base_unittests!main+0x56 [d:\src\chrome\src\base\test\run_all_base_unittests.cc @ 22] 
base_unittests!invoke_main+0x1e [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 64] 
base_unittests!__scrt_common_main_seh+0x150 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 253] 
base_unittests!__scrt_common_main+0xd [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 296] 
base_unittests!mainCRTStartup+0x8 [f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp @ 17] 
KERNEL32!BaseThreadInitThunk+0x24
ntdll!__RtlUserThreadStart+0x2f
ntdll!_RtlUserThreadStart+0x1b


 
Cc: primiano@chromium.org
Owner: palmer@chromium.org
moving to palmer who owns PA

Comment 2 by palmer@chromium.org, May 23 2017

I can't reproduce the problem on either Linux or Windows. However, these tests are 'aggressive' and OOM-y by nature; we've had issues in the past.

What's your hardware? Perhaps my machines just have so much RAM that I can't happen upon the problem. I'm on 64-bit Z840s with 64 GiB RAM for both Linux and Windows.

Comment 3 by gab@chromium.org, May 23 2017

Z840, running 32-bit build though.

Comment 4 by palmer@chromium.org, May 23 2017

Ahh. What are your build flags? So I can do the exact same as you.

Comment 5 by palmer@chromium.org, May 25 2017

Labels: OS-Linux OS-Windows
OK, I can reproduce this now (on Linux, and probably all 32-bit platforms):

~/chromium/src $ ./out/Default/base_unittests --gtest_filter=PartitionAllocTest.*
...
1 test crashed:
    PartitionAllocTest.DumpMemoryStats (../../base/allocator/partition_allocator/partition_alloc_unittest.cc:1440)
Tests took 15 seconds.

~/chromium/src $ cat out/Default/args.gn 
target_cpu = "x86"
is_debug=true
is_component_build=true
use_goma = true


Comment 6 by palmer@chromium.org, May 26 2017

I suspect the problem might be that PA is running out of address space (not actual memory):. From https://cs.chromium.org/chromium/src/base/allocator/partition_allocator/PartitionAlloc.md:

"""PartitionAlloc guarantees that different partitions exist in different regions of the process' address space. When the caller has freed all objects contained in a page in a partition, PartitionAlloc returns the physical memory to the operating system, but continues to reserve the region of address space. PartitionAlloc will only reuse an address space region for the same partition."""

However, it might be the case that the tests leak either memory or address space. I find the function `TestSetup` somewhat suspicious:

```
  // Zero the allocator structs to clear out traces
  // from previous test.
  memset(&allocator, 0, sizeof(allocator));
  memset(&generic_allocator, 0, sizeof(generic_allocator));

```

Working on a refactor that, if it succeeds, might help.
Project Member

Comment 7 by bugdroid1@chromium.org, May 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3cc4bc0d06591e905b80959e4273666fbcc06a3f

commit 3cc4bc0d06591e905b80959e4273666fbcc06a3f
Author: palmer <palmer@chromium.org>
Date: Fri May 26 21:18:48 2017

Refactor Partition Alloc unit tests.

Use testing::Test subclasses and TEST_F(...) instead of TEST(...) with
dependencies on global state.

BUG= 722911 

Review-Url: https://codereview.chromium.org/2903393002
Cr-Commit-Position: refs/heads/master@{#475131}

[modify] https://crrev.com/3cc4bc0d06591e905b80959e4273666fbcc06a3f/base/allocator/partition_allocator/partition_alloc_unittest.cc

Comment 8 by palmer@chromium.org, May 26 2017

Note: this refactor doesn't fix the bug.
Project Member

Comment 9 by bugdroid1@chromium.org, Sep 20 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/61538ef94febd23ee6ea84f0851489961661a6db

commit 61538ef94febd23ee6ea84f0851489961661a6db
Author: Erik Chen <erikchen@chromium.org>
Date: Wed Sep 20 21:41:45 2017

Refactor PartitionAllocator to allow local instances.

Previously, PartitionAllocator and related classes were trivially constructible,
but relied on 0-initialization of internal fields. This meant that only global
instances of PartitionAllocator could be used. This makes testing
PartitionAllocator difficult.

This CL adds constructors to PartitionAllocator classes, and switches globals to
LazyInstances, but otherwise should not cause behavior change. The expected
performance impact should be unnoticeable, as PartitionAllocator already has
non-trivial initialization code. This CL should not affect performance of the
allocation path itself.

Bug: 765735,  722911 
Change-Id: I8cd456772a3d71d316161a67d966e89fb9b81a63
Reviewed-on: https://chromium-review.googlesource.com/673726
Commit-Queue: Erik Chen <erikchen@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#503257}
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/address_space_randomization.cc
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/page_allocator.cc
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/partition_alloc.cc
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/partition_alloc.h
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/partition_alloc_unittest.cc
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/spin_lock.cc
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/base/allocator/partition_allocator/spin_lock.h
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/third_party/WebKit/Source/platform/wtf/allocator/Partitions.cpp
[modify] https://crrev.com/61538ef94febd23ee6ea84f0851489961661a6db/third_party/WebKit/Source/platform/wtf/allocator/Partitions.h

Comment 10 by gab@chromium.org, Feb 13 2018

 Issue 803704  has been merged into this issue.

Comment 11 by gab@chromium.org, Feb 13 2018

Cc: bbudge@chromium.org
Still a problem, repros 100% on my Win10 machine.

These days the error is :

$ out\Release\base_unittests.exe --gtest_filter=PartitionAllocTest.*

(...)
[19/24] PartitionAllocTest.DumpMemoryStats (CRASHED)
Received fatal exception 0xe0000008"
PartitionAllocTest.DumpMemoryStats (CRASHED)
(...)

Comment 12 by gab@chromium.org, Feb 13 2018

Labels: -Pri-3 M-66 Pri-2
This is the only test failing consistently when running all base_unittests locally (e.g. out\Release\base_unittests.exe --test-launcher-jobs=42), would be nice to address.

Comment 13 by gab@chromium.org, Mar 28 2018

Any updates? Still failing in every single local run.
I also see this on every run. I'm running a 32-bit version of base_unittests. It always succeeds on the retry.
Owner: davidbienvenu@chromium.org
@palmer - Chris, brucedawson asked me to look at this, so I'm going to have a whack at it. 
removing this test PartitionAllocTest.RepeatedReturnNullDirect fixes the issue, so I suspect there's an issue with that test.
Cc: palmer@chromium.org
Components: -Internals>Core Blink>MemoryAllocator>Partition
OK, great. Please CC me on the CL. Thanks!
The issue appears to be that the 32 bit version of MAYBE_RepeatedReturnNull allocates all the memory accessible to the process with generic_allocator. Although the test frees all the memory it allocates, generic_allocator never returns any memory to the OS, so the next test, DumpMemoryStats, can't allocate any memory because it uses the allocator, and there's no OS memory available to it.

I've been trying to figure out how to get the generic allocator to free its pages at the end of the test, w/o much luck so far. There are other, hackier, approaches, like letting the allocator object reserve some memory so that it can still run after generic_allocator has exhausted OS memory.  
#18: Is there a way to express that a test should always run in its own process, and never be batched together with other tests? If so, that might work.
I don't know if there's a way to force a test into it's own process (I'll look), but I am considering just splitting the OOM tests into their own file, which would accomplish the same thing.
#20: That SGTM. Thanks!
Project Member

Comment 22 by bugdroid1@chromium.org, Apr 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7980674378877d07a2b41f0a55a235fe13056cb1

commit 7980674378877d07a2b41f0a55a235fe13056cb1
Author: David Bienvenu <davidbienvenu@chromium.org>
Date: Tue Apr 10 20:39:34 2018

Put ReturnNull tests in their own class

This should fix an OOM crash in PartitionAllocTest.DumpMemoryStats
because it will make the ReturnNull tests run in a separate process,
and not exhaust process memory for subsequent tests.

Bug:  722911 
Change-Id: I841b846ea3322328a1d10391332c190b59db6db0
Reviewed-on: https://chromium-review.googlesource.com/1005506
Reviewed-by: Tom Sepez <tsepez@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Commit-Queue: David Bienvenu <davidbienvenu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549635}
[modify] https://crrev.com/7980674378877d07a2b41f0a55a235fe13056cb1/base/allocator/partition_allocator/partition_alloc_unittest.cc

Status: Fixed (was: Assigned)

Sign in to add a comment