New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 674665 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Dec 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

clang: PartitionAllocTest sets rlimit that is lower than memory reserved by coverage=edge

Project Member Reported by krasin@chromium.org, Dec 15 2016

Issue description

Chrome Version: tip
OS: Linux x86-64
Clang: tip

What steps will reproduce the problem?
(1) Compile base_unittests with edge coverage and Clang ToT:
gn gen out/edge-tot '--args=is_debug=false sanitizer_coverage_flags = "edge"  is_cfi=false symbol_level=2 dcheck_always_on=true clang_use_chrome_plugins=false clang_base_path="/usr/lo
cal/google/home/krasin/src/llvm.org/release-build"' --check

(the path to your Clang ToT might differ)

(2) Run the test:
$ ./out/edge-tot/base_unittests --gtest_filter=PartitionAllocTest.RepeatedReturnNull                                                                                                     

IMPORTANT DEBUGGING NOTE: batches of tests are run inside their
own process. For debugging a test inside a debugger, use the
--gtest_filter=<your_test_name> flag along with
--single-process-tests.
Using sharding settings from environment. This is shard 0/1
Using 1 parallel jobs.
Note: Google Test filter = PartitionAllocTest.RepeatedReturnNull
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from PartitionAllocTest
[ RUN      ] PartitionAllocTest.RepeatedReturnNull
Received signal 4 ILL_ILLOPN 0000019c485e
#0 0x0000017f1138 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#1 0x7fc5b73d3330 <unknown>
#2 0x0000019c485e base::partitionOutOfMemory()
#3 0x0000019c2177 base::partitionAllocSlowPath()
#4 0x000000631d6d base::DoReturnNullTest()
#5 0x000001a79d0c testing::Test::Run()
#6 0x000001a7b619 testing::TestInfo::Run()
#7 0x000001a7c049 testing::TestCase::Run()
#8 0x000001a8d359 testing::internal::UnitTestImpl::RunAllTests()
#9 0x000001a8cc16 testing::UnitTest::Run()
#10 0x0000019fbe05 base::TestSuite::Run()
#11 0x000001a1a93a base::(anonymous namespace)::LaunchUnitTestsInternal()
#12 0x000001a1a6bb base::LaunchUnitTests()
#13 0x0000019e3635 main
#14 0x7fc5b6e09f45 __libc_start_main
#15 0x000000562a85 <unknown>
  r8: 0000000008000000  r9: 0000000000000000 r10: 0000000000000022 r11: 0000000000000246
 r12: 0000000002e3cb80 r13: 0000000000000000 r14: 0000000002c64008 r15: 0000000000190020
  di: 0000000002e3d12c  si: 0000000000003557  bp: 00007ffd414e2700  bx: 0000000000000000
  dx: 00000000019c485d  ax: 00007fc576d612d0  cx: 00000000023df1c0  sp: 00007ffd414e26f0
  ip: 00000000019c485e efl: 0000000000010206 cgf: 0000000000000033 erf: 0000000000000000
 trp: 0000000000000006 msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
[142511:142512:1215/114620.167633:1384094597216:ERROR:kill_posix.cc(84)] Unable to terminate process group 142513: No such process
[1/1] PartitionAllocTest.RepeatedReturnNull (CRASHED)
1 test crashed:
    PartitionAllocTest.RepeatedReturnNull (../../base/allocator/partition_allocator/partition_alloc_unittest.cc:1312)

Something bad landed in LLVM trunk recently and broke coverage. I will investigate.
 

Comment 1 by thakis@chromium.org, Dec 15 2016

Blocking: 674274
Ot sounds like this too blocks a roll. We really need to get one out, so thanks for taking a look.

Comment 2 by thakis@chromium.org, Dec 15 2016

First bad: https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxUBSanVptr%20tester/builds/972

That has got_clang revision 289657, prior good build has 289636

Comment 3 by h...@chromium.org, Dec 15 2016

Looking at https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxUBSanVptr%20tester/
it seems like

Last good Clang: r289636
First bad: r289657

Comment 4 by thakis@chromium.org, Dec 15 2016

svn log -r289636:289657 https://nico@llvm.org/svn/llvm-project

Nothing obvious

Comment 5 by krasin@chromium.org, Dec 15 2016

I am building r289645 right now.

Comment 6 by krasin@chromium.org, Dec 15 2016

r289645 is bad. I am now building 289651.

Comment 7 by krasin@chromium.org, Dec 15 2016

r289651 is bad, but that's certainly expected, as I went the wrong side.

I am now building r289640

Comment 8 by h...@chromium.org, Dec 15 2016

I rolled back llvm, clang and compiler-rt to r289636 but still get that failure :-/

Could this be due to a Chromium change?

Last good: 5fb8c41aea7e2a5a7ec8f33a3c6bee46eb86d95d
First bad: a20ca9fe5726b33fb6b0d9f3b10b545fd6e6b195

There's only one change to base/ in that range:
https://codereview.chromium.org/2518253002
"Move Partition Allocator into Chromium base."

Incidentally, that's the test that's failing.

Comment 9 by krasin@chromium.org, Dec 15 2016

r289640 is bad.

Actually, yes, the ToT was not running wtf_unittests which is where these tests were before that move. I will now try r289600
krasin: I think Hans is saying that it might be bad at 289575 (the currently pinned clang) too, and might've been bad forever, and we just didn't notice 'cause that test didn't run until recently.
r289600 is bad.
I ran a sanity check on the revision we currently have as the stable toolchain, r289575. It's also bad.

https://build.chromium.org/p/chromium.fyi/builders/UBSanVptr Linux does not catch this, as it does not have edge coverage set. That makes this bug a non-blocker for the roll: we already have this bug in src/ and the roll does not make it any worse.
Blocking: -674274
Just spotted #8 and #10 (sorry, I was on the meeting, and only made occasional writes). Yes, #12 confirms that the currently rolled Clang is already bad.
Summary: clang: edge coverage causes a miscompilation in PartitionAllocTest (was: Clang ToT causes UBSan bot to fail due to a coverage miscompilation)
r288000 is bad.
r287000 is bad. Trying r285000.
r285000 is bad. Last attempt to claim it's a regression: r280000. After that I will be looking at the disassembly code.
r280000 is bad too. Turning into the investigation mode...
I have built Clang ToT (r290001) and spent some time digging. My current status is that this mmap with the same arguments succeeds for regular case and fails (return -1) when coverage=edge is enabled:
https://cs.chromium.org/chromium/src/base/allocator/partition_allocator/page_allocator.cc?sq=package:chromium&dr=CSs&rcl=1481905173&l=66

static void* systemAllocPages(
    void* hint,
    size_t len,
    PageAccessibilityConfiguration pageAccessibility) {
...
ret = mmap(hint, len, accessFlag, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
...

In both cases mmap leads to the tcmalloc hook:
https://cs.chromium.org/chromium/src/third_party/tcmalloc/chromium/src/malloc_hook_mmap_linux.h?q=third_party/tcmalloc/chromium/src/malloc_hook_mmap_linux.h:181&sq=package:chromium&l=181

Before I dive any deeper, I would like to take some time and reflect on my current findings. 
Summary: clang: PartitionAllocTest sets rlimit that is lower than memory reserved by coverage=edge (was: clang: edge coverage causes a miscompilation in PartitionAllocTest)
So, this is just rlimit. In case of coverage=edge, it reserves around 5GB of memory for bookkeeping. The real usage is much usually lower, but the test fails with OOM anyway, just because we're already above the limit when it's set.

https://cs.chromium.org/chromium/src/base/allocator/partition_allocator/partition_alloc_unittest.cc?sq=package:chromium&dr=CSs&l=70


The fix is sent for a review: https://codereview.chromium.org/2584123002/
Project Member

Comment 22 by bugdroid1@chromium.org, Dec 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/57d9a8022c40f1612774d7eb6a2554cf4a0cefe2

commit 57d9a8022c40f1612774d7eb6a2554cf4a0cefe2
Author: krasin <krasin@chromium.org>
Date: Sun Dec 18 16:36:13 2016

PartitionAllocTest: increase address space limit to 6 GB.

In some cases, such as sanitizer_coverage_flags=edge,
more than 5 GB of address space is reserved at a startup.
Raising the address space limit to be higher than that.

Unfortunately, Linux has a broken implementation of getrusage,
and the proper way of fixing this (by setting the current usage + 4 GB)
is not trivial / impossible under a sandbox. This is the riskless
change I could make up: increase the address limit, while still
not increasing the amount of memory allocated to avoid issues on
Android.

BUG= 674665 

Review-Url: https://codereview.chromium.org/2584123002
Cr-Commit-Position: refs/heads/master@{#439372}

[modify] https://crrev.com/57d9a8022c40f1612774d7eb6a2554cf4a0cefe2/base/allocator/partition_allocator/partition_alloc_unittest.cc

Status: Fixed (was: Untriaged)
The UBSanVptr ToT bot is now green:
https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxUBSanVptr%20tester/builds/983

Sign in to add a comment