New issue
Advanced search Search tips
Starred by 2 users
Status: WontFix
Owner:
Closed: Nov 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment
x86-generic-tot-asan-informational failures ASAN unable to mmap / pthread_create() failure
Project Member Reported by steve...@chromium.org, Sep 20 2016 Back to list
The x86-generic-tot-asan-informational builder:
https://build.chromium.org/p/chromiumos.chromium/builders/x86-generic-tot-asan-informational

is failing almost constantly, mostly in login_Cryptohome

This may or may not be related to issue 618392

 
The last 5 runs all have this failure message:
Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed

The same message is also appearing in other tests, e.g.
login_OwnershipNotRetaken
login_LogoutProcessCleanup

Output from the most recent failure:

https://pantheon.corp.google.com/storage/browser/chromeos-image-archive/x86-generic-tot-asan-informational/R55-8819.0.0-b11989/vm_test_results_1/test_harness/all/SimpleTestVerify/1_autotest_tests/results-33-login_Cryptohome/debug/

Output snippet:

09/20 08:27:06.773 INFO |              oobe:0039| Invoking Oobe.loginForTesting
09/20 08:27:09.798 ERROR|           browser:0062| Failure while starting browser backend.
Traceback (most recent call last):
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/browser.py", line 55, in __init__
    self._browser_backend.Start()
  File "/usr/local/telemetry/src/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
    return func(*args, **kwargs)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/cros_browser_backend.py", line 163, in Start
    self._gaia_id, not self.browser_options.disable_gaia_services)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/oobe.py", line 61, in NavigateFakeLogin
    enterprise_enroll)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/oobe.py", line 40, in _ExecuteOobeApi
    self.WaitForJavaScriptExpression("typeof Oobe == 'function'", 20)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 123, in WaitForJavaScriptExpression
    util.WaitFor(IsJavaScriptExpressionTrue, timeout)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/core/util.py", line 86, in WaitFor
    res = condition()
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 116, in IsJavaScriptExpressionTrue
    return bool(self.EvaluateJavaScript(expr))
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 187, in EvaluateJavaScript
    expr, context_id=None, timeout=timeout)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 215, in EvaluateJavaScriptInContext
    expr, context_id=context_id, timeout=timeout)
  File "/usr/local/telemetry/src/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
    return func(*args, **kwargs)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 37, in inner
    inspector_backend._ConvertExceptionFromInspectorWebsocket(e)
  File "/usr/local/telemetry/src/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
    return func(*args, **kwargs)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 34, in inner
    return func(inspector_backend, *args, **kwargs)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 208, in EvaluateJavaScript
    return self._runtime.Evaluate(expr, context_id, timeout)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py", line 45, in Evaluate
    res = self._inspector_websocket.SyncRequest(request, timeout)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 110, in SyncRequest
    res = self._Receive(timeout)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 149, in _Receive
    data = self._socket.recv()
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 596, in recv
    opcode, data = self.recv_data()
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 606, in recv_data
    frame = self.recv_frame()
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 637, in recv_frame
    self._frame_header = self._recv_strict(2)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 746, in _recv_strict
    bytes = self._recv(shortage)
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 739, in _recv
    raise WebSocketConnectionClosedException()
DevtoolsTargetCrashException: Devtools target crashed
********************************************************************************
(/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:394 _ConvertExceptionFromInspectorWebsocket) Original exception:

********************************************************************************
(/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:415 _AddDebuggingInformation) Received a socket error in the browser connection and the tab no longer exists. The tab probably crashed.
********************************************************************************
(/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:416 _AddDebuggingInformation) Debugger url: ws://127.0.0.1:60286/devtools/page/cb1638a9-1128-4679-8367-0d910dab0a2a
Found Minidump: False
Stack Trace:
********************************************************************************
	Cannot get stack trace on CrOS
********************************************************************************
Standard output:
********************************************************************************
	Cannot get standard output on CrOS
********************************************************************************
Scanning through the chrome logs I do not see any smoking guns.

However, there are two ASAN log entries:

==26899==ERROR: AddressSanitizer failed to allocate 0xb09000 (11571200) bytes of FakeStack (error code: 12)
ERROR: Failed to mmap
==26899==Process memory map follows:
	0x00155000-0x00156000	

AND:

==28561==ERROR: AddressSanitizer failed to allocate 0xb09000 (11571200) bytes of FakeStack (error code: 12)
==28561==Process memory map follows:
...
==28561==End of process memory map.
==28561==AddressSanitizer CHECK failed: /var/tmp/portage/sys-devel/llvm-3.9_pre265926-r11/work/llvm-3.9_pre265926/projects/compiler-rt/lib/sanitizer_common/sanitizer_common.cc:183 "((0 && "unable to mmap")) != (0)" (0x0, 0x0)
    #0 0x56812206  (/opt/google/chrome/chrome+0x1201206)
    #1 0x56819c59  (/opt/google/chrome/chrome+0x1208c59)
    #2 0x56819e6d  (/opt/google/chrome/chrome+0x1208e6d)
    #3 0x56820f81  (/opt/google/chrome/chrome+0x120ff81)
    #4 0x56766541  (/opt/google/chrome/chrome+0x1155541)
    #5 0x568167c6  (/opt/google/chrome/chrome+0x12057c6)
    #6 0x56766c58  (/opt/google/chrome/chrome+0x1155c58)
    #7 0x5f4fa959  (/opt/google/chrome/chrome+0x9ee9959)
    #8 0x5f4fcb03  (/opt/google/chrome/chrome+0x9eebb03)
    #9 0x56816bff  (/opt/google/chrome/chrome+0x1205bff)
    #10 0x5676cb9e  (/opt/google/chrome/chrome+0x115bb9e)
    #11 0x5534d584  (/lib/libpthread.so.0+0x6584)
Cc: ihf@chromium.org
achuith@, ihf@, I don't suppose either of you have any idea of how to get symbols from the asan_log files?

Cc: osh...@chromium.org
or oshima@?
Comment 6 by ihf@chromium.org, Sep 20 2016
I looked at it. Can't help with symbols, but this crash is typically OOM. ASAN uses extra memory, and maybe we are bumping against the limits? Or maybe there is a leak that it doesn't catch.

I think memory looks ok before/after
results-31-desktopui_KillRestart/desktopui_KillRestart.session/sysinfo/iteration.1
MemTotal:        2068296 kB
MemFree:         1004168 kB
MemAvailable:    1662676 kB

before results-32-security_RootCA/security_RootCA/sysinfo/iteration.1 we have
MemTotal:        2068296 kB
MemFree:          157196 kB
MemAvailable:     817508 kB

Very curious what happened between the two?
Now results-33-login_Cryptohome/login_Cryptohome/sysinfo/iteration.1

before:
MemTotal:        2068296 kB
MemFree:           51124 kB
MemAvailable:     686152 kB

after (crash seems to clean up usage)
MemTotal:        2068296 kB
MemFree:         1042492 kB
MemAvailable:    1660620 kB

In results-34-security_RestartJob/security_RestartJob/sysinfo/iteration.1 usage remains reasonable
MemTotal:        2068296 kB
MemFree:          425784 kB
MemAvailable:    1046288 kB
/
MemTotal:        2068296 kB
MemFree:          422540 kB
MemAvailable:    1043044 kB

Investigating a different failure with similar symptoms:
https://build.chromium.org/p/chromiumos.chromium/builders/x86-generic-tot-asan-informational/builds/11996

VMTest1 only fails in desktopui_ScreenLocker:
https://pantheon.corp.google.com/storage/browser/chromeos-image-archive/x86-generic-tot-asan-informational/R55-8823.0.0-b11996/vm_test_results_1/test_harness/all/SimpleTestVerify/1_autotest_tests/results-20-desktopui_ScreenLocker

There are 2 asan_log files that start with:
==15117==ERROR: AddressSanitizer failed to allocate 0xb09000 (11571200) bytes of FakeStack (error code: 12)

However the starting memeinfo looks fine:
MemTotal: 2067756 kB 
MemFree: 218796 kB 
MemAvailable: 896920 kB 

It also contains similar debug info (from desktopui_ScreenLocker.INFO):

09/21 05:47:29.093 INFO |              oobe:0039| Invoking Oobe.loginForTesting
09/21 05:47:29.911 ERROR|           browser:0062| Failure while starting browser backend.
Traceback (most recent call last):
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/browser/browser.py", line 55, in __init__
    self._browser_backend.Start()
...
  File "/usr/local/telemetry/src/third_party/catapult/telemetry/third_party/websocket-client/websocket.py", line 739, in _recv
    raise WebSocketConnectionClosedException()
DevtoolsTargetCrashException: Devtools target crashed

********************************************************************************
(/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:394 _ConvertExceptionFromInspectorWebsocket) Original exception:

********************************************************************************
(/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:415 _AddDebuggingInformation) Received a socket error in the browser connection and the tab no longer exists. The tab probably crashed.

My question now is, is the catapult output the actual failure (which suggests that the render process is crashing), or a symptom of ASAN failures catching something else?

Cc: steve...@chromium.org
Owner: jdufault@chromium.org
Status: Assigned
I tried to reproduce this locally but failed:

(cr) ~/trunk/src/scripts $ KEEP_CHROME_DEBUG_SYMBOLS=1 USE="accessibility asan autotest build_tests buildcheck chrome_debug chrome_remoting clang cups evdev_gestures fonts gn gold highdpi nacl opengles ozone runhooks v4l2_codec vaapi xkbcommon -X -afdo_use -app_shell -chrome_debug_tests chrome_internal -chrome_media -component_build -envoy -hardfp -internal_gles_conform -internal_khronos_glcts -mojo -opengl -v4lplugin -verbose -vtable_verify" emerge-${BOARD} chromeos-chrome
(cr) ~/trunk/src/scripts $ ./build_image --board=${BOARD} test && ./image_to_vm.sh --board=${BOARD} --test_image

~/trunk/src/scripts $ ./bin/cros_run_vm_test --no_graphics --board=x86-generic suite:smoke --results_dir_root /tmp/vm_tests_asan

All tests passed.

->current gardener to try and investigate

Note: In comment #8 $BOARD = x86-generic

Cc: x...@chromium.org
 Issue 658774  has been merged into this issue.
Owner: x...@chromium.org
-> Current gardener

Cc: jdufault@chromium.org
Owner: jamescook@chromium.org
Still failing this week -> current gardener (me!)

Steven, for your attempt to repro in #8 -- where did you get those USE flags?

So I found this occasionally in /var/log/ui/ui.<foo> for failing test runs:

[4311:4312:1031/131541:ERROR:platform_thread_posix.cc(119)] pthread_create: Resource temporarily unavailable

[4311:4312:1031/131541:FATAL:child_thread_impl.cc(160)] Check failed: CreateWaitAndExitThread(base::TimeDelta::FromSeconds(60)). 
#0 0x00005672d785 __interceptor_backtrace
#1 0x00005eb34cca base::debug::StackTrace::StackTrace()
#2 0x00005eb88915 logging::LogMessage::~LogMessage()
#3 0x000068551633 content::(anonymous namespace)::SuicideOnChannelErrorFilter::OnChannelError()
#4 0x000061b29270 IPC::ChannelProxy::Context::OnChannelError()
#5 0x000061b5ec78 IPC::SyncChannel::SyncContext::OnChannelError()
#6 0x000061b2ac06 IPC::ChannelProxy::Context::OnSendMessage()
#7 0x000061b3112b _ZN4base8internal7InvokerINS0_9BindStateIMN3IPC12ChannelProxy7ContextEFvSt10unique_ptrINS3_7MessageESt14default_deleteIS7_EEEJ13scoped_refptrIS5_ENS0_13PassedWrapperISA_EEEEEFvvEE3RunEPNS0_13BindStateBaseE
#8 0x00005ed91a5e base::debug::TaskAnnotator::RunTask()
#9 0x00005ebae1a7 base::MessageLoop::RunTask()
#10 0x00005ebaf0ab base::MessageLoop::DeferOrRunPendingTask()
#11 0x00005ebb0381 base::MessageLoop::DoWork()
#12 0x00005ebbae78 base::MessagePumpLibevent::Run()
#13 0x00005ebad658 base::MessageLoop::RunHandler()
#14 0x00005ec49267 base::RunLoop::Run()
#15 0x00005ecce121 base::Thread::Run()
#16 0x00005ecce4d6 base::Thread::ThreadMain()
#17 0x00005ecbb0ab base::(anonymous namespace)::ThreadFunc()
#18 0x0000567c4080 __asan::AsanThread::ThreadStart()
#19 0x00005671a01f asan_thread_start()
#20 0x0000552b3585 <unknown>
#21 0x000054d9700e clone

There's some discussion of this in issue 552097. It would happen if pthread_create failed.

So far this doesn't seem to correlate exactly with failing tests, but it seems suspicious.

Talked to rockot@ about this. That stack is suspicious for memory exhaustion. (It could be some other system resource, but I'm not sure what).

I think what's happening in that SuicideOnChannelErrorFilter class is:
* Renderer detects that browser process died (got an error on the IPC channel)
* Renderer tries to spawn a thread to kill itself 60 seconds later
* Renderer can't spawn the thread because pthread_create() fails

Also, I can reproduce the problem in a local VM, using flags similar to #8, but only if I run the entire test suite. If I just run the first failing test (security_NetworkListeners) it passes.

Does anyone know how to get cros_run_vm_test to just run 2 (or N) tests instead of a whole suite? Or how to get memory information out of the VM?

James - is it possible that the VM has too little memory? We run it with 2 GB, but that may not be sufficient for the ASAN build. It's set here:
https://cs.corp.google.com/chromeos_public/src/scripts/lib/cros_vm_lib.sh?l=294

We increase it for moblab here: https://cs.corp.google.com/chromeos_public/src/scripts/lib/cros_vm_lib.sh?l=248

We may want to increase it for asan by using this function:
https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/client/cros/asan.py

Do you want to see if this works locally?
Giving the VM more memory didn't seem to help.

I changed it to 4 GB by changing https://cs.corp.google.com/chromeos_public/src/scripts/lib/cros_vm_lib.sh?l=294 to "-m 4G" just like moblab does.

The VM picked up the change. Here's meminfo.before from the first test:
MemTotal:        4148272 kB
MemFree:         2542696 kB
MemAvailable:    3227604 kB

However, security_NetworkListeners still fails (and is the first test to fail):
Unhandled DevtoolsTargetCrashException: Devtools target crashed

For the record, here are all the failing tests:
results-12-security_NetworkListeners/security_NetworkListeners               FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed
results-20-desktopui_ScreenLocker/desktopui_ScreenLocker                     FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed
results-23-security_SandboxedServices/security_SandboxedServices             FAIL: One or more processes failed sandboxing
results-25-login_OwnershipTaken/login_OwnershipTaken                         FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed
results-35-login_Cryptohome/login_Cryptohome                                 FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed
results-38-login_OwnershipNotRetaken/login_OwnershipNotRetaken               FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed
results-42-login_LogoutProcessCleanup/login_LogoutProcessCleanup             FAIL: Unhandled unicode: Unhandled DevtoolsTargetCrashException: Devtools target crashed

Maybe this isn't memory exhaustion after all. I see some discussion of other things than can cause pthread_create to fail:
http://unix.stackexchange.com/questions/253903/creating-threads-fails-with-resource-temporarily-unavailable-with-4-3-kernel

Other ideas?

Are we sure there's not a real memory leak here? Nothing in the asan logs?
Summary: x86-generic-tot-asan-informational failures ASAN unable to mmap / pthread_create() failure since Oct 13 (was: login_Cryptohome fails nearly constantly on x86-generic-tot-asan-informational)
Hrm, I assume this is the right place to look for ASAN logs: results-1-security_NetworkListeners/security_NetworkListeners/sysinfo/var/log_diff/asan

If so, yes there is an error there:
==3025==ERROR: AddressSanitizer failed to allocate 0xb09000 (11571200) bytes of FakeStack (error code: 12)
==3025==Process memory map follows:
<<<big memory map>>>
==3025==AddressSanitizer CHECK failed: /var/tmp/portage/sys-devel/llvm-3.9_pre265926-r13/work/llvm-3.9_pre265926/projects/compiler-rt/lib/sanitizer_common/sanitizer_common.cc:183 "((0 && "unable to mmap")) != (0)" (0x0, 0x0)
    #0 0x567ed4c6  (/opt/google/chrome/chrome+0x12484c6)
    #1 0x567f4f19  (/opt/google/chrome/chrome+0x124ff19)
    #2 0x567f512d  (/opt/google/chrome/chrome+0x125012d)
    #3 0x567fc241  (/opt/google/chrome/chrome+0x1257241)
    #4 0x56741801  (/opt/google/chrome/chrome+0x119c801)
    #5 0x567f1a86  (/opt/google/chrome/chrome+0x124ca86)
    #6 0x56741f18  (/opt/google/chrome/chrome+0x119cf18)
    #7 0x5ecddc79  (/opt/google/chrome/chrome+0x9738c79)
    #8 0x5ecdfe23  (/opt/google/chrome/chrome+0x973ae23)
    #9 0x567f1ebf  (/opt/google/chrome/chrome+0x124cebf)
    #10 0x56747e5e  (/opt/google/chrome/chrome+0x11a2e5e)
    #11 0x552e1584  (/lib/libpthread.so.0+0x6584)

So something (ASAN itself?) is attempting to allocate 11 MB of "FakeStack" and failing, which causes a CHECK inside ASAN.

I see that libpthread is the last thing on the stack, so maybe this is the pthread_create() call resulting in ASAN dying internally.

Full asan log attached.

I see the ASAN CHECK listed in issue 508949, but the sheriffs there just disabled the failing tests.

https://github.com/google/sanitizers/issues/165 mentions this CHECK error message.

Does anyone know how to symbolize the above stack?

asan.3025
20.6 KB Download
The good news is I got a stack. The bad news is that I'm not sure what to make of it.

jamescook@rubella2:/w/chrome/src (configview)$ tools/valgrind/asan/asan_symbolize.py < ~/asan_stack.txt 
    #0 0x567ed4c6 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) crtstuff.c:?
    #1 0x567f4f19 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ??:?
    #2 0x567f512d in __sanitizer::ReportMmapFailureAndDie(unsigned long, char const*, char const*, int, bool) crtstuff.c:?
    #3 0x567fc241 in __sanitizer::MmapOrDie(unsigned long, char const*, bool) crtstuff.c:?
    #4 0x56741801 in __asan::FakeStack::Create(unsigned long) crtstuff.c:?
    #5 0x567f1a86 in __asan::AsanThread::AsyncSignalSafeLazyInitFakeStack() crtstuff.c:?
    #6 0x56741f18 in __asan_stack_malloc_0 ??:?
    #7 0x5ecddc79 in ?? ../../../chromeos-cache/distfiles/target/chrome-src/src/base/threading/platform_thread_linux.cc:80:0
    #8 0x5ecdfe23 in SetCurrentThreadPriority ../../../chromeos-cache/distfiles/target/chrome-src/src/base/threading/platform_thread_posix.cc:249:7
    #9 0x5ecdfe23 in ThreadFunc ../../../chromeos-cache/distfiles/target/chrome-src/src/base/threading/platform_thread_posix.cc:63:0
    #10 0x567f1ebf in __asan::AsanThread::ThreadStart(unsigned long, __sanitizer::atomic_uintptr_t*) crtstuff.c:?
    #11 0x56747e5e in asan_thread_start(void*) crtstuff.c:?
    #12 0x552e1584 in __pthread_get_minstack ??:?

The stack looks correct to me (apart from #12). This is the code starting at line 80 in platform_thread_linux.cc:

bool SetCurrentThreadPriorityForPlatform(ThreadPriority priority) {
#if !defined(OS_NACL)
  FilePath cpuset_directory = ThreadPriorityToCpusetDirectory(priority);
  ...
}

__asan_stack_malloc() sounds like it's just allocating some space on the stack. Hmm.


Recipe: I manually edited the paths in the above stack trace to look like this:
    #0 0x567ed4c6  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x12484c6)
    #1 0x567f4f19  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x124ff19)
    #2 0x567f512d  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x125012d)
    #3 0x567fc241  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x1257241)
    #4 0x56741801  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x119c801)
    #5 0x567f1a86  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x124ca86)
    #6 0x56741f18  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x119cf18)
    #7 0x5ecddc79  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x9738c79)
    #8 0x5ecdfe23  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x973ae23)
    #9 0x567f1ebf  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x124cebf)
    #10 0x56747e5e  (/x/chromeos/chroot/var/cache/chromeos-chrome/chrome-src/src/out_x86-generic/Release/chrome+0x11a2e5e)
    #11 0x552e1584  (/x/chromeos/chroot/build/x86-generic/lib/libpthread.so.0+0x6584)

and ran asan_symbolize.py from my own chromium checkout.

Cc: kcc@chromium.org euge...@chromium.org
It's trying to allocate "fake stack" which is actually heap. It's used to detect stack-use-after-free.

It's out of 32-bit address space. The memory dump in #19 attachment shows just 43MB total free address space, with the largest hole of less than 8MB.
Now that we know the cause of the problem, I think we should turn off this bot. All shipping Intel Chromebooks are 64-bit and we have a 64-bit ASAN bot. See issue 661347 for discussion.

Summary: x86-generic-tot-asan-informational failures ASAN unable to mmap / pthread_create() failure (was: x86-generic-tot-asan-informational failures ASAN unable to mmap / pthread_create() failure since Oct 13)
Retitled since this is broken at least as far back at August 27, which is over 2 months.

https://build.chromium.org/p/chromiumos.chromium/builders/x86-generic-tot-asan-informational/builds/11790

It's hard to know the exact date since the failures are somewhat flaky and there's no overview view of the builds that far back.

Also, I don't think this is Chrome itself consuming too much memory. I just checked the 64-bit Intel ASAN bot and it's running a 2 GB VM. We don't exhaust memory there when running the same tests.

Typical meminfo.after on the 64-bit bot:
MemTotal:        2046644 kB
MemFree:         1333116 kB
MemAvailable:    1538384 kB

Comment 25 by ihf@chromium.org, Nov 2 2016
Let me back off a little. There seems to be a slow memory leak on 32 bit. There are 50 some tests to run. We pass the first 40 something and run out of memory. Before turning down the builder, we could just reboot the VM between tests. But obviously this will hide the memory or address space leak.
I get the same failures running locally with a single test.

./bin/cros_run_vm_test --no_graphics --board=x86-generic --results_dir_root=JAMES_TOT "security_NetworkListeners"

Fails >50% of the time, even in a 4GB VM.

eugenis noted in an email thread that we are using 860MB address space for stacks
- 20MB per thread. We're not sure where that is coming from, though. Most of the thread creation code uses 0 for requested stack size, which means OS / pthread default.

https://cs.chromium.org/chromium/src/base/threading/platform_thread_linux.cc?sq=package:chromium&dr=C&rcl=1478101093&l=157

OS default on 64-bit Chrome OS is 8MB (output of ulimit -s). I'm not sure what the default is on 32-bit.

I've tried configuring ASAN to use less memory, but I still get failures running locally. See https://codereview.chromium.org/2471333003/
Comment 28 by ihf@chromium.org, Nov 3 2016
Wow, these stack sizes are huge. I had to reduce stack sizes for Flash to < 1MB at some point.
Status: WontFix
We turned down this bot, per discussion on issue 661347. Thanks for the help, everyone!

Sign in to add a comment