New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 764522 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit 25 days ago
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 3
Type: Bug
Hotlist-MemoryInfra



Sign in to add a comment

backtrace() fails in OOP memlog on Linux official GPU process

Project Member Reported by brettw@chromium.org, Sep 12 2017

Issue description

Heap profiling, either the older --enable-heap-profiling=native one, or the newer out-of-process --memlog one, sometimes cause painting problems on Linux.

This manifests as a completely transparent Window with nothing painting, but the browser seeming to function properly underneath.

It does not happen 100% of the time and seems to vary by version. On my system, a debug build works fine, but an official build is broken. In my experience a build is either completely OK or completely broken.

I was able to track this down to the Linux GPU process being hung. It is hung in taking a stack trace:

#0  pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:94
#1  0x00007f348cf2fa0c in __GI___backtrace (array=<optimized out>, size=62)
    at ../sysdeps/x86_64/backtrace.c:103
#2  0x0000004781688f27 in base::debug::StackTrace::StackTrace(unsigned long) ()
#3  0x00000d54b75e7e40 in ?? ()
#4  0x0000004782973469 in profiling::AllocatorShimLogAlloc(profiling::AllocatorType, void*, unsigned long, char const*) ()
#5  0x0000100000000000 in ?? ()
#6  0x00007fffb37aabf0 in ?? ()
#7  0x00007fffb37aabd0 in ?? ()
#8  0x00000000b77a55b0 in ?? ()
#9  0x00000d54b7486050 in ?? ()
#10 0x00000d54b770e000 in ?? ()
#11 0x00007f3483ca1b2a in ?? ()
   from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#12 0x00007f3483ca1b66 in ?? ()
   from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#13 0x00007f3483ca3cd1 in ?? ()

I suspect something happens with the nVidia GPU driver with the stack that causes the Linux backtrace call to hang.

In the mean time, I'm going to disable --memlog for the GPU process on Linux.
 

Comment 1 by brettw@chromium.org, Sep 12 2017

Since this seems to be related to the optimization level on my system, it could be a result of some optimization in our GPU code, and the nvidia driver on the stack is a red herring.
There's a comment in stack_trace_posix.cc that seems to imply we need to warm up backtrace(), otherwise it will hang with a very similar looking trace.

https://cs.chromium.org/chromium/src/base/debug/stack_trace_posix.cc?type=cs&q=StackTrace::StackTrace&sq=package:chromium&l=440
"""
  // Warm up stack trace infrastructure. It turns out that on the first
  // call glibc initializes some internal data structures using pthread_once,
  // and even backtrace() can call malloc(), leading to hangs.
  //
  // Example stack trace snippet (with tcmalloc):
  //
  // #8  0x0000000000a173b5 in tc_malloc
  //             at ./third_party/tcmalloc/chromium/src/debugallocation.cc:1161
  // #9  0x00007ffff7de7900 in _dl_map_object_deps at dl-deps.c:517
  // #10 0x00007ffff7ded8a9 in dl_open_worker at dl-open.c:262
  // #11 0x00007ffff7de9176 in _dl_catch_error at dl-error.c:178
  // #12 0x00007ffff7ded31a in _dl_open (file=0x7ffff625e298 "libgcc_s.so.1")
  //             at dl-open.c:639
  // #13 0x00007ffff6215602 in do_dlopen at dl-libc.c:89
  // #14 0x00007ffff7de9176 in _dl_catch_error at dl-error.c:178
  // #15 0x00007ffff62156c4 in dlerror_run at dl-libc.c:48
  // #16 __GI___libc_dlopen_mode at dl-libc.c:165
  // #17 0x00007ffff61ef8f5 in init
  //             at ../sysdeps/x86_64/../ia64/backtrace.c:53
  // #18 0x00007ffff6aad400 in pthread_once
  //             at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:104
  // #19 0x00007ffff61efa14 in __GI___backtrace
  //             at ../sysdeps/x86_64/../ia64/backtrace.c:104
  // #20 0x0000000000752a54 in base::debug::StackTrace::StackTrace
  //             at base/debug/stack_trace_posix.cc:175
  // #21 0x00000000007a4ae5 in
  //             base::(anonymous namespace)::StackDumpSignalHandler
  //             at base/process_util_posix.cc:172
  // #22 <signal handler called>
"""
Cc: primiano@chromium.org

Comment 4 by brettw@chromium.org, Sep 12 2017

Owner: brettw@chromium.org
Status: Started (was: Available)

Comment 5 by brettw@chromium.org, Sep 12 2017

Summary: backtrace() fails in OOP memlog on Linux official GPU process (was: nvidia driver causes backtrace() to fail, breaking memory logging)
Project Member

Comment 6 by bugdroid1@chromium.org, Sep 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/193e9210f91a8fc07f26d405cb1be69183060e9d

commit 193e9210f91a8fc07f26d405cb1be69183060e9d
Author: Brett Wilson <brettw@chromium.org>
Date: Wed Sep 13 16:09:10 2017

Initialize stack dumping in OOP memlog.

Failing to explicitly initialize stack dumping before adding a malloc hook that
uses it can cause problems. In particular, on Linux calling backtrace() can
load a library which calls malloc, meaning the hook will be recursive.

Bug:  764522 
Change-Id: If602902aa4238903541ca52f765927369dbf6e71
Reviewed-on: https://chromium-review.googlesource.com/664301
Reviewed-by: Erik Chen <erikchen@chromium.org>
Commit-Queue: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#501644}
[modify] https://crrev.com/193e9210f91a8fc07f26d405cb1be69183060e9d/chrome/common/profiling/memlog_allocator_shim.cc

Comment 7 by brettw@chromium.org, Sep 13 2017

Status: Fixed (was: Started)

Sign in to add a comment