Sampling heap profiler locks on TLS destruction |
||
Issue description
Here's the call stack:
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f91b7dbeb95 in __GI___pthread_mutex_lock (mutex=0x5629465ffc90 <base::SamplingHeapProfiler::GetInstance()::instance+8>)
at ../nptl/pthread_mutex_lock.c:80
#2 0x00005629420b3576 in base::internal::LockImpl::Lock() ()
#3 0x00005629420a2b33 in base::SamplingHeapProfiler::DoRecordAlloc(unsigned long, unsigned long, void*, unsigned int) ()
#4 0x00005629420a3503 in base::(anonymous namespace)::AllocFn(base::allocator::AllocatorDispatch const*, unsigned long, void*) ()
#5 0x00005629420fa58e in operator new(unsigned long, std::nothrow_t const&) ()
#6 0x00005629420cdf25 in (anonymous namespace)::ConstructTlsVector() ()
#7 0x00005629420ce082 in base::ThreadLocalStorage::Slot::Set(void*) ()
#8 0x00005629420a30d9 in base::SamplingHeapProfiler::DoRecordFree(void*) ()
#9 0x00005629420a35eb in base::(anonymous namespace)::FreeFn(base::allocator::AllocatorDispatch const*, void*, void*) ()
#10 0x00007f91b7fe4957 in _dl_deallocate_tls () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f91b7dbb736 in __free_stacks (limit=limit@entry=41943040) at allocatestack.c:284
#12 0x00007f91b7dbb88a in queue_stack (stack=0x7f918e715700) at allocatestack.c:312
#13 __deallocate_stack (pd=pd@entry=0x7f918e715700) at allocatestack.c:763
#14 0x00007f91b7dbc3a9 in __free_tcb (pd=pd@entry=0x7f918e715700) at pthread_create.c:243
#15 0x00007f91b7dbc714 in start_thread (arg=0x7f918e715700) at pthread_create.c:453
#16 0x00007f91b1f2ea8f in clone () from /lib/x86_64-linux-gnu/libc.so.6
Looks like we have one of the following scenarios:
1. The OnThreadExitInternal has already been called twice, so we have kUninitialized in the TLS slot.
2. We never used TLS on the thread and the first use falls into SamplingHeapProfiler::DoRecordFree that is called from _dl_deallocate_tls.
,
Apr 17 2018
Checked glibc sources, this (i.e. a call to __free_tcb) should happen only for detached threads.So this is coming from some call to CreateNonJoinableThread or Thread::StartWithOptions(joinable=false). The bad thing is that this happens while the TLS storage here is not yet initialized. But here I am missing something.. don't we have a call to initialized the TLS slot before we turn on the shim hooks? The other option is that this is a non-joinable thread that is started during late shutdown, when the TLS vector has been torn down. How did you hit this?
,
Apr 17 2018
On a second thought, isn't this a dupe of Issue 825218 ?
,
Apr 17 2018
Thanks for pointing out. It indeed seems to be a duplicate. I had an impression that issue was fixed with introduction of ThreadLocalStorage::HasBeenDestroyed But it seems to be not true. I hit it while running chrome with --sampling-heap-profiler flag. We indeed do create a TLS slot early and the slot is there. The problem is that we create (if the thread never accessed TLS), or recreate (if it already has been deleted) TlsVector for a thread during thread destruction.
,
Apr 17 2018
There's two separate problems here: 1) Attempting to access Chrome TLS after teardown of Chrome TLS during thread destruction is not supported. We have a DCHECK in Slot::Get() to check this, but it doesn't catch two cases: * TLS was never initialized * We've made 2 passes of pthread key destruction, and thus the pthread key looks uninitialized. In Issue 825218, I'm considering moving away from Chrome's implementation of TLS for OOP HP.
,
Sep 13
|
||
►
Sign in to add a comment |
||
Comment 1 by alph@chromium.org
, Apr 16 2018