New issue
Advanced search Search tips

Issue 833654 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner:
Closed: Sep 13
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Sampling heap profiler locks on TLS destruction

Project Member Reported by alph@chromium.org, Apr 16 2018

Issue description

Here's the call stack:

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f91b7dbeb95 in __GI___pthread_mutex_lock (mutex=0x5629465ffc90 <base::SamplingHeapProfiler::GetInstance()::instance+8>)
    at ../nptl/pthread_mutex_lock.c:80
#2  0x00005629420b3576 in base::internal::LockImpl::Lock() ()
#3  0x00005629420a2b33 in base::SamplingHeapProfiler::DoRecordAlloc(unsigned long, unsigned long, void*, unsigned int) ()
#4  0x00005629420a3503 in base::(anonymous namespace)::AllocFn(base::allocator::AllocatorDispatch const*, unsigned long, void*) ()
#5  0x00005629420fa58e in operator new(unsigned long, std::nothrow_t const&) ()
#6  0x00005629420cdf25 in (anonymous namespace)::ConstructTlsVector() ()
#7  0x00005629420ce082 in base::ThreadLocalStorage::Slot::Set(void*) ()
#8  0x00005629420a30d9 in base::SamplingHeapProfiler::DoRecordFree(void*) ()
#9  0x00005629420a35eb in base::(anonymous namespace)::FreeFn(base::allocator::AllocatorDispatch const*, void*, void*) ()
#10 0x00007f91b7fe4957 in _dl_deallocate_tls () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f91b7dbb736 in __free_stacks (limit=limit@entry=41943040) at allocatestack.c:284
#12 0x00007f91b7dbb88a in queue_stack (stack=0x7f918e715700) at allocatestack.c:312
#13 __deallocate_stack (pd=pd@entry=0x7f918e715700) at allocatestack.c:763
#14 0x00007f91b7dbc3a9 in __free_tcb (pd=pd@entry=0x7f918e715700) at pthread_create.c:243
#15 0x00007f91b7dbc714 in start_thread (arg=0x7f918e715700) at pthread_create.c:453
#16 0x00007f91b1f2ea8f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Looks like we have one of the following scenarios:
1. The OnThreadExitInternal has already been called twice, so we have kUninitialized in the TLS slot.
2. We never used TLS on the thread and the first use falls into SamplingHeapProfiler::DoRecordFree that is called from _dl_deallocate_tls.

 

Comment 1 by alph@chromium.org, Apr 16 2018

Cc: primiano@chromium.org erikc...@chromium.org
Checked glibc sources, this (i.e. a call to __free_tcb) should happen only for detached threads.So this is coming from some call to CreateNonJoinableThread or Thread::StartWithOptions(joinable=false).

The bad thing is that this happens while the TLS storage here is not yet initialized. But here I am missing something.. don't we have a call to initialized the TLS slot before we turn on the shim hooks?
The other option is that this is a non-joinable thread that is started during late shutdown, when the TLS vector has been torn down.

How did you hit this?
On a second thought, isn't this a dupe of Issue 825218 ?

Comment 4 by alph@chromium.org, Apr 17 2018

Thanks for pointing out. It indeed seems to be a duplicate. I had an impression that issue was fixed with introduction of ThreadLocalStorage::HasBeenDestroyed
But it seems to be not true.

I hit it while running chrome with --sampling-heap-profiler flag.
We indeed do create a TLS slot early and the slot is there.
The problem is that we create (if the thread never accessed TLS), or recreate (if it already has been deleted) TlsVector for a thread during thread destruction.
There's two separate problems here:

1) Attempting to access Chrome TLS after teardown of Chrome TLS during thread destruction is not supported. We have a DCHECK in Slot::Get() to check this, but it doesn't catch two cases:
  * TLS was never initialized
  * We've made 2 passes of pthread key destruction, and thus the pthread key looks uninitialized.

In Issue 825218, I'm considering moving away from Chrome's implementation of TLS for OOP HP.
Mergedinto: 881352
Status: Duplicate (was: Assigned)

Sign in to add a comment