New issue
Advanced search Search tips

Issue 772593 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

Browser crash, race condition in tracing_controller_impl.cc

Project Member Reported by erikc...@chromium.org, Oct 7 2017

Issue description

I built from ToT: 32ea531671b06e5005c6b8ffe33085f6a95a0e76, with a commit reverted: 1ca0e5fc031b33fc99f8e78bd367986398b06fb3

gn args:
"""
 11 is_component_build = false                                                      
 12 is_debug = false                                                                
 13 symbol_level = 1                                                                
 14 use_goma = true     
"""

I ran with --memlog=all, and used chrome://tracing to take a memory-infra trace. That worked. I went to take another trace [by hitting Record, again]. This crashed by browser process.

Here's the stack trace
"""
base::trace_event::TraceConfig::ToDict() const (in Chromium Framework) (trace_config.cc:480)
base::trace_event::TraceConfig::ToString() const (in Chromium Framework) (string:1322)
content::TracingControllerImpl::GenerateTracingMetadataDict() const (in Chromium Framework) (string:1221)
content::TracingControllerImpl::StopTracing(scoped_refptr<content::TracingController::TraceDataSink> const&) (in Chromium Framework) (tracing_controller_impl.cc:312)
content::(anonymous namespace)::OnTracingRequest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<void (scoped_refptr<base::RefCountedMemory>)> const&) (in Chromium Framework) (tracing_ui.cc:136)
content::WebUIDataSourceImpl::StartDataRequest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, base::RepeatingCallback<void (scoped_refptr<base::RefCountedMemory>)> const&) (in Chromium Framework) (web_ui_data_source_impl.cc:264)
content::URLDataManagerBackend::CallStartRequest(scoped_refptr<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, int) (in Chromium Framework) (callback_internal.h:148)
base::internal::Invoker<base::internal::BindState<void (*)(scoped_refptr<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, int), base::internal::RetainedRefWrapper<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, base::RepeatingCallback<content::WebContents* ()>, int>, void ()>::RunOnce(base::internal::BindStateBase*) (in Chromium Framework) (ref_counted.h:571)
base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) (in Chromium Framework) (callback_forward.h:11)
base::MessageLoop::RunTask(base::PendingTask*) (in Chromium Framework) (vector:639)
base::MessageLoop::DoWork() (in Chromium Framework) (message_loop.cc:524)
base::MessagePumpCFRunLoopBase::RunWork() (in Chromium Framework) (message_pump_mac.mm:452)
base::mac::CallWithEHFrame(void () block_pointer) (in Chromium Framework) (call_with_eh_frame_asm.S:38)
base::MessagePumpCFRunLoopBase::RunWorkSource(void*) (in Chromium Framework) (message_pump_mac.mm:431)
"""

My guess is that the failure happens in: metadata_dict->SetString("trace-config", trace_config_->ToString()); and that trace_config_ is mutated during the call. Looking through the code, the member is mutated in StartTracing() and StopTracing(). I'm guessing there's a race somewhere around here.
 
Cc: chiniforooshan@chromium.org
chiniforooshan: Would this change after https://chromium-review.googlesource.com/c/chromium/src/+/575716 ?
ToDict is a const method. It should not mutate trace_config_.

It's hard to say if https://chromium-review.googlesource.com/c/chromium/src/+/575716 fixes this without knowing the reason. I'll try to see if I can reproduce the crash tomorrow and if the coming refactoring fixes it.

Could you please attach to full error log?
> Could you please attach to full error log?

What are you referring to? This was on a custom build, and the symbolization was done manually with a symbolized binary that no longer exists. There's no other information, sorry.
Owner: chiniforooshan@chromium.org
Status: Assigned (was: Untriaged)
I just hit a similar looking crash on Canary, macOS: 64.0.3257.0.

https://crash.corp.google.com/browse?q=reportid=%270ed7e78504cd6472%27#3

stack:
"""
...
main thread:
0x00007fffafeaf313	(CoreFoundation + 0x00006313 )	__CFSearchStringROM
0x00007fffafeaecc7	(CoreFoundation + 0x00005cc7 )	__CFStringCreateImmutableFunnel3
0x00007fffafeae29c	(CoreFoundation + 0x0000529c )	CFStringCreateWithCString
0x00007fffaff63e6c	(CoreFoundation + 0x000bae6c )	CFLocaleCreateLocaleIdentifierFromComponents
0x00007fffaff638d6	(CoreFoundation + 0x000ba8d6 )	__CFCalendarCreateUCalendar
0x00007fffaff6372a	(CoreFoundation + 0x000ba72a )	CFCalendarCreateWithIdentifier
0x000000010d9f203c	(Google Chrome Framework -time_mac.cc:238 )	base::Time::Explode(bool, base::Time::Exploded*) const
0x000000010c796c1b	(Google Chrome Framework -time.h:614 )	content::TracingControllerImpl::GenerateMetadataDict() const
0x000000010c797cd9	(Google Chrome Framework -bind_internal.h:209 )	base::internal::Invoker<base::internal::BindState<std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> > (content::TracingControllerImpl::*)() const, base::internal::UnretainedWrapper<content::TracingControllerImpl> >, std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> > ()>::Run(base::internal::BindStateBase*)
0x000000010ed68dd9	(Google Chrome Framework -callback.h:92 )	tracing::ChromeTraceEventAgent::StopAndFlush(mojo::InterfacePtr<tracing::mojom::Recorder>)
0x000000010c1898c1	(Google Chrome Framework -tracing.mojom.cc:914 )	tracing::mojom::AgentStubDispatch::Accept(tracing::mojom::Agent*, mojo::Message*)
0x000000010da387bd	(Google Chrome Framework -multiplex_router.cc:880 )	mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*)
0x000000010da38006	(Google Chrome Framework -multiplex_router.cc:604 )	mojo::internal::MultiplexRouter::Accept(mojo::Message*)
0x000000010da3084c	(Google Chrome Framework -connector.cc:440 )	mojo::Connector::ReadSingleMessage(unsigned int*)
0x000000010da30e80	(Google Chrome Framework -connector.cc:469 )	mojo::Connector::ReadAllAvailableMessages()
...

thread 46 [CRASHED]:

Stack Quality75%Show frame trust levels
0x000000010c949d43	(Google Chrome Framework -__functional_base:55 )	tracing::Coordinator::TraceStreamer::CreateAndSendRecorder(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry>)
0x000000010c949f88	(Google Chrome Framework -bind_internal.h:194 )	base::internal::Invoker<base::internal::BindState<void (tracing::Coordinator::TraceStreamer::*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry>), base::WeakPtr<tracing::Coordinator::TraceStreamer>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry> >, void ()>::RunOnce(base::internal::BindStateBase*)
0x000000010d944ffb	(Google Chrome Framework -callback.h:64 )	base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*)
0x000000010d9aef16	(Google Chrome Framework -task_tracker.cc:411 )	base::internal::TaskTracker::RunOrSkipTask(std::__1::unique_ptr<base::internal::Task, std::__1::default_delete<base::internal::Task> >, base::internal::Sequence*, bool)
0x000000010d9afacb	(Google Chrome Framework -task_tracker_posix.cc:23 )	base::internal::TaskTrackerPosix::RunOrSkipTask(std::__1::unique_ptr<base::internal::Task, std::__1::default_delete<base::internal::Task> >, base::internal::Sequence*, bool)
0x000000010d9ae846	(Google Chrome Framework -task_tracker.cc:311 )	base::internal::TaskTracker::RunNextTask(scoped_refptr<base::internal::Sequence>, base::internal::CanScheduleSequenceObserver*)
0x000000010d9a96ff	(Google Chrome Framework -scheduler_worker.cc:72 )	base::internal::SchedulerWorker::Thread::ThreadMain()
0x000000010d9ba7b6	(Google Chrome Framework -platform_thread_posix.cc:75 )	base::(anonymous namespace)::ThreadFunc(void*)
0x00007fffc58c993a	(libsystem_pthread.dylib + 0x0000393a )	_pthread_body
0x00007fffc58c9886	(libsystem_pthread.dylib + 0x00003886 )	_pthread_start
0x00007fffc58c908c	(libsystem_pthread.dylib + 0x0000308c )	thread_start
"""

Not sure if same issue or different. Filed issue 782355.
Hmmm, it's probably the same issue that only looks different because of the tracing controller refactoring.

Does the crash happen right after you hit "Record" for the 2nd time or when you hit "Stop" for the 2nd time? (I've not been able to reproduce the crash yet).
Sorry, ignore my question... it must be when you stop because the crash is inside StopAndFlush.
Owner: erikc...@chromium.org
Actually, I wasn't using Chrome at the time - although it was in the foreground. It could be that the OOP HP background triggers were trying to take a trace for a heap dump. Code at: https://cs.chromium.org/chromium/src/chrome/browser/profiling_host/profiling_process_host.cc?l=404

It looks like the problem is that there are *two* different threads trying to simultaneously stop tracing:

thread 0, main thread:
"""
0x00007fffc57e019e	(libsystem_kernel.dylib + 0x0001b19e )	poll
0x0000000108608417	(Google Chrome Framework -memlog_allocator_shim.cc:77 )	profiling::(anonymous namespace)::DoSend(void const*, void const*, unsigned long, profiling::(anonymous namespace)::SendBuffer*)
0x00000001086082ee	(Google Chrome Framework -memlog_allocator_shim.cc:347 )	profiling::AllocatorShimLogAlloc(profiling::AllocatorType, void*, unsigned long, char const*)
0x0000000108608841	(Google Chrome Framework -memlog_allocator_shim.cc:134 )	profiling::(anonymous namespace)::HookAlloc(base::allocator::AllocatorDispatch const*, unsigned long, void*)
0x0000000107193ce4	(Google Chrome Framework -allocator_shim.cc:194 )	base::allocator::MallocZoneFunctionsToReplaceDefault()::$_1::__invoke(_malloc_zone_t*, unsigned long)
0x000000010860882f	(Google Chrome Framework -memlog_allocator_shim.cc:133 )	profiling::(anonymous namespace)::HookAlloc(base::allocator::AllocatorDispatch const*, unsigned long, void*)
0x0000000107193ce4	(Google Chrome Framework -allocator_shim.cc:194 )	base::allocator::MallocZoneFunctionsToReplaceDefault()::$_1::__invoke(_malloc_zone_t*, unsigned long)
0x00007fffc5832281	(libsystem_malloc.dylib + 0x00002281 )	malloc_zone_malloc
0x00007fffc58311ff	(libsystem_malloc.dylib + 0x000011ff )	malloc
0x00007fffc42b9e0d	(libc++abi.dylib + 0x00022e0d )	operator new(unsigned long)
0x000000010718821c	(Google Chrome Framework -new:226 )	std::__1::__tree_iterator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__tree_node<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, void*>*, long> std::__1::__tree<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__map_value_compare<base::trace_event::TraceLog::AsyncEnabledStateObserver*, std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::less<base::trace_event::TraceLog::AsyncEnabledStateObserver*>, true>, std::__1::allocator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver> > >::__emplace_hint_unique_key_args<base::trace_event::TraceLog::AsyncEnabledStateObserver*, std::__1::pair<base::trace_event::TraceLog::AsyncEnabledStateObserver* const, base::trace_event::TraceLog::RegisteredAsyncObserver> const&>(std::__1::__tree_const_iterator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__tree_node<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, void*>*, long>, base::trace_event::TraceLog::AsyncEnabledStateObserver* const&, std::__1::pair<base::trace_event::TraceLog::AsyncEnabledStateObserver* const, base::trace_event::TraceLog::RegisteredAsyncObserver> const&&&)
0x0000000107183a92	(Google Chrome Framework -__tree:1278 )	base::trace_event::TraceLog::SetDisabledWhileLocked(unsigned char)
0x0000000107183ca6	(Google Chrome Framework -trace_log.cc:661 )	base::trace_event::TraceLog::SetDisabled(unsigned char)
0x0000000108506a73	(Google Chrome Framework -chrome_trace_event_agent.cc:93 )	tracing::ChromeTraceEventAgent::StopAndFlush(mojo::InterfacePtr<tracing::mojom::Recorder>)
0x0000000105944931	(Google Chrome Framework -tracing.mojom.cc:914 )	tracing::mojom::AgentStubDispatch::Accept(tracing::mojom::Agent*, mojo::Message*)
0x00000001071de2ad	(Google Chrome Framework -multiplex_router.cc:880 )	mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*)
0x00000001071ddaf6	(Google Chrome Framework -multiplex_router.cc:604 )	mojo::internal::MultiplexRouter::Accept(mojo::Message*)
0x00000001071d633c	(Google Chrome Framework -connector.cc:440 )	mojo::Connector::ReadSingleMessage(unsigned int*)
0x00000001071d6970	(Google Chrome Framework -connector.cc:469 )	mojo::Connector::ReadAllAvailableMessages()
"""

thread 13, IO thread:
"""
0x00007fffc57dec22	(libsystem_kernel.dylib + 0x00019c22 )	__psynch_mutexwait
0x0000000107146fe7	(Google Chrome Framework -lock_impl_posix.cc:73 )	base::internal::LockImpl::Lock()
0x00000001086083de	(Google Chrome Framework -lock.h:26 )	profiling::(anonymous namespace)::DoSend(void const*, void const*, unsigned long, profiling::(anonymous namespace)::SendBuffer*)
0x00000001086085ed	(Google Chrome Framework -memlog_allocator_shim.cc:366 )	profiling::AllocatorShimLogFree(void*)
0x0000000108608a4e	(Google Chrome Framework -memlog_allocator_shim.cc:210 )	profiling::(anonymous namespace)::HookFreeDefiniteSize(base::allocator::AllocatorDispatch const*, void*, unsigned long, void*)
0x00000001084ecaa0	(Google Chrome Framework -message_pipe_dispatcher.cc:290 )	mojo::edk::MessagePipeDispatcher::~MessagePipeDispatcher()
0x00000001084e3ba6	(Google Chrome Framework -ref_counted.h:397 )	mojo::edk::Core::Close(unsigned int)
0x00000001071d6082	(Google Chrome Framework -handle.h:169 )	mojo::Connector::HandleError(bool, bool)
0x00000001071d8b5b	(Google Chrome Framework -interface_endpoint_client.cc:207 )	mojo::(anonymous namespace)::ResponderThunk::~ResponderThunk()
0x00000001071d8b6d	(Google Chrome Framework -interface_endpoint_client.cc:47 )	mojo::(anonymous namespace)::ResponderThunk::~ResponderThunk()
0x0000000105947fb6	(Google Chrome Framework -memory:2233 )	base::internal::BindState<void (tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder::*)(std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >), base::internal::PassedWrapper<std::__1::unique_ptr<tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder, std::__1::default_delete<tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder> > > >::Destroy(base::internal::BindStateBase const*)
0x00000001070e4a36	(Google Chrome Framework -callback_internal.cc:21 )	base::internal::CallbackBaseCopyable::operator=(base::internal::CallbackBaseCopyable const&)
0x00000001060f1462	(Google Chrome Framework -callback.h:81 )	tracing::Coordinator::StopAndFlushAgent(mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<void (std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >)> const&)
0x00000001060f12c8	(Google Chrome Framework -coordinator.cc:368 )	tracing::Coordinator::StopAndFlush(mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>, base::RepeatingCallback<void (std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >)> const&)
0x0000000105946f0f	(Google Chrome Framework -tracing.mojom.cc:2645 )	tracing::mojom::CoordinatorStubDispatch::AcceptWithResponder(tracing::mojom::Coordinator*, mojo::Message*, std::__1::unique_ptr<mojo::MessageReceiverWithStatus, std::__1::default_delete<mojo::MessageReceiverWithStatus> >)
0x00000001060f2855	(Google Chrome Framework -tracing.mojom.h:575 )	tracing::mojom::CoordinatorStub<mojo::RawPtrImplRefTraits<tracing::mojom::Coordinator> >::AcceptWithResponder(mojo::Message*, std::__1::unique_ptr<mojo::MessageReceiverWithStatus, std::__1::default_delete<mojo::MessageReceiverWithStatus> >)
0x00000001071d799b	(Google Chrome Framework -interface_endpoint_client.cc:393 )	mojo::InterfaceEndpointClient::HandleValidatedMessage(mojo::Message*)
0x00000001071de2ad	(Google Chrome Framework -multiplex_router.cc:880 )	mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*)
0x00000001071ddaf6	(Google Chrome Framework -multiplex_router.cc:604 )	mojo::internal::MultiplexRouter::Accept(mojo::Message*)
0x00000001071d633c	(Google Chrome Framework -connector.cc:440 )	mojo::Connector::ReadSingleMessage(unsigned int*)
0x00000001071d6970	(Google Chrome Framework -connector.cc:469 )	mojo::Connector::ReadAllAvailableMessages()
0x0000000105b5abf4	(Google Chrome Framework -callback.h:92 )	mojo::SimpleWatcher::DiscardReadyState(base::RepeatingCallback<void (unsigned int)> const&, unsigned int, mojo::HandleSignalsState const&)
0x00000001071e702c	(Google Chrome Framework -callback.h:92 )	mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&)
0x00000001070eab0b	(Google Chrome Framework -callback.h:64 )	base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*)
...
"""

It looks like TracingControllerImpl is intended to only be used from the UI thread. There are DCHECKs that enforce this, but this particular code path isn't tested by OOP HP on the waterfall. 
Project Member

Comment 9 by bugdroid1@chromium.org, Nov 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/76c34eb77f21263a35da917a12c4f11a0f6d6754

commit 76c34eb77f21263a35da917a12c4f11a0f6d6754
Author: Erik Chen <erikchen@chromium.org>
Date: Wed Nov 08 22:49:03 2017

Background profiling trigger for OOP HP should trgger trace on UI thread.

TracingControllerImpl is not thread safe.

Bug:  772593 
Change-Id: I9ac3c3174172f930e4bce5458af901aaf63b1a55
Reviewed-on: https://chromium-review.googlesource.com/757659
Commit-Queue: Albert J. Wong <ajwong@chromium.org>
Reviewed-by: Albert J. Wong <ajwong@chromium.org>
Cr-Commit-Position: refs/heads/master@{#514981}
[modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/background_profiling_triggers.cc
[modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/background_profiling_triggers.h
[modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/profiling_process_host.cc

Issue 782355 has been merged into this issue.
Status: Fixed (was: Assigned)

Sign in to add a comment