Browser crash, race condition in tracing_controller_impl.cc |
||||
Issue descriptionI built from ToT: 32ea531671b06e5005c6b8ffe33085f6a95a0e76, with a commit reverted: 1ca0e5fc031b33fc99f8e78bd367986398b06fb3 gn args: """ 11 is_component_build = false 12 is_debug = false 13 symbol_level = 1 14 use_goma = true """ I ran with --memlog=all, and used chrome://tracing to take a memory-infra trace. That worked. I went to take another trace [by hitting Record, again]. This crashed by browser process. Here's the stack trace """ base::trace_event::TraceConfig::ToDict() const (in Chromium Framework) (trace_config.cc:480) base::trace_event::TraceConfig::ToString() const (in Chromium Framework) (string:1322) content::TracingControllerImpl::GenerateTracingMetadataDict() const (in Chromium Framework) (string:1221) content::TracingControllerImpl::StopTracing(scoped_refptr<content::TracingController::TraceDataSink> const&) (in Chromium Framework) (tracing_controller_impl.cc:312) content::(anonymous namespace)::OnTracingRequest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<void (scoped_refptr<base::RefCountedMemory>)> const&) (in Chromium Framework) (tracing_ui.cc:136) content::WebUIDataSourceImpl::StartDataRequest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, base::RepeatingCallback<void (scoped_refptr<base::RefCountedMemory>)> const&) (in Chromium Framework) (web_ui_data_source_impl.cc:264) content::URLDataManagerBackend::CallStartRequest(scoped_refptr<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, int) (in Chromium Framework) (callback_internal.h:148) base::internal::Invoker<base::internal::BindState<void (*)(scoped_refptr<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<content::WebContents* ()> const&, int), base::internal::RetainedRefWrapper<content::URLDataSourceImpl>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, base::RepeatingCallback<content::WebContents* ()>, int>, void ()>::RunOnce(base::internal::BindStateBase*) (in Chromium Framework) (ref_counted.h:571) base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) (in Chromium Framework) (callback_forward.h:11) base::MessageLoop::RunTask(base::PendingTask*) (in Chromium Framework) (vector:639) base::MessageLoop::DoWork() (in Chromium Framework) (message_loop.cc:524) base::MessagePumpCFRunLoopBase::RunWork() (in Chromium Framework) (message_pump_mac.mm:452) base::mac::CallWithEHFrame(void () block_pointer) (in Chromium Framework) (call_with_eh_frame_asm.S:38) base::MessagePumpCFRunLoopBase::RunWorkSource(void*) (in Chromium Framework) (message_pump_mac.mm:431) """ My guess is that the failure happens in: metadata_dict->SetString("trace-config", trace_config_->ToString()); and that trace_config_ is mutated during the call. Looking through the code, the member is mutated in StartTracing() and StopTracing(). I'm guessing there's a race somewhere around here.
,
Oct 9 2017
ToDict is a const method. It should not mutate trace_config_. It's hard to say if https://chromium-review.googlesource.com/c/chromium/src/+/575716 fixes this without knowing the reason. I'll try to see if I can reproduce the crash tomorrow and if the coming refactoring fixes it. Could you please attach to full error log?
,
Oct 9 2017
> Could you please attach to full error log? What are you referring to? This was on a custom build, and the symbolization was done manually with a symbolized binary that no longer exists. There's no other information, sorry.
,
Oct 18 2017
,
Nov 7 2017
I just hit a similar looking crash on Canary, macOS: 64.0.3257.0. https://crash.corp.google.com/browse?q=reportid=%270ed7e78504cd6472%27#3 stack: """ ... main thread: 0x00007fffafeaf313 (CoreFoundation + 0x00006313 ) __CFSearchStringROM 0x00007fffafeaecc7 (CoreFoundation + 0x00005cc7 ) __CFStringCreateImmutableFunnel3 0x00007fffafeae29c (CoreFoundation + 0x0000529c ) CFStringCreateWithCString 0x00007fffaff63e6c (CoreFoundation + 0x000bae6c ) CFLocaleCreateLocaleIdentifierFromComponents 0x00007fffaff638d6 (CoreFoundation + 0x000ba8d6 ) __CFCalendarCreateUCalendar 0x00007fffaff6372a (CoreFoundation + 0x000ba72a ) CFCalendarCreateWithIdentifier 0x000000010d9f203c (Google Chrome Framework -time_mac.cc:238 ) base::Time::Explode(bool, base::Time::Exploded*) const 0x000000010c796c1b (Google Chrome Framework -time.h:614 ) content::TracingControllerImpl::GenerateMetadataDict() const 0x000000010c797cd9 (Google Chrome Framework -bind_internal.h:209 ) base::internal::Invoker<base::internal::BindState<std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> > (content::TracingControllerImpl::*)() const, base::internal::UnretainedWrapper<content::TracingControllerImpl> >, std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> > ()>::Run(base::internal::BindStateBase*) 0x000000010ed68dd9 (Google Chrome Framework -callback.h:92 ) tracing::ChromeTraceEventAgent::StopAndFlush(mojo::InterfacePtr<tracing::mojom::Recorder>) 0x000000010c1898c1 (Google Chrome Framework -tracing.mojom.cc:914 ) tracing::mojom::AgentStubDispatch::Accept(tracing::mojom::Agent*, mojo::Message*) 0x000000010da387bd (Google Chrome Framework -multiplex_router.cc:880 ) mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*) 0x000000010da38006 (Google Chrome Framework -multiplex_router.cc:604 ) mojo::internal::MultiplexRouter::Accept(mojo::Message*) 0x000000010da3084c (Google Chrome Framework -connector.cc:440 ) mojo::Connector::ReadSingleMessage(unsigned int*) 0x000000010da30e80 (Google Chrome Framework -connector.cc:469 ) mojo::Connector::ReadAllAvailableMessages() ... thread 46 [CRASHED]: Stack Quality75%Show frame trust levels 0x000000010c949d43 (Google Chrome Framework -__functional_base:55 ) tracing::Coordinator::TraceStreamer::CreateAndSendRecorder(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry>) 0x000000010c949f88 (Google Chrome Framework -bind_internal.h:194 ) base::internal::Invoker<base::internal::BindState<void (tracing::Coordinator::TraceStreamer::*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry>), base::WeakPtr<tracing::Coordinator::TraceStreamer>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, tracing::mojom::TraceDataType, base::WeakPtr<tracing::AgentRegistry::AgentEntry> >, void ()>::RunOnce(base::internal::BindStateBase*) 0x000000010d944ffb (Google Chrome Framework -callback.h:64 ) base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) 0x000000010d9aef16 (Google Chrome Framework -task_tracker.cc:411 ) base::internal::TaskTracker::RunOrSkipTask(std::__1::unique_ptr<base::internal::Task, std::__1::default_delete<base::internal::Task> >, base::internal::Sequence*, bool) 0x000000010d9afacb (Google Chrome Framework -task_tracker_posix.cc:23 ) base::internal::TaskTrackerPosix::RunOrSkipTask(std::__1::unique_ptr<base::internal::Task, std::__1::default_delete<base::internal::Task> >, base::internal::Sequence*, bool) 0x000000010d9ae846 (Google Chrome Framework -task_tracker.cc:311 ) base::internal::TaskTracker::RunNextTask(scoped_refptr<base::internal::Sequence>, base::internal::CanScheduleSequenceObserver*) 0x000000010d9a96ff (Google Chrome Framework -scheduler_worker.cc:72 ) base::internal::SchedulerWorker::Thread::ThreadMain() 0x000000010d9ba7b6 (Google Chrome Framework -platform_thread_posix.cc:75 ) base::(anonymous namespace)::ThreadFunc(void*) 0x00007fffc58c993a (libsystem_pthread.dylib + 0x0000393a ) _pthread_body 0x00007fffc58c9886 (libsystem_pthread.dylib + 0x00003886 ) _pthread_start 0x00007fffc58c908c (libsystem_pthread.dylib + 0x0000308c ) thread_start """ Not sure if same issue or different. Filed issue 782355.
,
Nov 7 2017
Hmmm, it's probably the same issue that only looks different because of the tracing controller refactoring. Does the crash happen right after you hit "Record" for the 2nd time or when you hit "Stop" for the 2nd time? (I've not been able to reproduce the crash yet).
,
Nov 7 2017
Sorry, ignore my question... it must be when you stop because the crash is inside StopAndFlush.
,
Nov 7 2017
Actually, I wasn't using Chrome at the time - although it was in the foreground. It could be that the OOP HP background triggers were trying to take a trace for a heap dump. Code at: https://cs.chromium.org/chromium/src/chrome/browser/profiling_host/profiling_process_host.cc?l=404 It looks like the problem is that there are *two* different threads trying to simultaneously stop tracing: thread 0, main thread: """ 0x00007fffc57e019e (libsystem_kernel.dylib + 0x0001b19e ) poll 0x0000000108608417 (Google Chrome Framework -memlog_allocator_shim.cc:77 ) profiling::(anonymous namespace)::DoSend(void const*, void const*, unsigned long, profiling::(anonymous namespace)::SendBuffer*) 0x00000001086082ee (Google Chrome Framework -memlog_allocator_shim.cc:347 ) profiling::AllocatorShimLogAlloc(profiling::AllocatorType, void*, unsigned long, char const*) 0x0000000108608841 (Google Chrome Framework -memlog_allocator_shim.cc:134 ) profiling::(anonymous namespace)::HookAlloc(base::allocator::AllocatorDispatch const*, unsigned long, void*) 0x0000000107193ce4 (Google Chrome Framework -allocator_shim.cc:194 ) base::allocator::MallocZoneFunctionsToReplaceDefault()::$_1::__invoke(_malloc_zone_t*, unsigned long) 0x000000010860882f (Google Chrome Framework -memlog_allocator_shim.cc:133 ) profiling::(anonymous namespace)::HookAlloc(base::allocator::AllocatorDispatch const*, unsigned long, void*) 0x0000000107193ce4 (Google Chrome Framework -allocator_shim.cc:194 ) base::allocator::MallocZoneFunctionsToReplaceDefault()::$_1::__invoke(_malloc_zone_t*, unsigned long) 0x00007fffc5832281 (libsystem_malloc.dylib + 0x00002281 ) malloc_zone_malloc 0x00007fffc58311ff (libsystem_malloc.dylib + 0x000011ff ) malloc 0x00007fffc42b9e0d (libc++abi.dylib + 0x00022e0d ) operator new(unsigned long) 0x000000010718821c (Google Chrome Framework -new:226 ) std::__1::__tree_iterator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__tree_node<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, void*>*, long> std::__1::__tree<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__map_value_compare<base::trace_event::TraceLog::AsyncEnabledStateObserver*, std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::less<base::trace_event::TraceLog::AsyncEnabledStateObserver*>, true>, std::__1::allocator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver> > >::__emplace_hint_unique_key_args<base::trace_event::TraceLog::AsyncEnabledStateObserver*, std::__1::pair<base::trace_event::TraceLog::AsyncEnabledStateObserver* const, base::trace_event::TraceLog::RegisteredAsyncObserver> const&>(std::__1::__tree_const_iterator<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, std::__1::__tree_node<std::__1::__value_type<base::trace_event::TraceLog::AsyncEnabledStateObserver*, base::trace_event::TraceLog::RegisteredAsyncObserver>, void*>*, long>, base::trace_event::TraceLog::AsyncEnabledStateObserver* const&, std::__1::pair<base::trace_event::TraceLog::AsyncEnabledStateObserver* const, base::trace_event::TraceLog::RegisteredAsyncObserver> const&&&) 0x0000000107183a92 (Google Chrome Framework -__tree:1278 ) base::trace_event::TraceLog::SetDisabledWhileLocked(unsigned char) 0x0000000107183ca6 (Google Chrome Framework -trace_log.cc:661 ) base::trace_event::TraceLog::SetDisabled(unsigned char) 0x0000000108506a73 (Google Chrome Framework -chrome_trace_event_agent.cc:93 ) tracing::ChromeTraceEventAgent::StopAndFlush(mojo::InterfacePtr<tracing::mojom::Recorder>) 0x0000000105944931 (Google Chrome Framework -tracing.mojom.cc:914 ) tracing::mojom::AgentStubDispatch::Accept(tracing::mojom::Agent*, mojo::Message*) 0x00000001071de2ad (Google Chrome Framework -multiplex_router.cc:880 ) mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*) 0x00000001071ddaf6 (Google Chrome Framework -multiplex_router.cc:604 ) mojo::internal::MultiplexRouter::Accept(mojo::Message*) 0x00000001071d633c (Google Chrome Framework -connector.cc:440 ) mojo::Connector::ReadSingleMessage(unsigned int*) 0x00000001071d6970 (Google Chrome Framework -connector.cc:469 ) mojo::Connector::ReadAllAvailableMessages() """ thread 13, IO thread: """ 0x00007fffc57dec22 (libsystem_kernel.dylib + 0x00019c22 ) __psynch_mutexwait 0x0000000107146fe7 (Google Chrome Framework -lock_impl_posix.cc:73 ) base::internal::LockImpl::Lock() 0x00000001086083de (Google Chrome Framework -lock.h:26 ) profiling::(anonymous namespace)::DoSend(void const*, void const*, unsigned long, profiling::(anonymous namespace)::SendBuffer*) 0x00000001086085ed (Google Chrome Framework -memlog_allocator_shim.cc:366 ) profiling::AllocatorShimLogFree(void*) 0x0000000108608a4e (Google Chrome Framework -memlog_allocator_shim.cc:210 ) profiling::(anonymous namespace)::HookFreeDefiniteSize(base::allocator::AllocatorDispatch const*, void*, unsigned long, void*) 0x00000001084ecaa0 (Google Chrome Framework -message_pipe_dispatcher.cc:290 ) mojo::edk::MessagePipeDispatcher::~MessagePipeDispatcher() 0x00000001084e3ba6 (Google Chrome Framework -ref_counted.h:397 ) mojo::edk::Core::Close(unsigned int) 0x00000001071d6082 (Google Chrome Framework -handle.h:169 ) mojo::Connector::HandleError(bool, bool) 0x00000001071d8b5b (Google Chrome Framework -interface_endpoint_client.cc:207 ) mojo::(anonymous namespace)::ResponderThunk::~ResponderThunk() 0x00000001071d8b6d (Google Chrome Framework -interface_endpoint_client.cc:47 ) mojo::(anonymous namespace)::ResponderThunk::~ResponderThunk() 0x0000000105947fb6 (Google Chrome Framework -memory:2233 ) base::internal::BindState<void (tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder::*)(std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >), base::internal::PassedWrapper<std::__1::unique_ptr<tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder, std::__1::default_delete<tracing::mojom::Coordinator_StopAndFlush_ProxyToResponder> > > >::Destroy(base::internal::BindStateBase const*) 0x00000001070e4a36 (Google Chrome Framework -callback_internal.cc:21 ) base::internal::CallbackBaseCopyable::operator=(base::internal::CallbackBaseCopyable const&) 0x00000001060f1462 (Google Chrome Framework -callback.h:81 ) tracing::Coordinator::StopAndFlushAgent(mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::RepeatingCallback<void (std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >)> const&) 0x00000001060f12c8 (Google Chrome Framework -coordinator.cc:368 ) tracing::Coordinator::StopAndFlush(mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>, base::RepeatingCallback<void (std::__1::unique_ptr<base::DictionaryValue, std::__1::default_delete<base::DictionaryValue> >)> const&) 0x0000000105946f0f (Google Chrome Framework -tracing.mojom.cc:2645 ) tracing::mojom::CoordinatorStubDispatch::AcceptWithResponder(tracing::mojom::Coordinator*, mojo::Message*, std::__1::unique_ptr<mojo::MessageReceiverWithStatus, std::__1::default_delete<mojo::MessageReceiverWithStatus> >) 0x00000001060f2855 (Google Chrome Framework -tracing.mojom.h:575 ) tracing::mojom::CoordinatorStub<mojo::RawPtrImplRefTraits<tracing::mojom::Coordinator> >::AcceptWithResponder(mojo::Message*, std::__1::unique_ptr<mojo::MessageReceiverWithStatus, std::__1::default_delete<mojo::MessageReceiverWithStatus> >) 0x00000001071d799b (Google Chrome Framework -interface_endpoint_client.cc:393 ) mojo::InterfaceEndpointClient::HandleValidatedMessage(mojo::Message*) 0x00000001071de2ad (Google Chrome Framework -multiplex_router.cc:880 ) mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*) 0x00000001071ddaf6 (Google Chrome Framework -multiplex_router.cc:604 ) mojo::internal::MultiplexRouter::Accept(mojo::Message*) 0x00000001071d633c (Google Chrome Framework -connector.cc:440 ) mojo::Connector::ReadSingleMessage(unsigned int*) 0x00000001071d6970 (Google Chrome Framework -connector.cc:469 ) mojo::Connector::ReadAllAvailableMessages() 0x0000000105b5abf4 (Google Chrome Framework -callback.h:92 ) mojo::SimpleWatcher::DiscardReadyState(base::RepeatingCallback<void (unsigned int)> const&, unsigned int, mojo::HandleSignalsState const&) 0x00000001071e702c (Google Chrome Framework -callback.h:92 ) mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&) 0x00000001070eab0b (Google Chrome Framework -callback.h:64 ) base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) ... """ It looks like TracingControllerImpl is intended to only be used from the UI thread. There are DCHECKs that enforce this, but this particular code path isn't tested by OOP HP on the waterfall.
,
Nov 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/76c34eb77f21263a35da917a12c4f11a0f6d6754 commit 76c34eb77f21263a35da917a12c4f11a0f6d6754 Author: Erik Chen <erikchen@chromium.org> Date: Wed Nov 08 22:49:03 2017 Background profiling trigger for OOP HP should trgger trace on UI thread. TracingControllerImpl is not thread safe. Bug: 772593 Change-Id: I9ac3c3174172f930e4bce5458af901aaf63b1a55 Reviewed-on: https://chromium-review.googlesource.com/757659 Commit-Queue: Albert J. Wong <ajwong@chromium.org> Reviewed-by: Albert J. Wong <ajwong@chromium.org> Cr-Commit-Position: refs/heads/master@{#514981} [modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/background_profiling_triggers.cc [modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/background_profiling_triggers.h [modify] https://crrev.com/76c34eb77f21263a35da917a12c4f11a0f6d6754/chrome/browser/profiling_host/profiling_process_host.cc
,
Nov 8 2017
Issue 782355 has been merged into this issue.
,
Nov 12 2017
|
||||
►
Sign in to add a comment |
||||
Comment 1 by oysteine@chromium.org
, Oct 9 2017