Sync hang (deadlock) in Chrome 60.0.3112.113 |
|||||||||||||
Issue descriptionPieter (CC'ed) was trying to use the DevTools profiling functionality on Chrome Canary 62 but starting the profiler kept hanging the render processes. He then switched to Chrome Stable 60.0.3112.113 but its devtools profiler would also hang the render process. Then a short time later, the Chrome Stable browser completely froze, and Pieter captured the attached hang stacks. Looking at the stacks, it looks like page JavaScript was setting a cookie, which triggered a Sync event: blink::V8Document::cookieAttributeGetterCallback [ExceptionState.h:120] blink::Document::cookie [Document.cpp:0] blink::Cookies [CookieJar.cpp:52] content::RendererWebCookieJarImpl::Cookies [renderer_webcookiejar_impl.cc:31] content::mojom::RenderFrameMessageFilterProxy::GetCookies [memory:2580] mojo::InterfaceEndpointClient::AcceptWithResponder [weak_ptr.h:225] mojo::SyncEventWatcher::SyncWatch [sync_event_watcher.cc:42] mojo::SyncHandleRegistry::Wait [sync_handle_registry.cc:96] mojo::WaitSet::State::Wait [wait_set.cc:204] base::WaitableEvent::WaitMany [waitable_event_posix.cc:129] _pthread_cond_wait __psynch_cvwait But the Browser process never returned from sending the sync event to the network: base::debug::TaskAnnotator::RunTask [callback_forward.h:29] syncer::SyncSchedulerImpl::TrySyncCycleJobImpl [sync_scheduler_impl.cc:717] syncer::SyncSchedulerImpl::DoNudgeSyncCycleJob [sync_scheduler_impl.cc:441] syncer::Syncer::NormalSyncShare [memory:2556] syncer::Syncer::BuildAndPostCommits [syncer.cc:171] syncer::Commit::PostAndProcessResponse [commit.cc:141] syncer::SyncerProtoUtil::PostClientToServerMessage [syncer_proto_util.cc:381] syncer::SyncerProtoUtil::PostAndProcessHeaders [syncer_proto_util.cc:337] syncer::ServerConnectionManager::PostBufferWithCachedAuth [server_connection_manager.cc:252] syncer::ServerConnectionManager::PostBufferToPath [server_connection_manager.cc:284] syncer::SyncBridgedConnection::Init [sync_server_connection_manager.cc:54] syncer::HttpBridge::MakeSynchronousPost [lock.h:26] base::WaitableEvent::Wait [waitable_event_posix.cc:159] base::WaitableEvent::TimedWaitUntil [waitable_event_posix.cc:225] _pthread_cond_wait __psynch_cvwait It looks like a tracing event from ServiceWorker might have blocked the I/O thread? content::BrowserThreadImpl::IOThreadRun [browser_thread_impl.cc:279] base::RunLoop::Run [run_loop.cc:112] base::MessagePumpLibevent::Run [message_pump_libevent.cc:224] event_base_loop [event.c:382] base::MessagePumpLibevent::OnLibeventNotification [message_pump_libevent.cc:97] mojo::edk:: [channel_posix.cc:317] mojo::edk::Channel::OnReadComplete [memory:2580] mojo::edk::NodeChannel::OnChannelMessage [node_channel.cc:0] mojo::edk::RequestContext::~RequestContext [request_context.cc:31] mojo::edk::Watch::InvokeCallback [lock.h:27] mojo::SimpleWatcher::Context::CallNotify [simple_watcher.cc:59] mojo::SimpleWatcher::OnHandleReady [weak_ptr.h:225] mojo::Connector::ReadAllAvailableMessages [connector.cc:419] mojo::Connector::ReadSingleMessage [connector.cc:390] mojo::internal::MultiplexRouter::Accept [multiplex_router.cc:610] mojo::internal::MultiplexRouter::ProcessIncomingMessage [multiplex_router.cc:0] content::mojom::ServiceWorkerFetchResponseCallbackStubDispatch::Accept [service_worker_event_dispatcher.mojom.cc:330] content::ServiceWorkerFetchDispatcher::ResponseCallback::OnFallback [memory:2580] content::ServiceWorkerFetchDispatcher::ResponseCallback::HandleResponse [memory:2580] content::ServiceWorkerFetchDispatcher::Complete [memory:2580] base::internal::Invoker<base::internal::BindState<void [memory:2580] content::ServiceWorkerURLRequestJob::DidDispatchFetchEvent [service_worker_url_request_job.cc:812] net::URLRequest::Restart [url_request.cc:689] net::URLRequest::PrepareToRestart [memory:2580] <name omitted> [net_log_with_source.cc:86] net::NetLog::AddEntry [net_log.cc:103] net::TraceNetLogObserver::OnAddEntry [trace_event.h:729] base::trace_event::TraceLog::AddTraceEventWithThreadIdAndTimestamp [trace_log.cc:1288] base::trace_event::TraceLog::ThreadLocalEventBuffer::AddTraceEvent [lock.h:27] base::trace_event::TraceLog::SetDisabledWhileLocked [iterator:1354] net::NetLog::DeprecatedRemoveObserver [vector:1465] base::internal::LockImpl::Lock [activity_tracker.h:1284] _pthread_mutex_lock_wait [0x7fffe07e2dfa] __psynch_mutexwait [0x7fffe06f7c22] Possibly related: the browser process's main thread was waiting on the compositor during the hang: base::MessageLoop::RunTask [vector:644] base::debug::TaskAnnotator::RunTask [callback_forward.h:29] ui:: [callback_forward.h:29] ui::Compositor::RequestNewCompositorFrameSink [weak_ptr.h:246] content::GpuProcessTransportFactory::CreateCompositorFrameSink [callback_internal.h:150] content::BrowserGpuChannelHostFactory::EstablishGpuChannel [ref_counted.h:472] base::internal::Invoker<base::internal::BindState<void [bind_internal.h:343] void base::internal::FunctorTraits<void [ref_counted.h:472] content::GpuProcessTransportFactory::EstablishedGpuChannel [gpu_process_transport_factory.cc:468] ui::ContextProviderCommandBuffer::BindToCurrentThread [context_provider_command_buffer.cc:258] gpu::CommandBufferProxyImpl::Create [command_buffer_proxy_impl.cc:91] gpu::CommandBufferProxyImpl::Initialize [command_buffer_proxy_impl.cc:229] IPC::SyncChannel::Send [atomic:922] IPC::SyncChannel::WaitForReply [ref_counted.h:476] mojo::SyncHandleRegistry::Wait [sync_handle_registry.cc:96] mojo::WaitSet::State::Wait [wait_set.cc:204] base::WaitableEvent::WaitMany [waitable_event_posix.cc:129] _pthread_cond_wait __psynch_cvwait
,
Sep 15 2017
[mac bug triage] Looping in blink and devtools triage.
,
Sep 17 2017
,
Sep 27 2017
As far as we can tell sync is waiting for the same thing that dev tools is; the sync thread being blocked wouldn't lock up the whole process in this scenario. For that matter, we don't sync cookies, so the fact that cookies are being set in a separate thread is a coincidence. +ServiceWorker folks in case this looks familiar.
,
Sep 27 2017
Looking at main thread trace at command_buffer_proxy_impl.cc:229, possibly to do with http://crbug.com/125248 ?
,
Sep 27 2017
The IO thread is blocked. That's the root of your problem. It will block the UI thread no matter what (and even if it didn't, it would prevent the UI from updating since we use the GPU process, which needs the IO thread to communicate with).
,
Sep 28 2017
Service worker doesn't seem greatly related. It just tried to restart the URLRequest, which triggered adding a netlog event, and netlog hung trying to get a lock in DeprecatedRemoveObserver which was likely held by sync scheduler? +net
,
Sep 28 2017
,
Sep 28 2017
,
Sep 28 2017
Looks like a bug in TraceNetLogObserver / TraceLog. NetLog's API contract is that observer's can't add or remove themselves during even notifications.
,
Sep 28 2017
I will take a look.
,
Sep 29 2017
,
Sep 29 2017
I uploaded a CL for review: https://chromium-review.googlesource.com/c/chromium/src/+/693215
,
Oct 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e4d00f0d4d07f3694800b3094378c04956c5c0c9 commit e4d00f0d4d07f3694800b3094378c04956c5c0c9 Author: Helen Li <xunjieli@chromium.org> Date: Tue Oct 03 16:13:47 2017 Change TraceNetLogObserver to use TraceLog::AsyncEnabledStateObserver Addition of a trace event can disable TraceLog if TraceBuffer becomes full. This CL makes TraceNetLogObserver an AsyncEnabledStateObserver so it receives disable/enable notification asynchronously from TraceLog. When OnAddEntry() changes TraceLog's enable state, TraceNetLogObserver will not remove itself from NetLog observer list inside of NetLog's event notification ( OnAddEntry()). This CL adds a regression test. Bug: 760817 Change-Id: Ia761a5905306aebe2714b1e3aabaf998368dc8e1 Reviewed-on: https://chromium-review.googlesource.com/693215 Commit-Queue: Helen Li <xunjieli@chromium.org> Reviewed-by: oysteine <oysteine@chromium.org> Reviewed-by: Matt Menke <mmenke@chromium.org> Cr-Commit-Position: refs/heads/master@{#506059} [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/base/trace_event/trace_log.cc [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/base/trace_event/trace_log.h [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/components/net_log/net_export_file_writer_unittest.cc [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer.cc [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer.h [modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer_unittest.cc
,
Oct 3 2017
,
Oct 3 2017
|
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by pav...@chromium.org
, Sep 6 2017