New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 760817 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

Sync hang (deadlock) in Chrome 60.0.3112.113

Project Member Reported by vdje...@fb.com, Aug 31 2017

Issue description

Pieter (CC'ed) was trying to use the DevTools profiling functionality on Chrome Canary 62 but starting the profiler kept hanging the render processes. He then switched to Chrome Stable 60.0.3112.113 but its devtools profiler would also hang the render process.

Then a short time later, the Chrome Stable browser completely froze, and Pieter captured the attached hang stacks.


Looking at the stacks, it looks like page JavaScript was setting a cookie, which triggered a Sync event:

blink::V8Document::cookieAttributeGetterCallback [ExceptionState.h:120]
blink::Document::cookie [Document.cpp:0]
blink::Cookies [CookieJar.cpp:52]
content::RendererWebCookieJarImpl::Cookies [renderer_webcookiejar_impl.cc:31]
content::mojom::RenderFrameMessageFilterProxy::GetCookies [memory:2580]
mojo::InterfaceEndpointClient::AcceptWithResponder [weak_ptr.h:225]
mojo::SyncEventWatcher::SyncWatch [sync_event_watcher.cc:42]
mojo::SyncHandleRegistry::Wait [sync_handle_registry.cc:96]
mojo::WaitSet::State::Wait [wait_set.cc:204]
base::WaitableEvent::WaitMany [waitable_event_posix.cc:129]
_pthread_cond_wait
__psynch_cvwait


But the Browser process never returned from sending the sync event to the network:

base::debug::TaskAnnotator::RunTask [callback_forward.h:29]
syncer::SyncSchedulerImpl::TrySyncCycleJobImpl [sync_scheduler_impl.cc:717]
syncer::SyncSchedulerImpl::DoNudgeSyncCycleJob [sync_scheduler_impl.cc:441]
syncer::Syncer::NormalSyncShare [memory:2556]
syncer::Syncer::BuildAndPostCommits [syncer.cc:171]
syncer::Commit::PostAndProcessResponse [commit.cc:141]
syncer::SyncerProtoUtil::PostClientToServerMessage [syncer_proto_util.cc:381]
syncer::SyncerProtoUtil::PostAndProcessHeaders [syncer_proto_util.cc:337]
syncer::ServerConnectionManager::PostBufferWithCachedAuth [server_connection_manager.cc:252]
syncer::ServerConnectionManager::PostBufferToPath [server_connection_manager.cc:284]
syncer::SyncBridgedConnection::Init [sync_server_connection_manager.cc:54]
syncer::HttpBridge::MakeSynchronousPost [lock.h:26]
base::WaitableEvent::Wait [waitable_event_posix.cc:159]
base::WaitableEvent::TimedWaitUntil [waitable_event_posix.cc:225]
_pthread_cond_wait
__psynch_cvwait


It looks like a tracing event from ServiceWorker might have blocked the I/O thread?

content::BrowserThreadImpl::IOThreadRun [browser_thread_impl.cc:279]
base::RunLoop::Run [run_loop.cc:112]
base::MessagePumpLibevent::Run [message_pump_libevent.cc:224]
event_base_loop   [event.c:382]
base::MessagePumpLibevent::OnLibeventNotification [message_pump_libevent.cc:97]
mojo::edk:: [channel_posix.cc:317]
mojo::edk::Channel::OnReadComplete [memory:2580]
mojo::edk::NodeChannel::OnChannelMessage [node_channel.cc:0]
mojo::edk::RequestContext::~RequestContext [request_context.cc:31]
mojo::edk::Watch::InvokeCallback [lock.h:27]
mojo::SimpleWatcher::Context::CallNotify [simple_watcher.cc:59]
mojo::SimpleWatcher::OnHandleReady [weak_ptr.h:225]
mojo::Connector::ReadAllAvailableMessages [connector.cc:419]
mojo::Connector::ReadSingleMessage [connector.cc:390]
mojo::internal::MultiplexRouter::Accept [multiplex_router.cc:610]
mojo::internal::MultiplexRouter::ProcessIncomingMessage [multiplex_router.cc:0]
content::mojom::ServiceWorkerFetchResponseCallbackStubDispatch::Accept [service_worker_event_dispatcher.mojom.cc:330]
content::ServiceWorkerFetchDispatcher::ResponseCallback::OnFallback [memory:2580]
content::ServiceWorkerFetchDispatcher::ResponseCallback::HandleResponse [memory:2580]
content::ServiceWorkerFetchDispatcher::Complete [memory:2580]
base::internal::Invoker<base::internal::BindState<void  [memory:2580]
content::ServiceWorkerURLRequestJob::DidDispatchFetchEvent [service_worker_url_request_job.cc:812]
net::URLRequest::Restart [url_request.cc:689]
net::URLRequest::PrepareToRestart [memory:2580]
<name omitted>   [net_log_with_source.cc:86]
net::NetLog::AddEntry [net_log.cc:103]
net::TraceNetLogObserver::OnAddEntry [trace_event.h:729]
base::trace_event::TraceLog::AddTraceEventWithThreadIdAndTimestamp [trace_log.cc:1288]
base::trace_event::TraceLog::ThreadLocalEventBuffer::AddTraceEvent [lock.h:27]
base::trace_event::TraceLog::SetDisabledWhileLocked [iterator:1354]
net::NetLog::DeprecatedRemoveObserver [vector:1465]
base::internal::LockImpl::Lock [activity_tracker.h:1284]
_pthread_mutex_lock_wait   [0x7fffe07e2dfa]
__psynch_mutexwait   [0x7fffe06f7c22]


Possibly related: the browser process's main thread was waiting on the compositor during the hang:

base::MessageLoop::RunTask [vector:644]
base::debug::TaskAnnotator::RunTask [callback_forward.h:29]
ui:: [callback_forward.h:29]
ui::Compositor::RequestNewCompositorFrameSink [weak_ptr.h:246]
content::GpuProcessTransportFactory::CreateCompositorFrameSink [callback_internal.h:150]
content::BrowserGpuChannelHostFactory::EstablishGpuChannel [ref_counted.h:472]
base::internal::Invoker<base::internal::BindState<void  [bind_internal.h:343]
void base::internal::FunctorTraits<void  [ref_counted.h:472]
content::GpuProcessTransportFactory::EstablishedGpuChannel [gpu_process_transport_factory.cc:468]
ui::ContextProviderCommandBuffer::BindToCurrentThread [context_provider_command_buffer.cc:258]
gpu::CommandBufferProxyImpl::Create [command_buffer_proxy_impl.cc:91]
gpu::CommandBufferProxyImpl::Initialize [command_buffer_proxy_impl.cc:229]
IPC::SyncChannel::Send [atomic:922]
IPC::SyncChannel::WaitForReply [ref_counted.h:476]
mojo::SyncHandleRegistry::Wait [sync_handle_registry.cc:96]
mojo::WaitSet::State::Wait [wait_set.cc:204]
base::WaitableEvent::WaitMany [waitable_event_posix.cc:129]
_pthread_cond_wait
__psynch_cvwait
 
GoogleChromeHelperSymbolicated.txt
82.9 KB View Download
GoogleChromeBrowserSymbolicated.txt
128 KB View Download
Components: Speed>Tracing
Sync thread was blocked waiting for http request to complete. I guess it was blocked by IO thread. I tried to find calls to [Deprecated]RemoveObserver in trace_log.cc, but no luck so far. Likely code has changed since M60.

Adding Speed>Tracing component, hopefully somebody has more background on TraceLog behavior.

Comment 2 by shrike@chromium.org, Sep 15 2017

Components: Platform>Apps>DevTools Blink
[mac bug triage] Looping in blink and devtools triage.
Components: -Blink Blink>Loader
Components: Blink>ServiceWorker
Labels: Sync-Triaged
As far as we can tell sync is waiting for the same thing that dev tools is; the sync thread being blocked wouldn't lock up the whole process in this scenario. For that matter, we don't sync cookies, so the fact that cookies are being set in a separate thread is a coincidence. 

+ServiceWorker folks in case this looks familiar. 

Comment 5 by yiinho@chromium.org, Sep 27 2017

Cc: piman@chromium.org
Looking at main thread trace at command_buffer_proxy_impl.cc:229, possibly to do with http://crbug.com/125248 ?

Comment 6 by piman@chromium.org, Sep 27 2017

The IO thread is blocked. That's the root of your problem. It will block the UI thread no matter what (and even if it didn't, it would prevent the UI from updating since we use the GPU process, which needs the IO thread to communicate with).

Comment 7 by falken@chromium.org, Sep 28 2017

Components: Internals>Network
Service worker doesn't seem greatly related. It just tried to restart the URLRequest, which triggered adding a netlog event, and netlog hung trying to get a lock in DeprecatedRemoveObserver which was likely held by sync scheduler? +net

Comment 8 by mef@chromium.org, Sep 28 2017

Cc: xunji...@chromium.org eroman@chromium.org

Comment 9 by mef@chromium.org, Sep 28 2017

Components: Internals>Network>Logging
Looks like a bug in TraceNetLogObserver / TraceLog.  NetLog's API contract is that observer's can't add or remove themselves during even notifications.
Owner: xunji...@chromium.org
Status: Assigned (was: Untriaged)
I will take a look.
Components: -Blink>ServiceWorker
Components: -Internals>Network -Blink>Loader -Services>Sync
Status: Started (was: Assigned)
I uploaded a CL for review: https://chromium-review.googlesource.com/c/chromium/src/+/693215

Project Member

Comment 14 by bugdroid1@chromium.org, Oct 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e4d00f0d4d07f3694800b3094378c04956c5c0c9

commit e4d00f0d4d07f3694800b3094378c04956c5c0c9
Author: Helen Li <xunjieli@chromium.org>
Date: Tue Oct 03 16:13:47 2017

Change TraceNetLogObserver to use TraceLog::AsyncEnabledStateObserver

Addition of a trace event can disable TraceLog if TraceBuffer becomes full.
This CL makes TraceNetLogObserver an AsyncEnabledStateObserver so it receives
disable/enable notification asynchronously from TraceLog. When OnAddEntry()
changes TraceLog's enable state, TraceNetLogObserver will not remove
itself from NetLog observer list inside of NetLog's event notification (
OnAddEntry()).

This CL adds a regression test.

Bug:  760817 
Change-Id: Ia761a5905306aebe2714b1e3aabaf998368dc8e1
Reviewed-on: https://chromium-review.googlesource.com/693215
Commit-Queue: Helen Li <xunjieli@chromium.org>
Reviewed-by: oysteine <oysteine@chromium.org>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#506059}
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/base/trace_event/trace_log.cc
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/base/trace_event/trace_log.h
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/components/net_log/net_export_file_writer_unittest.cc
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer.cc
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer.h
[modify] https://crrev.com/e4d00f0d4d07f3694800b3094378c04956c5c0c9/net/log/trace_net_log_observer_unittest.cc

Status: Fixed (was: Started)
Labels: M-63

Sign in to add a comment