Crash at net::UDPSocketPosix::Close()
Reported by
j...@snapchat.com,
Aug 28
|
||||
Issue descriptionSteps to reproduce the problem: We are using Cronet library in our iOS app and we saw the following crash in production: Thread 19 Crashed: 0 Snapchat 0x00000001004f4624 base::debug::BreakDebugger() + 20 1 Snapchat 0x000000010050db74 logging::LogMessage::~LogMessage() + 1408 2 Snapchat 0x000000010050dea8 logging::ErrnoLogMessage::~ErrnoLogMessage() + 140 3 Snapchat 0x0000000100756ca8 net::UDPSocketPosix::Close() + 204 4 Snapchat 0x00000001006db8d8 net::QuicChromiumClientSession::OnConnectionClosed(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 1344 5 Snapchat 0x000000010070af8c net::QuicConnection::TearDownLocalConnectionState(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 76 6 Snapchat 0x000000010070cc28 net::QuicConnection::OnWriteError(int) + 152 7 Snapchat 0x000000010070dc54 net::QuicConnection::WritePacket(net::SerializedPacket*) + 824 8 Snapchat 0x000000010070e1d8 net::QuicConnection::SendOrQueuePacket(net::SerializedPacket*) + 68 9 Snapchat 0x000000010071c358 net::QuicPacketCreator::OnSerializedPacket() + 84 10 Snapchat 0x000000010071c0b0 net::QuicPacketCreator::ReserializeAllFrames(net::QuicPendingRetransmission const&, char*, unsigned long) + 444 11 Snapchat 0x000000010070d7cc net::QuicConnection::WritePendingRetransmissions() + 128 12 Snapchat 0x000000010070d600 net::QuicConnection::OnCanWrite() + 44 13 Snapchat 0x000000010070e56c net::QuicConnection::OnRetransmissionTimeout() + 160 14 Snapchat 0x00000001004f5cd0 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 156 15 Snapchat 0x0000000100516a28 base::MessageLoop::RunTask(base::PendingTask*) + 280 16 Snapchat 0x0000000100517070 base::MessageLoop::DoDelayedWork(base::TimeTicks*) + 308 17 Snapchat 0x000000010059aa64 base::MessagePumpCFRunLoopBase::RunWork() + 92 18 Snapchat 0x000000010059a494 base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 64 19 CoreFoundation 0x000000018522697c __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 20 20 CoreFoundation 0x00000001852268fc __CFRunLoopDoSource0 + 84 21 CoreFoundation 0x0000000185226184 __CFRunLoopDoSources0 + 200 22 CoreFoundation 0x0000000185223d5c __CFRunLoopRun + 1044 23 CoreFoundation 0x0000000185143e58 CFRunLoopRunSpecific + 432 24 Foundation 0x0000000185b79594 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 25 Snapchat 0x000000010059b2d8 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 132 26 Snapchat 0x000000010059a0e8 base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 104 27 Snapchat 0x000000010053754c base::RunLoop::Run() + 84 28 Snapchat 0x0000000100568ab0 base::Thread::ThreadMain() + 384 29 Snapchat 0x00000001005615c4 base::(anonymous namespace)::ThreadFunc(void*) + 96 30 libsystem_pthread.dylib 0x0000000184ea42b4 _pthread_body + 304 31 libsystem_pthread.dylib 0x0000000184ea4180 _pthread_start + 308 32 libsystem_pthread.dylib 0x0000000184ea2b74 thread_start + 0 Currently we are unable to reproduce this locally but it is one of our most frequent crashes in prod now. I've searched in existing bugs and found this crash might be related to https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/W7YNqPabRm4/discussion. I am wondering if they are the same issue and if there is any work around this? What is the expected behavior? What went wrong? There are crashes when QUIC connection closes. Crashed report ID: How much crashed? Just one tab Is it a problem with a plugin? N/A Did this work before? N/A Chrome version: 65.0.3325.152 Channel: stable OS Version: iOS10/11/12 Flash Version:
,
Aug 28
,
Aug 28
Thanks for reporting the crash! One thing of note is that current stable version on iOS is 68.0.3440.83 so 65.0.3325.152 is pretty far behind.
,
Aug 28
Looking at UDPSocketPosix::Close() it seems that crash is coming from PCHECK(IGNORE_EINTR(guarded_close_np(socket_, &kSocketFdGuard)) == 0); I'm not familiar with guarded_close_np, but I'm not entirely surprised that socket close() is failing given previously reported write error (see net::QuicConnection::OnWriteError()). I wonder whether PCHECK is warranted in this situation.
,
Aug 28
Yes I am upgrading to 68.0.3440.70 and see if it resolves this crash. While 68.0.3440.83 is shown as the latest stable version, I cannot switch to that even after 'gclient sync --with_branch_heads --with_tags'.
,
Aug 31
,
Sep 20
I've upgraded cronet to version 69.0.3497.91. Will wait and see if that resolves the issue here.
,
Nov 12
Re#4 Unfortunately bumping up Cronet version to latest doesn't help mitigate this crash. We are still experiencing this crash with exactly the same stack on prod. Can we bump up the priority of this one?
,
Nov 13
Probably not related, but we've seen crashes like this in the past when other parts of apps accidentally double-closed file descriptors (e.g. Issue 640281 and b/113174967) which inadvertently closed Cronet's file descriptors.
,
Nov 19
Re #9: I am not able to see the ticket you mentioned. Would you mind sharing the bug description or stack trace here? And is there any possible way to identify and fix this? Thanks!
,
Nov 19
One other bug was identified using file-descriptor-sanitizer: https://android.googlesource.com/platform/bionic/+/master/docs/fdsan.md It found a double-close outside Cronet (in their app) which caused inadvertent closing of Cronet's file descriptors causing a crash similar to this one I believe. Looks like OSX offers something similar to fdsan with guarded_close_np.
,
Nov 19
I also realized this is a long standing issue. We saw similar crashes more than one year ago with similar crash stack as follows: Thread 14 Crashed: 0 Snapchat 0x0111bb8e base::debug::BreakDebugger() + 20 1 Snapchat 0x01125e61 logging::LogMessage::~LogMessage() + 1506 2 Snapchat 0x0112604d logging::ErrnoLogMessage::~ErrnoLogMessage() + 86 3 Snapchat 0x0125a381 net::UDPSocketPosix::Close() + 176 4 Snapchat 0x0120e5d9 net::QuicChromiumClientSession::OnConnectionClosed(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 1200 5 Snapchat 0x0122fb55 net::QuicConnection::TearDownLocalConnectionState(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 44 6 Snapchat 0x01231001 net::QuicConnection::OnWriteError(int) + 110 7 Snapchat 0x01231c0b net::QuicConnection::WritePacket(net::SerializedPacket*) + 678 8 Snapchat 0x0123203f net::QuicConnection::SendOrQueuePacket(net::SerializedPacket*) + 30 9 Snapchat 0x0123ae8f net::QuicPacketCreator::OnSerializedPacket() + 30 10 Snapchat 0x0123af93 net::QuicPacketCreator::Flush() + 76 11 Snapchat 0x0123bb87 net::QuicPacketGenerator::ConsumeData(unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 580 12 Snapchat 0x012310cb net::QuicConnection::SendStreamData(unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 190 13 Snapchat 0x0123f81b net::QuicSession::WritevData(net::QuicStream*, unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 240 14 Snapchat 0x012466f1 net::QuicStream::WritevDataInner(net::QuicIOVector, unsigned long long, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 86 15 Snapchat 0x0124633d net::QuicStream::WritevData(iovec const*, int, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 258 16 Snapchat 0x012461a7 net::QuicStream::WriteOrBufferData(base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 210 17 Snapchat 0x01234dfb net::QuicCryptoStream::SendHandshakeMessage(net::CryptoHandshakeMessage const&) + 60 18 Snapchat 0x012340eb net::QuicCryptoClientStream::DoSendCHLO(net::QuicCryptoClientConfig::CachedState*) + 868 19 Snapchat 0x0123395f net::QuicCryptoClientStream::DoHandshakeLoop(net::CryptoHandshakeMessage const*) + 280 20 Snapchat 0x01233ce7 net::QuicCryptoClientStream::CryptoConnect() + 18 21 Snapchat 0x0120d885 net::QuicChromiumClientSession::CryptoConnect(base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 42 22 Snapchat 0x0121681f net::QuicStreamFactory::Job::DoConnect() + 288 23 Snapchat 0x0121655b net::QuicStreamFactory::Job::DoLoop(int) + 200 24 Snapchat 0x01216473 net::QuicStreamFactory::Job::Run(base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 12 25 Snapchat 0x012174b7 net::QuicStreamFactory::Create(net::QuicServerId const&, net::HostPortPair const&, int, GURL const&, base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, net::NetLogWithSource const&, net::QuicStreamRequest*) + 724 26 Snapchat 0x012171b3 net::QuicStreamRequest::Request(net::HostPortPair const&, net::PrivacyMode, int, GURL const&, base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, net::NetLogWithSource const&, base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 94 27 Snapchat 0x011ed191 net::HttpStreamFactoryImpl::Job::DoInitConnectionImpl() + 734 28 Snapchat 0x011ec4d7 net::HttpStreamFactoryImpl::Job::DoInitConnection() + 20 29 Snapchat 0x011ec131 net::HttpStreamFactoryImpl::Job::DoLoop(int) + 114 30 Snapchat 0x011eb175 net::HttpStreamFactoryImpl::Job::RunLoop(int) + 56 31 Snapchat 0x011eb087 net::HttpStreamFactoryImpl::Job::StartInternal() + 80 32 Snapchat 0x011ee755 net::HttpStreamFactoryImpl::JobController::CreateJobs(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, net::HttpStreamRequest::StreamType) + 398 33 Snapchat 0x011ee5bb net::HttpStreamFactoryImpl::JobController::Start(net::HttpRequestInfo const&, net::HttpStreamRequest::Delegate*, net::WebSocketHandshakeStreamBase::CreateHelper*, net::NetLogWithSource const&, net::HttpStreamRequest::StreamType, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&) + 130 34 Snapchat 0x011e9975 net::HttpStreamFactoryImpl::RequestStreamInternal(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, net::WebSocketHandshakeStreamBase::CreateHelper*, net::HttpStreamRequest::StreamType, bool, bool, net::NetLogWithSource const&) + 106 35 Snapchat 0x011e9905 net::HttpStreamFactoryImpl::RequestStream(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, bool, bool, net::NetLogWithSource const&) + 42 36 Snapchat 0x011e0bad net::HttpNetworkTransaction::DoCreateStream() + 192 37 Snapchat 0x011dfa51 net::HttpNetworkTransaction::DoLoop(int) + 550 38 Snapchat 0x011df80d net::HttpNetworkTransaction::Start(net::HttpRequestInfo const*, base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, net::NetLogWithSource const&) + 96 39 Snapchat 0x01284ed7 net::URLRequestHttpJob::StartTransactionInternal() + 454 40 Snapchat 0x01284cfd net::URLRequestHttpJob::MaybeStartTransactionInternal(int) + 376 41 Snapchat 0x01284b69 net::URLRequestHttpJob::StartTransaction() + 172 42 Snapchat 0x0128544b net::URLRequestHttpJob::SetCookieHeaderAndStart(std::__1::vector<net::CanonicalCookie, std::__1::allocator<net::CanonicalCookie> > const&) + 114 43 Snapchat 0x012ff7f3 net::CookieStoreIOS::GetCookieListWithOptionsAsync(GURL const&, net::CookieOptions const&, base::Callback<void (std::__1::vector<net::CanonicalCookie, std::__1::allocator<net::CanonicalCookie> > const&), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 184 44 Snapchat 0x01284451 net::URLRequestHttpJob::AddCookieHeaderAndStart() + 238 45 Snapchat 0x01284071 net::URLRequestHttpJob::Start() + 368 46 Snapchat 0x01280ba7 net::URLRequest::StartJob(net::URLRequestJob*) + 440 47 Snapchat 0x01280849 net::URLRequest::Start() + 260 48 Snapchat 0x01305833 net::HttpProtocolHandlerCore::Start(id<CRNNetworkClientProtocol>) + 1148 49 Snapchat 0x0111c12b base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 188 50 Snapchat 0x01129161 base::MessageLoop::RunTask(base::PendingTask*) + 310 51 Snapchat 0x011293c7 base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) + 148 52 Snapchat 0x011294f3 base::MessageLoop::DoWork() + 252 53 Snapchat 0x0115a64d base::MessagePumpCFRunLoopBase::RunWork() + 106 54 Snapchat 0x01159ea3 base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 62 55 CoreFoundation 0x1d32ffdd __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 10 56 CoreFoundation 0x1d32fb05 __CFRunLoopDoSources0 + 422 57 CoreFoundation 0x1d32df51 __CFRunLoopRun + 1158 58 CoreFoundation 0x1d2811af CFRunLoopRunSpecific + 468 59 CoreFoundation 0x1d280fd1 CFRunLoopRunInMode + 102 60 Foundation 0x1dbd5ab5 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 256 61 Snapchat 0x0115af55 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 118 62 Snapchat 0x01159717 base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 84 63 Snapchat 0x01135737 base::RunLoop::Run() + 34 64 Snapchat 0x0114a0e7 base::Thread::ThreadMain() + 208 65 Snapchat 0x01146f5b base::(anonymous namespace)::ThreadFunc(void*) + 60 66 libsystem_pthread.dylib 0x1cbf893b _pthread_body + 214 67 libsystem_pthread.dylib 0x1cbf885d _pthread_start + 232 68 libsystem_pthread.dylib 0x1cbf6468 thread_start + 6 I believe at that time we were using Cronet m61. Similar crash is the top iOS crash for our app now. Do we have some clue on this crash? Would it be possible that it is related to use NSURLConnection (which is deprecated) at some places in our app? Thanks.
,
Nov 20
I've created prototype CL https://chromium-review.googlesource.com/c/chromium/src/+/1344290 to use change_fdguard_np to detect unexpected socket closings. Unfortunately running cronet tests with this CL didn't detect any issues. There is some discussion about performance implications, but in theory we should be able to use it in release builds.
,
Nov 20
Thanks Misha! Ryan and I discussed this issue a little bit, and we believe the root cause is not in QUIC code but in some other buggy code. The short term fix was to land code to guard our socket and let the crash blow on the correct stack. We will also need Snapchat folks to provide feedback to us on where the crash move so that we can fix the issue completely.
,
Dec 26
Hi Misha & Zhongyi, Thanks for helping us checking the crash! https://bugs.chromium.org/p/chromium/issues/detail?id=878429 Unfortunately the crash is still happening after we patched your fix into our latest app. Also, it only crash on iOS 12. Here is the stack trace, thanks! 0 Snapchat logging::LogMessage::~LogMessage() + 4311904744 1 Snapchat logging::LogMessage::~LogMessage() + 4311904224 2 Snapchat logging::ErrnoLogMessage::~ErrnoLogMessage() + 4311905516 3 Snapchat net::UDPSocketPosix::Close() + 4314508320 4 Snapchat net::QuicChromiumClientSession::OnConnectionClosed(quic::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, quic::ConnectionCloseSource) + 4313665840 5 Snapchat quic::QuicConnection::TearDownLocalConnectionState(quic::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, quic::ConnectionCloseSource) + 4314134604 6 Snapchat quic::QuicConnection::CheckForTimeout() + 4314153112 7 Snapchat base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 4311834020 8 Snapchat base::MessageLoop::RunTask(base::PendingTask*) + 4311945212 9 Snapchat base::MessageLoop::DoDelayedWork(base::TimeTicks*) + 4311946868 10 Snapchat base::MessagePumpCFRunLoopBase::RunWork() + 4312490860 11 Snapchat base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 4312489380 12 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 13 CoreFoundation __CFRunLoopDoSource0 + 88 14 CoreFoundation __CFRunLoopDoSources0 + 176 15 CoreFoundation __CFRunLoopRun + 1040 16 CoreFoundation CFRunLoopRunSpecific + 436 17 Foundation -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 18 Snapchat base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 4312493020 19 Snapchat base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 4312488440 20 Snapchat base::RunLoop::Run() + 4312080268 21 Snapchat base::Thread::ThreadMain() + 4312277068 22 Snapchat base::(anonymous namespace)::ThreadFunc(void*) + 4312478164 23 libsystem_pthread.dylib _pthread_body + 128 24 libsystem_pthread.dylib _pthread_start + 48 25 libsystem_pthread.dylib thread_start + 4
,
Jan 2
|
||||
►
Sign in to add a comment |
||||
Comment 1 by dtapu...@chromium.org
, Aug 28