New issue
Advanced search Search tips

Issue 878429 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: iOS
Pri: 2
Type: Bug



Sign in to add a comment

Crash at net::UDPSocketPosix::Close()

Reported by j...@snapchat.com, Aug 28

Issue description

Steps to reproduce the problem:
We are using Cronet library in our iOS app and we saw the following crash in production:

Thread 19 Crashed:
0   Snapchat                             0x00000001004f4624 base::debug::BreakDebugger() + 20
1   Snapchat                             0x000000010050db74 logging::LogMessage::~LogMessage() + 1408
2   Snapchat                             0x000000010050dea8 logging::ErrnoLogMessage::~ErrnoLogMessage() + 140
3   Snapchat                             0x0000000100756ca8 net::UDPSocketPosix::Close() + 204
4   Snapchat                             0x00000001006db8d8 net::QuicChromiumClientSession::OnConnectionClosed(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 1344
5   Snapchat                             0x000000010070af8c net::QuicConnection::TearDownLocalConnectionState(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 76
6   Snapchat                             0x000000010070cc28 net::QuicConnection::OnWriteError(int) + 152
7   Snapchat                             0x000000010070dc54 net::QuicConnection::WritePacket(net::SerializedPacket*) + 824
8   Snapchat                             0x000000010070e1d8 net::QuicConnection::SendOrQueuePacket(net::SerializedPacket*) + 68
9   Snapchat                             0x000000010071c358 net::QuicPacketCreator::OnSerializedPacket() + 84
10  Snapchat                             0x000000010071c0b0 net::QuicPacketCreator::ReserializeAllFrames(net::QuicPendingRetransmission const&, char*, unsigned long) + 444
11  Snapchat                             0x000000010070d7cc net::QuicConnection::WritePendingRetransmissions() + 128
12  Snapchat                             0x000000010070d600 net::QuicConnection::OnCanWrite() + 44
13  Snapchat                             0x000000010070e56c net::QuicConnection::OnRetransmissionTimeout() + 160
14  Snapchat                             0x00000001004f5cd0 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 156
15  Snapchat                             0x0000000100516a28 base::MessageLoop::RunTask(base::PendingTask*) + 280
16  Snapchat                             0x0000000100517070 base::MessageLoop::DoDelayedWork(base::TimeTicks*) + 308
17  Snapchat                             0x000000010059aa64 base::MessagePumpCFRunLoopBase::RunWork() + 92
18  Snapchat                             0x000000010059a494 base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 64
19  CoreFoundation                       0x000000018522697c __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 20
20  CoreFoundation                       0x00000001852268fc __CFRunLoopDoSource0 + 84
21  CoreFoundation                       0x0000000185226184 __CFRunLoopDoSources0 + 200
22  CoreFoundation                       0x0000000185223d5c __CFRunLoopRun + 1044
23  CoreFoundation                       0x0000000185143e58 CFRunLoopRunSpecific + 432
24  Foundation                           0x0000000185b79594 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
25  Snapchat                             0x000000010059b2d8 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 132
26  Snapchat                             0x000000010059a0e8 base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 104
27  Snapchat                             0x000000010053754c base::RunLoop::Run() + 84
28  Snapchat                             0x0000000100568ab0 base::Thread::ThreadMain() + 384
29  Snapchat                             0x00000001005615c4 base::(anonymous namespace)::ThreadFunc(void*) + 96
30  libsystem_pthread.dylib              0x0000000184ea42b4 _pthread_body + 304
31  libsystem_pthread.dylib              0x0000000184ea4180 _pthread_start + 308
32  libsystem_pthread.dylib              0x0000000184ea2b74 thread_start + 0

Currently we are unable to reproduce this locally but it is one of our most frequent crashes in prod now.

I've searched in existing bugs and found this crash might be related to https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/W7YNqPabRm4/discussion. I am wondering if they are the same issue and if there is any work around this?

What is the expected behavior?

What went wrong?
There are crashes when QUIC connection closes.

Crashed report ID: 

How much crashed? Just one tab

Is it a problem with a plugin? N/A 

Did this work before? N/A 

Chrome version: 65.0.3325.152  Channel: stable
OS Version: iOS10/11/12
Flash Version:
 
Components: Internals>Network>Library
Components: Internals>Network>QUIC
Thanks for reporting the crash! One thing of note is that current stable version on iOS is 68.0.3440.83 so 65.0.3325.152 is pretty far behind.


Looking at UDPSocketPosix::Close() it seems that crash is coming from PCHECK(IGNORE_EINTR(guarded_close_np(socket_, &kSocketFdGuard)) == 0);

I'm not familiar with guarded_close_np, but I'm not entirely surprised that socket close() is failing given previously reported write error (see net::QuicConnection::OnWriteError()).

I wonder whether PCHECK is warranted in this situation.
Yes I am upgrading to 68.0.3440.70 and see if it resolves this crash. While 68.0.3440.83 is shown as the latest stable version, I cannot switch to that even after 'gclient sync --with_branch_heads --with_tags'.
Owner: mef@chromium.org
Status: Assigned (was: Unconfirmed)
I've upgraded cronet to version 69.0.3497.91. Will wait and see if that resolves the issue here.
Re#4

Unfortunately bumping up Cronet version to latest doesn't help mitigate this crash. We are still experiencing this crash with exactly the same stack on prod. Can we bump up the priority of this one?
Probably not related, but we've seen crashes like this in the past when other parts of apps accidentally double-closed file descriptors (e.g. Issue 640281 and b/113174967) which inadvertently closed Cronet's file descriptors.
Re #9:
I am not able to see the ticket you mentioned. Would you mind sharing the bug description or stack trace here? And is there any possible way to identify and fix this? Thanks!
One other bug was identified using file-descriptor-sanitizer:
https://android.googlesource.com/platform/bionic/+/master/docs/fdsan.md
It found a double-close outside Cronet (in their app) which caused inadvertent closing of Cronet's file descriptors causing a crash similar to this one I believe.

Looks like OSX offers something similar to fdsan with guarded_close_np.
I also realized this is a long standing issue. We saw similar crashes more than one year ago with similar crash stack as follows:

Thread 14 Crashed:
0   Snapchat                             0x0111bb8e base::debug::BreakDebugger() + 20
1   Snapchat                             0x01125e61 logging::LogMessage::~LogMessage() + 1506
2   Snapchat                             0x0112604d logging::ErrnoLogMessage::~ErrnoLogMessage() + 86
3   Snapchat                             0x0125a381 net::UDPSocketPosix::Close() + 176
4   Snapchat                             0x0120e5d9 net::QuicChromiumClientSession::OnConnectionClosed(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 1200
5   Snapchat                             0x0122fb55 net::QuicConnection::TearDownLocalConnectionState(net::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, net::ConnectionCloseSource) + 44
6   Snapchat                             0x01231001 net::QuicConnection::OnWriteError(int) + 110
7   Snapchat                             0x01231c0b net::QuicConnection::WritePacket(net::SerializedPacket*) + 678
8   Snapchat                             0x0123203f net::QuicConnection::SendOrQueuePacket(net::SerializedPacket*) + 30
9   Snapchat                             0x0123ae8f net::QuicPacketCreator::OnSerializedPacket() + 30
10  Snapchat                             0x0123af93 net::QuicPacketCreator::Flush() + 76
11  Snapchat                             0x0123bb87 net::QuicPacketGenerator::ConsumeData(unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 580
12  Snapchat                             0x012310cb net::QuicConnection::SendStreamData(unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 190
13  Snapchat                             0x0123f81b net::QuicSession::WritevData(net::QuicStream*, unsigned int, net::QuicIOVector, unsigned long long, net::StreamSendingState, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 240
14  Snapchat                             0x012466f1 net::QuicStream::WritevDataInner(net::QuicIOVector, unsigned long long, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 86
15  Snapchat                             0x0124633d net::QuicStream::WritevData(iovec const*, int, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 258
16  Snapchat                             0x012461a7 net::QuicStream::WriteOrBufferData(base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, bool, net::QuicReferenceCountedPointer<net::QuicAckListenerInterface>) + 210
17  Snapchat                             0x01234dfb net::QuicCryptoStream::SendHandshakeMessage(net::CryptoHandshakeMessage const&) + 60
18  Snapchat                             0x012340eb net::QuicCryptoClientStream::DoSendCHLO(net::QuicCryptoClientConfig::CachedState*) + 868
19  Snapchat                             0x0123395f net::QuicCryptoClientStream::DoHandshakeLoop(net::CryptoHandshakeMessage const*) + 280
20  Snapchat                             0x01233ce7 net::QuicCryptoClientStream::CryptoConnect() + 18
21  Snapchat                             0x0120d885 net::QuicChromiumClientSession::CryptoConnect(base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 42
22  Snapchat                             0x0121681f net::QuicStreamFactory::Job::DoConnect() + 288
23  Snapchat                             0x0121655b net::QuicStreamFactory::Job::DoLoop(int) + 200
24  Snapchat                             0x01216473 net::QuicStreamFactory::Job::Run(base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 12
25  Snapchat                             0x012174b7 net::QuicStreamFactory::Create(net::QuicServerId const&, net::HostPortPair const&, int, GURL const&, base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, net::NetLogWithSource const&, net::QuicStreamRequest*) + 724
26  Snapchat                             0x012171b3 net::QuicStreamRequest::Request(net::HostPortPair const&, net::PrivacyMode, int, GURL const&, base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, net::NetLogWithSource const&, base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 94
27  Snapchat                             0x011ed191 net::HttpStreamFactoryImpl::Job::DoInitConnectionImpl() + 734
28  Snapchat                             0x011ec4d7 net::HttpStreamFactoryImpl::Job::DoInitConnection() + 20
29  Snapchat                             0x011ec131 net::HttpStreamFactoryImpl::Job::DoLoop(int) + 114
30  Snapchat                             0x011eb175 net::HttpStreamFactoryImpl::Job::RunLoop(int) + 56
31  Snapchat                             0x011eb087 net::HttpStreamFactoryImpl::Job::StartInternal() + 80
32  Snapchat                             0x011ee755 net::HttpStreamFactoryImpl::JobController::CreateJobs(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, net::HttpStreamRequest::StreamType) + 398
33  Snapchat                             0x011ee5bb net::HttpStreamFactoryImpl::JobController::Start(net::HttpRequestInfo const&, net::HttpStreamRequest::Delegate*, net::WebSocketHandshakeStreamBase::CreateHelper*, net::NetLogWithSource const&, net::HttpStreamRequest::StreamType, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&) + 130
34  Snapchat                             0x011e9975 net::HttpStreamFactoryImpl::RequestStreamInternal(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, net::WebSocketHandshakeStreamBase::CreateHelper*, net::HttpStreamRequest::StreamType, bool, bool, net::NetLogWithSource const&) + 106
35  Snapchat                             0x011e9905 net::HttpStreamFactoryImpl::RequestStream(net::HttpRequestInfo const&, net::RequestPriority, net::SSLConfig const&, net::SSLConfig const&, net::HttpStreamRequest::Delegate*, bool, bool, net::NetLogWithSource const&) + 42
36  Snapchat                             0x011e0bad net::HttpNetworkTransaction::DoCreateStream() + 192
37  Snapchat                             0x011dfa51 net::HttpNetworkTransaction::DoLoop(int) + 550
38  Snapchat                             0x011df80d net::HttpNetworkTransaction::Start(net::HttpRequestInfo const*, base::Callback<void (int), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, net::NetLogWithSource const&) + 96
39  Snapchat                             0x01284ed7 net::URLRequestHttpJob::StartTransactionInternal() + 454
40  Snapchat                             0x01284cfd net::URLRequestHttpJob::MaybeStartTransactionInternal(int) + 376
41  Snapchat                             0x01284b69 net::URLRequestHttpJob::StartTransaction() + 172
42  Snapchat                             0x0128544b net::URLRequestHttpJob::SetCookieHeaderAndStart(std::__1::vector<net::CanonicalCookie, std::__1::allocator<net::CanonicalCookie> > const&) + 114
43  Snapchat                             0x012ff7f3 net::CookieStoreIOS::GetCookieListWithOptionsAsync(GURL const&, net::CookieOptions const&, base::Callback<void (std::__1::vector<net::CanonicalCookie, std::__1::allocator<net::CanonicalCookie> > const&), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) + 184
44  Snapchat                             0x01284451 net::URLRequestHttpJob::AddCookieHeaderAndStart() + 238
45  Snapchat                             0x01284071 net::URLRequestHttpJob::Start() + 368
46  Snapchat                             0x01280ba7 net::URLRequest::StartJob(net::URLRequestJob*) + 440
47  Snapchat                             0x01280849 net::URLRequest::Start() + 260
48  Snapchat                             0x01305833 net::HttpProtocolHandlerCore::Start(id<CRNNetworkClientProtocol>) + 1148
49  Snapchat                             0x0111c12b base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 188
50  Snapchat                             0x01129161 base::MessageLoop::RunTask(base::PendingTask*) + 310
51  Snapchat                             0x011293c7 base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) + 148
52  Snapchat                             0x011294f3 base::MessageLoop::DoWork() + 252
53  Snapchat                             0x0115a64d base::MessagePumpCFRunLoopBase::RunWork() + 106
54  Snapchat                             0x01159ea3 base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 62
55  CoreFoundation                       0x1d32ffdd __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 10
56  CoreFoundation                       0x1d32fb05 __CFRunLoopDoSources0 + 422
57  CoreFoundation                       0x1d32df51 __CFRunLoopRun + 1158
58  CoreFoundation                       0x1d2811af CFRunLoopRunSpecific + 468
59  CoreFoundation                       0x1d280fd1 CFRunLoopRunInMode + 102
60  Foundation                           0x1dbd5ab5 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 256
61  Snapchat                             0x0115af55 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 118
62  Snapchat                             0x01159717 base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 84
63  Snapchat                             0x01135737 base::RunLoop::Run() + 34
64  Snapchat                             0x0114a0e7 base::Thread::ThreadMain() + 208
65  Snapchat                             0x01146f5b base::(anonymous namespace)::ThreadFunc(void*) + 60
66  libsystem_pthread.dylib              0x1cbf893b _pthread_body + 214
67  libsystem_pthread.dylib              0x1cbf885d _pthread_start + 232
68  libsystem_pthread.dylib              0x1cbf6468 thread_start + 6

I believe at that time  we were using Cronet m61. Similar crash is the top iOS crash for our app now. Do we have some clue on this crash? Would it be possible that it is related to use NSURLConnection (which is deprecated) at some places in our app? Thanks.
I've created prototype CL https://chromium-review.googlesource.com/c/chromium/src/+/1344290 to use change_fdguard_np to detect unexpected socket closings. 

Unfortunately running cronet tests with this CL didn't detect any issues.

There is some discussion about performance implications, but in theory we should be able to use it in release builds.
Thanks Misha! Ryan and I discussed this issue a little bit, and we believe the root cause is not in QUIC code but in some other buggy code. The short term fix was to land code to guard our socket and let the crash blow on the correct stack.
We will also need Snapchat folks to provide feedback to us on where the crash move so that we can fix the issue completely. 
Hi Misha & Zhongyi,

Thanks for helping us checking the crash! https://bugs.chromium.org/p/chromium/issues/detail?id=878429

Unfortunately the crash is still happening after we patched your fix into our latest app. Also, it only crash on iOS 12.

Here is the stack trace, thanks!


0
Snapchat	
logging::LogMessage::~LogMessage() + 4311904744
1
Snapchat	
logging::LogMessage::~LogMessage() + 4311904224
2
Snapchat	
logging::ErrnoLogMessage::~ErrnoLogMessage() + 4311905516
3
Snapchat	
net::UDPSocketPosix::Close() + 4314508320
4
Snapchat	
net::QuicChromiumClientSession::OnConnectionClosed(quic::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, quic::ConnectionCloseSource) + 4313665840
5
Snapchat	
quic::QuicConnection::TearDownLocalConnectionState(quic::QuicErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, quic::ConnectionCloseSource) + 4314134604
6
Snapchat	
quic::QuicConnection::CheckForTimeout() + 4314153112
7
Snapchat	
base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 4311834020
8
Snapchat	
base::MessageLoop::RunTask(base::PendingTask*) + 4311945212
9
Snapchat	
base::MessageLoop::DoDelayedWork(base::TimeTicks*) + 4311946868
10
Snapchat	
base::MessagePumpCFRunLoopBase::RunWork() + 4312490860
11
Snapchat	
base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 4312489380
12
CoreFoundation	
__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
13
CoreFoundation	
__CFRunLoopDoSource0 + 88
14
CoreFoundation	
__CFRunLoopDoSources0 + 176
15
CoreFoundation	
__CFRunLoopRun + 1040
16
CoreFoundation	
CFRunLoopRunSpecific + 436
17
Foundation	
-[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
18
Snapchat	
base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 4312493020
19
Snapchat	
base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 4312488440
20
Snapchat	
base::RunLoop::Run() + 4312080268
21
Snapchat	
base::Thread::ThreadMain() + 4312277068
22
Snapchat	
base::(anonymous namespace)::ThreadFunc(void*) + 4312478164
23
libsystem_pthread.dylib	
_pthread_body + 128
24
libsystem_pthread.dylib	
_pthread_start + 48
25
libsystem_pthread.dylib	
thread_start + 4

Summary: Crash at net::UDPSocketPosix::Close() (was: Crash at base::debug::BreakDebugger())

Sign in to add a comment