New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 739886 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android , All
Pri: 1
Type: Bug



Sign in to add a comment

Crash in rtc::FatalMessage::~FatalMessage

Project Member Reported by ClusterFuzz, Jul 6 2017

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=6244586442981376

Fuzzer: inferno_layout_test_unmodified
Job Type: android_asan_chrome_latest
Platform Id: android:hammerhead:l

Crash Type: UNKNOWN
Crash Address: 
Crash State:
  rtc::FatalMessage::~FatalMessage
  rtc::PlatformThread::Start
  webrtc::internal::Call::Call
  
Sanitizer: address (ASAN)

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6244586442981376


Issue filed automatically.

See https://dev.chromium.org/Home/chromium-security/bugs/reproducing-clusterfuzz-bugs for more information.
 
Cc: msrchandra@chromium.org henrike@webrtc.org
Components: Blink>WebRTC
Labels: Test-Predator-Correct-CLs
Assigning to concern owner from Predator results --
Regression information is not available. The result is the blame information. 

Author: henrike@webrtc.org
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/47be73b8629244d6bb63a28198f97f040ce53d21
Time: Tue May 13 18:00:26 2014
The CL last changed line 110 of file checks.cc, which is stack frame 5. 

Author: pbos
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/9410e011265bda08a673be1e2f99b9afd02614c1
Time: Mon Nov 23 22:47:56 2015
The CL last changed line 162 of file platform_thread.cc, which is stack frame 6. 

Author: nisse
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/7d5b2665b5c6a37ebad6c6bcbf5f4a9ee8408c65
Time: Thu Jan 19 13:41:25 2017
The CL last changed line 448 of file call.cc, which is stack frame 7. 

Author: zstein
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/29a1f8c3a67d890be8fa7c602b15f03485f4f1cc
Time: Mon May 08 18:52:38 2017
The CL last changed line 378 of file call.cc, which is stack frame 8. 

Author: zhihuang
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/f91805c46d7a5067e3b4ce2bd066ca504dc167f3
Time: Thu Jun 15 19:52:32 2017
The CL last changed line 349 of file peerconnectionfactory.cc, which is stack frame 9. 

Author: Henrik Kjellander
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/88b2dd4d05208b2dec968a88c5fcc5d8f8152b7f
Time: Thu Jun 29 05:52:50 2017
The CL last changed line 164 of file bind.h, which is stack frame 10. 

Author: Henrik Kjellander
Project: chromium-webrtc
Changelist: https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/88b2dd4d05208b2dec968a88c5fcc5d8f8152b7f
Time: Thu Jun 29 05:52:50 2017
The CL last changed line 155 of file bind.h, which is stack frame 11.

@henrike -- Could you please look into the issue, kindly re-assign if this is not related to your changes.
Thank You.

Comment 3 by guidou@chromium.org, Jul 31 2017

Owner: nisse@chromium.org
Status: Assigned (was: Untriaged)
A bisect of the WebRTC error range shows that the crash starts with this CL:

https://chromium.googlesource.com/external/webrtc/trunk/webrtc.git/+/c0ff88b15124baa1dbcd671f4b7f8ffeba5b7144


Assigning to nisse@, who authored it.

Comment 4 by guidou@chromium.org, Jul 31 2017

Cc: nisse@webrtc.org

Comment 5 by guidou@chromium.org, Jul 31 2017

Labels: -OS-Android OS-All

Comment 6 by guidou@chromium.org, Jul 31 2017

Cc: guidou@chromium.org
Project Member

Comment 7 by ClusterFuzz, Aug 20 2017

Labels: OS-Android
Project Member

Comment 8 by ClusterFuzz, Sep 5 2017

Status: WontFix (was: Assigned)
ClusterFuzz testcase 6244586442981376 is flaky and no longer crashes, so closing issue.

If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.
Status: Assigned (was: WontFix)
nisse@: Can you take a look at this? Apparently it's flaky, but I could reproduce reliably and tracked it to the CL in #3 a few weeks ago.
Issue 762366 has been merged into this issue.
I'll investigate.
Log messages just prior to the crash:

[1:18:0908/113418.666550:ERROR:platform_thread_posix.cc(123)] pthread_create: Resource temporarily unavailable (11)
[1:18:0908/113418.667014:ERROR:thread.cc(117)] failed to create thread
[1:18:0908/113418.667369:FATAL:task_queue.cc(104)] Check failed: result. 
#0 0x7f4375c7869d base::debug::StackTrace::StackTrace()
#1 0x7f4375c76a6c base::debug::StackTrace::StackTrace()
#2 0x7f4375d0762a logging::LogMessage::~LogMessage()
#3 0x7f4370446d07 rtc::TaskQueue::TaskQueue()
#4 0x7f437162126f webrtc::internal::Call::Call()
#5 0x7f43716205a3 webrtc::Call::Create()
#6 0x7f437163a679 webrtc::CallFactory::CreateCall()
#7 0x7f4371dfd93b webrtc::PeerConnectionFactory::CreateCall_w()

My interpretation is as follows: The clusterfuzz scripts creates lots of peer connections (I could add some logging to peerconnection constructor and destructor to confirm this).

These peerconnection consumes a lot of threads. Finally, creating a new thread fails, and then a CHECK in chrome's TaskQueue constructor crashes.
Improving error handling in this case seems difficult.

It could also be that we have a thread leak when a peerconnections is destroyed.
I've added logging of number of peer connections and number of threads in the Peerconnection constructor and destructor. The test creates 10402 peerconnections before crashing, never destroying any. Log excerpt:

[133474:133525:0908/134015.685199:WARNING:peerconnection.cc(420)] PC, pid 133474: created 1, destroyed 0, #threads 26
...
[133474:133525:0908/135021.434895:WARNING:peerconnection.cc(420)] PC, pid 133474: created 10402, destroyed 0, #threads 31246
[133474:133526:0908/135021.443370:WARNING:rtc_event_log.cc(833)] Denied creation of additional WebRTC event logs. 5 logs open already.
[133474:133526:0908/135021.443974:ERROR:platform_thread_posix.cc(123)] pthread_create: Resource temporarily unavailable (11)
[133474:133526:0908/135021.444316:ERROR:thread.cc(117)] failed to create thread
[133474:133526:0908/135021.444500:FATAL:task_queue.cc(104)] Check failed: result. 

So it seems the process can't spawn more than appr. 32000 threads (tested on my linux workstation).

There's little we can do in webrtc about exhaustion of OS threads. It would be good if some of you on the Chrome team could take over the issue.

Maybe Chromium could enforce some arbitrary limit on the number of peer connections  per render process, to fail in nicer manner? Or else, clusterfuzz needs to stop creating 10000 peer connections.

BTW: This function returns the number of threads on linux (requires --no-sandbox when running chrome):


int GetThreadCount() {
  struct stat st;
  if (stat ("/proc/self/task/", &st) < 0)
    return 0;
  else // 2 for "." and "..", plus one link per thread subdirectory.
    return st.st_nlink - 2;
}

Comment 14 by nisse@chromium.org, Sep 12 2017

Labels: -Restrict-View-Google
Status: WontFix (was: Assigned)
I've filed bug https://bugs.chromium.org/p/chromium/issues/detail?id=764265 about adding a limit on number of peerconnections.

Closing as WontFix, since we don't try to recover nicely from resource exhaustion.
Project Member

Comment 15 by ClusterFuzz, Sep 20 2017

Labels: Needs-Feedback
ClusterFuzz testcase 5386304924942336 is still reproducing on tip-of-tree build (trunk).

If this testcase was not reproducible locally or unworkable, ignore this notification and we will file another bug soon with hopefully a better and workable testcase.

Otherwise, if this is not intended to be fixed (e.g. this is an intentional crash), please add ClusterFuzz-Ignore label to prevent future bug filing with similar crash stacktrace.

Sign in to add a comment