New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 820830 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug-Security



Sign in to add a comment

Bad-cast to webrtc::VideoStreamEncoder from invalid vptr in rtc::ClosureTask<webrtc::VideoStreamEncoder::OnEncodedImage

Project Member Reported by ClusterFuzz, Mar 11 2018

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=5981177547325440

Fuzzer: phoglund_webrtc_peerconnection
Job Type: linux_ubsan_vptr_chrome
Platform Id: linux

Crash Type: Bad-cast
Crash Address: 0x2a7609edb100
Crash State:
  Bad-cast to webrtc::VideoStreamEncoder from invalid vptr
  rtc::ClosureTask<webrtc::VideoStreamEncoder::OnEncodedImage
  rtc::TaskQueue::Impl::RunTask
  
Sanitizer: undefined (UBSAN)

Recommended Security Severity: High

Regressed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_chrome&range=542245:542267

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5981177547325440

Additional requirements: Requires HTTP

Issue filed automatically.

See https://github.com/google/clusterfuzz-tools for more information.
 
Project Member

Comment 1 by ClusterFuzz, Mar 11 2018

Components: Blink>WebRTC Internals>Core
Labels: Test-Predator-Auto-Components
Automatically applying components based on crash stacktrace and information from OWNERS files.

If this is incorrect, please apply the Test-Predator-Wrong-Components label.
Project Member

Comment 2 by ClusterFuzz, Mar 11 2018

Labels: Test-Predator-Auto-Owner
Owner: tommi@chromium.org
Status: Assigned (was: Untriaged)
Automatically assigning owner based on suspected regression changelist https://chromium.googlesource.com/chromium/src/+/5773863a93f2b00e2a40fa134ad32f1da48ae588 (Make WebRTC task queues run on chromium's SequencedTaskRunner.).

If this is incorrect, please let us know why and apply the Test-Predator-Wrong-CLs label. If you aren't the correct owner for this issue, please unassign yourself as soon as possible so it can be re-triaged.

Comment 3 by tommi@chromium.org, Mar 11 2018

Owner: nisse@chromium.org
This third one also seems related to VideoStreamEncoder::OnEncodedImage.

Comment 4 by nisse@chromium.org, Mar 12 2018

Cc: perkj@chromium.org
Logs for the first case shows the crash is just after RemoveSendStream. Do you think this is a use-after-free, or some more subtle corruption?

Let me think aloud.

For use after-free, one could suspect either the new tq code, and its handling of the is_active flag. I've checked out revision 542374 and run gclient sync to get the right version of webrtc. Line number in the stack trace points here:

  encoder_queue_.PostTask(
      [this, timestamp, time_sent_us, qp, capture_time_us, encode_duration_us] {
-->     RTC_DCHECK_RUN_ON(&encoder_queue_);
        overuse_detector_->FrameSent(timestamp, time_sent_us, capture_time_us,
                                     encode_duration_us);
        if (quality_scaler_ && qp >= 0)
          quality_scaler_->ReportQP(qp);
      });

So one hypothesis is that we have this sequence of events:

1. On EncodedImage is called on a valid VSE. It posts a task  to update the overuse_frame_detector_ and quality_scaler_.

2. VSE is destroyed.

3. The task is nevertheless run, accessing the destroyed VSE.

Alternatively, it could be the VSE's own destruction handling that is broken, with OnEncodedImage called during or after destruction. It looks like this is the responsibility of VSE::Stop(), which calls 

    video_sender_.RegisterExternalEncoder(nullptr, settings_.payload_type,
                                          false);

It's not entirely clear to me how that cancels encoding of in-flight frames, but it seems it calls VCMEncoderDataBase::DeleteEncoder() which destroys the underlying VCMGenericEncoder, and then VSE::Stop() blocks waiting for completion of that task.

If VSE was destroyed without Stop being called first, would we see that? There's a DCHECK, are those enabled in fuzzer builds. If use-after-free here it is considered a security problem, that DCHECK should be upgraded to a CHECK. (And as far as I understand, the reason we have a Stop method at all is that it's too late to synchronize in the destructor, due to the way the vptr is handled during destruction).

It could also be a destruction order thing; the destructor starts running before the taskqueue (a member variable) is destroyed, then tasks scheduled to run during that short interval will see a half-destructed object. I think this is a problem we've seen in other places too. To fix, the tq would need to be destroyed by VSE::Stop().

And I'd wish C++ destructors provided more control, to let part of the destructor run with a valid object, doing any needed synchronization, and then explicitly say at which point inside the destructor the object should be considered invalid and vptr rewritten.

Comment 5 by nisse@chromium.org, Mar 12 2018

Had a quick chat with perkj. As far as we see, the task about have no reason to access the vptr at all.

Are fuzzers running with some sanitizer trying to detect use of objects after the destructor has started running (which I would guess is formally undefined behavior)? Then it's not enough to ensure that the TaskQueue is destroyed before all other member variables.
Project Member

Comment 6 by sheriffbot@chromium.org, Mar 12 2018

Labels: M-66
Project Member

Comment 7 by sheriffbot@chromium.org, Mar 12 2018

Labels: ReleaseBlock-Stable
This is a serious security regression. If you are not able to fix this quickly, please revert the change that introduced it.

If this doesn't affect a release branch, or has not been properly classified for severity, please update the Security_Impact or Security_Severity labels, and remove the ReleaseBlock label. To disable this altogether, apply ReleaseBlock-NA.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 8 by nisse@chromium.org, Mar 12 2018

Cc: tommi@chromium.org
Project Member

Comment 9 by sheriffbot@chromium.org, Mar 12 2018

Labels: Pri-1
Project Member

Comment 10 by bugdroid1@chromium.org, Mar 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/be53eac959ea4146e4fa7d9db046f5ec4c861307

commit be53eac959ea4146e4fa7d9db046f5ec4c861307
Author: Tommi <tommi@chromium.org>
Date: Mon Mar 12 15:30:03 2018

Add call to TaskQueue::Impl::Stop() in TQ's dtor.

This call got removed in my last CL by mistake:
https://chromium-review.googlesource.com/c/chromium/src/+/890738

Bug:  808801 , 820936,  820769 ,  820830 ,  820827 
Change-Id: I6c6efb3d40d43bc564e18b013ddf5d5d3b09e1eb
Reviewed-on: https://chromium-review.googlesource.com/958218
Commit-Queue: Tommi <tommi@chromium.org>
Reviewed-by: Guido Urdaneta <guidou@chromium.org>
Cr-Commit-Position: refs/heads/master@{#542486}
[modify] https://crrev.com/be53eac959ea4146e4fa7d9db046f5ec4c861307/third_party/webrtc_overrides/rtc_base/task_queue.cc

Comment 11 by tommi@chromium.org, Mar 12 2018

Cc: -tommi@chromium.org nisse@chromium.org
Owner: tommi@chromium.org

Comment 12 by nisse@chromium.org, Mar 13 2018

Cc: holmer@chromium.org
Project Member

Comment 13 by ClusterFuzz, Mar 13 2018

ClusterFuzz has detected this issue as fixed in range 542485:542486.

Detailed report: https://clusterfuzz.com/testcase?key=5981177547325440

Fuzzer: phoglund_webrtc_peerconnection
Job Type: linux_ubsan_vptr_chrome
Platform Id: linux

Crash Type: Bad-cast
Crash Address: 0x2a7609edb100
Crash State:
  Bad-cast to webrtc::VideoStreamEncoder from invalid vptr
  rtc::ClosureTask<webrtc::VideoStreamEncoder::OnEncodedImage
  rtc::TaskQueue::Impl::RunTask
  
Sanitizer: undefined (UBSAN)

Recommended Security Severity: High

Regressed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_chrome&range=542245:542267
Fixed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_chrome&range=542485:542486

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5981177547325440

Additional requirements: Requires HTTP

See https://github.com/google/clusterfuzz-tools for more information.

If you suspect that the result above is incorrect, try re-doing that job on the test case report page.

Comment 14 by tommi@chromium.org, Mar 13 2018

Status: Fixed (was: Assigned)
Project Member

Comment 15 by sheriffbot@chromium.org, Mar 13 2018

Labels: -Restrict-View-SecurityTeam Restrict-View-SecurityNotify
Project Member

Comment 16 by ClusterFuzz, Mar 15 2018

Labels: ClusterFuzz-Verified
Status: Verified (was: Fixed)
ClusterFuzz testcase 5981177547325440 is verified as fixed, so closing issue as verified.

If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.
Project Member

Comment 17 by sheriffbot@chromium.org, Mar 27 2018

Labels: -Security_Impact-Head Security_Impact-Beta
Labels: -ReleaseBlock-Stable -M-66 M-67
Project Member

Comment 19 by sheriffbot@chromium.org, Apr 27 2018

Labels: Merge-Request-67
Project Member

Comment 20 by sheriffbot@chromium.org, Apr 27 2018

Labels: -Merge-Request-67 Merge-Review-67 Hotlist-Merge-Review
This bug requires manual review: M67 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: awhalley@chromium.org
+awhalley@ for M67 merge review. 
Already in M67, no merge needed.
Labels: -Hotlist-Merge-Review -Merge-Review-67 Merge-Rejected-67
Project Member

Comment 24 by sheriffbot@chromium.org, Jun 19 2018

Labels: -Restrict-View-SecurityNotify allpublic
This bug has been closed for more than 14 weeks. Removing security view restrictions.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment