New issue
Advanced search Search tips

Issue 629954 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Chrome
Pri: 3
Type: Bug-Security



Sign in to add a comment

Heap-use-after-free in media::midi::MidiManagerAlsa::EventLoop

Project Member Reported by ClusterFuzz, Jul 20 2016

Issue description

Detailed report: https://cluster-fuzz.appspot.com/testcase?key=6381635469836288

Fuzzer: ipc_fuzzer_gen
Job Type: linux_asan_chrome_ipc
Platform Id: linux

Crash Type: Heap-use-after-free READ 8
Crash Address: 0x618000007b10
Crash State:
  media::midi::MidiManagerAlsa::EventLoop
  base::debug::TaskAnnotator::RunTask
  base::MessageLoop::RunTask
  
Recommended Security Severity: High


Unminimized Testcase: https://cluster-fuzz.appspot.com/download/AMIfv95IOiuVH-2t2O_EAF38elfOHd-asV3Y3-nY17w02hShasX5u2y07OkKEWOqEmuGpUmv85abcrjgb-040iFSDnBXuQZuypLR3TLfdGhoyTzUOrj2IBk4VBPK1Dfa1K_4Lj2RWPFVLOF_ZhNKhnCCJrLfMWpC4ZMj0IbH3Hb1f5L2ikI3kpc?testcase_id=6381635469836288


Filer: inferno

See https://dev.chromium.org/Home/chromium-security/bugs/reproducing-clusterfuzz-bugs for more information.
 
Owner: toyoshim@chromium.org
Status: Assigned (was: Available)
Components: Blink>Media
Cc: agoode@chromium.org
Components: -Blink>Media Blink>WebMIDI
Cc: toyoshim@chromium.org
Labels: OS-Chrome
Labels: -ClusterFuzz Clusterfuzz
==11706==ERROR: AddressSanitizer: heap-use-after-free on address 0x618000007b10 at pc 0x7fd8fd1049f9 bp 0x7fd737a52570 sp 0x7fd737a52568
READ of size 8 at 0x618000007b10 thread T26 (MidiEventThread)
SCARINESS: 51 (8-byte-read-heap-use-after-free)
    #0 0x7fd8fd1049f8 in message_loop base/threading/thread.h:162:46
    #1 0x7fd8fd1049f8 in ScheduleEventLoop media/midi/midi_manager_alsa.cc:886
    #2 0x7fd8fd1049f8 in media::midi::MidiManagerAlsa::EventLoop() media/midi/midi_manager_alsa.cc:971

So, suspicious objects we tough here on the event thread is one of |this|, |event_thread_|, and |event_thread_.message_loop()|.
These are expected to outlive. So this looks a race on the thread termination or we may forget to use mutex in right places.
I could not reproduce it on local machine as expected by Unreproducible label.

Adam, could you double-check the termination steps at Finalize() and dtor?
freed by thread T0 (chrome) here:
    #0 0x7fd8e941889b in operator delete(void*)
    #1 0x7fd8f6ea60fc in operator() buildtools/third_party/libc++/trunk/include/memory:2529:13
    #2 0x7fd8f6ea60fc in reset buildtools/third_party/libc++/trunk/include/memory:2735
    #3 0x7fd8f6ea60fc in ~unique_ptr buildtools/third_party/libc++/trunk/include/memory:2703
    #4 0x7fd8f6ea60fc in content::BrowserMainLoop::~BrowserMainLoop() content/browser/browser_main_loop.cc:430
    #5 0x7fd8f6ea739a in content::BrowserMainLoop::~BrowserMainLoop() content/browser/browser_main_loop.cc:426:37
    #6 0x7fd8f64fdfe6 in operator() buildtools/third_party/libc++/trunk/include/memory:2529:13
    #7 0x7fd8f64fdfe6 in reset buildtools/third_party/libc++/trunk/include/memory:2735
    #8 0x7fd8f64fdfe6 in content::BrowserMainRunnerImpl::Shutdown() content/browser/browser_main_runner.cc:224
    #9 0x7fd8f64fc3e5 in content::BrowserMain(content::MainFunctionParams const&) content/browser/browser_main.cc:48:16
    #10 0x7fd8ea502c75 in content::ContentMainRunnerImpl::Run() content/app/content_main_runner.cc:785:12
    #11 0x7fd8ea4fd22d in content::ContentMain(content::ContentMainParams const&) content/app/content_main.cc:20:28
    #12 0x7fd8e941a765 in ChromeMain chrome/app/chrome_main.cc:85:12
    #13 0x7fd8de802f44 in __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:287

MidiManager and BrowserMainLoop is destructed at the point Chrome crashes. This is super curious.
BrowserMainLoop calls MidiManager::Shutdown(), then stops and joins IOThread before it exits.

MidiManager::Shutdown() runs on the main thread at this point, and it post a task (MidiManager::Finalize) to the IOThread.
MidiManager expects the posted task runs on the IOThread before the IOThread stops (1).

MidiManager::Finalize() stops and joins the EventThread. So, the expectation (1) is correct, the EventThread should not run after BrowserMainLoop joins the IOThread. But it happens.
Cc: yhirano@chromium.org
Let me add CHECK()s to ensure related threads stop certainly  before MidiManager is destructed.

cc: yhirano for code review.
Hum... I notice that ALSA fails initialization if msan is enabled.
This is what I can see when Web MIDI is called on msan enabed chromium on Linux.

> ALSA lib conf.c:3314:(snd_config_hooks_call) Cannot open shared library libasound_module_conf_pulse.so
> ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ hw

I will check what happens on shutdown if MidiManagerAlsa fails to initialize ALSA.
Checked.

The first snd_seq_open() for SND_SEQ_OPEN_INPUT fails in StartInitialization().
After this failure, it returns with an error code immediately.

On shutdown, Finalize() and dtor do nothing correctly.

Actually, in this case, EventThread does not start. So, this isn't the case of Clusterfuzz, but it's just happening only on my local machine, maybe.
Note: this is a picture how each thread runs.

UIThread              IOThread
    |                     |
new BrowserMainLoop       |
    |                     |
new MidiManager           |
    |         => post MidiManagerAlsa::StartSession()
    |                     |
    |                  - event_thread_.Start() => start EventThread
    |                     |                                 |
    :                     :                                 :
    :                     :                                 :
    :                     :                                 :
(running)             (running)                         (running)
    :                     :                                 :
    :                     :                                 :
    :                     :                                 :
MidiManager::Shutdown()   |                                 |
    |         => post MidiManagerAlsa::Finalize()       MidiManagerAlsa::EventLoop()
delete IOThread           |                                 |
    |                  - out_client_.reset() => unblock  - case SND_SEQ_EVENT_CLIENT_EXIT
    |                     |                                 |
    |                  - event_thread_.Stop()               |
    |                     |                     => post ThreadQuitHelper()
    |                     |                                 |
 - IOThread::Stop()       |                     <= join QuitWhenIdle()
    |         => post ThreadQuitHelper()
    |                     |
    |         <= join QuitWhenIdle()
delete MidiManager
    |
delete BrowserMainLoop
    |
    :

IOThread::Stop() blocks until posted Finalize() runs and finishes, and Finalize() blocks until EventThread stops. But in the case of fuzz, EventThread still runs after BrowserMainLoop dies.

My CL to add CHECK()s will make Chrome abort in ~MidiManagerAlsa() instead of use-after-free when EventThread still runs there. It will make the problem clearer.

If Adam can not find any additional point, probably this is the best effort we can do now. So, I'd close this as WontFix with that CL as reproducing-clusterfuzz-bugs document suggests for Unreproducible bug.

Project Member

Comment 12 by sheriffbot@chromium.org, Jul 21 2016

Labels: Pri-1
I also see the IOThread still running at the time of crash Thread T15 (Chrome_IOThread). This is all after main_loop_->ShutdownThreadsAndCleanUp() should have completed.

So maybe this fuzz detected a problem that corrupts the shutdown of threads.

There is a note in the code that stops the IO thread: https://cs.chromium.org/chromium/src/content/browser/browser_main_loop.cc?rcl=1469099736&l=1108

It points to a potential problem and  issue 608049 .
Labels: Security_Impact-Head

Comment 15 by aarya@google.com, Jul 21 2016

Labels: -Security_Impact-Head Security_Impact-Stable
Does not look like recent regression.
Project Member

Comment 16 by sheriffbot@chromium.org, Jul 22 2016

Labels: M-52
Adam: How did you know if the IOThread was still running at the time of crash? Actually, I wanted to investigate it, but could not find out how to.

What I can read from the stacktrace are

 1. 8-byte-read-heap-use-after-free happened around ScheduleEventLoop() on thread T26 (MidiEventThread)
 2. The data we were discussing was freed at content::BrowserMainLoop::~BrowserMainLoop() on the thread T0 (chrome)
 3. The data was allocated at media::midi::MidiManager::Create() on the thread T0 (chrome)
 4. T26 (MidiEventThread) was created by T15 (Chrome_IOThread)
 5. T15 (Chrome_IOThread) was created by T0 (chrome)

1-3 said the data was the pointer to MidiManager instance, and it was freed at ~BrowserMainLoop() there Chrome_IOThread and MidiEventThread should be stopped.
4-5 just said these threads were created as we expected.
Project Member

Comment 18 by bugdroid1@chromium.org, Jul 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8fb95e706d70e82d0f9c41fba01b7919d16b8ab3

commit 8fb95e706d70e82d0f9c41fba01b7919d16b8ab3
Author: toyoshim <toyoshim@chromium.org>
Date: Mon Jul 25 07:42:41 2016

Web MIDI: Add CHECK()s to ensure MidiManager owned threads are terminated

All MidiManager owned threads are stopped and joined before MidiManager
is destructed. But let me add CHECK()s so that it aborts when these threads
are still running, just in case.

BUG= 629954 

Review-Url: https://codereview.chromium.org/2170463003
Cr-Commit-Position: refs/heads/master@{#407424}

[modify] https://crrev.com/8fb95e706d70e82d0f9c41fba01b7919d16b8ab3/media/midi/midi_manager.cc
[modify] https://crrev.com/8fb95e706d70e82d0f9c41fba01b7919d16b8ab3/media/midi/midi_manager_alsa.cc

Labels: -Pri-1 -M-52 -Security_Impact-Stable Pri-3
Status: Started (was: Assigned)
kerrnel@, aarya@
I do not think this is the same root issue with 608049.
AudioManager has a different logic to destruct the instance lazily on the audio thread that outlives the IOThread, and the crash happens only on Windows platforms, but in productions.

On the other hand, we are seeing our issue only on fuzz, and on Linux.


agoode@
Can you elaborate the note you found in the browser_main_loop.cc?


From Common problems at https://dev.chromium.org/Home/chromium-security/bugs/reproducing-clusterfuzz-bugs

> Sometimes, a crash just isn't reproducible. If a crash is marked as unreproducible in ClusterFuzz and you're seeing similar results, don't spend too long re-running it. If the information in the report isn't enough to work with, don't be afraid to mark the bug as WontFix.


If no one object, I'll close this as WontFix with the CL of c#18 that makes things clearer at the next catch.
But before that, I want to wait for answers from agoode@ that may give insights.
Ah, it is possible I was misreading the stack trace. If the IOThread is definitely stopped by here, then I don't have any other insights.
Status: WontFix (was: Started)
ok. probably we can not ensure that the IOThread stops correctly unless we can reproduce the crash.
Let me close this, and see what happens.

Comment 22 by aarya@google.com, Jul 26 2016

Status: Fixed (was: WontFix)
Marking Fixed based on c#18, we might want to take that mitigation for merge/
Status: WontFix (was: Fixed)
Not fixed.

c#18 was a patch to add additional checks for further investigation in the future.

merge isn't needed because we haven't seen this issue in the production.

Comment 24 by aarya@google.com, Jul 27 2016

Adding CHECKs could be preventing the security bug. I don't think we should merge a speculative fix.
Project Member

Comment 25 by sheriffbot@chromium.org, Nov 1 2016

Labels: -Restrict-View-SecurityTeam allpublic
This bug has been closed for more than 14 weeks. Removing security view restrictions.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment