New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 608049 link

Starred by 18 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows , Chrome , Mac
Pri: 1
Type: Bug

Blocked on:
issue 628192
issue 636377
issue 639193

Blocking:
issue 646647
issue 665076



Sign in to add a comment

Chrome: Crash Report - [Assert] media::AudioManagerBase::~AudioManagerBase

Project Member Reported by mummare...@chromium.org, Apr 29 2016

Issue description

Product name: Chrome
Magic Signature: [Assert] media::AudioManagerBase::~AudioManagerBase

Current link:
https://crash.corp.google.com/browse?q=product.name%3D'Chrome'%20AND%20product.version%3D'52.0.2716.0'%20AND%20custom_data.ChromeCrashProto.ptype%3D'browser'%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D'%5BAssert%5D%20media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase'&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#reports


Search properties:
product.name: Chrome
product.version: 52.0.2716.0
custom_data.chromecrashproto.ptype: browser

Stack trace
============
Thread 6 CRASHED [EXCEPTION_BREAKPOINT @ 0x56c695f0 ] MAGIC SIGNATURE THREAD
0x56c695f0	(chrome.dll -debugger_win.cc:21 )	base::debug::BreakDebugger()
0x568d720d	(chrome.dll -audio_manager_base.cc:112 )	media::AudioManagerBase::~AudioManagerBase()
0x568d7093	(chrome.dll + 0x001e7093 )	media::AudioManagerWin::`scalar deleting destructor'(unsigned int)
0x568d7086	(chrome.dll -sequenced_task_runner_helpers.h:40 )	base::DeleteHelper<media::AudioManager>::DoDelete(void const *)
0x567d7412	(chrome.dll -bind_internal.h:372 )	base::internal::Invoker<base::IndexSequence<0>,base::internal::BindState<base::internal::RunnableAdapter<void (*)(void const *)>,void ,void const * &>,base::internal::InvokeHelper<0,void,base::internal::RunnableAdapter<void (*)(void const *)> >,void >::Run(base::internal::BindStateBase *)
0x567713d0	(chrome.dll -task_annotator.cc:51 )	base::debug::TaskAnnotator::RunTask(char const *,base::PendingTask const &)
0x56771193	(chrome.dll -message_loop.cc:479 )	base::MessageLoop::RunTask(base::PendingTask const &)
0x56770b18	(chrome.dll -message_loop.cc:600 )	base::MessageLoop::DoWork()
0x568186f2	(chrome.dll -message_pump_default.cc:33 )	base::MessagePumpDefault::Run(base::MessagePump::Delegate *)
0x56770471	(chrome.dll -run_loop.cc:35 )	base::RunLoop::Run()
0x5676ed8a	(chrome.dll -thread.cc:254 )	base::Thread::ThreadMain()
0x5676eb57	(chrome.dll -platform_thread_win.cc:84 )	base::`anonymous namespace'::ThreadFunc
0x76d61173	(kernel32.dll + 0x00051173 )	BaseThreadInitThunk
0x7780b3f4	(ntdll.dll + 0x0005b3f4 )	__RtlUserThreadStart
0x7780b3c7	(ntdll.dll + 0x0005b3c7 )	_RtlUserThreadStart

Link to the list of builds 
===========================
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAssert%5D%20media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000

1.This crash is first seen on build M33 (33.0.1750.154) and seeing 86 crash instances on latest dev 52.0.2716.0 on windows.
2.This is top#2 browser crash in latest dev 52.0.2716.0 on Windows - 86 instances with 5.27%.

Suspecting CL is  https://chromium.googlesource.com/chromium/src.git/+/1f4996786dd7ab2e5d7f51c72f9309ee50aeb176
alokp@, Could you please take a look and reassign if it is not your related changes.
Thank you.

 
Showing comments 108 - 207 of 207 Older

Comment 108 by ajha@chromium.org, Oct 21 2016

Could anyone please update the thread if this is being worked on? This has been consistently been #1 browser crash on every canary build of Mac. 
Cc: grunell@chromium.org olka@chromium.org
this bug is still widely observed in all platformes.
Labels: Stability-Sheriff-Desktop
Looping 'Stability-Sheriff-Desktop' just to draw any further conclusion since this is a long standing issue.

Thank you!

Comment 113 by wfh@chromium.org, Oct 31 2016

alokp@chromium.org - please give an update on the status of this bug.
henrika/grunell/olka: CHECK_EQ(0, num_input_streams_) is still failing on mac. AFAICT we are explicitly closing all input streams in AudioManagerMac destructor. Do you guys have any other ideas?

wfh@: I do not have an update since comment #95 which closed all input streams on mac. I am not sure why the CHECK is still failing. That said, this is a pretty old bug that gets triggered during shutdown in cases that we do not understand yet. The only reason it is now more visible because we recently replaced DCHECKs with CHECKs.
Sorry alokp@ but I don't have any great ideas. I have tried to assist when possible but I lack understanding of the complete shut-down sequence on Mac.
The low-latency input and high-latency input streams should all self-destruct if the destruction works as intended. That leaves only fake input streams as possible streams that could still be alive (but they don't have unique implementation for Mac). There are also VirtualAudioInputStream streams and WebContentsAudioInputStream streams but they seem to be constructed independently of the audio manager (don't know any details).

Would it be possible to add some stream type to all input streams and avoid updating num_input_streams_ for e.g. Fake streams to see if those code be the issue? Or add self-destruct in Fake's as well?

What would happen if you simply skipped the CHECK(num_input_stream_) and leaked any remaining instances?
Just wondering: the bug actually refers to 2 signatures
[Assert] media::AudioManagerBase::~AudioManagerBase
and 
media::AudioManagerBase::~AudioManagerBase
the stacktrace is the same except the first signature's reports have "base::debug::BreakDebugger()" top frame.
Both signatures started about the same date (07/05 and 07/08 respectively) and happen similar variety of OS versions.

What I could notice that the [Assert] version of the signature is seen a lot more (~182K vs ~34K reports), but it is 98% of these reports are on stable, vs the second signature that is more equally spread across other channels. 

[Assert] media::AudioManagerBase::~AudioManagerBase
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAssert%5D%20media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#-samplereports:5,productversion:1000,-processtype,channel,-component,-filepath,-author,-changelist,-magicsignature2,-clientid,day:1000

media::AudioManagerBase::~AudioManagerBase
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#-samplereports:5,+productversion:1000,-processtype,channel,-component,-filepath,-author,-changelist,-magicsignature2,-clientid,day:1000
I don't have any ideas off the top of my head.

* Is this now only a problem on Mac? (Win is also marked on OS.)

* The two different signatures in comment #117 seems to be input and output streams CHECKs.

* Are there any report of how to repro this? I suppose it's a matter of timing.
I sent quite some time looking into this last week, but no bright ideas as well.

Some desperate wild guesses:
I would try to count fake streams to make sure that #99 is always true and we are not missing something.
And I would check if |num_input_streams_| is some meaningful value above zero, and not some rubbish or a negative value.
Labels: ReleaseBlock-Stable
Just to update, this is currently top # 1 browser crash on Mac Stable with statistics as 24.43% and 225 crash instances from 211 client Ids.

Added Releaseblock-stable, please modify if not appropriate.

Link to list of builds where crashes are seen:
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000

56.0.2916.0	0.01%	6	--canary
56.0.2914.3	0.03%	14	--dev
55.0.2883.44	0.06%	32	--beta
54.0.2840.98	0.45%	225	--Stable
alokp@ can you pick this up and try some of the suggestions mentioned? The easiest approach will be to just move the stream closing code out of AudioManagerMac and into AudioManagerBase, that way all possible streams are accounted for, then call that function from the AMMac destructor.
Labels: OS-Chrome
This issue is NOT Mac specific, from go/crash it's also happening on other platforms. For example, it's happening on ChromeOS on M55 [1]. It's also happening on M56 on Windows [2].

I am not familiar with this code. By looking at BrowserMainLoop, it seems |audio_manager_| [3] will be destructed before other browser threads [4] and the |main_message_loop_| [5]. Is it possible that some task that will close an audio input/output stream is still queued in the |main_message_loop_| when we get to ~BrowserMainLoop? This could be possible if the "close" is triggered by IPC, where we do have a IO->UI thread hop. Ignore me if this doesn't make any sense :)

[1] https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27%20AND%20product.name%3D%27Chrome_ChromeOS%27%20AND%20product.Version%3D%2755.0.2883.35%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D

[2] https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27%20AND%20product.name%3D%27Chrome%27%20AND%20product.Version%3D%2756.0.2906.0%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D

[3] https://cs.chromium.org/chromium/src/content/browser/browser_main_loop.h?rcl=0&l=291

[4] https://cs.chromium.org/chromium/src/content/browser/browser_main_loop.h?rcl=0&l=275

[5] https://cs.chromium.org/chromium/src/content/browser/browser_main_loop.h?rcl=0&l=219
Dale: Sorry I have been swamped with launch-related work lately. I will look into it this week.
Re comment #122: Xiaohan, thanks for the pointers.

[1] and [2]: The crashes on other platforms are mostly about output streams, where as mac ones are about input streams. IIUC they are also not as severe as on mac, although the underlying cause for both of them may be the same.

[3] and [4]: The browser threads are actually deleted in BrowserMainLoop::ShutdownThreadsAndCleanUp (not in BrowserMainLoop destructor).

[5] Your analysis about tasks being queued up in main_message_loop_ is correct but only on mac where AudioManager runs on the main thread. On other platforms, it has a dedicated thread which gets flushed in BrowserMainLoop destructor.
Re comment #115: henrika@, Thanks for pointing out WebContentsAudioInputStream. I think you are correct that those streams are not being counted in AudioManagerBase.

The concern about letting input stream instances leak was that it might lead to use-after-free bugs where the leaked streams will hold reference to deleted AudioManager. But this is not an issue on mac where BML::main_task_runner_ drops all queued tasks.
Cc: xhw...@chromium.org
Labels: -Stability-Sheriff-Desktop
It sounds like progress is being made on understanding this, so removing from desktop sheriff queue.  Feel free to add us back if there is something we can help with.

Labels: Stability-Sheriff-Desktop
My shift is wrapping up, re-adding for the next sheriff to continue followup.
Labels: Stability-Crash-Chronic
Blocking: 665076
Project Member

Comment 134 by bugdroid1@chromium.org, Nov 17 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4b2ff4ac1890f5f48eb144198796b3853ed1ac44

commit 4b2ff4ac1890f5f48eb144198796b3853ed1ac44
Author: alokp <alokp@chromium.org>
Date: Thu Nov 17 06:43:02 2016

Tracks all open input streams in AudioManagerBase.

Instead of just tracking the open stream count, this patch tracks stream handles so that they can be explicitly closed on shutdown.

BUG= 608049 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2503693002
Cr-Commit-Position: refs/heads/master@{#432786}

[modify] https://crrev.com/4b2ff4ac1890f5f48eb144198796b3853ed1ac44/media/audio/audio_manager_base.cc
[modify] https://crrev.com/4b2ff4ac1890f5f48eb144198796b3853ed1ac44/media/audio/audio_manager_base.h
[modify] https://crrev.com/4b2ff4ac1890f5f48eb144198796b3853ed1ac44/media/audio/mac/audio_manager_mac.cc

This crash is now the Top#1 browser crash on latest Chrome beta and been the same for quite sometime now, alokp@chromium.org want to check if the CL in Comment#134 is safe to get merged to M55 given the crash rate on this and making the M55 promotion decision little bit tricky. 


alokp@ - will your change fix this crash, or is it adding diagnostics?

It is supposed to fix the crash. But I am not certain that it will because the patch is based on the assumption that the crashes are because of fake streams.

I also do not recommend merge because this is an old bug and the crash only happens during shutdown.
Is it your recommendation that this not be marked RBS then (because users won't notice it happening on shutdown)?

Yes - this should not be RBS
Labels: -M-56 -ReleaseBlock-Stable M-57
FWIW alokp's CL (r432786) actually made it into M56 branch 2924 (r433059).
Labels: -Stability-Sheriff-Desktop
I couldn't find this crash in M55/56 so I'm taking this off stability sheriff queue. Please add it back if you see elevated crash rates in canary.

Project Member

Comment 143 by sheriffbot@chromium.org, Nov 20 2016

Labels: FoundIn-M-57
Users experienced this crash on the following builds:

Win Dev 56.0.2922.1 -  0.57 CPM, 32 reports, 29 clients (signature media::AudioManagerBase::~AudioManagerBase)
Mac Canary 57.0.2925.0 -  8.95 CPM, 16 reports, 15 clients (signature media::AudioManagerBase::~AudioManagerBase)

If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates.

- Go/Fracas
Crashes are still happening due to input streams. I double-checked my patch in comment #134, and it seems to be correct.

Could it be that MakeAudioInputStream is getting called between AMB::Shutdown and AMB::~AMB? It is possible if MakeAudioInputStream is used on a thread other than AM::task_runner_. We do enforce the thread restrictions on all of those functions but they are just DCHECKs.
Hmm, is the ui loop still running at ~BML? Seems it can't be, so your hypothesis must be correct. Can you add a CHECK(belongsToThread) in that function?
Keep in mind that a thread checker might always return true in official
builds.
No - the UI loop is not running. Even it were running, it is not possible to schedule a task between AMB::Shutdown and AMB::~AMB. Both run inside AMMac destructor call.

Added a few CHECKs here: https://codereview.chromium.org/2521583003/
Project Member

Comment 149 by bugdroid1@chromium.org, Nov 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e222406a7ee930c64e5401a3e5099acc137c743e

commit e222406a7ee930c64e5401a3e5099acc137c743e
Author: alokp <alokp@chromium.org>
Date: Tue Nov 22 03:05:06 2016

Adds speculative CHECKs to diagnose AudioManager crashes.

Adds CHECKs to verify that input/output streams are created/released on
the audio thread.

BUG= 608049 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2521583003
Cr-Commit-Position: refs/heads/master@{#433762}

[modify] https://crrev.com/e222406a7ee930c64e5401a3e5099acc137c743e/media/audio/audio_manager_base.cc

Just to update the latest behavior, Still crashes observed on latest stable, beta, dev and canary channels. Below information provides the comparison between previous and latest channels including total number of instances.
 
--------------------------------------------------+      
 Latest Channel       |    Previous Channel       |
--------------------------------------------------+
54.0.2840.98   16876  |	54.0.2840.87	5087	  |--> Stable
55.0.2883.59   41     | 55.0.2883.52	230	  |--> Beta
56.0.2924.3    0      |	56.0.2922.1	30	  |--> Dev
57.0.2926.0    100    | 57.0.2925.0     29        |--> Canary
--------------------------------------------------+
	
Link to the list of the builds getting crash:
---------------------------------------------
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000
Just to update, this crash is currently served as top # 4 crasher on Mac 54.0.2840.98 compared previous Stable 54.0.2840.87 and previous M53 Stable 53.0.2785.143.
There are 25228 crash instances from 16116 unique client ids on Mac stable.
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27%20AND%20product.Version%3D%2754.0.2840.98%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D
still the top crash on 54.0.2840.99, has 57k+ reports.
We have not merged any fix to stable/beta/dev branches, so I do not expect the crash rates to change in those branches.

The speculative CHECK in comment 149 has made it into the current mac canary 57.0.2935.0. If my hypothesis in comment 145 is correct, we should start seeing crashes in AudioManagerBase::MakeAudioInputStream.
Cc: rsesek@chromium.org
Seems no crashes from that call site! It's odd that many of the recent crashes are not actually CHECK() failures, they're instead:

CRASHED [EXC_BAD_INSTRUCTION / EXC_I386_INVOP @ 0x000000010214b007 ]

I'm surprised to see this for what should be a crash failure. +rsesek@ are there any known issues around CHECK() manifesting as bad instruction crashes?

If not it's possible this crash is unrelated to steam counts...
Actually, nevermind, I guess that's just how __builtin_trap() shows up on OSX:

http://stackoverflow.com/questions/10734893/exc-bad-instruction-code-exc-i386-invop-subcode-0x0

So stream counts still seem likely.
The callstack does look a bit weird though. Earlier it clearly pointed to the CHECK statement for stream count. Now it does not. Anyways, I do not have any new ideas at present.
Thanks rsesek@.

I think copying the empty CHECK() into the Shutdown() code adjacent to the for loop closing the streams would be interesting. It'll help us figure out if that loop is actually releasing all streams (it should be) and if some are being added later, or if Close() is failing for some unknown reason.

It'll also help determine if we're being bit by  issue 664209  and the problem is actually the output streams.
On M56, we expect to hit that CHECK.

I added a few more diagnostic CHECKs that landed in mac canary 57.0.2935.0. We are not hitting the new CHECKs, but since then the original CHECK about input_streams_ count has moved to the end of AudioManagerBase destructor. I am not confident if the callstack is correct.

https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27%20AND%20product.Version%3D%2757.0.2935.0%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D&stbtiq=&reportid=0631af2300000000&index=0#0
Project Member

Comment 160 by bugdroid1@chromium.org, Dec 2 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5d7b0d43758d7ae12dc893296ae775c3b888d13f

commit 5d7b0d43758d7ae12dc893296ae775c3b888d13f
Author: alokp <alokp@chromium.org>
Date: Fri Dec 02 04:43:05 2016

Adds CHECK to ensure all input streams are closed as expected.

BUG= 608049 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2542953003
Cr-Commit-Position: refs/heads/master@{#435859}

[modify] https://crrev.com/5d7b0d43758d7ae12dc893296ae775c3b888d13f/media/audio/audio_manager_base.cc

This is a top #1 browser crasher on Mac stable 55.0.2883.75 having 634 crash instances from 524 Unique client ids.
Any chance some of the speculative changes to track down this bug are related to Issue 671167?

Shouldn't be, this is a browser shutdown crash.
Cc: durga.behera@chromium.org
 Issue 670915  has been merged into this issue.
Latest crash reports show we're not crashing due to input_streams_ not being empty. Stack trace points to "}" -- Yet none of the AudioManager's have anything all that complex to destruct...

https://cs.chromium.org/chromium/src/media/audio/audio_manager.h?l=264
https://cs.chromium.org/chromium/src/media/audio/audio_manager_base.h?l=144
https://cs.chromium.org/chromium/src/media/audio/mac/audio_manager_mac.h?l=162

The most obvious suspects seem like:
AudioDeviceListenerMac
AudioPowerObserver
base::ObserverList<AudioDeviceListener>

Looking over them a bit now.
Actually, this is just CHECK() failure, I guess we always use a null-deref to indicate that on Windows now. So we have output streams still alive by this point.

> dt this
Local var @ edi Type media::AudioManagerBase*
   +0x000 __VFN_table : 0x11005eec 
   +0x004 task_runner_     : scoped_refptr<base::SingleThreadTaskRunner>
   +0x008 worker_task_runner_ : scoped_refptr<base::SingleThreadTaskRunner>
   +0x00c max_num_output_streams_ : 0n50
   +0x010 max_num_input_streams_ : 0n16
   +0x014 num_output_streams_ : 0n1
   +0x018 num_input_streams_ : 0n0
   +0x01c output_listeners_ : base::ObserverList<media::AudioManager::AudioDeviceListener,0>
   +0x034 output_dispatchers_ : ScopedVector<media::AudioManagerBase::DispatcherParams>
   +0x040 audio_log_factory_ : 0x11646958 media::AudioLogFactory

So there's still an output stream alive somehow on Windows every now and then.
Dale: Thanks for looking into it.

The report you linked to in comment 169 is on Mac. Anyways, shouldn't all output streams be closed when output dispatchers are destroyed during AMB::Shutdown?
https://cs.chromium.org/chromium/src/media/audio/audio_manager_base.cc?sq=package:chromium&l=311

Should we add a CHECK for num_output_streams_ right after output dispatchers are destroyed?
#171, yeah the link doesn't show what I was looking at, sorry, here's the correct one:
https://crash.corp.google.com/browse?stbtiq=0eb6c7bf00000000

Note: A lot of the Windows ones have the Cast extension present; so I wonder if this is tab casting related. 

I don't think the check would help here, we the minidump is clear that we have an open stream and the closure path for idle streams is also pretty clear.

Shutdown() on the dispatchers only closes all idle streams. Hosts are expected to close through the OnChannelClosing() path, so I wonder if there's still a case where ARH is lingering.

Random thoughts:
- I'm thinking about adding some sort of AM observer interface which will inject crashes into stream owners to try and debug this.
- Going to try and switch linux to the mac thread model and try some debugging locally on a non-osx type to see what I can reproduce.
- I really dislike that the renderer hosts and controllers are ref-counted, it makes it very hard to reason about the lifetime. Hopefully as part of the mojo work we can switch these to be unique_ptr.

This Windows crash shows we definitely still have AudioOutputControllers alive at the time of shutdown somehow, https://crash.corp.google.com/browse?stbtiq=ce02d7bf00000000

Which (likely) means the RendererProcessHostImpl has not shutdown the channel for whatever reason. Alok, did you ever determine if we definitely should not have RPHIs alive by the time ~BML starts?

If that's not the case, maybe we should just stop relying on this signal to ARH and instead plumb a ~AM notification to all stream owners (Audio(Input|Output)Controllers). With such a notification we could safely nuke the streams from the controller side prior to destruction.
Similar one for OSX, https://crash.corp.google.com/browse?stbtiq=46573dbf00000000

Note: This crash points at the "}", but it's highly likely it's a CHECK failure on the number of output streams; so I think our crash reporting tool is having some issues. Possibly due to  issue 664209  which prevents multiple CHECKs from showing up accurately.

Comment 176 by olka@chromium.org, Dec 12 2016

Cc: maxmorin@chromium.org
+maxmorin@ who is refactoring AudioOutputController/ARH.
This is the top#1 browser crash on mac latest dev 57.0.2946.0

Latest crash rates on all channels are as below.
57.0.2949.0	0.00%	5
57.0.2946.0	0.06%	70
55.0.2883.87	0.45%	518	
56.0.2924.21	0.08%	88

Link to the list of builds
==========================
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_Mac%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioManagerBase%3A%3A~AudioManagerBase%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000
The management of AudioOutputStreams sure is tricky :). I notice that the CHECK that AudioOutputDispatcherImpl.{idle_proxies_,proxy_to_physical_map_,idle_streams_} is empty is in ~AudioOutputDispatcherImpl, but if there are proxies left, the destructor won't run (since AudioOutputProxy keeps a reference back to it). Thus, the CHECKs doesn't really ensure that it's properly cleaned up at exit. This probably isn't the (only) issue though, since such an error still wouldn't explain why there's a sync reader running during audio manager shutdown.

When Audio(Input)RendererHost is replaced with a mojo service(Q1-Q2), the shutdown procedure will probably be altered slightly, so that might fix this or make it worse. Making the replacements (and eventually also controllers) not refcounted is planned.
#173, Ah - thanks for reminding me that AudioOutputDispatcher::Shutdown only closes idle streams.

#174, Last time I looked into RPHI, we determined that service workers were preventing RPHI channel from getting closed (see comment #65). horo@ and falken@ fixed some issues there. We can re-enable the CHECKS for RPHI::channel_ to see if those have been indeed fixed.

Ensuring that channel for all RPHI instances is closed before ~BML starts will fix this issue on all platforms except Mac where we have a race between IO thread (posting tasks to close streams on the main thread) and main thread (~BML). I am not sure how to fix this race even if we can fix outstanding RPHI instances.
#178, Is there a reason why AudioOutputProxy keeps a strong reference back to AudioOutputDispatcher? Thats bad.
@179: I audited the crashes from when falken@ landed his patch until the CHECKs() were reverted and there were no failures of the workers-check crash. I forgot to checkout the opposite though, that there were no ~AM check failures on Windows. Will do so today.

It seems in the general case then that the current stream shutdown is insufficient due to the race you describe then. We can either add more stream tracking for this case or do something different:

1. Add AudioManager::(Add|Remove)ShutdownObserver() or change the MakeAudio(Input|Output)Stream calls take that same observer when creating a stream. The latter gets complicated due to the AudioOutputProxy/Dispatcher being the owner of the streams "sometimes."
2. Have at least Audio(Input|Output)Controller register for this notification and release their streams.

If (2) is insufficient to prevent further usage, register renderer hosts as well and drop their AudioManager pointers to prevent further stream creation attempts.

This allows us to switch from a potentially flaky path which is hard to monitor (channel closing) to something more explicit that we can keep an eye on.

WDYT?
#180: Ownership is a confusing concept in this code, and I don't know it very well. If the AudioOutputDispatcher owns the AudioOutputProxy, there shouldn't be any need for the owning pointer back to the dispatcher and (I think) no need for refcounting AudioOutputDispatcher at all. I'll leave it for someone who know this code better than me.
The proxy_to_physical_map_ doesn't take a ref on the AOP, so the AOD doesn't own the AOP. Really nothing owns the AOD but AM though, so we can likely drop the ref-counting. 
@181: I think your idea about shutdown observer should work. Part of Audio(Input|Output)Controller is executed on the audio thread where it should be able to get the shutdown notification and synchronously release their streams.

I will try to implement it this week.
As an extension of the above work, I will also look into removing the ref-counting from AOP -> AOD.
Project Member

Comment 186 by bugdroid1@chromium.org, Dec 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1873f1437ee75eeca154753660b33ff2336509f0

commit 1873f1437ee75eeca154753660b33ff2336509f0
Author: alokp <alokp@chromium.org>
Date: Tue Dec 13 21:38:46 2016

Makes AudioOutputDispatcher non-ref-counted.

AudioManagerBase owns AudioOutputDispatcher instances. AudioOutputProxy
no longer holds a strong reference to AudioOutputDispatcher. This means
dispatcher instances will be deleted when AudioManager shuts down and
fire CHECKs for any open proxy streams.

BUG= 608049 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2570923002
Cr-Commit-Position: refs/heads/master@{#438285}

[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_manager_base.cc
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_dispatcher.h
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_dispatcher_impl.cc
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_dispatcher_impl.h
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_proxy.h
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_proxy_unittest.cc
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_resampler.cc
[modify] https://crrev.com/1873f1437ee75eeca154753660b33ff2336509f0/media/audio/audio_output_resampler.h

Cc: ajha@chromium.org
Issue 674100 has been merged into this issue.
We finally know that these crashes are triggered by active AudioOutputProxy instances:

https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27media%3A%3AAudioOutputDispatcherImpl%3A%3A~AudioOutputDispatcherImpl%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports

Note that this problem is more acute on mac due to the race described in comment 179. On other platforms where we have a dedicated audio thread, this happens due to RPHI leaks.

I briefly looked at adding a ShutdownObserver for AudioManager and it seems a bit hairy and probably not worth it for a single platform. At this point I am inclined to simply close all physical streams in ~AudioOutputDispatcherImpl on mac (like we close other streams during AMB::shutdown). Fixing leaky RPHI instances should fix other platforms.

Comments?
Thanks for working so hard on finding the root cause here. I would recommend an as simple solution as possible initially to see if that reduces the frequency of the problem. Don't know the details well enough to compare different approaches.
alokp@ I made the proposal I did above since relying on OnChannelClosing() seems flaky for all platforms. Constantly tracking down leaky RPHIs seems like a goose chase without some explicit CHECKs which fail closer to the source. It's also possible the failures on other platforms are related to things like speech which create their own output streams IIRC.

I proposed the shutdown observers for all platforms since it's the most explicit and safest way to close the streams and not leave dangling use-after-free opportunities; I.e. if each input/output controller registers as an observer for its stream we ensure that shutdown will drop its pointer to the stream we are closing.

The alternative is that instead of closing streams, you only Stop() then on Shutdown() and we remove the stream count checks. This avoids freeing pointers other threads may still be using.

All that said, I'm fine starting with just force closing streams on OSX since we know nothing else will run on the audio thread at this point (thus closing is safe). I worry more that relying on RPHI tear down is too implicit.
Another reason I am hesitant adding a AM ShutdownObserver because I am not sure if we will need it when we switch mojo audio service.

In the mojo world, the current AudioOutputProxy will roughly map to InterfacePtr<AudioOutputStream>, clients for which will need to handle connection error when the audio service goes away. We can trigger a similar "connection error" today on all active AudioOutputProxy instances in ~AudioOutputDispatcherImpl, which will allow AudioOutputProxy to clear the raw reference to AOD and also inform its client via AudioSourceCallback::OnError.

This should be pretty easy to implement and should fix all cases including RPHI leaks. WDYT?
The user /u/Coinkidinks approached me on reddit. It seems that they keeps experiencing this particular crash. His messages to me were:


For the past few weeks (perhaps months), on my Macbook Pro Retina 2015 13 inch mode (macOS Sierra)l, I've been getting a lot of "Aw, Snap!" pages when I open up new tabs, and this happens somewhat randomly.
I don't really have extensions downloaded; the only thing I've really installed was a Theme from the Chrome shop.
My second problem is that when the tabs I open do work (no Aw Snap), when I go to the address bar and type in something (say googling or a website), nothing happens when I hit "return"/"enter". The bar receives the input, but doesn't go to the page. It's only after I hit enter/return 3+ more times that it does go to the desired page(s) I want.
What insight can you provide me on this? Thank you!

---

This is what showed when I opened crashes after the 2nd problem (clicking on a link multiple times but chrome not directing there)
Crash ID fa8c4b72-ec6c-4671-aed5-750f7d6a2037
Crash report captured on Tuesday, September 6, 2016 at 12:07:51 AM was not uploaded
I have now reproduced the new tab crash (Aw, snap), and when I went to crashes, the only one showing is still the aforementioned crash.
Thank you very much!

---

Hi,
In addition to my message from earlier today, I checked crashes again and got this:
Crash ID fa8c4b72-ec6c-4671-aed5-750f7d6a2037 (Server ID: 2fb7c1a080000000)
Re:192, that seems fine to me for now. So long as we're also planning to track down the other platform's edge cases.
Project Member

Comment 195 by bugdroid1@chromium.org, Jan 11 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/28b0c339305fde2065b3d51514d29c1e6280216f

commit 28b0c339305fde2065b3d51514d29c1e6280216f
Author: alokp <alokp@chromium.org>
Date: Wed Jan 11 01:46:11 2017

Makes AudioOutputProxy -> AudioOutputDispatcher reference weak.

This patch changes the AudioOutputProxy -> AudioOutputDispatcher reference from
raw pointer to a weak pointer to allow the dispatcher to be deleted before
AudioOutputProxy instances can be closed.

BUG= 608049 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2621993002
Cr-Commit-Position: refs/heads/master@{#442765}

[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_manager_base.cc
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_dispatcher.h
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_dispatcher_impl.cc
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_dispatcher_impl.h
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_proxy.cc
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_proxy.h
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_proxy_unittest.cc
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_resampler.cc
[modify] https://crrev.com/28b0c339305fde2065b3d51514d29c1e6280216f/media/audio/audio_output_resampler.h

Issue 680326 has been merged into this issue.
This seems fixed. I do not see any crash reports for AudioManagerBase or AudioOutputDispatcher anymore.

Dale: Any issues with closing this?
sgtm! Thanks for being persistent on this Alok!
Status: Fixed (was: Assigned)
> sgtm! Thanks for being persistent on this Alok!

Plus one! Great!
Great work!
Thanks everyone for your help and suggestions.
Issue 698874 has been merged into this issue.
Labels: -Restrict-View-Google
This shipped; it's safe to open it up.
Showing comments 108 - 207 of 207 Older

Sign in to add a comment