New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 701000 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
OOO Dec 22 - Jan 8
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocking:
issue 564276



Sign in to add a comment

Reference time calculation in WebAudioMediaStreamSource::DeliverRebufferedAudio() assumes 10 ms intervals between calls

Project Member Reported by olka@chromium.org, Mar 13 2017

Issue description

WebAudio MediaStream Destination rendering works like this:

AudioDestination::render() render buffers of output device buffer size. To do so, it pulls 128-frame buffers out of WebAudio graph as many times as required to fill the output buffer. The 128 frame buffers it pulls out of the graph are also pushed into WebAudioMediaStreamSource, where they are rebuffered to 10 ms buffers and then pushed to the media stream consumer.


Reversed call stack (media::AudioDeviceThread::ThreadMain() drives the execution):

------- 10 ms buffer (may be e.g. 160)
content::PepperMediaStreamAudioTrackHost::AudioSink::OnData(media::AudioBus const&, base::TimeTicks)
content::MediaStreamAudioDeliverer<content::MediaStreamAudioSink>::OnData(media::AudioBus const&, base::TimeTicks)
5content::MediaStreamAudioTrack::OnData(media::AudioBus const&, base::TimeTicks)
content::MediaStreamAudioDeliverer<content::MediaStreamAudioTrack>::OnData(media::AudioBus const&, base::TimeTicks)
content::MediaStreamAudioSource::DeliverDataToTracks(media::AudioBus const&, base::TimeTicks)
content::WebAudioMediaStreamSource::DeliverRebufferedAudio(media::AudioBus const&, int)
------- 128 frames buffer
media::AudioPushFifo::Push(media::AudioBus const&)
content::WebAudioMediaStreamSource::consumeAudio(blink::WebVector<float const*> const&, unsigned long)
blink::ConsumerWrapper::consumeAudio(blink::AudioBus*, unsigned long)
blink::MediaStreamSource::consumeAudio(blink::AudioBus*, unsigned long)
blink::MediaStreamAudioDestinationHandler::process(unsigned long)
blink::AudioHandler::processIfNecessary(unsigned long)
blink::DeferredTaskHandler::processAutomaticPullNodes(unsigned long)
blink::AudioDestinationHandler::render(blink::AudioBus*, blink::AudioBus*, unsigned long, blink::AudioIOPosition const&)
------- Output device buffer (may be e.g. 512)
blink::AudioDestination::render(blink::WebVector<float*> const&, unsigned long, double, double, unsigned long) 
content::RendererWebAudioDeviceImpl::Render(base::TimeDelta, base::TimeTicks, int, media::AudioBus*) + 688
media::AudioOutputDevice::AudioThreadCallback::Process(unsigned int)
media::AudioDeviceThread::ThreadMain()


The problem: 

Reference time calculation in WebAudioMediaStreamSource::DeliverRebufferedAudio() assumes that DeliverRebufferedAudio() is called with approximately 10 ms intervals: frame delay offset is calculated from |current_reference_time_| obtained on the last push.

But it is not, if output buffer size is larger than 10 ms: in such a case DeliverRebufferedAudio() is called side by side multiple times with a larger interval between the bursts of calls. The difference between two adjacent |current_reference_time_| values varies from several microseconds to tens of milliseconds.

This results in non-increasing sequence of reference time stamps which makes the media stream consumers unhappy.

Overall, the consumers of WebAudio as MediaStream are not guaranteed to receive audio buffers at 10 ms intervals. In most cases it does not really show up though, because output buffer size is usually close to or smaller than 10 ms at common sample rate
 
Project Member

Comment 1 by bugdroid1@chromium.org, Mar 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f85f4d03170997f6d8fbf843e647e58a3923fd9e

commit f85f4d03170997f6d8fbf843e647e58a3923fd9e
Author: olka <olka@chromium.org>
Date: Mon Mar 13 20:27:29 2017

CL https://codereview.chromium.org/2692203003/ makes the code to use UnavailableDeviceParameters in case there are no real output devices available.

WebAudio is unhappy when 100 ms buffer size is used for audio sink.
(CL got reverted because of that). Switching to 10 ms.

BUG= 672468 ,701000

Review-Url: https://codereview.chromium.org/2738403002
Cr-Commit-Position: refs/heads/master@{#456464}

[modify] https://crrev.com/f85f4d03170997f6d8fbf843e647e58a3923fd9e/media/base/audio_parameters.cc

Comment 2 by olka@chromium.org, Mar 14 2017

Blocking: 564276

Comment 3 by m...@chromium.org, Apr 11 2017

Relevant e-mail discussion on this issue, between me and olka@... My copy-and-paste from Gmail didn't preserve the "who said what" formatting; but it's pretty easy to follow along, regardless.

--------------------------------------------

Sounds like we're on the same page, except for whether the reference timestamps are so inaccurate that we would consider the current impl broken. CIL...

--------------------------------------------

As I mentioned above, the call stack is driven by AudioOutputDevice -> WebAudioDeviceImpl -> AudioDestination call which requests a buffer of platform output buffer size.
While pulling the graph to fill that buffer, AudioDestination also pushes 128 frame buffers into WebAudioMediaStreamSource which rebuffers the audio to 10 ms using this fifo: https://cs.chromium.org/chromium/src/content/renderer/media/webaudio_media_stream_source.h?l=67. Audio format is not propagated further that this fifo.
 
The audio format *is* propagated after this point: MediaStreamSource/Track/Sink provide this information in GetAudioParameters(), so all downstream consumers have it: https://cs.chromium.org/chromium/src/content/renderer/media/media_stream_audio_source.h?rcl=82d4b66fd8df31637f3a9cdbb4ada8a00b6b3781&l=90

Did you mean to imply that consumers should be aware that the audio has been rebuffered from 128 to 441 (or 480 or whatever 10 ms is)?

I mean that the audio has been rebuffered twice: from audio device buffer size, to 128 and then to 10 ms. And WebAudioMediaStreamSource does not take it into account while doing reference time stamp calculation here https://cs.chromium.org/chromium/src/content/renderer/media/webaudio_media_stream_source.cc?type=cs&q=WebAudioMediaStreamSource::consumeAudio&l=104.

Well, sort of "yes" and sort of "no." The main problem here is my flawed "pragmatic heuristic." We are snapshotting the reference clock at the wrong place in the code. That's not taking into account the first of the two rebufferrings.

However, the code on L104 *does* take into account the 128-->441 rebufferring. It works backwards, under the assumption that |current_reference_time_| is associated with one sample past the audio being pushed into the FIFO. It uses |frame_delay| to determine what the reference time should have been for the first sample coming out of the FIFO (into the 10 ms buffer).

That said, at best, this is all an approximation; and should be fixed to achieve better accuracy (hence, my TODO comment).

-------------------------------------------- 

BTW--I'm thinking we can just remove this rebuffering FIFO. It was there for legacy reasons: The MediaStream code used to have a hard-coded 10 ms assumption everywhere. Since my refactoring, that's no longer a requirement since the WebRTC adapter sink will rebuffer audio to 10 ms if needed.

-------------------------------------------- 

So the buffers are 10 ms, but they do not necessary come at uniform 10 ms intervals (though their average is still 10 ms). 

If two calls to WebAudioMediaStreamSource::DeliverRebufferedAudio() are issued with 1 ms interval, we may get a reference time stamp which precedes the previously calculated one, because frame delay is milliseconds is subtracted from the time the call is issued at, which is only 1 ms later than the previous call.

Yes, I agree that's possible and probably happens often in some situations (my graph from an earlier e-mail showed this). My TODO comment in the code (where TimeTicks::Now() is called) is basically saying, "sampling the clock here is a big source of inaccuracy." However, this shouldn't break end-to-end functionality: Reference timestamps are approximations. They are flawed in the short-term, but average-out to an "A/V synchronization goal" in the longer term. As I've said before, if downstream consumers need precise, exact-to-the-frame, monotonically increasing timestamps, they should be tracking the number of samples consumed and dividing by sampling rate to get a precise timestamp. It's only when the reference timestamp skews too far from the "media timestamp" that a consumer might need to take action (e.g., detecting gaps/skips in the media, and/or stretching the signal to compensate and re-sync to video).

That said, I *do* agree that the reference timestamps can and should be made more accurate. It seems the reference timestamps should be generated upstream of WebAudioMediaStreamSource, inside the Blink WebAudio impl, as that's where the realtime thread lives that paces and generates the audio signal.

However, I don't agree that the current impl is totally broken. It's just inaccurate, and may or may not be harming performance downstream (e.g., false detection of gaps/skips, and/or premature signal stretching causing a "wah" effect at the listener). It's also worth mentioning that the problem may be worse that you think: On Windows systems on hardware older than 6 years, the resolution of the TimeTicks clock could sometimes be as poor as 15.6 ms! ;-)

So, I think we should be treating this as a performance issue, not a broken impl issue. And, there is an obvious improvement to be made: basically, address all my TimeTicks::Now() TODO's in content/renderer/media. We should be able to make the reference timestamps monotonically increasing everywhere. Does this sound right to you?

I haven't said it's *totally* broken. It does not work correctly in certain situations which I described :)
Non-increasing reference time stamps do not look like a performance problem to me: it's a contract violation, isn't it?

If we were talking about the media timestamps, it would definitely be a contract violation to go backwards.

For reference timestamps, I'm not sure whether it's a contract violation. I would say that, ideally, they should be monotonic. A good answer to this question would be found in my RTP book (it's a very thorough academic-style summary of the RTP protocol, which introduces all the "multiple clock" and "synchronization" concepts I've been talking about)...but somebody borrowed it and I can't find it right now. ;-)

BTW--Your team should have a copy, if you don't already: https://www.amazon.com/RTP-Audio-Internet-Colin-Perkins/dp/0672322498/ref=mt_hardcover?_encoding=UTF8&me= There might be a newer edition...

Comment 4 by m...@chromium.org, Apr 11 2017

Cc: m...@chromium.org
Labels: M-58
Owner: ----
Status: Available (was: Assigned)
Marking available...
Project Member

Comment 5 by sheriffbot@chromium.org, Apr 12 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 6 by guidou@chromium.org, Apr 13 2018

Components: -Blink>MediaStream Blink>WebRTC>Audio
[Re-triage] olka@ and miu@, is this still relevant? Can you close, assign, or set back to available please?
Owner: olka@chromium.org
Status: Assigned (was: Untriaged)

Sign in to add a comment