New issue
Advanced search Search tips

Issue 666813 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

MSE + playbackRate > 1 + 192kHz audio => stutters and lag.

Project Member Reported by dalecur...@chromium.org, Nov 18 2016

Issue description

Happens with both software and hardware decoding; does not happen with src= clips. 96kHz audio has some blips in it that 48kHz doesn't. 

http://storage.googleapis.com/dalecurtis/mse_clank/mse_mp4.html is the page I'm using to test with, open console and set playbackRate > 1 to repro.

I've only tested on M54 stable w/ Windows. I don't have devices to test on other platforms currently.

Marking as M56 RBS since it's too late to fix for M55.
 
I've attached a trace, but I don't see culprits standing out. Just a lot of extremely (seconds) delayed rendering calls.
trace_trace.json.gz
3.6 MB Download
It looks like this is just a performance issue with the AudioRendererAlgorithm exacerbated by how we handle audio configuration changes with MSE when running on systems with high channel counts and sample rates.

We upsample and upmix the signal for MSE before running it through WSOLA, which means that stereo 44.1kHz signal is now a 7.1 192kHz signal on some devices! In such circumstances the WSOLA converter can take upwards of 35ms per 20ms chunk of audio (as seen by my 4GHz i7).

There's definitely some low hanging WSOLA performance fruit, but the ideal fix is similar to what we do with AudioConverter: be smarter about our conversions. I.e. if we can downmix or downsample before running wsola... do it. Definitely do not upmix or upsample prior to the wsola step; instead do it after.

Implementing this is hard though since WSOLA expects to be working with uniform buffers (same sample and channel counts) and things like the AudioConverter expect a fixed initial configuration (again incoming buffers may have different channel counts or sample rates). It also doesn't solve the case of someone playing actual 192kHz 7.1 content at rates > 1, which is admittedly rare though.

I'll give it some more thought next week and look at optimizing the WSOLA code.
Put together https://codereview.chromium.org/2522673003 as a temporary workaround. It just prevents us from trying to run WSOLA on muted output which is a ~74% cost reduction for the stereo -> 7.1 case.
Project Member

Comment 4 by bugdroid1@chromium.org, Nov 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff

commit 21d17ca06e742cbfdcaaa54e0f4964248e3fbfff
Author: dalecurtis <dalecurtis@chromium.org>
Date: Tue Nov 22 03:54:40 2016

Don't run WSOLA over always muted channels.

WSOLA can be quite expensive to run; we should not be running it on
channels which we know are always muted due to an upmixed signal.

Not doing this results in an ~388% speedup when adjusting the
playback rate for 8 channel sources. Taking the run time for 10
seconds of audio at 2.0 rate from 11536ms (> 10 is not real time)
down to 2975ms.

This is accomplished by introducing a channel mask to the render
algorithm which can be used to generate AudioBus objects with
lower channel counts for use with the WSOLA algorithm. Masked
channels are always muted.

BUG= 666813 
TEST=new unittests

Review-Url: https://codereview.chromium.org/2522673003
Cr-Commit-Position: refs/heads/master@{#433782}

[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/cast/test/fake_media_source.cc
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/filters/audio_renderer_algorithm.cc
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/filters/audio_renderer_algorithm.h
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/filters/audio_renderer_algorithm_unittest.cc
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/renderers/audio_renderer_impl.cc
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/renderers/audio_renderer_impl.h
[modify] https://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff/media/renderers/audio_renderer_impl_unittest.cc

Project Member

Comment 5 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/77fd2aecaa7e1db995e158fb095a72be385e333c

commit 77fd2aecaa7e1db995e158fb095a72be385e333c
Author: dalecurtis <dalecurtis@chromium.org>
Date: Wed Nov 23 00:08:14 2016

Add SSE and NEON intrinsics for WSOLA's MultiChannelDotProduct().

When adapting 10 seconds of 192kHz stereo audio at a 2.0 playback
rate, this provides a ~9x speedup on ARM and a ~3x speedup on X86.

NEON: 24986ms -> 2695ms
SSE: 3014ms -> 894ms

No tests added since the existing tests verify this functionality
correctly and it continues to pass as before.

BUG= 666813 
TEST=n5x for ARM, z620 for X86.

Review-Url: https://codereview.chromium.org/2527533002
Cr-Commit-Position: refs/heads/master@{#434036}

[modify] https://crrev.com/77fd2aecaa7e1db995e158fb095a72be385e333c/media/filters/wsola_internals.cc

Will test these in the next canary release on my home system, but we should now be comfortably under real time for 192kHz 8 channel audio. Will merge request once verified.

I played around with further optimizations in wsola_internals, but none yielded much. Here were my experiments:
- Force channel count to 8  such that MultiChannelSimilarityMeasure() could just be two pairs of unrolled float operations. Zero effect on x86, didn't test on ARM.
- Optimize MultiChannelMovingBlockEnergies() to use SIMD, zero effect on X86; it's only called once per ~60ms of samples - so that's not unexpected.
- Removed memmove() operations from DecimatedSearch() in favor of 3 variables, again no effect on x86.
- Optimized FullSearch() to exit early if a perfect match is found (similarity == channels) again no effect.
- Removed DecimatedSearch entirely in favor of only FullSearch, this is about 10x as expensive, so the benefits of DecimatedSearch are still worthwhile.

Unexplored optimizations:
- Can decimation factor be increased?
- Are there more efficient algorithms?
Project Member

Comment 7 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c9f01b2a8a5f2df7198f4c62faf9e6750451033a

commit c9f01b2a8a5f2df7198f4c62faf9e6750451033a
Author: dalecurtis <dalecurtis@chromium.org>
Date: Wed Nov 23 22:06:22 2016

Lazily allocate WSOLA structures; saves 56kB to 765kB of memory.

These structures aren't needed until we actually attempt a playback
rate adaptation, so don't bother allocating them until we do. This
saves a few tens to hundreds of kilobytes of memory based on the
channel layout.

Stereo 48kHz: 57592 bytes
Stereo 192kHz: 230392 bytes
7.1 48kHz: 195808 bytes
7.1 192kHz: 783328 bytes

BUG= 666813 
TEST=none

Review-Url: https://codereview.chromium.org/2523893002
Cr-Commit-Position: refs/heads/master@{#434265}

[modify] https://crrev.com/c9f01b2a8a5f2df7198f4c62faf9e6750451033a/media/filters/audio_renderer_algorithm.cc
[modify] https://crrev.com/c9f01b2a8a5f2df7198f4c62faf9e6750451033a/media/filters/audio_renderer_algorithm_unittest.cc

Labels: Merge-Request-56
Verified on my home system, smooth playback with no popping. Tested some 5.1 clips without issue as well.

Comment 9 by dimu@chromium.org, Nov 24 2016

Labels: -Merge-Request-56 Merge-Approved-56 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M56 (branch: 2924)
Just realized this likely regressed implicit channel count changes, will fix on Monday.
Project Member

Comment 11 by bugdroid1@chromium.org, Nov 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c242c3a3b9242f63aa079ec10442488d1a6916cb

commit c242c3a3b9242f63aa079ec10442488d1a6916cb
Author: dalecurtis <dalecurtis@chromium.org>
Date: Tue Nov 29 03:28:32 2016

Fix implicit channel layout configurations at non-1.0 playback rates.

Change http://crrev.com/21d17ca06e742cbfdcaaa54e0f4964248e3fbfff
incorrectly assumed the input channel layout never changes. This
patch fixes that issue by allowing the channel mask to change as
layouts change.

BUG= 666813 
TEST=updated unittests
TEST=http://storage.googleapis.com/chcunningham-chrome-shared/534301/aac_test.html

Review-Url: https://codereview.chromium.org/2536013002
Cr-Commit-Position: refs/heads/master@{#434868}

[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/cast/test/fake_media_source.cc
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/filters/audio_renderer_algorithm.cc
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/filters/audio_renderer_algorithm.h
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/filters/audio_renderer_algorithm_unittest.cc
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/renderers/audio_renderer_impl.cc
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/renderers/audio_renderer_impl.h
[modify] https://crrev.com/c242c3a3b9242f63aa079ec10442488d1a6916cb/media/renderers/audio_renderer_impl_unittest.cc

Can someone verify the fix. If all looks good please merge your change to M56 (branch: 2924) ASAP so that we could take it for next Dev Release.
Will merge tomorrow after fix has been verified on canary.
Project Member

Comment 14 by sheriffbot@chromium.org, Nov 30 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 15 by bugdroid1@chromium.org, Nov 30 2016

Labels: -merge-approved-56 merge-merged-2924
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ce3455666136ca3e39c66aab13e53da72249b26d

commit ce3455666136ca3e39c66aab13e53da72249b26d
Author: Dale Curtis <dalecurtis@chromium.org>
Date: Wed Nov 30 23:24:40 2016

Roll up of playback rate efficiency improvements for M56.

532a5dc Fix implicit channel layout configurations at non-1.0 playback rates.
fe0d379 Lazily allocate WSOLA structures; saves 56kB to 765kB of memory.
d99d0fa Add SSE and NEON intrinsics for WSOLA's MultiChannelDotProduct().
2c82957 Don't run WSOLA over always muted channels

BUG= 666813 

(cherry picked from commit a2cf799e795b7125776e6d58f1bb384c465c4c18)

Review URL: https://codereview.chromium.org/2541873005 .

Cr-Commit-Position: refs/branch-heads/2924@{#226}
Cr-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059}

[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/filters/audio_renderer_algorithm.cc
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/filters/audio_renderer_algorithm.h
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/filters/audio_renderer_algorithm_unittest.cc
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/filters/wsola_internals.cc
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/renderers/audio_renderer_impl.cc
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/renderers/audio_renderer_impl.h
[modify] https://crrev.com/ce3455666136ca3e39c66aab13e53da72249b26d/media/renderers/audio_renderer_impl_unittest.cc

Status: Fixed (was: Assigned)

Sign in to add a comment