New issue
Advanced search Search tips

Issue 813504 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Performance drop using AudioParam.setTargetAtTime over implicit dezippering

Reported by andrew.macpherson@soundtrap.com, Feb 19 2018

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36

Steps to reproduce the problem:
1. Turn volume down and go to https://jsfiddle.net/8vo0gbav/1/
2. Check CPU use (~30% on my Macbook Pro)
3. Change the "var dezipper = false;" line in the JS to true.
4. Run again and check CPU use (now ~100% here, with audio dropouts).
5. The number of nodes can be increased if there are no dropouts at the current value.

What is the expected behavior?
After the removal of implicit dezippering in Chrome 66 users need to use setTargetAtTime on AudioParams where they want dezippering to occur, however the performance of using setTargetAtTime appears to be worse than when using implicit dezippering.

What went wrong?
High CPU use and audio dropouts when trying to dezipper AudioParam values.

Did this work before? N/A 

Does this work in other browsers? N/A

Chrome version: 64.0.3282.167  Channel: stable
OS Version: OS X 10.13.2
Flash Version:
 
Labels: Needs-Triage-M64
Cc: vamshi.kommuri@chromium.org
Labels: Triaged-ET Needs-Feedback
Thanks for filing the issue!

Checked the issue on reported chrome version 64.0.3282.167 using MacBook Pro 10.13.1 with the below mentioned steps.
1. Launched Chrome
2. Turned the volume down and Navigated to https://jsfiddle.net/8vo0gbav/1/
3. Checked the CPU usage
4. Changed the "var dezipper = true" and hit Run
5. Observed the CPU Usage.
In first and second case we got a CPU usage as ~30 and ~100 respectively as said in comment#0. Attaching the screen shots of the same.
Note: We observed audio dropouts in "var dezipper = True" case.

@Reporter: Could you please clarify us on the issue as ET team is unclear about the expected and actual behaviour i.e., we couldn't get "setTargetAtTime on AudioParams" and occurrence of dezippering. It would be highly helpful if mentioned whether usage of cpu ~100 and audio dropouts is the issue. 
var dezipper = false.png
620 KB View Download
var dezipper = true.png
633 KB View Download
@vamshi.kommuri: As per  https://crbug.com/496282  dezippering of AudioParams is being removed in Chrome 66. In Chrome 64 I believe that dezippering of AudioParams happens automatically when using the .value setter, but in Chrome 66 this functionality has been removed, so a user needs to use setTargetAtTime() in order to manually dezipper the values.

This isn't a pressing problem in Chrome 64 as we can continue to use the .value setter there, but this will become a problem in Chrome 66 once the .value setter no longer does dezippering internally and we have to switch to setTargetAtTime. The issue is both the high CPU load as well as the dropouts (though I assume the dropouts are just caused by the high CPU load).

Thanks for the help!
Project Member

Comment 4 by sheriffbot@chromium.org, Feb 20 2018

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding the requester to the cc list.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 5 by rtoy@chromium.org, Feb 20 2018

Status: Available (was: Unconfirmed)
I can reproduce this as well with Chrome 65.0.3325.73, Linux. I also hear some glitches, but that might be because the linux audio system is not nearly as consistent as macos.

I did a perf using a repro case (slightly modified) and there are the results, using ToT chromium from today:

Samples: 15K of event 'cycles', Event count (approx.): 9624544187
  Children      Self  Command          Shared Object                                  Symbol                                                   ◆
+   60.12%     0.71%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioNodeOutput::Pull                         ▒
+   59.99%     0.78%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioHandler::ProcessIfNecessary              ▒
+   32.94%    32.83%  AudioOutputDevi  libblink_platform.so                           [.] blink::Biquad::Process                               ▒
+   21.35%     0.46%  AudioOutputDevi  libblink_modules.so                            [.] blink::BiquadProcessor::Process                      ▒
+   14.31%    14.24%  AudioOutputDevi  libm-2.24.so                                   [.] __ieee754_pow_sse2                                   ▒
+   13.90%     0.16%  AudioOutputDevi  libblink_modules.so                            [.] blink::BiquadDSPKernel::Process                      ▒
+   13.63%     0.15%  AudioOutputDevi  libblink_modules.so                            [.] blink::BiquadDSPKernel::UpdateCoefficientsIfNecessary▒
+   11.96%    11.93%  AudioOutputDevi  libm-2.24.so                                   [.] __exp1                                               ▒
+    9.46%     1.60%  AudioOutputDevi  libblink_modules.so                            [.] blink::BiquadDSPKernel::UpdateCoefficients           ▒
+    7.71%     0.26%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioNodeInput::Pull                          ▒
+    7.62%     0.02%  AudioOutputDevi  libmedia.so                                    [.] media::AudioDeviceThread::ThreadMain                 ▒
+    7.59%     0.00%  AudioOutputDevi  libbase.so                                     [.] base::(anonymous namespace)::ThreadFunc              ▒
+    7.52%     0.02%  AudioOutputDevi  libblink_platform.so                           [.] blink::AudioDestination::RequestRender               ▒
+    7.50%     0.00%  AudioOutputDevi  libpthread-2.24.so                             [.] start_thread                                         ▒
+    7.44%     0.00%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioDestinationHandler::Render               ▒
+    7.42%     0.01%  AudioOutputDevi  libcontent.so                                  [.] content::RendererWebAudioDeviceImpl::Render          ▒
+    7.41%     0.00%  AudioOutputDevi  libblink_platform.so                           [.] blink::AudioDestination::Render                      ▒
+    7.32%     0.01%  AudioOutputDevi  libmedia.so                                    [.] media::SilentSinkSuspender::Render                   ▒
+    7.31%     0.00%  AudioOutputDevi  libmedia.so                                    [.] media::AudioOutputDevice::AudioThreadCallback::Proces▒
+    5.88%     0.56%  AudioOutputDevi  libblink_modules.so                            [.] blink::BiquadProcessor::CheckForDirtyCoefficients    ▒
+    4.98%     4.98%  AudioOutputDevi  libpthread-2.24.so                             [.] pthread_mutex_trylock                                ▒
+    4.43%     0.00%  AudioOutputDevi  [unknown]                                      [.] 0000000000000000                                     ▒
+    4.25%     4.23%  AudioOutputDevi  libm-2.24.so                                   [.] __cos_avx                                            ▒
+    3.56%     0.26%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioParamHandler::CalculateFinalValues       ▒
+    3.52%     3.51%  AudioOutputDevi  libm-2.24.so                                   [.] __sin_avx                                            ▒
+    3.38%     3.37%  AudioOutputDevi  libpthread-2.24.so                             [.] __pthread_mutex_unlock_usercnt                       ▒
+    3.26%     3.26%  AudioOutputDevi  libblink_platform.so                           [.] blink::Biquad::SetLowpassParams                      ▒
+    2.80%     0.48%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioParamHandler::Smooth                     ▒
+    2.26%     2.25%  AudioOutputDevi  libm-2.24.so                                   [.] do_cos.isra.0                                        ▒
+    1.79%     1.26%  AudioOutputDevi  libblink_modules.so                            [.] blink::AudioParamTimeline::ValuesForFrameRange       ▒

The interesting part is that ValuesForFrameRange is only 1.26% of the whole processing.  Most of the time is spent in Biquad::Process (32%), which isn't surprising because this is the guts of the filter (of which there are 512).

And surprisingly __ieee754_pow_sse2 and __exp1 represent 26% of the total.

Comment 6 by rtoy@chromium.org, Feb 20 2018

Oops.  The profile result in c#5 is for when dezipper = false.  That is, use the value setter, which in chrome canary is setValueAtTime.  For dezipper = true, we explicitly call setTargetAtTime, which is where the slowdown is.

Here is the perf result for that case:

+   38.02%     0.17%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioNodeOutput::Pull                        ▒
+   37.95%     0.23%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioHandler::ProcessIfNecessary             ▒
+   26.63%    26.55%  AudioOutputDevi  libm-2.24.so                                    [.] __ieee754_pow_sse2                                  ▒
+   24.68%     0.06%  AudioOutputDevi  libblink_modules.so                             [.] blink::BiquadProcessor::Process                     ▒
+   24.20%     0.03%  AudioOutputDevi  libblink_modules.so                             [.] blink::BiquadDSPKernel::Process                     ▒
+   24.12%     0.03%  AudioOutputDevi  libblink_modules.so                             [.] blink::BiquadDSPKernel::UpdateCoefficientsIfNecessar▒
+   20.62%    20.58%  AudioOutputDevi  libm-2.24.so                                    [.] __exp1                                              ▒
+   16.63%     2.69%  AudioOutputDevi  libblink_modules.so                             [.] blink::BiquadDSPKernel::UpdateCoefficients          ▒
+   13.01%     1.04%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioParamHandler::CalculateFinalValues      ▒
+    7.69%     7.67%  AudioOutputDevi  libm-2.24.so                                    [.] __cos_avx                                           ▒
+    6.99%     3.83%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioParamTimeline::ValuesForFrameRange      ▒
+    6.74%     0.01%  AudioOutputDevi  libblink_modules.so                             [.] blink::BiquadProcessor::ProcessOnlyAudioParams      ▒
+    6.19%     6.18%  AudioOutputDevi  libm-2.24.so                                    [.] __sin_avx                                           ▒
+    5.73%     5.72%  AudioOutputDevi  libblink_platform.so                            [.] blink::Biquad::SetLowpassParams                     ▒
+    4.93%     4.92%  AudioOutputDevi  libblink_platform.so                            [.] blink::Biquad::Process                              ▒
+    4.65%     0.08%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioNodeInput::Pull                         ▒
+    4.54%     0.00%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioDestinationHandler::Render              ▒
+    4.54%     0.00%  AudioOutputDevi  libblink_platform.so                            [.] blink::AudioDestination::RequestRender              ▒
+    4.48%     0.00%  AudioOutputDevi  libblink_platform.so                            [.] blink::AudioDestination::Render                     ▒
+    4.47%     0.00%  AudioOutputDevi  libcontent.so                                   [.] content::RendererWebAudioDeviceImpl::Render         ▒
+    4.41%     0.00%  AudioOutputDevi  libmedia.so                                     [.] media::SilentSinkSuspender::Render                  ▒
+    4.38%     0.00%  AudioOutputDevi  libmedia.so                                     [.] media::AudioDeviceThread::ThreadMain                ▒
+    4.38%     0.00%  AudioOutputDevi  libmedia.so                                     [.] media::AudioOutputDevice::AudioThreadCallback::Proce▒
+    4.37%     0.00%  AudioOutputDevi  libbase.so                                      [.] base::(anonymous namespace)::ThreadFunc             ▒
+    4.32%     0.00%  AudioOutputDevi  libpthread-2.24.so                              [.] start_thread                                        ▒
+    3.75%     3.74%  AudioOutputDevi  libm-2.24.so                                    [.] do_cos.isra.0                                       ▒
+    3.03%     1.14%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioParamTimeline::ValuesForFrameRangeImpl  ▒
+    2.38%     0.64%  AudioOutputDevi  libblink_platform.so                            [.] blink::AudioBus::Create                             ▒
+    1.97%     1.97%  AudioOutputDevi  libm-2.24.so                                    [.] __pow                                               ▒
+    1.70%     0.80%  AudioOutputDevi  libblink_platform.so                            [.] blink::AudioBus::AudioBus                           ▒
+    1.59%     1.48%  AudioOutputDevi  libblink_modules.so                             [.] blink::AudioParamTimeline::ProcessSetTarget         

ProcessSetTarget is the code that handles the automation for setTarget.  It's just 1.48%.

And again, __ieee754_pow_sse2 and __exp1 represent a whopping 47% of the processing.  Simplifying or removing these calls would help a lot, but I'm not sure how much we can do.  

Because you're automating the Q value of the biquad filter, the coefficients of the biquad need to be recomputed every frame, and the formulas to convert the biquad parameters to filter coefficients are quite complex.
Thanks for looking into it rtoy! What I had assumed was that in Chrome 64 the performance of using the .value setter would be similar to using .setTargetAtTime(), since both should result in dezippering. I would then have expected in Chrome 66 for the .value setter to get cheaper since it became a .setValueAtTime() call. So I'm mostly wondering why using .setTargetAtTime is so expensive compared to the .value setter in Chrome 64 since they should both be dezippering the values.

Comment 8 by rtoy@chromium.org, Feb 20 2018

Yeah, there's something going on here that I don't understand. I expected setTargetAtTime to be a little bit more expensive, but not 3-4 times more expensive.

And some of the perf results don't really make sense to me.  Maybe I don't know how to read perf reports.

More study needed....
I built version 64.0.3282.186 here and ran it through Instruments on OSX for both the .value setter and setTargetAtTime() versions. Here are the last few frames of the "Heaviest Stack Trace" from Instruments on a 20 second session.

With the .value setter:

   8 Chromium Framework 927.0  blink::AudioHandler::ProcessIfNecessary(unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNode.cpp:325
   7 Chromium Framework 927.0  blink::AudioNodeInput::Pull(blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNodeInput.cpp:207
   6 Chromium Framework 927.0  blink::AudioNodeOutput::Pull(blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNodeOutput.cpp:140
   5 Chromium Framework 927.0  blink::AudioHandler::ProcessIfNecessary(unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNode.cpp:346
   4 Chromium Framework 714.0  blink::BiquadProcessor::Process(blink::AudioBus const*, blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/BiquadProcessor.cpp:117
   3 Chromium Framework 584.0  blink::BiquadDSPKernel::Process(float const*, float*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp:157
   2 Chromium Framework 423.0  blink::Biquad::Process(float const*, float*, unsigned long) ../../third_party/WebKit/Source/platform/audio/Biquad.cpp:135
   1 Chromium Framework 423.0  blink::Biquad::ProcessFast(float const*, float*, unsigned long) ../../third_party/WebKit/Source/platform/audio/Biquad.cpp:233

And with setTargetAtTime:

  11 Chromium Framework 5100.0  blink::AudioHandler::ProcessIfNecessary(unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNode.cpp:325
  10 Chromium Framework 4796.0  blink::AudioNodeInput::Pull(blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNodeInput.cpp:207
   9 Chromium Framework 4796.0  blink::AudioNodeOutput::Pull(blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNodeOutput.cpp:140
   8 Chromium Framework 4796.0  blink::AudioHandler::ProcessIfNecessary(unsigned long) ../../third_party/WebKit/Source/modules/webaudio/AudioNode.cpp:346
   7 Chromium Framework 3773.0  blink::BiquadProcessor::Process(blink::AudioBus const*, blink::AudioBus*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/BiquadProcessor.cpp:117
   6 Chromium Framework 3773.0  blink::BiquadDSPKernel::Process(float const*, float*, unsigned long) ../../third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp:154
   5 Chromium Framework 3773.0  blink::BiquadDSPKernel::UpdateCoefficientsIfNecessary(int) ../../third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp:34
   4 Chromium Framework 2733.0  blink::AudioParamHandler::CalculateSampleAccurateValues(float*, unsigned int) ../../third_party/WebKit/Source/modules/webaudio/AudioParam.cpp:152
   3 Chromium Framework 2733.0  blink::AudioParamHandler::CalculateFinalValues(float*, unsigned int, bool) ../../third_party/WebKit/Source/modules/webaudio/AudioParam.cpp:169
   2 Chromium Framework 1732.0  blink::AudioParamHandler::CalculateTimelineValues(float*, unsigned int) ../../third_party/WebKit/Source/modules/webaudio/AudioParam.cpp:215
   1 Chromium Framework 1694.0  blink::AudioParamTimeline::ValuesForFrameRange(unsigned long, unsigned long, float, float*, unsigned int, double, double, float, float) ../../third_party/WebKit/Source/modules/webaudio/AudioParamTimeline.cpp:114

Number value is cumulative time in milliseconds. It looks like there is much more time being spent in processing with setTargetAtTime and digging down a bit further it looks like BiquadDSPKernel::UpdateCoefficientsIfNecessary is a cause. There's a check in there for "if (GetBiquadProcessor()->HasSampleAccurateValues())" which then maybe calls CalculateSampleAccurateValues, would it be that setTargetAtTime uses this code path and the .value setter doesn't?

Comment 10 by rtoy@chromium.org, Feb 26 2018

Thanks for the profile results. That the value setter doesn't seem to be calling HasSampleAccurateValues in Chrome 64 would certainly explain the difference.  However, that would be a bug in Chrome 64 because the biquad filter is supposed to be a-rate so the dezippering in the value setter should call CalculateSampleAccurateValues.  Perhaps that was missed when we implemented a-rate parameters for the BiquadFilterNode. (This a-rate change was done in M50.)
Thanks for the update rtoy! Does this mean that BiquadFilterNode parameters via the .value setter were never being dezippered? Our biggest concern is around maintaining current performance without degrading sound quality due to any dezippering changes. Firefox shows roughly the same performance on the jsfiddle (~30% cpu use) whether using setTargetAtTime or not, which also led to wondering if there was a bug here.

Comment 12 by rtoy@chromium.org, Feb 26 2018

I'm pretty sure they're being dezippered.  And as a I check, I (un)applied the CL from https://chromium-review.googlesource.com/c/chromium/src/+/612104 so I can work on ToT chromium instead chrome 64, and CalculateSampleAccurateValues() is getting called with the old dezippering code.

So I'm not really sure where the slowdown is coming from.
I just pulled ToT (66.0.3356.0) and reverted the patch you linked and updated the fiddle to start via a button (autoplay stuff in Chrome 66):

https://jsfiddle.net/8vo0gbav/31/

Results are similar to what was seen in Chrome 64 so I did a bit of debugging and you're right that both cases have GetBiquadProcessor()->HasSampleAccurateValues() == true in UpdateCoefficientsIfNecessary, however it's the first check in that method (for GetBiquadProcessor()->FilterCoefficientsDirty()) which is true on every call in the setTargetAtTime case, whereas it's only true ~1/10 times with the .value setter. So it's not that they're using different code paths but that it's just updating the coefficients much more frequently with setTargetAtTime. The coefficients are being marked dirty in here on every call in the setTargetAtTime case:

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webaudio/BiquadProcessor.cpp?l=70


Any thoughts?
In fact it looks like setTargetAtTime doesn't need to be called regularly, just calling it once per parameter seems to be enough to cause the CPU to jump to ~100% and stay there:

https://jsfiddle.net/8vo0gbav/52/

Comment 15 by rtoy@chromium.org, Feb 27 2018

Yeah, I noticed yesterday that the value setter wasn't always doing the sample-accurate path.  I'll need to do more digging to figure out why that's happening, because that is unexpected, especially since you're setting new values every 100 ms or so.

And about c#14, that's a bug somewhere.  setTargetAtTime should eventually stop using the timeline when the actual value is close enough to the target value, about 10 time constants.
Thanks rtoy. It would be pretty important from our side to ensure that we get a fix in for this before it lands in Chrome Stable, we haven't been using setTargetAtTime much before but we'll be relying on it for dezippering now. Would it help for me to look into this further and would any potential fix need to be in place before the Chrome 66 branch (which is coming very soon)?

Comment 17 by rtoy@chromium.org, Feb 27 2018

Owner: rtoy@chromium.org
Status: Assigned (was: Available)
Ok, there's a round-off problem for c#14.  In one test case, the target value is -2.19524 and the current value is -2.19518.  With a time constant of 0.05 sec, the increment to be added is 2.72047e-8 so the new value remains basically the same.

The thresholds for convergence are too tight.  Also, there a test the event has lasted for more than 5 time constants (which is true in this case), but we're requiring convergence too.

This can probably be fixed in time for M66.  Still investigating the other parts, though.
From what I've seen I'm suspecting that c#14 may be the whole problem, I just hadn't isolated it enough before. Unless some of the earlier results look suspicious to you.

Comment 19 by rtoy@chromium.org, Feb 27 2018

I'm not sure about c#14 being the whole problem.  In the original repro case, you're calling setTargetAtTime every 100 ms.  This will never converge in 5 time constants (5*50 ms) unless the random values happen to be very very close to each other.

But the fix mentioned in c#17 will be fairly straightforward to land in time for 66.
Ah ok, I see. The 50ms time constant in this case came from a comment you made around the values being used internally (50ms for Biquad):

https://bugs.chromium.org/p/chromium/issues/detail?id=496282#c16

We're mainly interested in making the transition to manual dezippering as invisible as possible to end users. Thanks again!

Comment 21 by rtoy@chromium.org, Feb 27 2018

Re c#13. One other thing that I've seen is that once in a while AudioParamTimeline::ValueForContextTime() can't get the events_lock (probably because some new event is being inserted), so it gives up and returns the default value, fooling the code into thinking the code that sample-accurate values are needed.

I don't think, however, this happens often enough to cause the setter to be much faster.  There's something else at play, that I haven't tracked down yet.
Project Member

Comment 22 by bugdroid1@chromium.org, Feb 28 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a3c8a81f604a306174a02e8f2d574fced6a11256

commit a3c8a81f604a306174a02e8f2d574fced6a11256
Author: Raymond Toy <rtoy@chromium.org>
Date: Wed Feb 28 18:38:17 2018

Refine condition when setTargetAtTime has converged

The function HasSetTargetConverged was too restrictive and would
continue to say the setTargetAtTime event was not converged even
though the increment was so small as make no change in the value.

Thus, adjust the criteria so that if the increment is too small to
affect the value, consider the event has having converged.

There's also a bug in HandleAllEventsInThePast where setTargetAtTime
has converged and we've updated the default value but the timeline
would return the old default value instead of the new converged
default value.

A couple of tests needed to be updated due to the change in the
convergence criterion.

Ran the test https://jsfiddle.net/8vo0gbav/52/ and the CPU now no
longer goes to 100% as it did without this change.

Bug:  813504 
Change-Id: I506b31289b5b40380147d231d8b2ea41785d6600
Reviewed-on: https://chromium-review.googlesource.com/940273
Reviewed-by: Hongchan Choi <hongchan@chromium.org>
Commit-Queue: Raymond Toy <rtoy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#539900}
[modify] https://crrev.com/a3c8a81f604a306174a02e8f2d574fced6a11256/third_party/WebKit/LayoutTests/webaudio/AudioParam/audioparam-setTargetAtTime-limit.html
[modify] https://crrev.com/a3c8a81f604a306174a02e8f2d574fced6a11256/third_party/WebKit/LayoutTests/webaudio/AudioParam/audioparam-setTargetAtTime.html
[modify] https://crrev.com/a3c8a81f604a306174a02e8f2d574fced6a11256/third_party/WebKit/Source/modules/webaudio/AudioParamTimeline.cpp
[modify] https://crrev.com/a3c8a81f604a306174a02e8f2d574fced6a11256/third_party/WebKit/Source/modules/webaudio/AudioParamTimeline.h

Comment 23 by rtoy@chromium.org, Feb 28 2018

Andrew, your demo uses biquad filters to illustrate the problem.  Is this the typical case you're seeing?  I'm guessing the other nodes that had dezippering removed aren't as important to your use case.

I ask because the a-rate biquad filters are quite expensive in computing the coefficients.

I'm still investigating this, and have some general optimizations that help a lot, but not enough to go from 100% back to 30-40% CPU.  

Comment 24 by rtoy@chromium.org, Feb 28 2018

One other thing that is probably making analysis harder.  In ToT, the value setter was also changed to be the identical to setValueAtTime.  In the implementation, we actually call setValueAtTime.  This causes the timeline code to run.

Perhaps we should change the implementation so as not to call setValueAtTime?

Comment 25 by rtoy@chromium.org, Mar 1 2018

Ok. I think I know why Chrome 64 is so much faster.  We screwed up implementing a-rate biquads and the a-rate doesn't actually happen with the value setter.  The value setter dezippering does happen, but the filter coefficients are updated just once per render quantum, and each render gets the dezippered value.

Hence, we never do the sample accurate case.

One really gross solution:  Instead of calling setTargetAtTime(v, t, tau), call setValueAtTime(v[n], t + n*128/context.sampleRate, tau) and v[n] is a manually dezippered value.  Every time you would call setTargetAtTime (or any other automation function), you have to remember to cancelScheduledValues (or cancelAndHold) to remove the effect of the setValueAtTime in the future.

I have not tried this out yet but I think the performance would be close to the value setter version.

Other alternatives:  

1. Don't do a-rate biquads (which are required by the spec, but FF and others probably don't do yet.).
2. Don't remove dezippering (FF never dezippered biquads).
3. Change the spec so that AudioParams can be a-rate or k-rate as desired by the user.

I pulled from ToT last night and the problem from c#14 is fixed, thanks for that! As you said though there is still high CPU use in the other case where setTargetAtTime is being called every 100ms.

The use case in question is functionality in Soundtrap that we added after the performance regression due to the changed threading model for AudioWorklet. We're basically adding audio tracks to a project, using a pre-defined effects chain which uses most available node types, and seeing how many tracks we can add before audio starts glitching. It's very rough and high-level but has been effective so far at pointing out regressions, and mirrors real user experience. This test also shows an improvement now but still not equal to pre-setTargetAtTime performance on OSX.

On some Chromebooks were were seeing a 50% drop in "performance" in this test using setTargetAtTime over implicit dezippering, so a machine that could run with 10 tracks before glitching could now only handle 5. I should add that this test was done in Chrome 65 though, so if any optimizations were added in Chrome 66 when dezippering was officially removed I missed those (I'll need to re-run the test). I'm using Biquad here because when looking into the performance regression this was the first node type I identified as worse, it's possible that others have also regressed but since Biquad is heavy it was the easiest to see.

If it were only up to me I would definitely lean towards postponing the removal of implicit dezippering. I'd be happy to provide additional testing (the cancelScheduledValues solution or anything else) and maybe the fix you've already done for setTargetAtTime convergence will have a big effect, but knowing that we have a fallback in case it's not enough would be really great. Again, the biggest concern here is ChromeOS, where Chrome Stable is the only available browser.

Any thoughts?

Comment 27 by rtoy@chromium.org, Mar 1 2018

My gross workaround doesn't actually work because there's still a timeline. It would require some additional code to make it work fast, but it would make for a pretty terrible developer experience.

We could revert the removal of dezippering for the Biquad. Not sure I could get it landed before the M66 branch, though.  Might require additional review because we already did an intent to remove a few months ago, so I might need another review for an intent to unremove.

We could propose adding a automationRate attribute to AudioParams in the spec.  I think it's a relatively simple change, but might be too late to do in version 1.  Implementation would be simple too.  However, that definitely won't happen for M66 but could for M67.

Lastly, the ultimate fix:  Use an AudioWorklet to fix the biquad filter dezippering issue.  A bit of work but I was going to implement a Biquad in a worklet anyway, as a side project for fun.

Comment 28 by rtoy@chromium.org, Mar 1 2018

FWIW, I hacked the code so that automations for Biquads are always k-rate.  For the repro case, the value setter was using about 33% (+/- 2-3) CPU.  With k-rate setValueAtTime, CPU is about 37%.  I think that's reasonable.

We can also do some optimizations to make sample accurate automation run faster, but I think it won't be nearly fast enough to bring CPU for >= 100% to 35-40%, and it would only apply if the automations are changing relatively slowly over a render quantum.
Project Member

Comment 29 by bugdroid1@chromium.org, Mar 1 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f5087a275ca258d91d475266584e15d81d20b143

commit f5087a275ca258d91d475266584e15d81d20b143
Author: Raymond Toy <rtoy@chromium.org>
Date: Thu Mar 01 19:00:16 2018

Replace pow with exp/exp2 in BiquadFilter routines

The general pow function is pretty expensive when computing 10^x and
2^x.  Replace it with the more specific exp and exp2 functions.

This is only done for the routines for the BiquadFilterNode.

Using https://jsfiddle.net/8vo0gbav/1/ (with dezipper = true to use
setTargetAtTime) as a test, perf top on my Z840 machine said:

+   29.01%    28.98%  __ieee754_pow_sse2
+   22.65%    22.61%  __exp1

With this CL, the same test reports

+   17.15%    17.12%  __expf_finite

This isn't enough to make the test case use less than 100% CPU, but
audio with glitches is now heard.

Two tests needed to have the thresholds modified.  Apparently there's
a difference in round-off between exp and pow for this change to
change the results slightly.

Bug:  813504 
Change-Id: I7c517c558bbb54c344f10dc3c057d21404679995
Reviewed-on: https://chromium-review.googlesource.com/942091
Reviewed-by: Hongchan Choi <hongchan@chromium.org>
Commit-Queue: Raymond Toy <rtoy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#540235}
[modify] https://crrev.com/f5087a275ca258d91d475266584e15d81d20b143/third_party/WebKit/LayoutTests/webaudio/BiquadFilter/biquad-getFrequencyResponse.html
[modify] https://crrev.com/f5087a275ca258d91d475266584e15d81d20b143/third_party/WebKit/LayoutTests/webaudio/BiquadFilter/biquad-highshelf.html
[modify] https://crrev.com/f5087a275ca258d91d475266584e15d81d20b143/third_party/WebKit/LayoutTests/webaudio/BiquadFilter/no-dezippering.html
[modify] https://crrev.com/f5087a275ca258d91d475266584e15d81d20b143/third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp
[modify] https://crrev.com/f5087a275ca258d91d475266584e15d81d20b143/third_party/WebKit/Source/platform/audio/Biquad.cpp

Thanks rtoy. From this side I think we're open to any possible solution that will avoid killing performance on low-end devices in Chrome 66. If this means temporarily switching from a-rate to k-rate for Biquads, or making a/k-rate configurable, or temporarily reverting dezippering until a/k-rate changes can be approved in the spec, any of those would be fine as long as it results in reasonable performance.

I'd still feel safest if it were possible to revert the dezippering for now since I would need to do low-end device testing with the k-rate change to see its impact. Looking back at the intent to remove for dezippering it looks like there is no version specified, would it be possible to simply postpone its removal until M67 or M68 pending the addition and testing of the k-rate change?

WDYT?

Comment 31 by rtoy@chromium.org, Mar 2 2018

The M66 branch has been made.  The removal of dezippering was supposed to have happened in 64 (or 65?), but we were slow in removing it.

I don't want to revert all the removals, but maybe biquads would be enough.  I also won't be putting back the console messages either.

It seems likely that the spec will allow AudioParams with selectable a-rate or k-rate.  But that probably won't happen until 67 (even though I do have a working prototype now.)

I'll see what I can do; it's generally difficult to get changes into the stable branch unless they're security issues.

Comment 32 by rtoy@chromium.org, Mar 2 2018

One last experiment:  https://jsfiddle.net/8vo0gbav/80/ uses setValueAtTime instead of setTargetAtTime.  (Ignore the stuff about automationRate).

CPU usage stays about the same for me at roughly 35%.  The output sound is pretty the same as using the value setter.

Is this a viable alternative?  At least until we update the spec to allow selectable automation rates?
Thanks rtoy. The sound output may be the same with setValueAtTime in this repro scenario but a real scenario would involve changing more than just the Q param every 100ms so I'm not sure how the same sound could be achieved by avoiding the use of setTargetAtTime.

My concern here is still about rolling this out to users when we know that there is at least one performance regression if dezippering is required. I'm also concerned that even if we're forced to disable dezippering in M66 due to this issue that we may encounter further issues due to the changed behaviour of the .value setter, which now also causes timeline processing and may also be heavier than expected due to the a-rate Biquad params. Does that make sense?
Project Member

Comment 34 by bugdroid1@chromium.org, Mar 6 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4d80299ac53b23961a5b4dc3c2a247debdd187f0

commit 4d80299ac53b23961a5b4dc3c2a247debdd187f0
Author: Raymond Toy <rtoy@chromium.org>
Date: Tue Mar 06 23:04:48 2018

Shortcut Biquad computations if values are constant

Check to see if the Biquad AudioParam values are all constant for the
duration of the render.  If they are, we can skip computing the filter
coefficients for the render and just compute the one constant value.

Using https://jsfiddle.net/8vo0gbav/87/ as a test, CPU usage is
reduced from 35-38% down to 25-28% CPU.  The bulk of the change is
from calling BiquadDSPKernel::UpdateCoefficients less often. It was
accounting for 17.8% but is now just 2%. (According to perf top on
linux.)

Bug:  813504 
Change-Id: I1e58dc433f7c03d4e0daa13a195b90328e03312a
Reviewed-on: https://chromium-review.googlesource.com/951628
Reviewed-by: Hongchan Choi <hongchan@chromium.org>
Commit-Queue: Raymond Toy <rtoy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#541227}
[modify] https://crrev.com/4d80299ac53b23961a5b4dc3c2a247debdd187f0/third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp

Comment 35 by rtoy@chromium.org, Mar 7 2018

Andrew mentioned in a separate message that https://chromium-review.googlesource.com/940273 fixed most of the performance regression in their application.  And using setValueAtTime instead of setTargetAtTime is probably good enough for now, so performance is roughly equivalent.

It would be great, Andrew, if you could confirm my statements above.

The last item in c#34 should help improve performance for the setValueAtTime case.


Comment 36 by rtoy@chromium.org, Mar 7 2018

Labels: Merge-Request-66
Requesting merge to M66 of the small CL in c#34 (https://chromium-review.googlesource.com/c/chromium/src/+/951628).  This simple change improves the test for about 35% CPU usage to 25% CPU usage.

All the other CLs mentioned in this bug landed before the M66 branch.
Just confirming what rtoy mentioned in c#35, we spoke offline and the main source of the performance regression was fixed by the CL in that comment. Performance is still a bit lower than before in the specific case of dezippering a Biquad param but with the additional optimization mentioned in c#36 and with a potential update to the Web Audio spec around optionally making AudioParams k-rate that's good enough for us.

Thanks for the help here rtoy!
Project Member

Comment 38 by sheriffbot@chromium.org, Mar 8 2018

Labels: -Merge-Request-66 Merge-Approved-66 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M66. Please go ahead and merge the CL to branch 3359 manually. Please contact milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 39 by bugdroid1@chromium.org, Mar 8 2018

Labels: -merge-approved-66 merge-merged-3359
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8cd23d620edef7eab5c3c04c3bc7d0b284328740

commit 8cd23d620edef7eab5c3c04c3bc7d0b284328740
Author: Raymond Toy <rtoy@chromium.org>
Date: Thu Mar 08 18:44:17 2018

Shortcut Biquad computations if values are constant

Check to see if the Biquad AudioParam values are all constant for the
duration of the render.  If they are, we can skip computing the filter
coefficients for the render and just compute the one constant value.

Using https://jsfiddle.net/8vo0gbav/87/ as a test, CPU usage is
reduced from 35-38% down to 25-28% CPU.  The bulk of the change is
from calling BiquadDSPKernel::UpdateCoefficients less often. It was
accounting for 17.8% but is now just 2%. (According to perf top on
linux.)

Bug:  813504 
Change-Id: I1e58dc433f7c03d4e0daa13a195b90328e03312a
Reviewed-on: https://chromium-review.googlesource.com/951628
Reviewed-by: Hongchan Choi <hongchan@chromium.org>
Commit-Queue: Raymond Toy <rtoy@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#541227}(cherry picked from commit 4d80299ac53b23961a5b4dc3c2a247debdd187f0)
Reviewed-on: https://chromium-review.googlesource.com/955678
Reviewed-by: Raymond Toy <rtoy@chromium.org>
Cr-Commit-Position: refs/branch-heads/3359@{#104}
Cr-Branched-From: 66afc5e5d10127546cc4b98b9117aff588b5e66b-refs/heads/master@{#540276}
[modify] https://crrev.com/8cd23d620edef7eab5c3c04c3bc7d0b284328740/third_party/WebKit/Source/modules/webaudio/BiquadDSPKernel.cpp

Labels: Needs-Feedback
As we are unable to reproduce this consistently as per the test in Comment#2

andrew.macpherson@: Could you please help us in verifying this on the latest canary.
You can download the same from here
https://www.google.com/chrome/browser/canary.html

Thanks!
Hi vamshi.kommuri@, I can confirm that this is fixed as discussed with rtoy in the latest Canary 67.0.3366.0. Thanks!

Comment 42 by rtoy@chromium.org, Mar 9 2018

Status: Verified (was: Assigned)
The merge has also landed (and stuck), and Andrew has confirmed this fixes the immediate issue.  (There are still some others but they require spec changes.)

Closing.

Sign in to add a comment