Issue metadata
Sign in to add a comment
|
AudioWorklet Glitches: Linux/PulseAudio
Reported by
mossblaser@gmail.com,
Mar 26 2018
|
||||||||||||||||||||
Issue descriptionChrome Version : 67.0.3377.1 (Official Build) dev (64-bit) OS Version: Ubuntu 16.04 (LTS) x86_64 Other browsers tested: n/a (Worklets not available elsewhere) What steps will reproduce the problem? 1. Open the attached page (under Linux/PulseAudio) 2. Upon opening the page a single "click" should be heard as the audio worklet in the page sends a DC signal to the output (i.e. a constant stream of '1.0' values). What is the expected result? Silence (after the initial click) What happens instead of that? Every now-and then (around once a minute or so) a 'click' will occur as samples from the AudioWorklet are dropped. Please provide any additional information below. Attach a screenshot if possible. PulseAudio version 1:8.0-0ubuntu3.8 Sound card: Realtek ALC3235 The same page under OS X does not produce clicks/glitches. Changing the worklet code to produce a sine wave produces the same effect, though more annoying to listen to than the silence produced by the attached example. UserAgentString: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3377.1 Safari/537.36
,
Mar 26 2018
,
Apr 9 2018
,
Apr 9 2018
Probably the same as issue 817314.
,
Apr 10 2018
> Probably the same as issue 817314. Is this issue currently private or am I doing something silly trying to view it? (I get "User is not allowed to view this issue"). If this issue is private, would it be possible to give a brief description? (Thanks!)
,
Apr 10 2018
Oh right, that talks about an internal Linux distribution, so it's private, but there's speculation that it's due to slow fork(), see issue 819228. Which probably impacts this bug as well.
,
Apr 10 2018
Though... now that I say that, you've mentioned that this happens relatively immediately and not after some time, so this may be a different issue then. What do you have as the value for this histogram? chrome://histograms/Media.LinuxAudioIO
,
Apr 11 2018
Histogram: Media.LinuxAudioIO recorded 1 samples, mean = 0.0 (flags = 0x41) 0 ------------------------------------------------------------------------O (1 = 100.0%) 1 ...
,
Apr 11 2018
Thanks, it's definitely using PulseAudio then. +appropriate folks and labels.
,
Apr 11 2018
The build 67.0.3377.1 does not require the experimental flag. The AudioWorklet is enabled by default. A trace from `chrome://tracing` would help. (record categories: audio, blink, blink_gc, webaudio)
,
Apr 11 2018
mossblaser@ could you do a trace recording [1] for "audio", "webaudio", "top" categories in a continuous mode (please stop it right after a glitch is heard). Thanks! [1] https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs
,
Apr 11 2018
Passing to hongchan@
,
Apr 11 2018
,
Apr 12 2018
Please find the trace attached. Thanks for looking into this! NB: Following hongchan@'s advice I've restarted chrome without the audio worklet flag.
,
Apr 13 2018
The trace shows that the audio clock from the infra (AudioOutputDevice) is quite irregular. The task in the WebAudio thread is being finished way earlier than the next clock. I blame PulseAudio's unstable clock and I don't think this is a WebAudio issue. olka@ WDYT?
,
Apr 16 2018
AudioOutputDevice callback irregularity look a bit weird - agree. Another reason for glitches could be the FIFO underflow, which does no seem to be traced (worth adding that tracing?) mossblaser@ could you reproduce the problem, then close the tab, open chrome://histograms/ and look up WebAudio.PushPullFIFO.UnderflowPercentage WebAudio.PushPullFIFO.UnderflowGlitches there - what are the values? Thanks!
,
Apr 16 2018
A debug recording (chrome://webrtc-internals > Create Dump > Enable diagnostic audio recordings) would help figuring out if the issue is before or after the data reaches pulseaudio.
,
Apr 16 2018
(Taking another looks at the traces, it's unlikely an underflow.) mossblaser@ - is other audio in Chrome working well for you? (like media element playback) And what default sample rate the device is configured with?
,
Apr 17 2018
> mossblaser@ could you reproduce the problem, then close the tab, open
> chrome://histograms/ and look up
> WebAudio.PushPullFIFO.UnderflowPercentage
> WebAudio.PushPullFIFO.UnderflowGlitches
> there - what are the values?
Histogram: WebAudio.PushPullFIFO.UnderflowPercentage recorded 5 samples, mean = 0.0 (flags = 0x41)
0 ------------------------------------------------------------------------O (5 = 100.0%)
1 ...
Histogram: WebAudio.PushPullFIFO.UnderflowGlitches recorded 5 samples, mean = 1.0 (flags = 0x41)
0 O (0 = 0.0%)
1 ------------------------------------------------------------------------O (5 = 100.0%) {0.0%}
2 O (0 = 0.0%) {100.0%}
> A debug recording (chrome://webrtc-internals > Create Dump > Enable diagnostic
> audio recordings) would help figuring out if the issue is before or after the
> data reaches pulseaudio.
See attached gzipped file recorded using the above tool. The hexdump below seems to show the glitch occurring towards the end of the file:
$ hexdump glitch.wav
0000000 4952 4646 2824 0439 4157 4556 6d66 2074
0000010 0010 0000 0001 0002 ac44 0000 b110 0002
0000020 0004 0010 6164 6174 2800 0439 7fff 7fff
0000030 7fff 7fff 7fff 7fff 7fff 7fff 7fff 7fff
*
4182820 7fff 7fff 7fff 7fff 7fff 7fff 0000 0000
4182830 0000 0000 0000 0000 0000 0000 0000 0000
*
4183020 0000 0000 0000 0000 0000 0000 7fff 7fff
4183030 7fff 7fff 7fff 7fff 7fff 7fff 7fff 7fff
*
439282c
> mossblaser@ - is other audio in Chrome working well for you? (like media
> element playback)
It seems to be; I've not noticed any other problems while running other web audio demos which don't use the audio worklet function.
As an additional test I've tried playing a one hour 44.1KHz WAV file of the same 'DC' (always 1.0) signal with an <audio> tag on an otherwise empty page. I've not heard this glitch.
> And what default sample rate the device is configured with?
According to `pacmd list-sinks` (the 'sample spec' section): 48KHz
The web audio context (and recorded wav) are both 44.1KHz sampled. According to `pacmd list-sink-inputs`, Chrome is sending 44.1KHz audio to Pulse Audio.
,
Apr 17 2018
Thanks mossblaser@! Yes, the glitch is clearly heard on the recording => it's not PulsAudio problem. WebAudio.PushPullFIFO.UnderflowGlitches registered underflows. The recording shows 512 bytes of zeroes (as maxmorin noticed), and the buffer is zeroed on the underflow. Though it's not clear from traces what an underflow could be caused by.
,
Apr 19 2018
re #20: > WebAudio.PushPullFIFO.UnderflowGlitches registered underflows. Our metric had some issues and the collected data was incorrect. The issue is fixed and waits for the M67 merge.
,
Apr 19 2018
,
Apr 19 2018
,
Apr 23 2018
The NextAction date has arrived: 2018-04-23
,
Apr 23 2018
Like I mentioned above, the historgram was incorrect and I still think this is the audio infra issue. Somehow it is triggered by the activation of AudioWorklet, but the WebAudio can't do anything about the irregular audio callback. Perhaps the thread contention could be the root cause? By using AudioWorklet, it adds one more thread on top of the AudioDeviceThread. Somehow this makes the PulseAudio callback timing unstable?
,
Apr 23 2018
There is no thread contention seen in the trace recording. Basing on the diagnostic debug recording, the glitch happens before audio reaches PulseAudio layer.
,
Apr 23 2018
,
Apr 23 2018
,
Apr 27 2018
,
May 4 2018
The NextAction date has arrived: 2018-05-04
,
Jul 16
OK, so I've been doing a little more digging into this issue and it seems that the problem is related to PulseAudio and, depending on your point of view, might either be a Chrome issue or a Pulse issue. Callbacks from pulse audio do not arrive at especially regular intervals (as the equivalent callbacks do on macOS for instance). The following histograms show inter-callback time measurements for both platforms executing a WebAudio application. * linux-interrupts.png * mac-interrupts.png Timings extracted from AudioDestination::Render trace events. The vertical bar shows the notional 'correct' inter-callback time for the sample rate/buffer size used in either platform -- 512 frames for Linux, 256 on macOS. This jittery timing has the undesired effect of causing buffer underruns when two callbacks arrive in quick succession as illustrated in the attached figure: * underrun.png Obviously this doesn't occur if your audio pipeline is doing very little (since the processing will be quick enough to still never be caught out) but in this example the quantity of processing is fairly reasonable (about 50% utilisation of the realtime deadline). I'm not a PulseAudio expert but judging by CreateInputStream() in media/audio/pulse/pulse_util.cc it doesn't seem that PulseAudio provides an API which provides guaranteed interrupt timing (and that what has been achieved has been done so experimentally). It seems that maybe: * Pulse audio is broken for not providing well timed callbacks or * Chrome should attempt to behave better in the presence of jittery callbacks. Any thoughts?
,
Jul 16
Thanks for the nice histograms! Pulseaudio has always behaved this way for me on linux. The callbacks have never been very regular and this makes it difficult for webaudio to work well. Historically this was managed by making webaudio run 8 renders (of 128 frames each) at a time so that there's a bit more averaging possible. At some time this was changed to 4 renders and it's worked well. A long time ago, I used to have a custom .asoundrc file to control Alsa and this worked really well and I could use buffer sizes of 256 (2 renders)---the same as on Mac. Perhaps you could create your context with latencyHint = 'playback' or maybe latencyHint = 1024 / fs, where fs is the sample rate of your hardware. (To do this, you might have to create a junk AudioContext to get the sampleRate of the hardware and use that to create a new AudioContext that will be used. The junk one can be discarded, perferably by call context.close() first.) This is just a workaround. It would be really nice if we could use something else besides pulse audio.
,
Jul 17
Hello there, Thanks for your workaround suggestion; it certainly does help a great deal but sadly PulseAudio seems to have another trick to play which scuppers it: in this mode it will even emit callbacks with almost no pause between them (see trace below). * back-to-back-pa-calls.png As the histogram in the previous comment hints, this also did happen with the shorter buffer sizes, though not so often. I put together a small test program using the rather easy-to-use PortAudio library (which internally will default to using PulseAudio on Linux) to see if I could reproduce the problem on a smaller scale. Sure enough the same thing happens. Interestingly, in this test near-zero delays between callbacks where much more common (see yet-another histogram below): * portaudio.png My first impression is that PA's callbacks are subtly different from what Chrome is expecting. In Chrome, the "pa_stream_set_write_callback" callbacks seem to be treated as a hard-realtime callback (see media/audio/pulse/pulse_util.cc again) while in the PA documentation (https://freedesktop.org/software/pulseaudio/doxygen/stream_8h.html#a2dcc985c65964da290a0c2e1bf103175) this callback is worded as: > Set the callback function that is called when new data may be written > to the stream. I.e., the callback is produced when buffer space is available rather than when it is strictly necessary to fill it. Indeed, PA's pa_stream_write API (https://freedesktop.org/software/pulseaudio/doxygen/stream_8h.html#a4fc69dec0cc202fcc174125dc88dada7) is asynchronous and need not be called from the callback. PA is fairly liberal with its internal buffering and light in its guarantees meaning that it probably isn't expected or intended that buffer-availability callbacks would be a reliable isochronous callback source. Incidentally, the same issue (PA issuing back-to-back callbacks) doesn't cause problems for non-worklet using code. For example, see: * non-glitch.png Which shows back-to-back callbacks occurring on a WebAudio pipeline using only native nodes. As you can see, the later PA callback is essentially stalled while the previous one executes and no samples end up being dropped. I presume this is an artefact resulting from the audio processing taking place in a separate thread when AudioWorklets are in use (rather than blocking whichever thread is handling the PA callbacks). Perhaps the same blocking behaviour could be introduced here?
,
Jul 17
I went digging through our PA code and found we use PA_STREAM_ADJUST_LATENCY, which may be the cause of the uneven callbacks. The code for setting up the stream is at https://cs.chromium.org/chromium/src/media/audio/pulse/pulse_util.cc?l=478, and hasn't been touched for 5 years as far as I can see. The flags are defined at https://freedesktop.org/software/pulseaudio/doxygen/def_8h.html#a6966d809483170bc6d2e6c16188850fc. There's a PA_STREAM_EARLY_REQUESTS flag which may be of interest to us? If someone wants to make changes to the Pulse code, we would first need stats for the number of underruns reported by Pulse (see issue 864463) to make sure the changes doesn't negatively impact normal playback.
,
Jul 17
> There's a PA_STREAM_EARLY_REQUESTS flag which may be of interest to us? Sounds hopeful! I've tried this in a simple PulseAudio hello-world type program and it does seem to give nice regular callbacks! This said, is this necessarily the best approach? In particular, allowing Pulse to control latency for non-latency critical applications (e.g. simple playback) would potentially enable greater battery efficiency. Also, is it correct for Chrome to 'take control' by dropping samples itself in under run situations? When worklets are not used, AudioDestination::Render (third_party/blink/renderer/platform/audio/audio_destination.cc) will block for as long as it takes for the audio processing pipeline to occur. When audio worklets are used, this function returns immediately and underruns in the internal PushPullFIFO will result in silence being sent to the audio subsystem if callbacks arrive too soon. Would it not be better (at least in principle) to be consistent and let the external audio library decide what to do about under runs? In this case is it safe to allow user code to block the AudioIO thread on the user-supplied Javascript code be "safe"?
,
Jul 17
mossblaser@ Thanks so much for looking into this issue. I wrote the most of AudioWorklet code in WebAudio and I think you're pointing the right direction. I do have few questions and perhaps you and maxmorin@ can help finding answers: > Would it not be better (at least in principle) to be consistent and let the external audio library decide what to do about under runs? How is this possible? By delivering a prematurely rendered block, the audio subsystem handles it properly? What if this "handling" policy is different across the platforms? > In this case is it safe to allow user code to block the AudioIO thread on the user-supplied Javascript code be "safe"? AudioWorkletThread is deliberately detached from the audio device thread for this reason. 1) V8 requires async task runner which the audio thread doesn't provide and 2) the user code MUST not run on the high priority thread. Here's the relevant discussion: https://groups.google.com/a/chromium.org/d/msg/platform-architecture-dev/EnlQMTRwyrw/nmdihrCSAwAJ
,
Jul 17
> > Would it not be better (at least in principle) to be consistent and let the external audio library decide what to do about under runs? > > How is this possible? By delivering a prematurely rendered block, the audio subsystem handles it properly? What if this "handling" policy is different across the platforms? For non-worklet audio processing, the callback from the underlying audio library doesn't return until the processing completes. If the processing takes too long, the underlying audio API has to decide what to do about this underrun and behave accordingly. In the case of PulseAudio, specifically, the callbacks provided to Chrome have slightly unconventional semantics. Rather than saying "I need the next N bytes of audio, NOW!" this callback instead simply indicates "I now have at least N bytes of buffer space available". Since PulseAudio's normal mode of operation involves rather large buffers, this enables considerable flexibility on when the audio is actually written to the buffers. Indeed, the process of writing to PulseAudio's buffers happens asynchronously (i.e. it doesn't have to be done during the callback). > > In this case is it safe to allow user code to block the AudioIO thread on the user-supplied Javascript code be "safe"? > > AudioWorkletThread is deliberately detached from the audio device thread for this reason. 1) V8 requires async task runner which the audio thread doesn't provide and 2) the user code MUST not run on the high priority thread. That sounds very reasonable, though I was thinking along slightly different lines. Rather than running the audio processing in the Audio thread, run it elsewhere but put the audio thread to sleep until the rendering process has completed. Of course, there could be potential for priority inversion, but if this were avoided, all this would allow unruly Javascript to do would be to block the real time audio thread, not abuse its priority. With that thread stalled, we'd leave it up to the underlying audio library to handle underruns, as it is presently if you build a heavier duty native-node-only audio pipeline than the client can handle. From skimming the discussion it sounds like a major concern was preventing Javascript from eating CPU in a high-priority thread? If so, if priority inversion can be prevented (and the audio thread being blocked indefinitely doesn't cause other problems) this approach feels to me like it should work. As an aside, this approach could accidentally benefit other platforms where very occasional and small underruns (e.g. due to GC) might just get absorbed by the platform's own buffering (Android, I hear, typically has rather a lot of audio buffering!).
,
Jul 17
> Rather than running the audio processing in the Audio thread, run it elsewhere but put the audio thread to sleep until the rendering process has completed. In the worklet rendering mode, the audio processing (WebAudio rendering) happens on the worklet thread instead of the actual audio device thread. I think that's what you meant, but just wanted to make sure we're on the same page. > (and the audio thread being blocked indefinitely doesn't cause other problems) This is already a problem for Mac OS. IIUC, the stalling AudioOutputDevice thread may cause the glitch in the entire audio mixer. In other words, the audio from other tabs might suffer because of a tab with bad AudioWorklet code. I think maxmorin@ can offer a better answer on this front. > As an aside, this approach could accidentally benefit other platforms where very occasional and small underruns (e.g. due to GC) might just get absorbed by the platform's own buffering (Android, I hear, typically has rather a lot of audio buffering!). I do not know how stalling/sleeping thread idea will pan out, but I am interested in ideas that we can get out of this discussion. For the real-time priority tread issue, we have the issue 813825.
,
Jul 18
> In the worklet rendering mode, the audio processing (WebAudio rendering) happens on the worklet thread instead of the actual audio device thread. I think that's what you meant, but just wanted to make sure we're on the same page. That is correct. My idea was to use some mechanism to allow the worklet thread to cause the audio device thread to block until the worklet thread has completed its work (in lieu of the work actually happening in the audio device thread). > In other words, the audio from other tabs might suffer because of a tab with bad AudioWorklet code. That doesn't sound ideal... Though I suppose the same would currently also be true of tabs running very large audio processing pipelines (e.g. HRTF-panning many nodes). With that said, arguably if one tab is producing glitching audio, the glitch will interrupt the listening experience of everything else on the system regardless since it will make a popping noise. > For the real-time priority tread issue, we have the issue 813825. Indeed, that along with issue 836306 remain as hard problems to iron out. That said, I must emphasise how much I appreciate the fantastic work which you've done on Audio Worklets -- even in this early state it is very clear this is another watershed feature for the web.
,
Jul 19
As a quick experiment I've rebuilt Chrome with the attached patch which uses a WaitableEvent to make the realtime audio thread block until the worklet thread finishes processing.
My informal tests under Linux seems to indicate this eradicates the spurious glitches when the audio processing runs well within the deadline. If audio processing systematically takes too long, as expected glitches are heard and SyncReader timeouts like the one below are logged:
[10244:10705:0719/151629.073191:WARNING:sync_reader.cc(188)] SyncReader::Read timed out, audio glitch count=80
But things don't seem to crash or crawl to a halt.
Further, if I put an infinite loop in the AudioWorklet's process callback (e.g. while (true) {}) similar messages are printed followed by:
[10244:10962:0719/152324.934017:WARNING:sync_reader.cc(170)] ASR: No room in socket buffer.: Resource temporarily unavailable (11)
Meanwhile aside from a CPU being pegged the browser remains responsive. Closing the tab and loading up another one with a normally functioning worklet results in audio playing successfully again.
I believe (but I am too far out of my comfort zone to be sure) that priority inversion will not occur with this solution. My understanding is that WaitableEvents cannot cause a priority inversion since the OS has no way to know which thread will eventually send the event.
What are folks' thoughts?
,
Jul 19
Re 35: I certainly agree we should be careful about where we solve this problem (code interacting with PA or WebAudio), especially since clients other than WebAudio handles the situation well. Regarding blocking, the situation is a bit complicated in Chrome. For security reasons, JS and access to hardware cannot be in the same process, so the audio data is pulled from the renderer (where WebAudio is) to the browser (which in turn hands the audio over to Pulse). The browser blocks up to a certain time which depends on the buffer size chosen and the platform (https://cs.chromium.org/chromium/src/services/audio/sync_reader.cc?l=56). Most platform APIs come with disclaimers like "never ever do anything blocking in the data callback", so Chrome is already in deep waters. Underrunning and letting the platform handle it could have unexpected consequences, e.g. on Mac, having one stream blocked will block all the other streams using the same device as well (sometimes at least?), and on Chrome OS underruns leads to garbage being played (which sounds worse than playing 0). > With that said, arguably if one tab is producing glitching audio, the glitch > will interrupt the listening experience of everything else on the system > regardless since it will make a popping noise. An audiocontext used for something like UI sounds can disrupt WebRTC audio processing. In such a case, the consequences of blocking are much more severe than a small glitch, so we really want to make sure we don't block indefinitely. It would be fine for WebAudio to block until the deadline given by the browser though (like in comment 40). Regarding thread priorities, I re-read the discussion, and it seems like people would be fine with allowing RT priority as long as we have a mechanism to limit the percentage of cycles that can be consumed by RT threads. Most platforms have some sort of limit on what thread priorities a normal user can set though (stock Linux kernel means no elevated priority at all without extra setup IIRC).
,
Oct 16
,
Nov 8
,
Nov 8
,
Nov 14
+ Oscar who started to look into WebAudio.
,
Yesterday
(47 hours ago)
|
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by mossblaser@gmail.com
, Mar 26 2018