decodeAudioData() cuts up to 55 ms of mp3-file's sound
Reported by
bup...@gmail.com,
Mar 4 2016
|
||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36 Example URL: Steps to reproduce the problem: 1. Take any mp3-file 2. Decode it with decodeAudioData() 3. Compare duration of original mp3 with decoded PCM What is the expected behavior? Original mp3-file and decoded PCM should have the same duration What went wrong? In an attached picture you can see difference between decoding wav and mp3. In case of wav-file decodeAudioData() preserves duration of sound, but for most mp3-files it's cut first 10-50ms. Did this work before? N/A Is it a problem with Flash or HTML5? N/A Does this work in other browsers? N/A Chrome version: 48.0.2564.116 Channel: stable OS Version: 10.0 Flash Version: Shockwave Flash 20.0 r0
,
Mar 4 2016
Please provide the original pcm sample. An shorter example would be nice too. Finally, what is the context sampleRate?
,
Mar 4 2016
chcunningham@ since it may be mp3 issues.
,
Mar 10 2016
bupaev@, can you upload a short repro mp3 file?
,
Mar 11 2016
Ok, I've attached shorter examples with variable bitrate* Sample rate always 44100. *mp3 file was created by Sound Forge 11.
,
Mar 11 2016
Thanks for the examples. I loaded both of these with decodeAudioData on Chrome 49 beta on linux. The resulting buffers have exactly the same length (46080). The mp3 files seems to be missing the first 12 samples or so. Otherwise they seem identical (from a quick inspection of a few samples). avprobe says the wav file is 1.04 sec long and the mp3 file is 1.07 sec. I think avprobe only produces an estimate of the duration for the mp3 file. Please describe exactly what you see.
,
Mar 14 2016
Same result with decodeAudioData on Windows - 46080 samples after decoding. But Audacity shows that original mp3 contains 47232 samples. Also I've compared mp3 and wav in Garage Band, result is different: duration for both files is the same, but decoded sound is shifted, it looks like sound was cut at the start.
,
Mar 14 2016
The difference in lengths is 1152, which is about the size of the encoding delay that mp3 encoders introduce at the beginning of the file. (See http://lame.sourceforge.net/tech-FAQ.txt). Recent (last couple of years?) versions of ffmpeg will remove this delay under certain conditions such as if the metadata says it's there or if the encoder was LAME. I think. mp3 encoders also append a bunch of zeroes; ffmpeg doesn't currently remove those, I think. There are other issues with decodeAudioData, but in this case I think decodeAudioData is doing what it's intended to do.
,
Mar 14 2016
dale, can you take a look?
,
Mar 14 2016
=> wolenetz to look at during next ffmpeg roll.
,
Mar 14 2016
,
Mar 15 2016
Ok, but how can I predict the behavior of decodeAudioData for various kinds of files? For some files decodeAudioData cuts beginning, for some doesn't. For my task a preservation of the sample accuracy is critical.
,
Mar 21 2016
Renaming Blink>Audio to Blink>Media>Audio for better characterization
,
Mar 25 2016
@bupaev: I think fundamentally you can't predict unless you carefully control how the files are encoded and the browser you're running on. For compressed audio, ogg files come closest to consistency that I've seen, but not all browsers support Ogg. AFAIK, the only consistent and portable format is (uncompressed) WAV.
,
Aug 9 2016
,
Nov 2 2016
,
Feb 7 2017
Possibly related issue: https://bugs.chromium.org/p/chromium/issues/detail?id=689334
,
Oct 23 2017
Hmm. This one's been around for quite a while. Dale, can you take a look during your M-64 FFmpeg roll?
,
Oct 23 2017
->rtoy@ to make the judgement call here. Both this and issue 689334 I think should be marked as WontFix. We've long since decided that if the track has a Xing/Info header we will strip encoder delays. There should be no complaints about this, the encoder delay is not present in the source audio it's injected by the encoder to deal with DCT warmup. Both MP3/AAC transforms depend on previous input so the injected silence ensures the decoded result is always the same. As an experiment try not feeding the first two packets to any decoder and observe that you get different results than with the silence injected.
,
Oct 23 2017
,
Oct 25 2017
Closing as WontFix (WAI). Now that we decode the entire file without using the estimated duration, we do not want to add any additional further heuristics on what to do. Heuristics will surely be wrong for someone's example. We want to use whatever ffmpeg returns. If you find another example that you think is wrong, please file a new bug. |
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by b...@chromium.org
, Mar 4 2016