New issue
Advanced search Search tips

Issue 689334 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

WebAudio not respecting audio duration

Reported by stoyanov...@gmail.com, Feb 7 2017

Issue description

Chrome Version       : 56.0.2924.87
OS Version: 10.0
URLs (if applicable) :
Other browsers tested:
  Add OK or FAIL after other browsers where you have tested this issue:
     Safari 5: 
  Firefox 4.x: Sorta, buggy though.
IE 7/8/9/Edge: OK

What steps will reproduce the problem?
1. Download attachment
2. Extra files to a testing environment
3. Open index.html in your web browser
4. Open developer tools and set breakpoint on line 33 "playAudio(buffer)"
5. Press "Load Track1"
6. Examine buffer.duration (take note of duration and length)
7. Reload page
8. Press "Load Track2"
9. Examine buffer.duration (take note of duration and length)
10. Download + Open Audacity
11. Review duration of both track1.mp3 and track2.mp3

What is the expected result?
Chrome should respect the tracks actual duration

What happens instead of that?
Chrome somehow determines these two tracks are the exact same length. One track bitrate is VBR208, the other is 320.

Please provide any additional information below. Attach a screenshot if
possible.

UserAgentString: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36

Chrome should respect the tracks bitrate. The duration of these files are important in most audio applications, such as the start of audio based of a marked region from external applications and audio data should not be inconsistent.

 
webaudiotest.zip
16.9 MB Download

Comment 1 Deleted

Comment 2 by rtoy@chromium.org, Feb 7 2017

Components: Blink>WebAudio
Status: Available (was: Unconfirmed)
I think it's not the bitrate but the estimate duration.  WebAudio uses the estimated duration to allocate space for the decoded audio.  For whatever reason the estimate duration (from ffmpeg) is the same.

avprobe shows the duration as 4:40.81 and 4:40.84.
I'm finding articles and previous discussions about ffmpeg group choosing to remove this padding, however this now causes inconsistencies with external applications that follow the same standard.

Two options need to be presented:
1) Ability to set a flag via webaudio API to get the correct duration and frames (no trimming)
2) Ability to switch out ffmpeg for libav as avconv does not strip this padding.

I cannot stress how important it is to remain consistent with audio. You cannot expect large/popular/industry standard applications to adapt to ffmpeg executive decision to removing this padding.

Comment 4 by rtoy@chromium.org, Feb 8 2017

Cc: dalecur...@chromium.org
+dalecurtis

Option 1 requires a change in the spec.  File an issue against the spec.
Option 2 I find that highly unlikely, but I defer to dalecurtis@ on that.

I don't understand what "remain consistent with audio" means.  What is "audio".

It is unlikely that Chrome's implementation of WebAudio will do anything other than return whatever ffmpeg returns for the decoded audio.
Option 1) Where to file against the spec?
Option 2) Yes, very unlikely but still an option.

The reason I am filing the issue with WebAudio is mainly because I am not sure how Chrome implements this estimated duration. The same implementation works with Firefox and Internet Explorer, why does Chrome aim to not be consistent with other browsers?


Re: Consistency
Chrome (or ffmpeg, not sure who or where this issue should fall) and WebAudio do not render the same information as Pro-Audio applications (for example):
- Adobe Audition
- Avid Pro Tools
- Sony Sound Forge

Though these applications don't encode, the common libraries used by them do, thus pro-audio editing applications render the correct lengths.

For what reason should Chrome not use the same duration?

Comment 6 by rtoy@chromium.org, Feb 8 2017

For spec issues, see https://github.com/WebAudio/web-audio-api/issues

One issue with Chrome's implementation was that it was using an estimated duration (obtained from ffmpeg) and using that as the actual size.  In some cases, the estimated duration is actually shorter than the real duration.  Chrome wouldn't update the decoding to include the true length so that the decoded data was less than what might be expected.  

This particular issue is being worked on ( issue 673782 ).  However, I tested the new implementation against your test case and the decoded audio buffers have exactly the same length.  I do not know why.

Should I push an issue for spec or do we believe this is an issue with decoded audio buffers or intended ffmpeg functionality of removing padding?

Comment 8 by rtoy@chromium.org, Feb 8 2017

The spec issue is up to you. My guess is that you'll get resistance because it's pretty hard to specify.  For example, Chrome used to use Android's MediaCodec framework.  It always removed the pre-roll and trailing frames which resulted in audio much shorter on Android than on desktop. And there's no way to control that. (This is now fixed by using ffmpeg on Android and desktop.)

Certainly some of the issues are with Chrome's decoding.  

As for ffmpeg, I don't know.  We did ask quite a while go for ffpmeg to remove these padding frames and at the time they said under certain conditions[1] they could remove the pre-roll, but not the trailing frames. I don't know what has happened since then.

[1] Something like if the information were in the headers or if the encoder could be identified (LAME).
Isn't this the same technical issue you're fixing right now rtoy@? Essentially that we truncate based on duration?

WebAudio should never care about metadata duration since it's a synchronous API, instead it should return an exact duration based on the sample count post decodeAudioData().

Is your complaint that ffmpeg is removing the padding without your say so?

Comment 10 by rtoy@chromium.org, Feb 8 2017

Yes, it's about truncation, but with the in-progress CL the repro case still produces exactly the same duration for both files.  avconv to convert each of the test files to wav files give different lengths, in line with the estimated duration.  I have not figured out why webaudio thinks they're the same length.
The second file is encoded with LAME, this is a common encoder and ffmpeg might be removing the padding thus making the duration shorter by <1 second. Would explain the difference between ffmpeg and avconv.

Comment 12 by rtoy@chromium.org, Feb 27 2017

The fix for  issue 673782  has landed and is available in Chrome canary.  With this fix, the estimated duration is not used; the entire file is read and decoded.  With Chrome canary, both test files have exactly the same reported duration: 280.81632653061223 sec or 12384000 frames
rtoy,

Thanks for the update.

Just to confirm, the expected result should be the shorter duration? This causes major inconsistencies with other browsers as they do not remove the padding at the beginning of the tracks. Neither IE or firefox remove the added silence, this causes different experiences for users.

Generally a web developer will not know what LAME encoder is doing, how these files are encoded, and that FFMPEG intentionally removes the padding.

Is the expected result here to have inconsistent audio experiences because of an executive decision from Chrome?
FWIW, this is the first request we've had to _not_ remove the silence -- which used to be our default. Every request up to this point was, "why isn't silence being stripped?" and "silence is causing gaps in my tracks".
I can see how this would be frustrating to those who expected the silence to be stripped, though I believe any rendering of this audio shouldn't be modified outside of the professional composing/editing software.

If a user wanted to detect silence, couldn't they check the PCM data at the beginning of an audiobuffer? 

Comment 16 by rtoy@chromium.org, Feb 27 2017

This is not an executive decision by Chrome.  It's what ffmpeg does and I, for one, am very reluctant to do anything different from whatever ffmpeg does.

IIRC, the beginning is not truly silent.  There is a ramp up in the pre-roll portion of the file, so detecting silence isn't going to work.

I wrote this simple demo long ago:  rtoy.github.com/webaudio-hacks/codec-tests/plot-audio.html.  Run this in various browsers (and be sure your audio hardware is set to 44100 kHz since the files assume that).  The original source was a square wave with exactly 44100 samples.

With Chrome canary, the mp3 128kbps (FFmpeg) file is exactly 44100 frames long.  That's nice.  FF 51 gives 44399 frames.  Don't know why it's 299 frames longer than the source; I was expecting something longer than that.

Now look at mp3 vbr 128kbps (iTunes).  Chrome has 47232 frames and didn't appear to remove any pre-roll.  FF only produces 45551 frames and it appears as if it did remove some pre-roll frames (based on comparison between Chrome and FF waveforms). I'm surprised by that.

I think this is a big mess and I do wish it were more consistent. I just don't really know how to achieve that everywhere.

Sorry rtoy about delay.

With this becoming more clearly defined, do we have any ideas for guidance on how to collectively come together and find a standard?

With Chrome, is ffmpeg compiled with LAME enabled? Does the sample size change with LAME disabled?

Comment 18 by rtoy@chromium.org, May 10 2017

dalecurtis: Do you know if LAME is enabled?
No, we use ffmpeg's built in decoder, not the liblame variant.

Comment 20 by rtoy@chromium.org, Jul 19 2017

Just wanted to update this thread with a little more information.  A little while ago, we updated Chrome's decodeAudioData to decode the entire file instead of using an estimated duration which was sometimes short and therefore lopped off things from the end.

The demo itself was also updated to use an offline context at 44100 Hz so it's independent of the sample rate for the online context, so no interpolation is done on the AudioBuffer.

Rerunning the demo from c#16, we can see that the only encoded files that that decode with a length of 44100 are the WAV encode files, the FFMpeg-encoded files, and the FLAC encoded files.

Everything else has a length exceeding 44100.

Comment 21 by rtoy@chromium.org, Oct 25 2017

Status: WontFix (was: Available)
dalecurtis@ has an important point in  issue 519960 : https://bugs.chromium.org/p/chromium/issues/detail?id=591960#c19, and my follow-up comment in https://bugs.chromium.org/p/chromium/issues/detail?id=591960#c21.

Since the duration that ffmpeg returns is an estimate, and since it is sometimes too short (and occasionally very much shorter!), we do not use the estimated duration any more.  We decode the entire file.

Closing as WontFix (WAI).

Having said that, I do wish it were more consistent, but it's not really clear what the right answer would be for all the test cases I have and for the examples you provide.

Sign in to add a comment