New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 457068 link

Starred by 22 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug


Show other hotlists

Hotlists containing this issue:
Hotlist-1


Sign in to add a comment

Web Speech API recognition stutters

Project Member Reported by jdimm@google.com, Feb 10 2015

Issue description

Chrome Version       :  40.0.2214.109
URLs (if applicable) :http://www.google.com/intl/en/chrome/demos/speech.html
Other browsers tested:
Add OK or FAIL after other browsers where you have tested this issue:
Safari 6:
Firefox 20:
IE 7/8/9/10:

What steps will reproduce the problem?
1. Go to the Google demo page for speech recognition
2. Click the mic icon, and the allow message from Chrome
3. Say "  "okay, can you hear me".  

What is the expected result?

ASR output should be:
  Okay can you here me

What happens instead?

And the ASR output is:
  Okayokay canokay can youokay can you hearokay can you hear meokay can you hear meokay can you hear me


Please provide any additional information below. Attach a screenshot if
possible.

Here is a Google drive link of a video showing the problem, shared with everyone at Google:

https://drive.google.com/a/google.com/file/d/0ByMcm9qszOWVdnVFS1g1WE5yWWc/view?usp=sharing

 
Labels: Hotlist-Google
This is an automated update generated by script.

Comment 2 by tkent@chromium.org, Feb 10 2015

Labels: Cr-Blink-Speech
Labels: OS-Android
Labels: Cr-Platform-Apps-Demo Documentation-SampleCode
Status: Available
I can confirm this is a bug in the demo page that only happens for Chrome on Android (verified with a Nexus 6). On desktop the intermediate results are overwriting each other, while on Android they append to the previous text.

Cc: gshires@chromium.org tommi@chromium.org tnakamura@chromium.org
gshires@ - do you know anyone who could look into this?
Cc: mbrophy@google.com
This recently fixed bug in GSA might be related: http://b/19373901
Mark, can you take a look?

It's possible this breakage was due to a change in GSA or a change in Chrome. (Someone could determine which this by re-installing older versions of each.)


The problem is that the partial results are now appear as isFinal in the JavaScript Web Speech API for Chrome on Android (whereas they continue to work fine for Chrome on Desktop).


The same symptoms can be observed when running the W3C Conformance Tests at
https://dvcs.w3.org/hg/speech-api/raw-file/tip/conformance/index.html
1. Click the link "onresult"
2. Check the box "continuous" and check the box "interim"
3. Click the button "click and speak"


The problem is easily seen with this test:
http://src.chromium.org/viewvc/chrome/trunk/src/chrome/test/data/speech/web_speech_api_test.html
1. Check the box "continuous" and check the box "interimResults"
2. Click the button "start()"
3. Say something.
4. Click the button "stop()"

Example of proper output from Chrome on Desktop. (Note that only the last item indicates isFinal.)
3:7:56.661
created reco object
3:7:59.742
start()
3:7:59.742
reco.continuous = true
3:7:59.742
reco.interimResults = true
3:7:59.744
onstart
3:7:59.856
onaudiostart
3:8:1.276
onsoundstart
3:8:1.276
onspeechstart
3:8:2.48
onresult  0:{object,[set (0.009999999776482582)]}
         
3:8:2.118
onresult  0:{object,[say (0.009999999776482582)]}
         
3:8:2.505
onresult  0:{object,[say (0.8999999761581421)]}
         
3:8:2.915
onresult  0:{object,[say (0.8999999761581421)]}
          1:{object,[ I (0.009999999776482582)]}
         
3:8:3.125
onresult  0:{object,[say (0.8999999761581421)]}
         
3:8:3.168
onresult  0:{object,[say (0.8999999761581421)]}
          1:{object,[ a (0.009999999776482582)]}
         
3:8:3.335
onresult  0:{object,[say (0.8999999761581421)]}
         
3:8:3.863
onresult  0:{object,[say (0.8999999761581421)]}
          1:{object,[ a few (0.009999999776482582)]}
         
3:8:4.115
onresult  0:{object,[se fue (0.009999999776482582)]}
         
3:8:4.144
onresult  0:{object,[say a few things (0.009999999776482582)]}
         
3:8:4.449
onresult  0:{object,[say a few things (0.8999999761581421)]}
         
3:8:4.972
onresult  0:{object,(isFinal) [say a few things (0.973739504814148),
                 save a few things (0),
                 say a few things I (0),
                 save a few things I (0)]}
         
3:8:5.974
stop()
3:8:5.976
onspeechend
3:8:5.976
onsoundend
3:8:5.976
onaudioend
3:8:6.217
onend


Example of improper output from Chrome on Android. (Note that all items indicates isFinal.)
3:7:19.228created reco object
3:7:29.452start()
3:7:29.455reco.continuous = true
3:7:29.455reco.interimResults = true
3:7:29.474onstart
3:7:30.107onaudiostart
3:7:30.842onsoundstart
3:7:30.843onspeechstart
3:7:32.487onresult  0:{object,(isFinal) [say (0)]}
3:7:32.557onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
3:7:32.593onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
3:7:32.615onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
3:7:32.735onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
3:7:33.222onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
3:7:33.381onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
         6:{object,(isFinal) [say (0)]}
3:7:33.633onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
         6:{object,(isFinal) [say (0)]}
         7:{object,(isFinal) [say a few (0)]}
3:7:33.881onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
         6:{object,(isFinal) [say (0)]}
         7:{object,(isFinal) [say a few (0)]}
         8:{object,(isFinal) [say a few things (0)]}
3:7:34.869onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
         6:{object,(isFinal) [say (0)]}
         7:{object,(isFinal) [say a few (0)]}
         8:{object,(isFinal) [say a few things (0)]}
         9:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
3:7:37.369stop()3:7:37.946onresult  0:{object,(isFinal) [say (0)]}
         1:{object,(isFinal) [say (0)]}
         2:{object,(isFinal) [say (0)]}
         3:{object,(isFinal) [say (0)]}
         4:{object,(isFinal) [say (0)]}
         5:{object,(isFinal) [say (0)]}
         6:{object,(isFinal) [say (0)]}
         7:{object,(isFinal) [say a few (0)]}
         8:{object,(isFinal) [say a few things (0)]}
         9:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
         10:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
3:7:37.948onspeechend
3:7:37.948onsoundend
3:7:37.949onaudioend
3:7:37.968onend

Comment 7 by mbrophy@google.com, Feb 22 2015

This doesn't look related to b/19373901
Cc: bringert@google.com
Perhaps Bjorn knows who could look into this.

Comment 9 by mbrophy@google.com, Feb 23 2015

If this is broken in a recent version of GSA, please let us know.

Comment 10 by gshires@google.com, Feb 24 2015

Yes, this is broken in a recent version of GSA.
(Although it may be that there is a bug in Chrome that also needs to be fixed.)

On a MotoX2013 running chrome 40.0.2214.109, I reproduced the problem on gsa 4.3.0.85542248.arm. Then I reverted to the factory-version gsa 3.4.16.1149292.arm and it worked as it did before (that is, it doesn't show any interim (aka partial) results).

More specifically. Non-continuous mode (aka S3RecognizerInfo.dictation = false) works properly with both 3.4.16.1149292 and 4.3.0.85542248.

However, Continuous mode (aka S3RecognizerInfo.dictation = true) fails for both 3.4.16.1149292 and 4.3.0.85542248, but the 4.3.0.85542248 is "worse" or at least more noticeable.
Specifically,
3.4.16.1149292 doesn't provide interim results in Continuous mode.
4.3.0.85542248 does provide interim results in Continuous mode, but erroneously marks them as isFinal.

All of the following examples are on Chrome 40.0.2214.109 on Android.
First two are non-continuous mode (correct behavior).
Last two are continuous mode (erroneous behavior).

These are from: http://src.chromium.org/viewvc/chrome/trunk/src/chrome/test/data/speech/web_speech_api_test.html 

Example: gsa 3.4.16.1149292, non-continuous.  (This is correct.)
22:56:32.963created reco object
22:56:39.85start()
22:56:39.86reco.continuous = false
22:56:39.86reco.interimResults = true
22:56:39.94onstart
22:56:39.552onaudiostart
22:56:40.635onsoundstart
22:56:40.636onspeechstart
22:56:41.221onresult  0:{object,[ (0)]}
        22:56:41.436onresult  0:{object,[ (0)]}
        22:56:41.624onresult  0:{object,[ (0)]}
        22:56:41.853onresult  0:{object,[ (0)]}
        22:56:42.79onresult  0:{object,[say a few (0)]}
        22:56:42.348onresult  0:{object,[say a few things (0)]}
        22:56:42.589onspeechend
22:56:42.589onsoundend
22:56:42.590onaudioend
22:56:42.730onresult  0:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
22:56:42.738onend


Example: gsa 4.3.0.85542248, non-continuous.  (This is correct.)
0:5:45.631created reco object
0:5:54.935start()
0:5:54.937reco.continuous = false
0:5:54.937reco.interimResults = true
0:5:55.13onstart
0:5:55.492onaudiostart
0:5:57.238onsoundstart
0:5:57.239onspeechstart
0:5:58.298onresult  0:{object,[ (0)]}
        0:5:58.763onresult  0:{object,[ (0)]}
        0:5:58.957onresult  0:{object,[say (0)]}
        0:5:59.9onresult  0:{object,[say (0)]}
        0:5:59.69onresult  0:{object,[say (0)]}
        0:5:59.222onresult  0:{object,[say (0)]}
        0:5:59.442onresult  0:{object,[say a few (0)]}
        0:5:59.727onresult  0:{object,[say a few things (0)]}
0:5:59.823onspeechend
0:5:59.824onsoundend
0:5:59.827onaudioend
0:6:0.40onresult  0:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things i (0),
                say a few things in (0),
                say a few things I (0),
                say a few things to (0)]}
0:6:0.45onend


Example: gsa 3.4.16.1149292, continuous, 2 phrases. No partial (interim) results are returned, only final results, and they are properly marked as isFinal.
22:58:3.123created reco object
22:58:9.901start()
22:58:9.904reco.continuous = true
22:58:9.905reco.interimResults = true
22:58:9.944onstart
22:58:10.339onaudiostart
22:58:11.165onsoundstart
22:58:11.171onspeechstart
22:58:13.571onresult  0:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
        22:58:18.441onresult  0:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
         1:{object,(isFinal) [ and a few more (0.9305474758148193),
                and if you more (0),
                add a few more (0),
                in a few more (0),
                and if u more (0)]}
22:58:20.511stop()
        22:58:20.718onresult  0:{object,(isFinal) [say a few things (0.9876290559768677),
                say a few things I (0)]}
         1:{object,(isFinal) [ and a few more (0.9305474758148193),
                and if you more (0),
                add a few more (0),
                in a few more (0),
                and if u more (0)]}
         2:{object,(isFinal) [say a few things and a few more (0.9590882658958435),
                say a few things and if you more (0),
                say a few things add a few more (0),
                say a few things in a few more (0),
                say a few things and if u more (0)]}
        22:58:20.720onspeechend
22:58:20.721onsoundend
22:58:20.722onaudioend
22:58:20.727onend


Example: gsa 4.3.0.85542248, continuous, 2 phrases. isFinal is erroneously returned on partial (interim) results.
0:7:41.16created reco object
0:7:47.770start()
0:7:47.771reco.continuous = true
0:7:47.772reco.interimResults = true
0:7:47.786onstart
0:7:48.341onaudiostart
0:7:49.443onsoundstart
0:7:49.443onspeechstart
0:7:51.511onresult  0:{object,(isFinal) [save a few (0)]}
        0:7:51.794onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
        0:7:52.545onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
         2:{object,(isFinal) [say a few things (0.5713640451431274),
                save a few things (0),
                save a few things: (0),
                save a few things I (0),
                say a few things I (0)]}
        0:7:56.355onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
         2:{object,(isFinal) [say a few things (0.5713640451431274),
                save a few things (0),
                save a few things: (0),
                save a few things I (0),
                say a few things I (0)]}
         3:{object,(isFinal) [ and a few (0)]}
        0:7:56.817onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
         2:{object,(isFinal) [say a few things (0.5713640451431274),
                save a few things (0),
                save a few things: (0),
                save a few things I (0),
                say a few things I (0)]}
         3:{object,(isFinal) [ and a few (0)]}
         4:{object,(isFinal) [ and a few more (0)]}
        0:7:57.364onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
         2:{object,(isFinal) [say a few things (0.5713640451431274),
                save a few things (0),
                save a few things: (0),
                save a few things I (0),
                say a few things I (0)]}
         3:{object,(isFinal) [ and a few (0)]}
         4:{object,(isFinal) [ and a few more (0)]}
         5:{object,(isFinal) [ and a few more (0.9730430841445923),
                and a few mor (0),
                and a few more I (0),
                and if you more (0),
                and F you more (0)]}
0:7:59.941stop()
0:8:0.122onresult  0:{object,(isFinal) [save a few (0)]}
         1:{object,(isFinal) [save a few things (0)]}
         2:{object,(isFinal) [say a few things (0.5713640451431274),
                save a few things (0),
                save a few things: (0),
                save a few things I (0),
                say a few things I (0)]}
         3:{object,(isFinal) [ and a few (0)]}
         4:{object,(isFinal) [ and a few more (0)]}
         5:{object,(isFinal) [ and a few more (0.9730430841445923),
                and a few mor (0),
                and a few more I (0),
                and if you more (0),
                and F you more (0)]}
         6:{object,(isFinal) [say a few things and a few more (0),
                save a few things and a few more (0),
                say a few things and a few mor (0),
                save a few things and a few mor (0),
                save a few things: and a few more (0)]}
        0:8:0.129onspeechend
0:8:0.129onsoundend
0:8:0.129onaudioend
0:8:0.130onend

I also filed a bug against GSA: http://b/19496412

Comment 12 by Deleted ...@, Mar 18 2015

I have question about #10

1) I find some dump objects in #10 like '{object,(isFinal) [say a few things (0.9876290559768677)]' and I think they should be created by Chrome side, right? Because in GSA side, we don't use a field called 'isFinal' to indicate final result or not.

2) GSA will return partial result through RecognitionService.Callback#partialResults(Bundle partialResults) and use
SpeechRecognizer#RESULTS_RECOGNITION as the key to retrieve it in bundle object. Can anyone show me the code in Chrome side that fetch this partial result and construct the objects in 1)?

And I remember the recognition API works fine 2 month ago when I tried to fix another issue in it. But we don't change it since then. So I suspect it could be a potential issue in Chrome side, especially in the result objects construction part.
#12
1) The Web Speech API is defined by W3C and therefore is not an exact match to the GSA APIs.
It includes isFinal, which denotes that a result is not a partialResult. You can see the definition of isFinal here: https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html#speechreco-section

3) As I wrote in #10, testing on the same version of Chrome, this works with earlier versions of GSA  and fails with more recent versions. Also, there have been no recent changes in Chrome in this area, so I believe the problem is with GSA.

(That's not to say the Chrome implementation is, or ever was perfect, as I wrote in #10, 3.4.16.1149292 doesn't provide interim results in Continuous mode.)
 Issue 468031  has been merged into this issue.
This is still present in Chrome 43.0.2357.93 on Android (5.0.2), but works fine on Chrome on OS X (46.0.2457.0 canary) and Windows (43.0.2357.134).
This is still a problem in Chrome 48.0.2564.95, both on Android 5.0.2 (Samsung Galaxy Tab A) and on Android 6.0.1 (Nexus 5X). Works correctly on Chromium 45.0.2454.85 on Linux.
Project Member

Comment 17 by sheriffbot@chromium.org, Feb 10 2017

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been available for more than 365 days, and should be re-evaluated. Please re-triage this issue.
The Hotlist-Recharge-Cold label is applied for tracking purposes, and should not be removed after re-triaging the issue.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
 Issue 608070  has been merged into this issue.
 Issue 645539  has been merged into this issue.
Cc: tedc...@chromium.org
 Issue 709446  has been merged into this issue.
Cc: ligim...@chromium.org maxmorin@chromium.org olka@chromium.org sandeepkumars@chromium.org rbasuvula@chromium.org nyerramilli@chromium.org msrchandra@chromium.org
 Issue 773241  has been merged into this issue.

Comment 22 by bulle...@gmail.com, Oct 30 2017

Hi, is the Google team finally working on this? Thanks!
Status: Available (was: Untriaged)
There is noone at Google working on the speech recognition API :(.

This bug appears to have been known when this API was initially implemented: https://cs.chromium.org/chromium/src/content/public/android/java/src/org/chromium/content/browser/SpeechRecognition.java?l=157.
The internal bug referenced in comment 11 suggests checking if "android.speech.extra.UNSTABLE_TEXT" is present in the results; this would indicate that the results are provisional.
Project Member

Comment 24 by sheriffbot@chromium.org, Oct 30

Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment