|
|||||||||||
Issue descriptionChrome Version : 65.0.3325.51 OS Version: OS X 10.13.3 URLs (if applicable) : https://davidben.scripts.mit.edu/speech-test.html What steps will reproduce the problem? 1. Visit https://davidben.scripts.mit.edu/speech-test.html 2. Get very annoyed What is the expected result? Speech should not play without user gesture. What happens instead of that? Speech plays without user gesture. I'm not sure what the current state of audio autoplay but this suggests that, at least on Android, we don't allow audio autoplay without user gesture: https://cs.chromium.org/chromium/src/media/base/media_switches.cc?rcl=b0e6afca6430cd3aa67504b4453ec10e15e37dd9&l=355 One way or another, this API ultimately dumps the data into a system service, so everything we do to control sound must be reimplemented to account for it, if it is left in its current form. Please provide any additional information below. Attach a screenshot if possible. I came across a frame-busting ad that was abusing this this morning. UserAgentString: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.51 Safari/537.36 Feb 16 2018,
dmazzoni@, would it make sense to block the Speech Synthesis API when autoplay isn't allowed on the page? Are their ways to notify the callers that the API can't be used? Feb 26 2018,
Issue 807538 has been merged into this issue. Feb 26 2018,
My understanding is that an <audio> or <video> element cannot autoplay, but we don't prevent a page from playing audio via JavaScript, for example via the Web Audio API. But maybe I'm wrong. Basically I think speech should behave the same as the Web Audio API. If we have a mechanism to Feb 26 2018,
@rtoy, can you answer about the Web Audio API? If there is a mechanism to prevent autoplay, can you point me to it so we can reuse it for speech? If we don't suppress the Web Audio API then I think we should close this as WontFix. Feb 26 2018,
We do restrict autoplay for Web Audio, see https://developers.google.com/web/updates/2017/09/autoplay-policy-changes#webaudio Feb 26 2018,OK, sounds good. Do you happen to know where this is triggered in the code? Do we have a common place to check if there's been an appropriate user gesture? Feb 26 2018,Also we'll need a way to bypass this for Chrome extensions - besides our internal stuff there are some other extensions in the web store that generate speech in the background page. Feb 26 2018,Extensions and Chrome Apps should be fine with the exception of WebView. You should be able to check if you are allowed to play by doing this: ``` #include "core/html/media/AutoplayPolicy.h" bool IsAllowedToPlay() { if (AutoplayPolicy::GetAutoplayPolicyForDocument(*document) != AutoplayPolicy::Type::kDocumentUserActivationRequired)) return true; return AutoplayPolicy::IsDocumentAllowedToPlay(*document); ``` Mar 5 2018,
Adding Hotlist-Abusive since the API is being actively abused. Mar 9 2018,Issue 814129 has been merged into this issue. Apr 17 2018,
Any news on this? This is being used by abusive ads and already falls under our existing autoplay behavior, so we really should plug this hole. May 24 2018,dmazzoni: Ping? May 24 2018,
jkarlin, csharrison: Interested in picking this up? May 24 2018,Yeah, I'd be happy to implement this if feature-owners are on board. I don't have much context in the accessibility space to know how much breakage to expect though. May 25 2018,related to 517317 Jun 14 2018,
I'm supportive of making this change. Chrome also has a TTS extension API that predates the web speech synthesis API, so extension authors already have another option. Still, perhaps the safest route would be to whitelist extensions for now and just try to tackle the problem with web pages. What sort of metrics could we collect? I'm wondering if there's any way we could determine how many instances of bad speech were blocked without violating privacy. Assigning back to @csharrison, who volunteered to implement this, but please loop in me and katie@chromium.org (katydek@google.com) and we can try to help. I don't know how autoplay is implemented but I can help answer any questions about how speech synthesis works now, and Katie is the one who has done the most work on this recently. Jun 15 2018,Thanks dmazzoni. I can volunteer some cycles to take a stab. First thing I was thinking of implementing is: 1. UseCounter for speechSynthesis.speak 2. UseCounter for speechSynthesis.speak that would be blocked with autoplay policy. The existing UseCounter for SpeechSynthesis (V8Window_SpeechSynthesis_AttributeGetter) shows hits on ~5% of pages. This is way higher than I expected, so hopefully implementing (1) and (2) can help measure some risk. I don't think we can measure "good" vs. "bad" speech easily. Maybe if the API was per-frame we could correlate speech with quick tab closing as a proxy for abuse. As-is we don't have an easy way to do that though. Jun 15 2018, Project MemberThe following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2a2ef3018c7f45844706bec3d3f529ff8d6fa18e commit 2a2ef3018c7f45844706bec3d3f529ff8d6fa18e Author: Charlie Harrison <csharrison@chromium.org> Date: Fri Jun 15 20:01:26 2018 Log use counters for evaluating SpeechSynthesis Autoplay intervention Bug: 812767 Change-Id: I2e3d2653153fa66b47cc533cdaeb97c4786a2478 Reviewed-on: https://chromium-review.googlesource.com/1102480 Reviewed-by: Rick Byers <rbyers@chromium.org> Reviewed-by: Dominic Mazzoni <dmazzoni@chromium.org> Reviewed-by: Mounir Lamouri <mlamouri@chromium.org> Commit-Queue: Charlie Harrison <csharrison@chromium.org> Cr-Commit-Position: refs/heads/master@{#567772} [modify] https://crrev.com/2a2ef3018c7f45844706bec3d3f529ff8d6fa18e/third_party/blink/public/platform/web_feature.mojom [modify] https://crrev.com/2a2ef3018c7f45844706bec3d3f529ff8d6fa18e/third_party/blink/renderer/modules/speech/speech_synthesis.cc [modify] https://crrev.com/2a2ef3018c7f45844706bec3d3f529ff8d6fa18e/third_party/blink/renderer/modules/speech/speech_synthesis.h [modify] https://crrev.com/2a2ef3018c7f45844706bec3d3f529ff8d6fa18e/tools/metrics/histograms/enums.xml Jun 26 2018, Project MemberThe following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/dcdefe4c3bc6c6d890c46079fb1c4b02fc7846cc commit dcdefe4c3bc6c6d890c46079fb1c4b02fc7846cc Author: Charlie Harrison <csharrison@chromium.org> Date: Tue Jun 26 14:10:22 2018 Add TTS UseCounters to ukm_features These counters are logged in < .1% of pages. See blink-dev intent to deprecate: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/XpkevOngqUs Bug: 812767 Change-Id: I75e262fb04230da4a2ebb47ecac37ebaf602462f Reviewed-on: https://chromium-review.googlesource.com/1113659 Reviewed-by: Robert Kaplow <rkaplow@chromium.org> Commit-Queue: Charlie Harrison <csharrison@chromium.org> Cr-Commit-Position: refs/heads/master@{#570397} [modify] https://crrev.com/dcdefe4c3bc6c6d890c46079fb1c4b02fc7846cc/chrome/browser/page_load_metrics/observers/use_counter/ukm_features.cc Aug 3, Project MemberThe following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/76fa3b9b03cc2794038568845c57ae7336836f51 commit 76fa3b9b03cc2794038568845c57ae7336836f51 Author: Charlie Harrison <csharrison@chromium.org> Date: Fri Aug 03 15:16:14 2018 Deprecate speechSynthesis.speak() without user activation See intent to deprecate: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/XpkevOngqUs The deprecation will target M71. Bug: 812767 Change-Id: Id4448a91047def16194a47efdacc152070dace82 Reviewed-on: https://chromium-review.googlesource.com/1157231 Reviewed-by: Dominic Mazzoni <dmazzoni@chromium.org> Reviewed-by: Philip Jägenstedt <foolip@chromium.org> Commit-Queue: Charlie Harrison <csharrison@chromium.org> Cr-Commit-Position: refs/heads/master@{#580550} [modify] https://crrev.com/76fa3b9b03cc2794038568845c57ae7336836f51/third_party/blink/renderer/core/frame/deprecation.cc [modify] https://crrev.com/76fa3b9b03cc2794038568845c57ae7336836f51/third_party/blink/renderer/modules/speech/speech_synthesis.cc Aug 26,Can anyone explain how this would work? Right now I have an app that makes heavy use of TTS to provide audible cues to people. The TTS is initiated by the user, so this change would not have to have any effects on the app, but then the app makes uses of a timer to timely notify the user about things. This subsequent notifications are not user initiated, so I guess they would break? If that's the case, this change will simply make my app unusable, and I would highly prefer this to be handled with a permission (much like any other thing: notifications, bluetooth, etc) than a user interaction... Aug 26,Hi mvolmaro, thanks for reaching out. Can you provide a link to your site? Are you seeing cases where the devtools deprecation warning is firing? If I understand you correctly, I do not think you will be affected. The change requires the user to interact with the page at least _once_ before subsequent speaking will succeed. After some interaction, multiple calls to speak can be called on that same page without failing. Aug 27,@csharrison: Right now I'm not seeing anything as I'm not using m70. I just read the upcoming changes in https://www.chromestatus.com/features and this took my attention. If, after the first user interaction, TTS works normally, that would work for me. Question: I'm initiating the TTS on user interaction (on click) but not directly on the event handler, but on the result of a promise fired by the event handler... So: Click event handler > Promise.then > play utterance. Will that works the same as if the utterance is being played right on the event handler? Aug 27,mvolmaro: Yes, speaking sometime after the user initiation but not on the actual event handler should work fine. Please do reach out if you see anything unexpected on Chrome 70. Sep 28, Project MemberThe following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b469a6e8b042ebeb028c7f601f7c98990981b9a7 commit b469a6e8b042ebeb028c7f601f7c98990981b9a7 Author: Charlie Harrison <csharrison@chromium.org> Date: Fri Sep 28 22:42:09 2018 Disallow speechSynthesis.speak autoplaying See intent to remove: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/WsnBm53M4Pc Even though speech API does not work properly on content shell, this change can be tested in layout tests because it fails immediately without calling into any synthesis code. To force autoplay, tests need to use the new unified autoplay flag: --autoplay-policy=document-user-activation-required Bug: 812767 Change-Id: I41bee6e37ab46ff2013d096c714b5124bd0ccc2c Reviewed-on: https://chromium-review.googlesource.com/1225650 Commit-Queue: Charlie Harrison <csharrison@chromium.org> Reviewed-by: Dominic Mazzoni <dmazzoni@chromium.org> Reviewed-by: Philip Jägenstedt <foolip@chromium.org> Cr-Commit-Position: refs/heads/master@{#595238} [modify] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/WebKit/LayoutTests/NeverFixTests [modify] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/WebKit/LayoutTests/TestExpectations [modify] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/WebKit/LayoutTests/VirtualTestSuites [add] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/WebKit/LayoutTests/virtual/speech-with-unified-autoplay/external/wpt/speech-api/README.txt [modify] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/blink/renderer/core/frame/deprecation.cc [modify] https://crrev.com/b469a6e8b042ebeb028c7f601f7c98990981b9a7/third_party/blink/renderer/modules/speech/speech_synthesis.cc Sep 28,
Oct 9,What about when "SpeechRecognition" results are used to trigger a "speechSynthesis.speak" event? Will that continue to work, or is that going to be broken by these changes? This is going to affect voice only web applications that recognize speech and read text back to users. There are so many use-cases where sounds should play without direct user interaction, and SpeechRecognition result events firing speechSynthesis.speak events should continue to work without the user clicking on something directly since voice events are calling it. It would be nice if Google quit inventing standards that should NOT exist. There are perfectly valid instances when sounds should play without direct user interaction, and cases when they should NOT. Deciding to take the blacklist approach and blocking sounds from playing without direct user interaction is just ridiculous and is not a very well thought out approach to handling this problem (if there even is one - since what you consider annoying may NOT be annoying to me). Why change something that wasn't broken and was the web standard up until now just because someone is annoyed? This is the internet. If you don't like something, quit visiting the page. Content creators should be able to decide when sounds play (with or without direct user interaction). If you don't like it, get your panties out of a bunch and cry elsewhere. Nov 6,I also think that there are many use-cases where speechSynthesis.speak should work without user-ineraction! It is (or better it was?) a very important feature in sense of human/machine interaction. For example in conjunction with the speechRecognition API to basically enable an audio-dialog between the user and the machine. Sure, you have to ban abusive implementations of this feature, but the planned solution is not a good solution for most web developers out there! Also think of use-cases where handicapped (blind) users want to take advantage of this feature. And in future they have to first click a button to get an audio response?! *lol* Nov 6,I agree it is unfortunate that we need interactions for speech synthesis to work on the web. However, browsers have a responsibility to protect their users, and in this case we made a trade-off. In this case the majority of usage of this API was for abuse. Note that this doesn't change the ability for extensions to use the chrome.tts API to enable speech synthesis without an interaction. Nov 14,Hi! Just got deprecation warning for speechsynthesis.`speak`. Can I suggest disabling it but let user turn it on in Chrome settings for specific domain(s)? There are use cases when this is useful and improves UX. So, for my site I could prompt users to turn such a Chrome setting on (for my domain) and I'm sure those who interested by my web-app will do it. Nov 14,Hey tgarifulin! We are currently developing a solution which allows sites whitelisted in chrome://settings/content/sound to autoplay content. This will include autoplay speech synthesis. mlamouri: Is this expected to land in M71? Nov 15,I would strongly discourage websites to suggest users to alter there settings. If your website doesn't work with the autoplay restrictions, you would spend more time explaining to your users how to enable autoplay than it would take to make your website adapt to no autoplay. The spirit of this setting is for advanced users that visit old unmaintained websites. It is meant to help with backward compatibility. Nov 15,csharrison, sounds great. mlamouri, I see your point. I partially agree it is not ideal. Bu I don't see other options. Sure, I'll have to adapt to `no autoplay`, but as I said in my above comment it will degrade user experience for some of my web-app functionality: it is something like autoplaying slides with animations and on each such slide a piece of text is spoken. How would you deal with this without autoplay? Nov 15,Just read the whole thread and now I think I don't understand exact scenario. Could you guys spread more light on the upcoming behavior? Consider csharrison's comment: >> The change requires the user to interact with the page at least _once_ before subsequent speaking will succeed. After some interaction, multiple calls to speak can be called on that same page without failing. According to the comment my case should not be affected? I got deprecation warning "[Deprecation] speechSynthesis.speak() without user activation is deprecated and will be removed in M71...." as soon as I open a view in my web-app and it autoplays some text with delay using setTimeout. Any subsequent calls to speechsynthesis.speak() from the subsequent setTimeout handlers do not incur the deprecation warning. FYI, opening the view with the autoplay functionality doesn't trigger http request, just history.pushState(). Thanks, Timur Garifulin. Nov 15,Hey Timur, You can take a look at [1] for an explanation of the autoplay policies. The one difference for speech synthesis is that we don't support "muted" speech synthesis on the platform, since SSML can change the volume out from under us. The deprecation messages should indicate when speech is disallowed, but I recommend you try out your app in Chrome Beta (M71) to see how it behaves with the policies enabled. [1]: https://developers.google.com/web/updates/2017/09/autoplay-policy-changes Dec 3,Hey ikolosov, You should just call speak() again once you have user activation if the first call failed. Dec 6,I was afraid this would happen: https://bugs.chromium.org/p/chromium/issues/detail?id=912429 Dec 14,This is not a bug at all. But a wrong/bad mindset. Why should user pay for your decisions? The sound is already "harder" than visuals, but you try to make it even harder with confirmations/additional visual requests. Consider the possibility of UI speech helper more than some shhty site needs in ads. I belive that the blocker is better than the confirmer. Firefox has some sound off icon in the tabs, for example. Let the sites comptete for the user, browsers are already fine. Bug submitter, huh. Dec 20,This is causing havoc on our virtual call center. As soon as the version is updated to Chrome v71 the agent can't hear a caller anymore but the caller can hear them. We are currently in the process of downgrading to Chrome v70 until there is a fix to auto allow the Speech Synthesis API for a site like you can with sound and microphone. Most agents have switched to Firefox because of this. Thanks for the test script David, works like a charm, please keep it online so we can test future releases. |
|||||||||||
►
Sign in to add a comment |
Comment 1 by krajshree@chromium.org, Feb 16 2018