New issue
Advanced search Search tips

Issue 811929 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

Chrome binds to TTS when "window.speechSynthesis" is accessed

Project Member Reported by dskiba@chromium.org, Feb 13 2018

Issue description

This was brought by Android Go team in b/73009067: Chrome binds to TTS service when playing Youtube videos.

Turned out that Youtube page simply accesses "window.speechSynthesis", and doesn't try to synthesize anything. But blink::PlatformSpeechSynthesizer::Create() calls InitializeVoiceList():

#0  blink::PlatformSpeechSynthesizer::InitializeVoiceList () at ../../third_party/WebKit/Source/platform/speech/PlatformSpeechSynthesizer.cpp:84
#1  0xd6095ef8 in blink::PlatformSpeechSynthesizer::Create () at ../../third_party/WebKit/Source/platform/speech/PlatformSpeechSynthesizer.cpp:42
#2  0xd7699384 in blink::SpeechSynthesis::SpeechSynthesis () at ../../third_party/WebKit/Source/modules/speech/SpeechSynthesis.cpp:41
#3  0xd7696300 in blink::DOMWindowSpeechSynthesis::speechSynthesis () at ../../third_party/WebKit/Source/modules/speech/DOMWindowSpeechSynthesis.cpp:71
#4  0xd750db14 in blink::DOMWindowPartialV8Internal::speechSynthesisAttributeGetter () at gen/blink/bindings/modules/v8/V8WindowPartial.cpp:634
#5  blink::V8WindowPartial::speechSynthesisAttributeGetterCallback () at gen/blink/bindings/modules/v8/V8WindowPartial.cpp:2564
#6  0xd3c5f268 in v8::internal::FunctionCallbackArguments::Call () at ../../v8/src/api-arguments.cc:26
#7  0xd3c849e4 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> () at ../../v8/src/builtins/builtins-api.cc:112
#8  0xd3c5e3dc in v8::internal::Builtins::InvokeApiFunction () at ../../v8/src/builtins/builtins-api.cc:220
#9  0xd3becd14 in v8::internal::Object::GetPropertyWithAccessor () at ../../v8/src/objects.cc:1643
#10 0xd3c7fb24 in v8::internal::LoadIC::Load () at ../../v8/src/ic/ic.cc:467
#11 0xd5b90778 in v8::internal::__RT_impl_Runtime_LoadIC_Miss () at ../../v8/src/ic/ic.cc:2089

InitializeVoiceList() goes through several layers and results in TtsHostMsg_InitializeVoiceList message, which creates TtsPlatformImpl on the browser side:

#0  TtsPlatformImpl::GetInstance () at ../../chrome/browser/speech/tts_android.cc:21
#1  0xd7c39738 in TtsControllerImpl::GetPlatformImpl () at ../../chrome/browser/speech/tts_controller_impl.cc:439
#2  0xd7c39f34 in TtsControllerImpl::GetVoices () at ../../chrome/browser/speech/tts_controller_impl.cc:375
#3  0xd7c3a554 in TtsMessageFilter::OnInitializeVoiceList () at ../../chrome/browser/speech/tts_message_filter.cc:98
#4  0xd41191c4 in base::DispatchToMethod<android_webview::AwRenderFrameExt*, void (android_webview::AwRenderFrameExt::*)(), std::__ndk1::tuple<> >(android_webview::AwRenderFrameExt* const&, void (android_webview::AwRenderFrameExt::*)(), std::__ndk1::tuple<>&&) () at ../../base/tuple.h:60
#5  IPC::DispatchToMethod<android_webview::AwRenderFrameExt, void (android_webview::AwRenderFrameExt::*)(), void, std::__ndk1::tuple<> >(android_webview::AwRenderFrameExt*, void (android_webview::AwRenderFrameExt::*)(), void*, std::__ndk1::tuple<>&&) () at ../../ipc/ipc_message_templates.h:51
#6  0xd7c3a468 in IPC::MessageT<TtsHostMsg_InitializeVoiceList_Meta, std::__ndk1::tuple<>, void>::Dispatch<TtsMessageFilter, TtsMessageFilter, void, void (TtsMessageFilter::*)()>(IPC::Message const*, TtsMessageFilter*, TtsMessageFilter*, void*, void (TtsMessageFilter::*)()) () at ../../ipc/ipc_message_templates.h:146
#7  0xd7c3a324 in TtsMessageFilter::OnMessageReceived () at ../../chrome/browser/speech/tts_message_filter.cc:55
...

That creates TtsPlatformImpl on the Java side, which creates android.speech.tts.TextToSpeech and binds to TTS service:

[1] android.app.ContextImpl.bindService (ContextImpl.java:1,557)
[2] android.content.ContextWrapper.bindService (ContextWrapper.java:684)
[3] android.speech.tts.TextToSpeech.connectToEngine (TextToSpeech.java:810)
[4] android.speech.tts.TextToSpeech.initTts (TextToSpeech.java:780)
[5] android.speech.tts.TextToSpeech.<init> (TextToSpeech.java:733)
[6] android.speech.tts.TextToSpeech.<init> (TextToSpeech.java:712)
[7] android.speech.tts.TextToSpeech.<init> (TextToSpeech.java:696)
[8] org.chromium.chrome.browser.TtsPlatformImpl.<init> (TtsPlatformImpl.java:6)
[9] org.chromium.chrome.browser.LollipopTtsPlatformImpl.<init> (LollipopTtsPlatformImpl.java:1)
[10] org.chromium.chrome.browser.TtsPlatformImpl.create (TtsPlatformImpl.java:9)
...


The obvious problem here is that Chrome needlessly binds to TTS when SpeechSynthesis JS object is created.

Less obvious problem is that since native TtsPlatformImpl is a singleton, TTS stays bound for the lifetime of the browser process.
 

Comment 1 by dskiba@chromium.org, Feb 13 2018

The core of the problem is that we're retrieving list of voices when SpeechSynthesis is created. However, since retrieval is asynchronous, getVoices() can still return empty list if called too early. So there is 'voiceschanged' event which is fired when list of voices arrives. Callers of getVoices() are encouraged to listen for that event, see for example https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/getVoices

There are SO questions about getVoices() returning empty list, for example: https://stackoverflow.com/questions/21513706/getting-the-list-of-voices-in-speechsynthesis-of-chrome-web-speech-api

That question mentions the following errata:

Section 5.2.2 getVoices method: append at the end of the definition: If there are no voices available, or if the the list of available voices is not yet known (for example: server-side synthesis where the list is determined asynchronously), then this method MUST return a SpeechSynthesisVoiceList of length zero.

New "Section 5.2.2.1 SpeechSynthesis Events" is created and contains: voiceschanged: Fired when the contents of the SpeechSynthesisVoiceList, that the getVoices method will return, have changed. Examples include: server-side synthesis where the list is determined asynchronously, or when client-side voices are installed/uninstalled.

Some solutions also include calling getVoices() repeatedly via a timer.

Comment 2 by dskiba@chromium.org, Feb 13 2018

Ideally we would like to avoid getting list of voices until there is an indication that client is interested in SpeechSynthesis object, i.e. until it calls a method on it:

* Accessing 'window.speechSynthesis' doesn't query for voices, avoiding binding to TTS.

* Calling getVoices() for the first time returns empty list, and starts async fetch of voices (binding to TTS in the process).

* Calling any other method starts fetching list of voices too.

* Once list of voices arrives 'voiceschanged' is fired.

This change is relatively straightforward to implement, and it shouldn't break existing code.

Comment 3 by dskiba@chromium.org, Feb 13 2018

The other option is to unbind from TTS once we got initial list of voices. I.e. in Android implementation of TtsPlatformImpl, we would do the following:

* After initializing list of voices in TtsPlatformImpl.initialize() we would call TextToSpeech.shutdown().

* We would then lazily re-initialize TextToSpeech when needed.


There are two downsides to this though:

1. We would still pressure the system with TTS. Since TTS is pretty heavy (40MiB PSS) it could still contribute to slowing down the phone / OOM-killing other processes, especially on 512MiB devices.

2. TextToSpeech is initialized asynchronously, so for example if speak() is called and TextToSpeech is not available, we would have to block until it's initialized.

Comment 4 by dskiba@chromium.org, Feb 13 2018

Cc: dmazz...@chromium.org
Components: Internals>SpeechSynthesis
Status: Available (was: Untriaged)
The fix proposed in comment #2 sounds good to me. The change should be entirely in Blink. There are some existing tests that could be modified slightly to reflect this new behavior.

Reminds me of  bug 616636 

Project Member

Comment 7 by bugdroid1@chromium.org, Feb 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d10a5836167d1fa92b23ee0f2b74ddce95b0b872

commit d10a5836167d1fa92b23ee0f2b74ddce95b0b872
Author: Dmitry Skiba <dskiba@chromium.org>
Date: Wed Feb 14 23:45:41 2018

Lazily initialize TTS voices on low-end Android devices.

Merely accessing 'window.speechSynthesis' creates SpeechSynthesis object,
which fetches list of TTS voices. On Android process of fetching voices
involves creating and binding TTS service. TTS service is quite big and
causes memory pressure on low-end Android devices, which can result in
bad UI experience and even cause Chrome to be OOM-killed.

Since voices are fetched asynchronously, SpeechSynthesis.getVoices() can
sometimes return an empty list. Client code is supposed to be prepared for
that and needs to listen to 'voiceschanged' event to know when voices are
available.

This CL changes behavior on low-end Android devices, and defers fetching
TTS voices until any SpeechSynthesis method (including getVoices()) is
called. The change should be transparent to client code, which should be
listening to 'voiceschanged' event anyway.

Bug:  811929 
Change-Id: Ib26d78aaf9e847fa0bda86c9bc800a7054b2487b
Reviewed-on: https://chromium-review.googlesource.com/917230
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Reviewed-by: Dominic Mazzoni <dmazzoni@chromium.org>
Commit-Queue: Dmitry Skiba <dskiba@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536890}
[modify] https://crrev.com/d10a5836167d1fa92b23ee0f2b74ddce95b0b872/third_party/WebKit/Source/modules/speech/SpeechSynthesis.cpp
[modify] https://crrev.com/d10a5836167d1fa92b23ee0f2b74ddce95b0b872/third_party/WebKit/Source/modules/speech/testing/PlatformSpeechSynthesizerMock.cpp
[modify] https://crrev.com/d10a5836167d1fa92b23ee0f2b74ddce95b0b872/third_party/WebKit/Source/platform/speech/PlatformSpeechSynthesizer.cpp
[modify] https://crrev.com/d10a5836167d1fa92b23ee0f2b74ddce95b0b872/third_party/WebKit/Source/platform/speech/PlatformSpeechSynthesizer.h

Project Member

Comment 8 by bugdroid1@chromium.org, Feb 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4738c24147dc2c690f9df305da4fe9d695b4e663

commit 4738c24147dc2c690f9df305da4fe9d695b4e663
Author: Dmitry Skiba <dskiba@chromium.org>
Date: Thu Feb 15 18:23:02 2018

Lazily initialize TTS voices on all Android devices.

This is a follow up to crrev.com/c/917230, which enabled lazy initialization
of TTS voices on low-end Android devices.

While it's critical to not bind to TTS on low-end devices, it's nice to also
avoid doing that on mid- and high-end devices.

This CL removes 'is low-end' check so that TTS voices are always lazily
initialized on Android.

If there are complaints we have an option to revert this CL, but not the
previous one.

Bug:  811929 
Change-Id: I9eaeea75e7879780934f62d453c6b70fce68ee7c
Reviewed-on: https://chromium-review.googlesource.com/919069
Commit-Queue: Dmitry Skiba <dskiba@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Reviewed-by: Dominic Mazzoni <dmazzoni@chromium.org>
Cr-Commit-Position: refs/heads/master@{#537074}
[modify] https://crrev.com/4738c24147dc2c690f9df305da4fe9d695b4e663/third_party/WebKit/Source/platform/speech/PlatformSpeechSynthesizer.cpp

Status: Fixed (was: Available)

Sign in to add a comment