New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 718921 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug

Blocking:
issue 592556



Sign in to add a comment

[JNI crashes] Properly throw Java Exceptions from RunJavaScriptDialog.

Project Member Reported by gsennton@chromium.org, May 5 2017

Issue description

This is part of the Android WebView JNI crash propagation effort.

AwContentsClientBridge::RunJavaScriptDialog calls into Java in the following calls:

Java_AwContentsClientBridge_handleJsAlert
Java_AwContentsClientBridge_handleJsConfirm
Java_AwContentsClientBridge_handleJsPrompt

which correspond to
WebChromeClient.onJsAlert
WebChromeClient.onJsConfirm
WebChromeClient.onJsPrompt


The native call stack seems to represent an IPC call received in the Browser process, calling from RenderProcessHostImpl::OnMesssageReceived down into AwContentsClientBridge::RunJavaScriptDialog.


Crash URL for RunJavaScriptDialog (excluding most crashes before 55):
https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%20CONTAINS%20%27%5BAndroid%20Java%20Exception%5D%27%20AND%20product.name%3D%27AndroidWebView%27%20AND%20NOT%20product.Version%20CONTAINS%20%2754.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2753.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2752.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2751.0.2%27%20AND%20special_protos.user_feedback.mobile_data.crash_data.exception_class_name%3D%27Native%20crash%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAndroid%20Java%20Exception%5D%20android_webview%3A%3AAwContentsClientBridge%3A%3ARunJavaScriptDialog%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D
 
This seems a bit fiddly because it calls into Java twice (first as part of the Mojo stack, and secondly into RunJavaScriptDialog), and needs to bail out of some parts of Mojo code.

Comment 2 by torne@chromium.org, May 8 2017

Maybe we should chat with the Mojo people about how to make this generally possible in Mojo land.
Cc: roc...@chromium.org qsr@chromium.org
Yup, sounds like a plan.

cc'ing rockot@ and qsr@
rockot@/qsr@ who do we contact regarding making it possible to bail out of a call / message in Mojo (Android/Java) - without processing any other calls/messages?

Background: at some point in the call stack we call into Java, and throw a Java exception. We want to let this exception propagate through the native stack back to Android Framework code to cause a Feedback-prompt (instead of crashing in native with a native stack just after returning from the JNI call). When a Java exception has been thrown we are not allowed to call any other Java code from native - so we need a way of bailing out of the messageloop/task runner we are in when making the failing JNI call.
Aha, I misunderstood the crashes on 58 - in there the  mojo::Watcher::onHandleReady call is the first non-general call in the call stack - and I mixed that up with mojo::WatcherImpl::onHandleReady (because mojo::Watcher::onHandleReady was removed in 59, so code-search would only return results for mojo::WatcherImpl::onHandleReady and some other classes - I incorrectly assumed WatcherImpl sub-classed Watcher). I've probably said several things offline to rockot@ that just don't make sense because of this ;) (apologies for the confusion).

Here is a stack from 59:

logging.cc:759  	logging::LogMessage::~LogMessage()
jni_android.cc:243  	base::android::CheckException(_JNIEnv*)
jni_generator_helper.h:42  	android_webview::AwContentsClientBridge::RunJavaScriptDialog()
aw_javascript_dialog_manager.cc:33  	android_webview::AwJavaScriptDialogManager::RunJavaScriptDialog()
web_contents_impl.cc:4273  	content::WebContentsImpl::RunJavaScriptDialog()
render_frame_host_impl.cc:1787  	content::RenderFrameHostImpl::OnRunJavaScriptDialog()
tuple.h:117  	bool IPC::MessageT<FrameHostMsg_RunJavaScriptDialog_Meta, ...>
render_frame_host_impl.cc:764  	content::RenderFrameHostImpl::OnMessageReceived(IPC::Message const&)
render_process_host_impl.cc:2072  	content::RenderProcessHostImpl::OnMessageReceived(IPC::Message const&)
ipc_channel_proxy.cc:329  	IPC::ChannelProxy::Context::OnDispatchMessage(IPC::Message const&)
ipc_sync_channel.cc:214  	IPC::SyncChannel::ReceivedSyncMsgQueue::DispatchMessages(IPC::SyncChannel::SyncContext*)
callback.h:80  	base::AsyncCallbackHelper(base::Flag*, base::Callback<void (base::WaitableEvent*), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, base::WaitableEvent*)
callback.h:91  	base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*)
message_loop.cc:423  	base::MessageLoop::RunTask(base::PendingTask*)
message_loop.cc:434  	base::MessageLoop::DeferOrRunPendingTask(base::PendingTask)
message_loop.cc:527  	base::MessageLoop::DoWork()
message_pump_android.cc:44  	Java_org_chromium_base_SystemMessageHandler_nativeDoRunLoopOnce

So we never touch Mojo, and AFAICT the only place we really need to bail out of (i.e. not just add a comment for), is ReceivedSyncMsgQueue::DispatchMessages which can handle several messages in the same loop.

Do we know whether a JS dialog IPC call is always (and will always be) synchronous? I.e. will it always call into IPC::SyncChannel::ReceivedSyncMsgQueue? 


Example report from 59:
https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%20CONTAINS%20%27%5BAndroid%20Java%20Exception%5D%27%20AND%20product.name%3D%27AndroidWebView%27%20AND%20product.Version%20CONTAINS%20%2759.0%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAndroid%20Java%20Exception%5D%20android_webview%3A%3AAwContentsClientBridge%3A%3ARunJavaScriptDialog%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D&stbtiq=&reportid=78451d60d0000000&index=0#0
Note that AFAICT the call stack for RunBeforeUnloadDialog is similar to the one for RunJavaScriptDialog, so if we fix this one we get RunBeforeUnloadDialog for free (though that's less than 1% of our crashes on 58).
Cc: -roc...@chromium.org -qsr@chromium.org
Cc: roc...@chromium.org
Actually, adding rockot@ back since he is an owner of ipc/

Comment 8 by roc...@chromium.org, May 12 2017

A message is either sync or async, and if it's sync it always goes through this code path. I'm not sure exactly what you're proposing we change about this though. It does not seem like IPC is the right layer to be addressing this issue.

Comment 9 by torne@chromium.org, May 12 2017

It's not about addressing the issue in the IPC layer; it's that there is literally only one way to throw the Java exception and it requires that we back out of the entire C++ call stack safely, which means that every point in the C++ call stack in question needs to be able to bail in a way that guarantees it cannot possibly call any other Java function or cause a native crash.

So, if this is the only call stack that leads to OnRunJavaScriptDialog then this is the only call stack we need to look into; if there's other possibilities we need to handle all of them.
Couldn't we just have RunJavaScriptDialog always post a new task to make its JNI call? Then we don't have to care who's calling it?
Yepp, so the next step is to find a decent way to abort the infinite loop in ReceivedSyncMsgQueue::DispatchMessages whenever we have a pending Java exception.
I don't see how DispatchMessages can loop infinitely unless it's DoSed with
new incoming sync messages?
(argh, in #11 I was responding to comment #9)

If we are unsure about whether the stack trace will be the same we could post a new task - aren't there performance considerations? (and also the added complexity of returning the result from the new task?)
Sorry, ignore the "infinite" - as long as the loop continues after the OnRunJavaScriptDialog is called we could potentially call into Java again - which is unsafe.
I see what you meant now. In any case, deferring the JNI to a new task
avoids that issue entirely.

I don't think there are any significant performance implications of adding
a small amount of latency to modal dialog dispatch.
Alright, just to clarify what we intend to do here (and why it doesn't break anything).

RunJavaScriptDialog is run as a synchronous IPC call from renderer -> browser.
The browser finishes the call by calling RenderFrameHostImpl::SendJavaScriptDialogReply().

So, as long as we don't need to return anything from RunJavaScriptDialog we should be fine.
There is a
IPC::Message* reply_msg
parameter being passed into RenderFrameHostImpl::OnRunJavaScriptDialog(), can we ignore that? (given that we will respond using RenderFrameHostImpl::SendJavaScriptDialogReply() anyway).


Is ThreadTaskRunnerHandle the class to use for posting a task to the current thread in native?
Actually, if we can post the JNI task from native, why not post the task from the Java side instead? (to avoid calling JNI at all within the stack where we can actually cause an exception).
Cc: a...@chromium.org
+avi@ who knows things about dialogs

Comment 19 by a...@chromium.org, May 15 2017

> There is a IPC::Message* reply_msg parameter being passed into RenderFrameHostImpl::OnRunJavaScriptDialog(), can we ignore that? (given that we will respond using RenderFrameHostImpl::SendJavaScriptDialogReply() anyway).

You kinda need to pass that IPC::Message* back to RenderFrameHostImpl::SendJavaScriptDialogReply() for it to work.

I'm really confused about what's being proposed here. There is an interface to the JavaScript dialogs called JavaScriptDialogManager. You're proposing to hook into a lower level?
Does that pointer stay alive across different tasks? (i.e. if we post a new task to later call SendJavaScriptDialogReply(), and pass that IPC::Message pointer, would that be OK?)

Yeah, we are using JavaScriptDialogManager through AwJavaScriptDialogManager - that then calls into AwContentsClientBridgeBase::RunJavaScriptDialog() which in turn call into Java through JNI.

Our goal is to be able to return from that JNI call without calling into Java again, or alternatively, we might be able to post the Java method call so that it is handled by a Java-handler, and thus can't return into JNI when throwing an exception.
Either way - what we are suggesting is to post either the JNI-call, or a call on the Java-side, back to the current thread, so that the call is made from a very clean call stack. So the question is - is it ok to implement JavaScriptDialogManager::RunJavaScriptDialog() as an asynchronous method? (i.e. by posting the implementation to the same thread to be executed later).

Comment 21 by a...@chromium.org, May 15 2017

> Does that pointer stay alive across different tasks?

AFAIK, that should be fine. In normal use, that pointer is bound in a base::Callback and the UI loop is run a lot, and then when the user finally clicks a button that message is dealt with.

> is it ok to implement JavaScriptDialogManager::RunJavaScriptDialog() as an asynchronous method?

It should be, but your discussion was talking about the RenderFrameHostImpl layer, which is much below the JavaScriptDialogManager layer. That's what I was confused about.

If your question is: Can we take the callback given by JavaScriptDialogManager::RunJavaScriptDialog(), and wait an arbitrary amount of time, and then much later on in some other spin of the runloop do the callback, yes! Please by all means do so. That's why a callback is used, to give you the freedom to do exactly that. The new dialogs that I've just deployed to the desktop do that.

OTOH let me know if that's not your question.
Cool, it sounds like we can do just what we want to do ;)

Comment 23 by torne@chromium.org, May 16 2017

Gustav, it seems possible we can use this same technique for some of the other callbacks, if they aren't time sensitive and don't need a synchronous response.

Comment 24 by a...@chromium.org, May 16 2017

Note that this doesn't change the fact that, from the render process's point of view, this is a synchronous call, and it's going to stay blocked until that callback is made. Make sure to that that into account in your designs.
Project Member

Comment 25 by bugdroid1@chromium.org, May 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b5d138816ecd20145552720460906269c7f3a0fd

commit b5d138816ecd20145552720460906269c7f3a0fd
Author: gsennton <gsennton@chromium.org>
Date: Thu May 18 16:06:29 2017

[Android WebView] Propagate Java exceptions from JS dialog callbacks

WebView applications can throw run-time exceptions within
WebView-callbacks. When such an exception is thrown from either
onJsAlert, onJsConfirm, or onJsPrompt, it is not correctly propagated to
Android's feedback mechanism. This is because those callbacks are called
through JNI, and the default way of handling Java exceptions thrown on
the Java-side of JNI calls is to print the stack trace for the current
exception to the logcat when the JNI call returns to native, and then
intentionally crash in native.

With this CL we avoid the problem of propagating a Java exception back
through JNI and the native stack by posting the WebView-callback (which
can cause the Java exception) as a new task to the current Java Handler.
In that way any Java exception thrown inside the WebView-callback can be
propagated directly up to the Android framework's
UncaughtExceptionHandler which then properly reports the Exception
through the framework's crash reporting mechanism.

BUG= 718921 

Review-Url: https://codereview.chromium.org/2886963003
Cr-Commit-Position: refs/heads/master@{#472830}

[modify] https://crrev.com/b5d138816ecd20145552720460906269c7f3a0fd/android_webview/java/src/org/chromium/android_webview/AwContentsClientBridge.java

Status: Fixed (was: Assigned)

Comment 27 by ctzsm@chromium.org, Oct 20 2017

 Issue 652754  has been merged into this issue.

Sign in to add a comment