[JNI crashes] Properly throw Java Exceptions from RunJavaScriptDialog. |
||||||
Issue descriptionThis is part of the Android WebView JNI crash propagation effort. AwContentsClientBridge::RunJavaScriptDialog calls into Java in the following calls: Java_AwContentsClientBridge_handleJsAlert Java_AwContentsClientBridge_handleJsConfirm Java_AwContentsClientBridge_handleJsPrompt which correspond to WebChromeClient.onJsAlert WebChromeClient.onJsConfirm WebChromeClient.onJsPrompt The native call stack seems to represent an IPC call received in the Browser process, calling from RenderProcessHostImpl::OnMesssageReceived down into AwContentsClientBridge::RunJavaScriptDialog. Crash URL for RunJavaScriptDialog (excluding most crashes before 55): https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%20CONTAINS%20%27%5BAndroid%20Java%20Exception%5D%27%20AND%20product.name%3D%27AndroidWebView%27%20AND%20NOT%20product.Version%20CONTAINS%20%2754.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2753.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2752.0.2%27%20AND%20NOT%20product.Version%20CONTAINS%20%2751.0.2%27%20AND%20special_protos.user_feedback.mobile_data.crash_data.exception_class_name%3D%27Native%20crash%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAndroid%20Java%20Exception%5D%20android_webview%3A%3AAwContentsClientBridge%3A%3ARunJavaScriptDialog%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D
,
May 8 2017
Maybe we should chat with the Mojo people about how to make this generally possible in Mojo land.
,
May 8 2017
Yup, sounds like a plan. cc'ing rockot@ and qsr@ rockot@/qsr@ who do we contact regarding making it possible to bail out of a call / message in Mojo (Android/Java) - without processing any other calls/messages? Background: at some point in the call stack we call into Java, and throw a Java exception. We want to let this exception propagate through the native stack back to Android Framework code to cause a Feedback-prompt (instead of crashing in native with a native stack just after returning from the JNI call). When a Java exception has been thrown we are not allowed to call any other Java code from native - so we need a way of bailing out of the messageloop/task runner we are in when making the failing JNI call.
,
May 11 2017
Aha, I misunderstood the crashes on 58 - in there the mojo::Watcher::onHandleReady call is the first non-general call in the call stack - and I mixed that up with mojo::WatcherImpl::onHandleReady (because mojo::Watcher::onHandleReady was removed in 59, so code-search would only return results for mojo::WatcherImpl::onHandleReady and some other classes - I incorrectly assumed WatcherImpl sub-classed Watcher). I've probably said several things offline to rockot@ that just don't make sense because of this ;) (apologies for the confusion). Here is a stack from 59: logging.cc:759 logging::LogMessage::~LogMessage() jni_android.cc:243 base::android::CheckException(_JNIEnv*) jni_generator_helper.h:42 android_webview::AwContentsClientBridge::RunJavaScriptDialog() aw_javascript_dialog_manager.cc:33 android_webview::AwJavaScriptDialogManager::RunJavaScriptDialog() web_contents_impl.cc:4273 content::WebContentsImpl::RunJavaScriptDialog() render_frame_host_impl.cc:1787 content::RenderFrameHostImpl::OnRunJavaScriptDialog() tuple.h:117 bool IPC::MessageT<FrameHostMsg_RunJavaScriptDialog_Meta, ...> render_frame_host_impl.cc:764 content::RenderFrameHostImpl::OnMessageReceived(IPC::Message const&) render_process_host_impl.cc:2072 content::RenderProcessHostImpl::OnMessageReceived(IPC::Message const&) ipc_channel_proxy.cc:329 IPC::ChannelProxy::Context::OnDispatchMessage(IPC::Message const&) ipc_sync_channel.cc:214 IPC::SyncChannel::ReceivedSyncMsgQueue::DispatchMessages(IPC::SyncChannel::SyncContext*) callback.h:80 base::AsyncCallbackHelper(base::Flag*, base::Callback<void (base::WaitableEvent*), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, base::WaitableEvent*) callback.h:91 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) message_loop.cc:423 base::MessageLoop::RunTask(base::PendingTask*) message_loop.cc:434 base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) message_loop.cc:527 base::MessageLoop::DoWork() message_pump_android.cc:44 Java_org_chromium_base_SystemMessageHandler_nativeDoRunLoopOnce So we never touch Mojo, and AFAICT the only place we really need to bail out of (i.e. not just add a comment for), is ReceivedSyncMsgQueue::DispatchMessages which can handle several messages in the same loop. Do we know whether a JS dialog IPC call is always (and will always be) synchronous? I.e. will it always call into IPC::SyncChannel::ReceivedSyncMsgQueue? Example report from 59: https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%20CONTAINS%20%27%5BAndroid%20Java%20Exception%5D%27%20AND%20product.name%3D%27AndroidWebView%27%20AND%20product.Version%20CONTAINS%20%2759.0%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BAndroid%20Java%20Exception%5D%20android_webview%3A%3AAwContentsClientBridge%3A%3ARunJavaScriptDialog%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D&stbtiq=&reportid=78451d60d0000000&index=0#0
,
May 11 2017
Note that AFAICT the call stack for RunBeforeUnloadDialog is similar to the one for RunJavaScriptDialog, so if we fix this one we get RunBeforeUnloadDialog for free (though that's less than 1% of our crashes on 58).
,
May 12 2017
,
May 12 2017
Actually, adding rockot@ back since he is an owner of ipc/
,
May 12 2017
A message is either sync or async, and if it's sync it always goes through this code path. I'm not sure exactly what you're proposing we change about this though. It does not seem like IPC is the right layer to be addressing this issue.
,
May 12 2017
It's not about addressing the issue in the IPC layer; it's that there is literally only one way to throw the Java exception and it requires that we back out of the entire C++ call stack safely, which means that every point in the C++ call stack in question needs to be able to bail in a way that guarantees it cannot possibly call any other Java function or cause a native crash. So, if this is the only call stack that leads to OnRunJavaScriptDialog then this is the only call stack we need to look into; if there's other possibilities we need to handle all of them.
,
May 12 2017
Couldn't we just have RunJavaScriptDialog always post a new task to make its JNI call? Then we don't have to care who's calling it?
,
May 12 2017
Yepp, so the next step is to find a decent way to abort the infinite loop in ReceivedSyncMsgQueue::DispatchMessages whenever we have a pending Java exception.
,
May 12 2017
I don't see how DispatchMessages can loop infinitely unless it's DoSed with new incoming sync messages?
,
May 12 2017
(argh, in #11 I was responding to comment #9) If we are unsure about whether the stack trace will be the same we could post a new task - aren't there performance considerations? (and also the added complexity of returning the result from the new task?)
,
May 12 2017
Sorry, ignore the "infinite" - as long as the loop continues after the OnRunJavaScriptDialog is called we could potentially call into Java again - which is unsafe.
,
May 12 2017
I see what you meant now. In any case, deferring the JNI to a new task avoids that issue entirely. I don't think there are any significant performance implications of adding a small amount of latency to modal dialog dispatch.
,
May 15 2017
Alright, just to clarify what we intend to do here (and why it doesn't break anything). RunJavaScriptDialog is run as a synchronous IPC call from renderer -> browser. The browser finishes the call by calling RenderFrameHostImpl::SendJavaScriptDialogReply(). So, as long as we don't need to return anything from RunJavaScriptDialog we should be fine. There is a IPC::Message* reply_msg parameter being passed into RenderFrameHostImpl::OnRunJavaScriptDialog(), can we ignore that? (given that we will respond using RenderFrameHostImpl::SendJavaScriptDialogReply() anyway). Is ThreadTaskRunnerHandle the class to use for posting a task to the current thread in native?
,
May 15 2017
Actually, if we can post the JNI task from native, why not post the task from the Java side instead? (to avoid calling JNI at all within the stack where we can actually cause an exception).
,
May 15 2017
+avi@ who knows things about dialogs
,
May 15 2017
> There is a IPC::Message* reply_msg parameter being passed into RenderFrameHostImpl::OnRunJavaScriptDialog(), can we ignore that? (given that we will respond using RenderFrameHostImpl::SendJavaScriptDialogReply() anyway). You kinda need to pass that IPC::Message* back to RenderFrameHostImpl::SendJavaScriptDialogReply() for it to work. I'm really confused about what's being proposed here. There is an interface to the JavaScript dialogs called JavaScriptDialogManager. You're proposing to hook into a lower level?
,
May 15 2017
Does that pointer stay alive across different tasks? (i.e. if we post a new task to later call SendJavaScriptDialogReply(), and pass that IPC::Message pointer, would that be OK?) Yeah, we are using JavaScriptDialogManager through AwJavaScriptDialogManager - that then calls into AwContentsClientBridgeBase::RunJavaScriptDialog() which in turn call into Java through JNI. Our goal is to be able to return from that JNI call without calling into Java again, or alternatively, we might be able to post the Java method call so that it is handled by a Java-handler, and thus can't return into JNI when throwing an exception. Either way - what we are suggesting is to post either the JNI-call, or a call on the Java-side, back to the current thread, so that the call is made from a very clean call stack. So the question is - is it ok to implement JavaScriptDialogManager::RunJavaScriptDialog() as an asynchronous method? (i.e. by posting the implementation to the same thread to be executed later).
,
May 15 2017
> Does that pointer stay alive across different tasks? AFAIK, that should be fine. In normal use, that pointer is bound in a base::Callback and the UI loop is run a lot, and then when the user finally clicks a button that message is dealt with. > is it ok to implement JavaScriptDialogManager::RunJavaScriptDialog() as an asynchronous method? It should be, but your discussion was talking about the RenderFrameHostImpl layer, which is much below the JavaScriptDialogManager layer. That's what I was confused about. If your question is: Can we take the callback given by JavaScriptDialogManager::RunJavaScriptDialog(), and wait an arbitrary amount of time, and then much later on in some other spin of the runloop do the callback, yes! Please by all means do so. That's why a callback is used, to give you the freedom to do exactly that. The new dialogs that I've just deployed to the desktop do that. OTOH let me know if that's not your question.
,
May 16 2017
Cool, it sounds like we can do just what we want to do ;)
,
May 16 2017
Gustav, it seems possible we can use this same technique for some of the other callbacks, if they aren't time sensitive and don't need a synchronous response.
,
May 16 2017
Note that this doesn't change the fact that, from the render process's point of view, this is a synchronous call, and it's going to stay blocked until that callback is made. Make sure to that that into account in your designs.
,
May 18 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b5d138816ecd20145552720460906269c7f3a0fd commit b5d138816ecd20145552720460906269c7f3a0fd Author: gsennton <gsennton@chromium.org> Date: Thu May 18 16:06:29 2017 [Android WebView] Propagate Java exceptions from JS dialog callbacks WebView applications can throw run-time exceptions within WebView-callbacks. When such an exception is thrown from either onJsAlert, onJsConfirm, or onJsPrompt, it is not correctly propagated to Android's feedback mechanism. This is because those callbacks are called through JNI, and the default way of handling Java exceptions thrown on the Java-side of JNI calls is to print the stack trace for the current exception to the logcat when the JNI call returns to native, and then intentionally crash in native. With this CL we avoid the problem of propagating a Java exception back through JNI and the native stack by posting the WebView-callback (which can cause the Java exception) as a new task to the current Java Handler. In that way any Java exception thrown inside the WebView-callback can be propagated directly up to the Android framework's UncaughtExceptionHandler which then properly reports the Exception through the framework's crash reporting mechanism. BUG= 718921 Review-Url: https://codereview.chromium.org/2886963003 Cr-Commit-Position: refs/heads/master@{#472830} [modify] https://crrev.com/b5d138816ecd20145552720460906269c7f3a0fd/android_webview/java/src/org/chromium/android_webview/AwContentsClientBridge.java
,
May 18 2017
,
Oct 20 2017
Issue 652754 has been merged into this issue. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by gsennton@chromium.org
, May 8 2017