Issue metadata
Sign in to add a comment
|
[Remoting Android] Client Randomly Crashes due to DCHECK failure |
||||||||||||||||||||||
Issue descriptionVersion: Android Client ToT OS: Android 6 What steps will reproduce the problem? (1) Open the client. (2) Keep clicking on a host and canceling on the enter PIN dialog. What is the expected output? The app should not crash. What do you see instead? The app crashes at some point. The DCHECK's in the dtor of ChromotingJniInstance failed. There are probably two problems going on: 1. ChromotingJniInstance::Disconnect() posts a task to the network thread to clean up its members, and the d'tor may be called on any thread when the refcount drops to 0. This sometimes causes the d'tor being executed before Disconnect() is called. 2. ChromotingJniInstance also has a weak ptr factory. The c'tor and d'tor may not be called on the same thread.
,
May 14 2016
This can be fixed by implementing a refcount object deleter ("Traits"?) for JniInstance that posts delete task to the network thread.
,
May 14 2016
Disconnect() will not be called after destructor because the Disconnect() task holds a reference to the object, i.e. ref-count will not go to 0 while Disconnect() task is pending. To fix this bug should be enough to add weak_factory_.InvalidateWeakPtrs() in Disconnect().
,
May 14 2016
Looks like it does fix the bug. I have no idea why this bug exists at the first place... ConnectToHostOnNetworkThread() is called after Disconnect() is called? Then how does calling weak_factory_.InvalidateWeakPtrs() fix this problem?
,
May 15 2016
Weak pointers are supposed to be invalidated on the same thread on which they are dereferenced (which is the network thread in this case). ChromotingJniInstance can be destroyed on any thread and so WeakPtrFactory sometimes would invalidate weak ptrs on the UI thread (when the factory is destroyed), but that's supposed to happen only on network thread, hence the DCHECK. Calling InvalidateWeakPtrs() on network thread ensures that the pointer is invalidated on the network thread.
,
May 16 2016
I log the pointer address in ctor/dtor/connect/disconnect functions and got interesting logs when the app crashes... 05-16 11:03:53.875: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(68)] CTOR: 0x9d7bff00 05-16 11:03:53.876: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(392)] CONNECT: 0x9d7bff00 05-16 11:03:53.883: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(96)] DTOR: 0x9d7bff00 CTOR runs on UI thread and posts a connect task on network thread. disconnect posts itself to the network thread so DTOR should not be called before the actual disconnect function is called on the network thread. This log doesn't make sense at all... Typical logs when it doesn't crash looks like this: 05-16 11:03:51.850: I/chromium(5966): [0516/110351:INFO:chromoting_jni_instance.cc(68)] CTOR: 0x9c7f6700 05-16 11:03:51.851: I/chromium(5966): [0516/110351:INFO:chromoting_jni_instance.cc(392)] CONNECT: 0x9c7f6700 05-16 11:03:52.994: I/chromium(5966): [0516/110352:INFO:chromoting_jni_instance.cc(113)] DISCONNECT: 0x9c7f6700 05-16 11:03:52.996: I/chromium(5966): [0516/110352:INFO:chromoting_jni_instance.cc(96)] DTOR: 0x9c7f6700 I suspect it is an Android bug... I can't repro this bug after installing a system update this morning...
,
May 19 2016
Okay... It may be some race conditions less likely to happen after Android update and new CL's. Since I can't reproduce this any more. I will just mark this as WontFix until somebody is bothered by this problem.
,
May 20 2016
https://codereview.chromium.org/1998353002/ Just wrote experiment code to continuously connect to host and cancel on the PIN dialog. It crashes since the weakptr is invalidated on the wrong thread, however, DCHECK's (!view_, !client_context_, etc.) inside the destructor passed and the logs look reasonable: 05-20 13:24:36.777: I/chromium(16671): [0520/132436:INFO:chromoting_jni_instance.cc(66)] CTOR: 0xef2f6600 05-20 13:24:36.778: I/chromium(16671): [0520/132436:INFO:chromoting_jni_instance.cc(404)] Run Connect: 0xef2f6600 05-20 13:24:37.421: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(105)] Post Disconnect: 0xef2f6600 05-20 13:24:37.421: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(111)] Run Disconnect: 0xef2f6600 05-20 13:24:37.455: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(91)] DTOR: 0xef2f6600 This is something Comment#3 can explain and may be a potential bug. I'll reopen this bug and fix it by invalidating the weakptr in Disconnect().
,
May 20 2016
After taking sergeyu@'s suggestion and invalidating weakptr in disconnect, the still crashes and the mysterious logging sequence still appears after (programatically :P) connecting and canceling for 68 times. Mysterious log: 05-20 15:30:26.434: D/cr_Chromoting(23721): [ConnectAndCancelTest.java:52] ConnectAndCancelTest connection count: 68 05-20 15:30:26.588: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(66)] CTOR: 0xea274c00 05-20 15:30:26.588: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(406)] Run Connect: 0xea274c00 05-20 15:30:26.589: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(91)] DTOR: 0xea274c00 05-20 15:30:26.628: A/chromium(23721): [0520/153026:FATAL:chromoting_jni_instance.cc(96)] Check failed: !view_. 05-20 15:30:26.628: A/chromium(23721): #00 0xdfba0b47 /data/app/org.chromium.chromoting-1/lib/arm/libremoting_client_jni.so+0x00b26b47 ... Whereas a successful log sequence is: 05-20 15:30:20.600: D/cr_Chromoting(23721): [ConnectAndCancelTest.java:52] ConnectAndCancelTest connection count: 67 05-20 15:30:20.722: W/chromium(23721): [0520/153020:WARNING:chromoting_jni_instance.cc(66)] CTOR: 0xdaf25f00 05-20 15:30:20.722: W/chromium(23721): [0520/153020:WARNING:chromoting_jni_instance.cc(406)] Run Connect: 0xdaf25f00 05-20 15:30:21.427: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_runtime.cc(269)] Called: ChromotingJniRuntime::DisconnectFromHost 05-20 15:30:21.427: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_runtime.cc(271)] DisconnectFromHost: session != null 05-20 15:30:21.428: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(105)] Post Disconnect: 0xdaf25f00 05-20 15:30:21.429: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(111)] Run Disconnect: 0xdaf25f00 05-20 15:30:21.431: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(91)] DTOR: 0xdaf25f00 Basically in some rare circumstances DisconnectFromHost is not called before the d'tor is called...
,
May 23 2016
Just to clarify that invalidating WeakPtr on the network thread can fix WeakPtr's validation DCHECK's although it doesn't fix DCHECK(!view_), so I will still commit the change for the WeakPtr.
,
May 23 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3f62203016905be841dfa1142238f338916802c8 commit 3f62203016905be841dfa1142238f338916802c8 Author: yuweih <yuweih@chromium.org> Date: Mon May 23 17:21:46 2016 Invalidate ChromotingJniInstance's WeakPtr on network thread ChromotingJniInstance implements refcounted but also has a weak pointer factory. The implementation allows the d'tor to be called on any thread but the weak pointer can only be invalidated on the network thread. In some rare circumstances, The instance is destructed on a non-network thread and fails weakptr's DCHECK. This CL makes the instance invalidate the weak pointer when it disconnects on the network thread. BUG= 611893 Review-Url: https://codereview.chromium.org/2001813002 Cr-Commit-Position: refs/heads/master@{#395347} [modify] https://crrev.com/3f62203016905be841dfa1142238f338916802c8/remoting/client/jni/chromoting_jni_instance.cc [modify] https://crrev.com/3f62203016905be841dfa1142238f338916802c8/remoting/client/jni/chromoting_jni_instance.h
,
May 25 2016
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by yuweih@chromium.org
, May 13 2016