New issue
Advanced search Search tips

Issue 611893 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 614242
Owner:
Closed: May 2016
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

[Remoting Android] Client Randomly Crashes due to DCHECK failure

Project Member Reported by yuweih@chromium.org, May 13 2016

Issue description

Version: Android Client ToT
OS: Android 6

What steps will reproduce the problem?
(1) Open the client.
(2) Keep clicking on a host and canceling on the enter PIN dialog.

What is the expected output?

The app should not crash.

What do you see instead?

The app crashes at some point. The DCHECK's in the dtor of ChromotingJniInstance failed. There are probably two problems going on:

1. ChromotingJniInstance::Disconnect() posts a task to the network thread to clean up its members, and the d'tor may be called on any thread when the refcount drops to 0. This sometimes causes the d'tor being executed before Disconnect() is called.
2. ChromotingJniInstance also has a weak ptr factory. The c'tor and d'tor may not be called on the same thread.
 

Comment 1 by yuweih@chromium.org, May 13 2016

Just to mention that this only happens on dev builds where DCHECK is turned on.

Comment 2 by yuweih@chromium.org, May 14 2016

Owner: yuweih@chromium.org
Status: Assigned (was: Untriaged)
This can be fixed by implementing a refcount object deleter ("Traits"?) for JniInstance that posts delete task to the network thread.

Comment 3 by sergeyu@google.com, May 14 2016

Disconnect() will not be called after destructor because the Disconnect() task holds a reference to the object, i.e. ref-count will not go to 0 while Disconnect() task is pending.

To fix this bug should be enough to add weak_factory_.InvalidateWeakPtrs() in Disconnect(). 

Comment 4 by yuweih@chromium.org, May 14 2016

Looks like it does fix the bug. I have no idea why this bug exists at the first place... ConnectToHostOnNetworkThread() is called after Disconnect() is called? Then how does calling weak_factory_.InvalidateWeakPtrs() fix this problem?

Comment 5 by sergeyu@google.com, May 15 2016

Weak pointers are supposed to be invalidated on the same thread on which they are dereferenced (which is the network thread in this case). ChromotingJniInstance can be destroyed on any thread and so WeakPtrFactory sometimes would invalidate weak ptrs on the UI thread (when the factory is destroyed), but that's supposed to happen only on network thread, hence the DCHECK. Calling InvalidateWeakPtrs() on network thread ensures that the pointer is invalidated on the network thread.

Comment 6 by yuweih@chromium.org, May 16 2016

I log the pointer address in ctor/dtor/connect/disconnect functions and got interesting logs when the app crashes...

05-16 11:03:53.875: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(68)] CTOR: 0x9d7bff00
05-16 11:03:53.876: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(392)] CONNECT: 0x9d7bff00
05-16 11:03:53.883: I/chromium(5966): [0516/110353:INFO:chromoting_jni_instance.cc(96)] DTOR: 0x9d7bff00

CTOR runs on UI thread and posts a connect task on network thread. disconnect posts itself to the network thread so DTOR should not be called before the actual disconnect function is called on the network thread. This log doesn't make sense at all...

Typical logs when it doesn't crash looks like this:

05-16 11:03:51.850: I/chromium(5966): [0516/110351:INFO:chromoting_jni_instance.cc(68)] CTOR: 0x9c7f6700
05-16 11:03:51.851: I/chromium(5966): [0516/110351:INFO:chromoting_jni_instance.cc(392)] CONNECT: 0x9c7f6700
05-16 11:03:52.994: I/chromium(5966): [0516/110352:INFO:chromoting_jni_instance.cc(113)] DISCONNECT: 0x9c7f6700
05-16 11:03:52.996: I/chromium(5966): [0516/110352:INFO:chromoting_jni_instance.cc(96)] DTOR: 0x9c7f6700

I suspect it is an Android bug... I can't repro this bug after installing a system update this morning...

Comment 7 by yuweih@chromium.org, May 19 2016

Status: WontFix (was: Assigned)
Okay... It may be some race conditions less likely to happen after Android update and new CL's. Since I can't reproduce this any more. I will just mark this as WontFix until somebody is bothered by this problem.

Comment 8 by yuweih@chromium.org, May 20 2016

Status: Started (was: WontFix)
https://codereview.chromium.org/1998353002/

Just wrote experiment code to continuously connect to host and cancel on the PIN dialog. It crashes since the weakptr is invalidated on the wrong thread, however, DCHECK's (!view_, !client_context_, etc.) inside the destructor passed and the logs look reasonable:

05-20 13:24:36.777: I/chromium(16671): [0520/132436:INFO:chromoting_jni_instance.cc(66)] CTOR: 0xef2f6600
05-20 13:24:36.778: I/chromium(16671): [0520/132436:INFO:chromoting_jni_instance.cc(404)] Run Connect: 0xef2f6600
05-20 13:24:37.421: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(105)] Post Disconnect: 0xef2f6600
05-20 13:24:37.421: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(111)] Run Disconnect: 0xef2f6600
05-20 13:24:37.455: I/chromium(16671): [0520/132437:INFO:chromoting_jni_instance.cc(91)] DTOR: 0xef2f6600

This is something Comment#3 can explain and may be a potential bug.

I'll reopen this bug and fix it by invalidating the weakptr in Disconnect().

Comment 9 by yuweih@chromium.org, May 20 2016

After taking sergeyu@'s suggestion and invalidating weakptr in disconnect, the still crashes and the mysterious logging sequence still appears after (programatically :P) connecting and canceling for 68 times.

Mysterious log:

05-20 15:30:26.434: D/cr_Chromoting(23721): [ConnectAndCancelTest.java:52] ConnectAndCancelTest connection count: 68
05-20 15:30:26.588: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(66)] CTOR: 0xea274c00
05-20 15:30:26.588: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(406)] Run Connect: 0xea274c00
05-20 15:30:26.589: W/chromium(23721): [0520/153026:WARNING:chromoting_jni_instance.cc(91)] DTOR: 0xea274c00
05-20 15:30:26.628: A/chromium(23721): [0520/153026:FATAL:chromoting_jni_instance.cc(96)] Check failed: !view_. 
05-20 15:30:26.628: A/chromium(23721): #00 0xdfba0b47 /data/app/org.chromium.chromoting-1/lib/arm/libremoting_client_jni.so+0x00b26b47
...

Whereas a successful log sequence is:

05-20 15:30:20.600: D/cr_Chromoting(23721): [ConnectAndCancelTest.java:52] ConnectAndCancelTest connection count: 67
05-20 15:30:20.722: W/chromium(23721): [0520/153020:WARNING:chromoting_jni_instance.cc(66)] CTOR: 0xdaf25f00
05-20 15:30:20.722: W/chromium(23721): [0520/153020:WARNING:chromoting_jni_instance.cc(406)] Run Connect: 0xdaf25f00
05-20 15:30:21.427: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_runtime.cc(269)] Called: ChromotingJniRuntime::DisconnectFromHost
05-20 15:30:21.427: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_runtime.cc(271)] DisconnectFromHost: session != null
05-20 15:30:21.428: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(105)] Post Disconnect: 0xdaf25f00
05-20 15:30:21.429: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(111)] Run Disconnect: 0xdaf25f00
05-20 15:30:21.431: W/chromium(23721): [0520/153021:WARNING:chromoting_jni_instance.cc(91)] DTOR: 0xdaf25f00

Basically in some rare circumstances DisconnectFromHost is not called before the d'tor is called...
Just to clarify that invalidating WeakPtr on the network thread can fix WeakPtr's validation DCHECK's although it doesn't fix DCHECK(!view_), so I will still commit the change for the WeakPtr.
Project Member

Comment 11 by bugdroid1@chromium.org, May 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3f62203016905be841dfa1142238f338916802c8

commit 3f62203016905be841dfa1142238f338916802c8
Author: yuweih <yuweih@chromium.org>
Date: Mon May 23 17:21:46 2016

Invalidate ChromotingJniInstance's WeakPtr on network thread

ChromotingJniInstance implements refcounted but also has a weak pointer
factory. The implementation allows the d'tor to be called on any thread
but the weak pointer can only be invalidated on the network thread. In
some rare circumstances, The instance is destructed on a non-network
thread and fails weakptr's DCHECK. This CL makes the instance invalidate
the weak pointer when it disconnects on the network thread.

BUG= 611893 

Review-Url: https://codereview.chromium.org/2001813002
Cr-Commit-Position: refs/heads/master@{#395347}

[modify] https://crrev.com/3f62203016905be841dfa1142238f338916802c8/remoting/client/jni/chromoting_jni_instance.cc
[modify] https://crrev.com/3f62203016905be841dfa1142238f338916802c8/remoting/client/jni/chromoting_jni_instance.h

Mergedinto: 614242
Status: Duplicate (was: Started)
Should be fixed after  issue 614242  is done.

Sign in to add a comment