New issue
Advanced search Search tips

Issue 689229 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Chrome , Mac
Pri: 1
Type: Bug-Regression

Blocking:
issue 687298



Sign in to add a comment

Chrome Remote Desktop keeps disconnecting with "A required component has stopped working"

Project Member Reported by w...@chromium.org, Feb 6 2017

Issue description

Chrome Version: 57.0.2987.19 (Official Build) dev (64-bit)
Chrome Remote Desktop version: 56.0.2924.51
OS: ChromeOS

What steps will reproduce the problem?
(1) Run Chrome Remote Desktop.
(2) Connect to a host.
(3) Switch to some other window, e.g. crbug.com, and use that for a while ;)

What is the expected result?

Session stays connected indefinitely.

What happens instead?

Within a few tens of seconds, Chrome Remote Desktop disconnects with "A required component has stopped working"

Looking at the JavaScript console, the CRD web-app is seeing the PNaCl plugin crashing. 

Note that this only started happening today; there was a ChromeOS update this morning - not sure if CRD happens to also have had an update.
 

Comment 1 by w...@chromium.org, Feb 6 2017

Cc: jamiewa...@chromium.org

Comment 2 by w...@chromium.org, Feb 6 2017

I'm unable repro this on Flip (CrOS 57.0.2987.19) nor Pixel 1 (CrOS 58.0.3001.0) with the latest web-app. Noticable on the Panther device on which I previously repro'd this is that the Chrome Remote Desktop log keeps logging entries of the form:

Received data for unknown socket 7
(anonymous) @ tcp_socket.js:27
EventImpl.dispatchToListener @ extensions::event_bindings:388
publicClassPrototype.(anonymous function) @ extensions::utils:149
EventImpl.dispatch_ @ extensions::event_bindings:372
dispatchArgs @ extensions::event_bindings:244
dispatchEvent @ extensions::event_bindings:253

It is logging these every few tens of seconds at times.

Comment 3 by w...@chromium.org, Feb 6 2017

Logs from the plugin crashing look like:

NativeClient: NaCl module crashed
console_wrapper.js:115 Connection dropped: ERROR_NACL_PLUGIN_CRASHED client_session.js:663
remoting.ConsoleWrapper.recordAndLog_ @ console_wrapper.js:115
remoting.ClientSession.notifyStateChanges_ @ client_session.js:663
remoting.ClientSession.setState_ @ client_session.js:575
remoting.ClientSession.onConnectionStatusUpdate @ client_session.js:501
remoting.ClientPluginImpl.onPluginCrashed_ @ client_plugin_impl.js:178

NaCl Module crashed.  client_plugin_impl.js:185
remoting.ConsoleWrapper.recordAndLog_ @ console_wrapper.js:115
remoting.ClientPluginImpl.onPluginCrashed_ @ client_plugin_impl.js:185
tcp_socket.js:27 Received data for unknown socket 11
(anonymous) @ tcp_socket.js:27
EventImpl.dispatchToListener @ extensions::event_bindings:388
publicClassPrototype.(anonymous function) @ extensions::utils:149
EventImpl.dispatch_ @ extensions::event_bindings:372
dispatchArgs @ extensions::event_bindings:244
dispatchEvent @ extensions::event_bindings:253

I've verified that these don't seem to correspond to e.g. online/offline transitions.
Owner: joedow@chromium.org
Status: Assigned (was: Untriaged)
Investigating since this is a possible regression in M56 CRD release which we should understand and try to get a fix into M57 if possible.

Comment 5 Deleted

Based on telemetry, I see an uptick in NACL crashes across all client platforms.

Comment 7 Deleted

Comment 8 Deleted

Comment 9 Deleted

Comment 10 Deleted

Comment 11 by w...@chromium.org, Feb 7 2017

Cc: sergeyu@chromium.org
Labels: -OS-Chrome OS-All
Neither the CRD client nor host logs contain anything helpful-looking, but the crash/disconnect events do always seem to be accompanied by Chrome log output of the form:

[6136:2328:0207/125547.454856:ERROR:vaapi_video_decode_accelerator.cc(637)] Error decoding stream
[6136:6136:0207/125547.456684:ERROR:vaapi_video_decode_accelerator.cc(288)] Notifying of error 4
[6136:6136:0207/125547.459512:ERROR:vaapi_video_decode_accelerator.cc(762)] Decode request from client in invalid state: 0
[6136:6136:0207/125547.459635:ERROR:vaapi_video_decode_accelerator.cc(288)] Notifying of error 4
[6136:6136:0207/125547.459749:ERROR:vaapi_video_decode_accelerator.cc(762)] Decode request from client in invalid state: 0
[6136:6136:0207/125547.459799:ERROR:vaapi_video_decode_accelerator.cc(288)] Notifying of error 4
[6136:2653:0207/130127.162337:ERROR:vaapi_video_decode_accelerator.cc(637)] Error decoding stream
[6136:6136:0207/130127.162877:ERROR:vaapi_video_decode_accelerator.cc(288)] Notifying of error 4

FYI, I have started an internal tracking bug as well: b/35103700

Comment 13 by w...@chromium.org, Feb 7 2017

FWIW tried two experiments to see if this is a decoder-specific issue:
- Disabled hardware video decode.
- Reverted from using VP9 in CRD to using VP8.

Client plugin crashes even with software-only decode, and VP8, so not a decoder-specific issue.
This version of the webapp has been rolled back while I investigate. We will try to bisect the change to figure out which change is causing the problem.
Blocking: 687298
Status: Started (was: Assigned)
It looks like the change that is causing the issue was checked in last june:
https://codereview.chromium.org/2096643003/

Wez has confirmed that a webapp built one CL prior to this change works fine and a webapp built with the CL exhibits the NACL crash.

Project Member

Comment 17 by bugdroid1@chromium.org, Feb 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/838ec64ba5d7602699ec7c02182c5b17fa06075d

commit 838ec64ba5d7602699ec7c02182c5b17fa06075d
Author: joedow <joedow@chromium.org>
Date: Mon Feb 13 22:52:25 2017

CRD Webapp intermittently crashes on some machines

This issue appeared during our M56 release and affected a subset of
machines.  Some users would hit this problem after a few seconds and
others could spend several hours using the app without problems.

I was able to track the problem down to a checkin from last year:
https://codereview.chromium.org/2096643003/

After a bit of additional debugging, I believe I know what the problem
is.  The actual problem is caused by a call to OnPictureReady() in the
PepperVideoRenderer3D class that occurs before we have a decoded frame
ready.  This leads to a cascade of issues where we try to splice an
element from the empty decoded_frames_ list which causes the size of the
list to underflow to 2^32 and inserts a null FrameTracker into the
next_picture_frames_ list.  This null FrameTracker is eventually
dereferenced which causes a crash.

Why does this only happen on certain machines?  I believe the problem is
in the call to GetNextPicture().  This method is called in two places,
once when we finish decoding a frame and again after we have retrieved a
decoded frame.  The machine I have been debugging the crash on is a dual
core celeron which is quite slow.  All of the test machines have at
least a core i5 in it.  My theory is that on the faster machines, the
decoder is fast enough to complete its work before the extra call to
GetNextPicture() results in the PictureReady callback being signalled.
On the slow machines. we set up our callback which ends up triggering
before the decoding completes.

My fix is to prevent setting up the OnPictureReady callback if we do not
have any decoded frames.  A simple fix for a difficult to debug problem.

BUG= 689229 

Review-Url: https://codereview.chromium.org/2692703002
Cr-Commit-Position: refs/heads/master@{#450132}

[modify] https://crrev.com/838ec64ba5d7602699ec7c02182c5b17fa06075d/remoting/client/plugin/pepper_video_renderer_3d.cc

Labels: Merge-Request-57
This issue affected our M56 release and we *just* found and corrected the issue.  Can I get approval to move this over in to the M57 branch?  This change does not affect the browser, only the CRD webapp.

Thank you!
Note: The merge request is only for commit 838ec64ba5d7602699ec7c02182c5b17fa06075d, the second CL was a method rename due to post-submission feedback, no need to merge it as it is functionally equivalent.
Labels: -OS-All OS-Chrome OS-Linux OS-Mac OS-Windows
Labels: -Merge-Request-57 Merge-Approved-57
Approving merge to M57 Chrome OS.
Project Member

Comment 23 by bugdroid1@chromium.org, Feb 14 2017

Labels: -merge-approved-57 merge-merged-2987
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/dd3c640c1140c46d321b41d3434b660a1cff4860

commit dd3c640c1140c46d321b41d3434b660a1cff4860
Author: Joe Downing <joedow@google.com>
Date: Tue Feb 14 23:51:27 2017

CRD Webapp intermittently crashes on some machines

This issue appeared during our M56 release and affected a subset of
machines.  Some users would hit this problem after a few seconds and
others could spend several hours using the app without problems.

I was able to track the problem down to a checkin from last year:
https://codereview.chromium.org/2096643003/

After a bit of additional debugging, I believe I know what the problem
is.  The actual problem is caused by a call to OnPictureReady() in the
PepperVideoRenderer3D class that occurs before we have a decoded frame
ready.  This leads to a cascade of issues where we try to splice an
element from the empty decoded_frames_ list which causes the size of the
list to underflow to 2^32 and inserts a null FrameTracker into the
next_picture_frames_ list.  This null FrameTracker is eventually
dereferenced which causes a crash.

Why does this only happen on certain machines?  I believe the problem is
in the call to GetNextPicture().  This method is called in two places,
once when we finish decoding a frame and again after we have retrieved a
decoded frame.  The machine I have been debugging the crash on is a dual
core celeron which is quite slow.  All of the test machines have at
least a core i5 in it.  My theory is that on the faster machines, the
decoder is fast enough to complete its work before the extra call to
GetNextPicture() results in the PictureReady callback being signalled.
On the slow machines. we set up our callback which ends up triggering
before the decoding completes.

My fix is to prevent setting up the OnPictureReady callback if we do not
have any decoded frames.  A simple fix for a difficult to debug problem.

BUG= 689229 

Review-Url: https://codereview.chromium.org/2692703002
Cr-Commit-Position: refs/heads/master@{#450132}
(cherry picked from commit 838ec64ba5d7602699ec7c02182c5b17fa06075d)

Review-Url: https://codereview.chromium.org/2692383002 .
Cr-Commit-Position: refs/branch-heads/2987@{#514}
Cr-Branched-From: ad51088c0e8776e8dcd963dbe752c4035ba6dab6-refs/heads/master@{#444943}

[modify] https://crrev.com/dd3c640c1140c46d321b41d3434b660a1cff4860/remoting/client/plugin/pepper_video_renderer_3d.cc

Status: Verified (was: Started)
Issue has been resolved in both M57 and M58.  Wez verified the fix in M57 this morning.

Sign in to add a comment