New issue
Advanced search Search tips

Issue 821130 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 3
Type: Bug



Sign in to add a comment

[remoting host] Python IOError in child when notifying parent logger.

Project Member Reported by lambroslambrou@chromium.org, Mar 12 2018

Issue description

When starting a new host process via start-host, it can sometimes take a long time to start the CRD host (maybe 3 minutes). In this situation, the session can be terminated even when the host reports a successful start.

Python sets up a parent/child process pair, connected via a pipe.
When the CRD host is ready, it sends SIGUSR1 to the child. The child handles this and sends "READY" to the parent via the pipe.
However, if the parent process has exited (perhaps due to start-host timeout), the child fails to write data to the pipe, with an uncaught IOError exception.

I think we should trap the IOError exception at that point. Even if the parent process has gone away, it's still better to allow the child to continue running.

Example log of a failed session, which took 3 minutes to start successfully but was then killed because of the unhandled exception:
[0310/105554.051208:INFO:signaling_connector.cc(73)] Signaling connected. New JID: [...]
[0310/105554.334353:ERROR:heartbeat_sender.cc(271)] Received error: Host ID not found
[0310/105604.477453:ERROR:heartbeat_sender.cc(271)] Received error: Host ID not found
[0310/105854.833419:INFO:remoting_me2me_host.cc(935)] Host ready to receive connections.
2018-03-10 10:58:54,833:INFO:Host ready to receive connections.
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 885, in emit
    self.flush()
  File "/usr/lib/python2.7/logging/__init__.py", line 845, in flush
    self.stream.flush()
IOError: [Errno 32] Broken pipe
Logged from file chrome-remote-desktop, line 741
Traceback (most recent call last):
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 1772, in <module>
    sys.exit(main())
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 1696, in main
    pid, status = waitpid_handle_exceptions(-1, deadline)
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 1305, in waitpid_handle_exceptions
    pid_result, status = os.waitpid(pid, 0)
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 743, in sigusr1_handler
    ParentProcessLogger.release_parent_if_connected(True)
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 1041, in release_parent_if_connected
    instance._release_parent(success)
  File "/opt/google/chrome-remote-desktop/chrome-remote-desktop", line 1014, in _release_parent
    self._write_file.flush()
IOError: [Errno 32] Broken pipe
2018-03-10 10:58:54,862:INFO:Cleanup.
2018-03-10 10:58:54,863:INFO:Terminating X server
Gdk-Message: chrome-remote-desktop-host: Fatal IO error 11 (Resource temporarily unavailable) on X server :20.

2018-03-10 10:58:54,872:INFO:Terminating session
2018-03-10 10:58:54,902:INFO:Terminating host
[0310/105854.911995:WARNING:remoting_user_session.cc(464)] Child exited with status 1

 
Owner: lambroslambrou@chromium.org
Status: Assigned (was: Untriaged)
Status: Started (was: Assigned)
Project Member

Comment 3 by bugdroid1@chromium.org, May 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/09bcd258802cd250d96c43fa4f814c35867a4dce

commit 09bcd258802cd250d96c43fa4f814c35867a4dce
Author: Lambros Lambrou <lambroslambrou@chromium.org>
Date: Wed May 23 18:33:58 2018

[remoting] Catch IOError writing READY message to pipe

When the host starts successfully, the chrome-remote-desktop Python
script tries to notify the user-session process by sending a "READY"
message through a pipe. But sometimes the host can take longer to start,
than user-session is prepared to wait for it (it times out after 120s).
Since the user-session process has exited, the "READY" message fails
with IOError (broken pipe). If it is not caught, the Python process
exits, taking down the host process with it.

This CL traps the IOError, and allows the host to continue running in
this situation.

Test: Change "alarm(120)" to "alarm(5)" in remoting_user_session.cc
and try to start the host:
out/Default/remoting/chrome-remote-desktop --start
After this CL, the host service continues running normally, even though
user-session reports "Timeout waiting for session to start..."

Bug:  821130 
Change-Id: I7ba1b7fcd6c48890677310551ae734a8bc905bbe
Reviewed-on: https://chromium-review.googlesource.com/994586
Reviewed-by: Jamie Walch <jamiewalch@chromium.org>
Commit-Queue: Lambros Lambrou <lambroslambrou@chromium.org>
Cr-Commit-Position: refs/heads/master@{#561170}
[modify] https://crrev.com/09bcd258802cd250d96c43fa4f814c35867a4dce/remoting/host/linux/linux_me2me_host.py

Status: Fixed (was: Started)

Sign in to add a comment