New issue
Advanced search Search tips

Issue 740201 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 2
Type: Bug



Sign in to add a comment

test-runner.py hangs, and fails to respond to Ctrl-C, if test(s) crash

Project Member Reported by w...@chromium.org, Jul 7 2017

Issue description

What steps will reproduce the problem?
(1) Build ipc_tests.
(2) Run a crashy IPC test, e.g. run_ipc_tests --gtest_filter=*SharedMemory*

What is the expected result?

Expect that the test runs, and the test-runner exits.

What happens instead?

Test-runner continues running, and cannot be quit via Ctrl-C, nor does it ever appear to time-out and exit.

 

Comment 1 by w...@chromium.org, Jul 8 2017

Failure to respond to signals (or, at least, to shutdown *cleanly* in response to Ctrl-C) is a porting TODO (see https://cs.chromium.org/chromium/src/base/test/launcher/test_launcher.cc?sq=package:chromium&l=119) - in future this is something that the Fuchsia shell should really handle, by terminating the test process.

Timeout issue appears to be that, since so many tests currently hang due to partial/missing implementation, we end up with all the shard sub-process stuck waiting for hung tests, until they timeout - with 10 tests per shard batch, the timeout is 450s by default. Ideally we'd have the test launcher watch the progress of the sub-process and only allow 45s per actual test, rather than a blanket timeout, but I suspect that's tricky to arrange.

To make these hangs easier to diagnose locally, we can at least add pass-throughs for the batch & job counts to the test-runner script.

Comment 2 by w...@chromium.org, Jul 11 2017

Owner: kmarshall@chromium.org
Kevin, as noted, there's not much to do about Ctrl-C right now, but the terminal cruft you've noticed, and the timeouts issue, would be good to fix - WDYT?

Comment 3 by w...@chromium.org, Jul 12 2017

... and then Kevin inspired me to realise that we just need to redirect QEMU's stdin to allow the test-runner script to catch Ctrl-C: https://chromium-review.googlesource.com/c/567037
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7913acc222c37941fb90f7e8813221c8cd83075c

commit 7913acc222c37941fb90f7e8813221c8cd83075c
Author: Kevin Marshall <kmarshall@chromium.org>
Date: Thu Jul 13 19:01:47 2017

Stability, usability improvements for Fuchsia test runner.

Stability, usability improvements for Fuchsia test runner.
* Fix test flakiness that seemed to be related to qemu sharing the
  stdin stream with Python. The OS would then tear down the scripts
  prematurely causing Python to panic.
  Specifying a subprocess PIPE for stdin fixes that.
* Disable the QEMU interactive monitor. It was overriding the user's
  terminal settings and blocking important keypresses like ^C, making
  it difficult to terminate tests early.

R: wez@chromium.org,thakis@chromium.org
BUG:  741194 , 740201 
Change-Id: Ibda6ac59d44b07c1b75902a7dc13db086841baa0
Reviewed-on: https://chromium-review.googlesource.com/568739
Commit-Queue: Kevin Marshall <kmarshall@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#486451}
[modify] https://crrev.com/7913acc222c37941fb90f7e8813221c8cd83075c/build/fuchsia/test_runner.py

Status: Fixed (was: Assigned)
Cc: scottmg@chromium.org
Ctrl-A X was the way to "Ctrl-C" before, but I guess Ctrl-C is more natural. The spam from python on Ctrl-C is annoying though, vs. just quietly exiting the qemu monitor was exited.

(I also used the shell to run local commands so having Ctrl-C go to the guest instead made more sense. Perhaps we should have a "run_shell" helper to get to a shell without running a test binary for investigating the vm setup.)

Comment 7 by w...@chromium.org, Jul 24 2017

+1 to having an interactive shell runner script, as well as test-runner.
Project Member

Comment 8 by bugdroid1@chromium.org, Jul 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9d6ed67929770fd6541ac6bd82904ae1610fe745

commit 9d6ed67929770fd6541ac6bd82904ae1610fe745
Author: Kevin Marshall <kmarshall@chromium.org>
Date: Mon Jul 24 18:48:33 2017

Handle ctrl-C and timeouts for Fuchsia test symbolization.

Handle ctrl-C and timeouts for Fuchsia test symbolization.
Currently the symbolization worker processes fight over who gets
to receive SIGINT when the user presses ctrl-C during symbolization.

This change makes the parent process, the one that spawned the
children, solely responsible for receiving the signal and killing
the child processes.

R: sergeyu@chromium.org
Bug:  740201 
Change-Id: I76721d18dda35a8d5b28ace24687c07ea0f68b87
Reviewed-on: https://chromium-review.googlesource.com/578747
Commit-Queue: Kevin Marshall <kmarshall@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#489035}
[modify] https://crrev.com/9d6ed67929770fd6541ac6bd82904ae1610fe745/build/fuchsia/test_runner.py

Scott, can you resync and see if the python log spew issue is acceptable to you? The CL in comment #8 should improve things substantially.
I might've done the except KeyboardInterrupt at the top-level I think (as the backtrace of python isn't interesting even if it's not symbolizing), but that's fine too. Thanks!

Sign in to add a comment