New issue
Advanced search Search tips

Issue 778389 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 25
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug


Show other hotlists

Hotlists containing this issue:
Hotlist-Tast


Sign in to add a comment

tast should be more robust against DUT hangs

Project Member Reported by derat@chromium.org, Oct 25 2017

Issue description

Right now, the tast executable is dependent on the local_tests executable (running on the DUT) to enforce tests' timeouts. If local_tests itself is hanging (e.g. because the DUT goes down), tast will just wait forever. This is easy to repro by unplugging the DUT's Ethernet cable midway through the test.

This isn't a huge problem in TastVMTestStage, which enforces its own timeout on the tast executable, but it'd be nice to handle it better, both to provide better output from the test stage and to improve the experience for developers who are running tast directly.

The tast executable receives serialized testing.Test objects now ( issue 773819 ), so it probably makes sense to include the per-test timeouts in that struct and use them to determine the maximum time between control.TestStart and TestEnd messages (e.g. if a test has a one-minute timeout, something's probably broken if tast hasn't received a report that it's passed or failed by the ~1:10 mark).

tast would also need to check that RunStart and RunEnd messages are received in a timely fashion, and that local_tests doesn't hang between tests.

There possibly ought to be timeouts on other SSH-based communication like pushing the local_tests binary, getting the list of required data files, pushing data files, and receiving results.

I don't think this is an issue for remote tests, since the remote_tests binary runs on the same machine as tast, and SSH communication between it and the DUT is already subject to per-test timeouts.
 

Comment 1 by derat@chromium.org, Oct 25 2017

Status: Started (was: Assigned)
Project Member

Comment 2 by bugdroid1@chromium.org, Oct 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/79649c65a6a2d2dd6d2bc38a172f665aa4ad6a77

commit 79649c65a6a2d2dd6d2bc38a172f665aa4ad6a77
Author: Daniel Erat <derat@chromium.org>
Date: Sat Oct 28 07:11:31 2017

tast-tests: Handle runner.RunConfig timeout rename.

Update local_tests and remote_tests to handle
runner.RunConfig's TestTimeout field being renamed to
DefaultTestTimeout.

BUG= chromium:778389 
TEST=tests still pass
CQ-DEPEND=Ide796df20da960debc00524a7d9b5247cfb3edcb

Change-Id: I9d302e703889a6836e8e813f6727fd767d1422be
Reviewed-on: https://chromium-review.googlesource.com/738809
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/79649c65a6a2d2dd6d2bc38a172f665aa4ad6a77/src/chromiumos/tast/remote/main.go
[modify] https://crrev.com/79649c65a6a2d2dd6d2bc38a172f665aa4ad6a77/src/chromiumos/tast/local/main.go

Project Member

Comment 3 by bugdroid1@chromium.org, Oct 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/445353427d0847817b0896f462845e4fea981c35

commit 445353427d0847817b0896f462845e4fea981c35
Author: Daniel Erat <derat@chromium.org>
Date: Sat Oct 28 07:11:31 2017

tast: Add testing.Test.Timeout field.

Add a Timeout field to testing.Test that can be used to
specify a custom timeout for a test. This field will later
be used by the tast executable to give up on tests that have
far exceeded their timeouts (e.g. because the DUT is
hanging).

BUG= chromium:778389 
TEST=added unit tests
CQ-DEPEND=I9d302e703889a6836e8e813f6727fd767d1422be

Change-Id: Ide796df20da960debc00524a7d9b5247cfb3edcb
Reviewed-on: https://chromium-review.googlesource.com/738645
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/445353427d0847817b0896f462845e4fea981c35/src/chromiumos/tast/common/runner/runner_test.go
[modify] https://crrev.com/445353427d0847817b0896f462845e4fea981c35/src/chromiumos/tast/common/testing/test.go
[modify] https://crrev.com/445353427d0847817b0896f462845e4fea981c35/src/chromiumos/tast/common/runner/runner.go

Project Member

Comment 4 by bugdroid1@chromium.org, Nov 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/3d57dfedbefb9f4846a6bd29e6e31f3030fc5541

commit 3d57dfedbefb9f4846a6bd29e6e31f3030fc5541
Author: Daniel Erat <derat@chromium.org>
Date: Tue Nov 07 10:32:36 2017

tast: Make run.readTestOutput take Config struct.

Pass a Config struct to run's readTestOutput function
instead of passing the logger and results dir as individual
arguments. No functional changes, but I'm doing this so I
can pass an additional message timeout via the Config struct
in a later change.

BUG= chromium:778389 
TEST=unit tests pass; local and remote tests still work

Change-Id: I1589f40bebb8fb10f7708b52452cbb719c754d99
Reviewed-on: https://chromium-review.googlesource.com/754274
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/3d57dfedbefb9f4846a6bd29e6e31f3030fc5541/src/chromiumos/tast/cmd/run/results_test.go
[modify] https://crrev.com/3d57dfedbefb9f4846a6bd29e6e31f3030fc5541/src/chromiumos/tast/cmd/run/local.go
[modify] https://crrev.com/3d57dfedbefb9f4846a6bd29e6e31f3030fc5541/src/chromiumos/tast/cmd/run/results.go
[modify] https://crrev.com/3d57dfedbefb9f4846a6bd29e6e31f3030fc5541/src/chromiumos/tast/cmd/run/remote.go

Project Member

Comment 5 by bugdroid1@chromium.org, Nov 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/113ea40182f6db495835e4be695a62619262f7e9

commit 113ea40182f6db495835e4be695a62619262f7e9
Author: Daniel Erat <derat@chromium.org>
Date: Wed Nov 08 01:52:20 2017

tast: Enforce timeouts when reading control messages.

Make the tast executable enforce timeouts when reading
control messages sent by the local_tests (or remote_tests,
although it runs on the same machine) executable.

By default, a one-minute timeout is used, but this is
lengthened based on per-test timeouts while tests are
running.

BUG= chromium:778389 
TEST=added unit tests; also pulled the plug on a device
     mid-test and checked that the tast command timed out
     after two minutes

Change-Id: Ia60c06d408692e3d5f9bfabec7059002d581fdfd
Reviewed-on: https://chromium-review.googlesource.com/757163
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/113ea40182f6db495835e4be695a62619262f7e9/src/chromiumos/tast/cmd/run/results_test.go
[modify] https://crrev.com/113ea40182f6db495835e4be695a62619262f7e9/src/chromiumos/tast/cmd/run/results.go
[modify] https://crrev.com/113ea40182f6db495835e4be695a62619262f7e9/src/chromiumos/tast/cmd/run/config.go

Project Member

Comment 6 by bugdroid1@chromium.org, Nov 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/847704e41648fe500f1c874c99c7401d001fffe1

commit 847704e41648fe500f1c874c99c7401d001fffe1
Author: Daniel Erat <derat@chromium.org>
Date: Wed Nov 15 00:03:22 2017

tast: Validate control message ordering.

Make the tast command validate the ordering of control
messages that it receives from the test executable.

BUG= chromium:778389 
TEST=added unit tests

Change-Id: Ie225738896b0784d8803f742feb7fe8c0d9f918f
Reviewed-on: https://chromium-review.googlesource.com/769107
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/847704e41648fe500f1c874c99c7401d001fffe1/src/chromiumos/tast/cmd/run/results_test.go
[modify] https://crrev.com/847704e41648fe500f1c874c99c7401d001fffe1/src/chromiumos/tast/cmd/run/results.go

Comment 7 by ovanieva@google.com, Jan 19 2018

Labels: Build-Toolchain

Comment 8 by derat@chromium.org, Jan 19 2018

Cc: ovanieva@chromium.org
Nope, this is unrelated to the toolchain.

This will get the Tests>Tast component once someone creates it (see  issue 801909 ).
Labels: -Build-Toolchain

Comment 10 by derat@chromium.org, Jan 23 2018

Components: Tests>Tast
Project Member

Comment 11 by bugdroid1@chromium.org, Jul 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/267277d689aeda7d35fdfff2050d26cc9ff79e5f

commit 267277d689aeda7d35fdfff2050d26cc9ff79e5f
Author: Daniel Erat <derat@chromium.org>
Date: Wed Jul 11 19:13:28 2018

tast: Make test SSH server threadsafe and improve errors.

I'm cleaning up this class a bit in anticipation of making
it support simulating server hangs in a later change.

BUG= chromium:778389 
TEST=unit tests pass

Change-Id: I1f198523c6528b1d2ff38015ae5d656fb13ab7e4
Reviewed-on: https://chromium-review.googlesource.com/1132349
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/267277d689aeda7d35fdfff2050d26cc9ff79e5f/src/chromiumos/tast/host/test/ssh_server.go

Project Member

Comment 12 by bugdroid1@chromium.org, Jul 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/4a3ecf33066c480a1bdc5ab593b7baa0f60a5dc3

commit 4a3ecf33066c480a1bdc5ab593b7baa0f60a5dc3
Author: Daniel Erat <derat@chromium.org>
Date: Thu Jul 12 09:33:25 2018

tast: Make host package honor context deadlines.

Update the SSH type's Start, PutTree, and GetFile methods
and SSHCommandHandle's Close and Wait methods to honor
context deadlines.

This makes it easier to guard against DUTs that hang
mid-test; for instance, the tast process can detect the
issue and exit with a friendlier error message instead of
hanging indefinitely.

BUG= chromium:778389 
TEST=added unit tests

Change-Id: Icd4656fffa31e80f43c8a34c95ac742a6964a120
Reviewed-on: https://chromium-review.googlesource.com/1134440
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/4a3ecf33066c480a1bdc5ab593b7baa0f60a5dc3/src/chromiumos/tast/host/ssh_test.go
[modify] https://crrev.com/4a3ecf33066c480a1bdc5ab593b7baa0f60a5dc3/src/chromiumos/tast/host/test/ssh_server.go
[modify] https://crrev.com/4a3ecf33066c480a1bdc5ab593b7baa0f60a5dc3/src/chromiumos/tast/host/ssh.go

Project Member

Comment 13 by bugdroid1@chromium.org, Jul 13

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/875a8d83245141f6445e09e6dac7bf11bd116256

commit 875a8d83245141f6445e09e6dac7bf11bd116256
Author: Daniel Erat <derat@chromium.org>
Date: Fri Jul 13 09:22:24 2018

tast: Add -timeout flag to run subcommand.

Add a -timeout flag to the tast executable's "run"
subcommand that can be used to limit the total running time.
The tast Autotest test can set this to make it more likely
that the tast process will detect an impending timeout and
have time to write its results to disk before it's killed.

Also watch the context's deadline while reading control
messages.

Finally, check the SSH connection at completion and log an
error if it's no longer active.

BUG= chromium:778389 
TEST=added and updated unit tests; also manually tested with
     various timeouts

Change-Id: I3fe99324bcc6ed41b19c83d6ef23b71b712ca308
Reviewed-on: https://chromium-review.googlesource.com/1135445
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Jason Clinton <jclinton@chromium.org>

[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run_cmd_test.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run/results_test.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run_wrapper_test.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run_wrapper.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run_cmd.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run/local.go
[modify] https://crrev.com/875a8d83245141f6445e09e6dac7bf11bd116256/src/chromiumos/cmd/tast/run/results.go

Project Member

Comment 14 by bugdroid1@chromium.org, Jul 15

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/7be2ee643b1d912d01f20c42ebcb141eacc614db

commit 7be2ee643b1d912d01f20c42ebcb141eacc614db
Author: Daniel Erat <derat@chromium.org>
Date: Sun Jul 15 01:11:45 2018

autotest: Update tast.py to pass -timeout flag.

Make the "tast" server test tell the "tast run" process how
long it has to run. This makes it possible for the process
to reserve time to write full results and collect system
information from the DUT before it gets killed.

BUG= chromium:778389 
TEST=ran tast.must_pass and checked that expected timeout
     is passed to "tast run" command
CQ-DEPEND=CL:1135445

Change-Id: I2c23daddc91fc86320c4c5962d906602d2ca977b
Reviewed-on: https://chromium-review.googlesource.com/1135835
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>

[modify] https://crrev.com/7be2ee643b1d912d01f20c42ebcb141eacc614db/server/site_tests/tast/testdata/fake_tast.py
[modify] https://crrev.com/7be2ee643b1d912d01f20c42ebcb141eacc614db/server/site_tests/tast/tast.py

Project Member

Comment 15 by bugdroid1@chromium.org, Jul 25

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/e7431ef6fcc67cd95e57f29f4c8ea65f780d4045

commit e7431ef6fcc67cd95e57f29f4c8ea65f780d4045
Author: Daniel Erat <derat@chromium.org>
Date: Wed Jul 25 07:14:57 2018

autotest: Improve tast.py error reporting for failed run.

Update the tast.py server test to report the tast process's
last line of output rather than an unhelpful "Results file
... not found" message when the tast process fails before
starting any tests.

BUG= chromium:778389 
TEST=added unit test

Change-Id: I3dd0f87e0857389fa7b8a9427bca26738cf02c42
Reviewed-on: https://chromium-review.googlesource.com/1146322
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>

[modify] https://crrev.com/e7431ef6fcc67cd95e57f29f4c8ea65f780d4045/server/site_tests/tast/tast_unittest.py
[modify] https://crrev.com/e7431ef6fcc67cd95e57f29f4c8ea65f780d4045/server/site_tests/tast/tast.py

This code in the runLocalRunner function in cmd/tast/run/local.go still has an issue where it can hang if network connectivity to the DUT is lost:

487         results, rerr = readTestOutput(ctx, cfg, handle.Stdout(), crf)
488     }
489 
490     // Check that the runner exits successfully first so that we don't give a useless error
491     // about incorrectly-formed output instead of e.g. an error about the runner being missing.
492     if err := handle.Wait(ctx); err != nil {
493         return results, stderrReader.appendToError(err, stderrTimeout)
494     }

readTestOutput pays attention to test timeouts and returns an error if the test runner fails to send it a control message before a test's deadline is reached. However, the Wait call (which is attempting to produce a better error message) will then hang indefinitely if ctx doesn't have a deadline (which is the case in manual runs, but not when running under tast.py).

I was previously going to add a short timeout (~5 seconds) for the Wait call but incorrectly convinced myself that it wasn't necessary. I should probably add it back to handle this.
From the banon-release run at http://stainless/browse/chromeos-autotest-results/221204724-chromeos-test/, error reporting could also be improved when tast.py's initial "tast list" execution fails due to SSH connection issues. This failure in debug/autoserv.INFO:

...
07/27 07:44:26.281 INFO |             utils:0286| [stdout] tast version tast-cmd-0.0.1-r130
07/27 07:44:26.282 INFO |              tast:0247| Getting list of tests that will be run
07/27 07:44:26.282 INFO |              tast:0221| Running '/usr/local/tast/tast' -verbose=true -logtime=false list -build=false '-remotebundledir=/usr/local/tast/bundles/remote' -remotedatadir= '-remoterunner=/usr/local/tast/remote_test_runner' -json=true 'chromeos6-row1-rack18-host13:22' '(!informational && !disabled && ("dep:chrome" || "dep:chrome_login") && !"dep:android")'
07/27 07:44:36.322 ERROR|             utils:0286| [stderr] 2018/07/27 07:44:26 Connecting to chromeos6-row1-rack18-host13:22
07/27 07:44:36.323 ERROR|             utils:0286| [stderr] 2018/07/27 07:44:36 Failed to connect to chromeos6-row1-rack18-host13:22: dial tcp 100.115.130.62:22: i/o timeout
...

was reported like this in the TKO database:

Failed to run tast (last line: ): Command <'/usr/local/tast/tast' -verbose=true -logtime=false list -build=false '-remotebundledir=/usr/local/tast/bundles/remote' -remotedatadir= '-remoterunner=/usr/local/tast/remote_test_runner' -json=true 'chromeos6-row1-rack18-host13:22' '(!informational && !disabled && ("dep:chrome" || "dep:chrome_login") && !"dep:android")'> failed, rc=1, Command returned non-zero exit status

It'd be better to pin the blame on the DUT here instead of giving a generic "Failed to run tast" error.
Project Member

Comment 18 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/cffedf4839f4dd4f262a0f6e3184c923c7d238eb

commit cffedf4839f4dd4f262a0f6e3184c923c7d238eb
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jul 31 23:36:58 2018

tast: Remove 'lg' logging.Logger global.

Remove the tast command's global 'lg' logging.Logger
variable. Instead, attach a logger to the context that is
passed to subcommands using added NewContext and FromContext
functions in the logging package.

The logger unfortunately can't be easily assigned directly
to fields in subcommand structs, as subcommands must be
created before parsing flags, while the logger can only be
created after doing so.

I'm doing this to make it possible for subcommand tests to
verify log messages in a later change. This change also adds
a logging.NewDiscard convenience function for tests that
don't care about logging.

BUG= chromium:778389 
TEST=unit tests pass and tast command logs output as before

Change-Id: If9ea532755711d230e18d1019974ee5fce9e2c3a
Reviewed-on: https://chromium-review.googlesource.com/1154315
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/run_cmd_test.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/logging/simple.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/main.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/run_cmd.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/build_cmd.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/list_cmd.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/logging/logger.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/list_cmd_test.go
[modify] https://crrev.com/cffedf4839f4dd4f262a0f6e3184c923c7d238eb/src/chromiumos/cmd/tast/symbolize_cmd.go

Project Member

Comment 19 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/c94d1be1ee30e90ec6be0e9ed6ba9a6e4b4a31e6

commit c94d1be1ee30e90ec6be0e9ed6ba9a6e4b4a31e6
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jul 31 23:36:57 2018

tast: Update test.SSHServer to expose delays for fake cmds.

Update the SSHServer type's FakeCmd method to take an
exported CmdResult struct that contains optional StartDelay
and EndDelay fields. These are equivalent to the delays that
can be set via the existing ExecDelays method, but allow
delays to be configured for individual commands. The delays
configurable by ExecDelays are updated to actually-executed
commands.

BUG= chromium:778389 
TEST=unit tests still pass

Change-Id: I861b1d4e97c18bd7126a2f7f7b9dcb4d9166879c
Reviewed-on: https://chromium-review.googlesource.com/1154492
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/c94d1be1ee30e90ec6be0e9ed6ba9a6e4b4a31e6/src/chromiumos/tast/bundle/remote_test.go
[modify] https://crrev.com/c94d1be1ee30e90ec6be0e9ed6ba9a6e4b4a31e6/src/chromiumos/tast/host/test/ssh_server.go
[modify] https://crrev.com/c94d1be1ee30e90ec6be0e9ed6ba9a6e4b4a31e6/src/chromiumos/cmd/tast/run/local_test.go

Project Member

Comment 20 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/f8fa4d43812ee1fb3c8bad1bb93133867a76430a

commit f8fa4d43812ee1fb3c8bad1bb93133867a76430a
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jul 31 23:36:58 2018

tast: Add timeout for waiting for local_test_runner.

Make "tast run" time out if local_test_runner hasn't exited
after 10 seconds. The tast process already times out while
reading control messages, but this additional timeout is
necessary to ensure that it doesn't hang later while waiting
for local_test_runner to exit.

BUG= chromium:778389 
TEST=added unit test; also manually tested by sending
     SIGSTOP to local_test_runner on the DUT mid-test and
     verifying that "tast run" eventually times out

Change-Id: Ibd40413bca6a1322366760a061038cc548e6889e
Reviewed-on: https://chromium-review.googlesource.com/1154493
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/f8fa4d43812ee1fb3c8bad1bb93133867a76430a/src/chromiumos/cmd/tast/run/config.go
[modify] https://crrev.com/f8fa4d43812ee1fb3c8bad1bb93133867a76430a/src/chromiumos/cmd/tast/run/local.go
[modify] https://crrev.com/f8fa4d43812ee1fb3c8bad1bb93133867a76430a/src/chromiumos/cmd/tast/run/local_test.go

Project Member

Comment 21 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast/+/383be95bd893e0ee2b124ce102f23c35659e316f

commit 383be95bd893e0ee2b124ce102f23c35659e316f
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jul 31 23:36:59 2018

tast: Improve error reporting from "tast run" command.

Update the "tast run" command to always print the main error
message as its last line of output. This makes it possible
for the tast.py Autotest server test to display more-useful
error messages.

BUG= chromium:778389 
TEST=added unit tests; also manually verified by making
     local_test_runner time out via SIGSTOP and checking
     that the timeout error is included in the last line of
     output

Change-Id: I4ef3a4787587d41620989616f969361bc58dc753
Reviewed-on: https://chromium-review.googlesource.com/1154494
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run_cmd_test.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run/remote.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run_wrapper_test.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run_wrapper.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run_cmd.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run/local.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run/local_test.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/list_cmd.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run/remote_test.go
[modify] https://crrev.com/383be95bd893e0ee2b124ce102f23c35659e316f/src/chromiumos/cmd/tast/run/run.go

Project Member

Comment 22 by bugdroid1@chromium.org, Aug 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/dd34fda301bf9ad82fe90ddef178f5e5d6d9b778

commit dd34fda301bf9ad82fe90ddef178f5e5d6d9b778
Author: Daniel Erat <derat@chromium.org>
Date: Tue Aug 14 00:48:59 2018

autotest: Improve tast.py error logging.

Make the tast.py server test log the tast process's first
line of stderr on failure if stdout is empty. The "list"
subcommand logs to stderr instead of stdout (since it writes
tests to stdout), so this produces a better failure message
if the DUT is initially unreachable.

BUG= chromium:778389 
TEST=updated tests

Change-Id: I4f7244191303d2405c39ce76902b820d7306739b
Reviewed-on: https://chromium-review.googlesource.com/1172058
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/dd34fda301bf9ad82fe90ddef178f5e5d6d9b778/server/site_tests/tast/tast_unittest.py
[modify] https://crrev.com/dd34fda301bf9ad82fe90ddef178f5e5d6d9b778/server/site_tests/tast/tast.py

Status: Fixed (was: Started)

Sign in to add a comment