New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 677307 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

difficult to explain timeouts for tests on chromeos4-row11-rack7-host1

Project Member Reported by semenzato@chromium.org, Dec 28 2016

Issue description

reks-release build 706 failed with a timeout.  The AFE reports that all tests succeeded except for logging_CrashSender, which was aborted.

https://uberchromegw.corp.google.com/i/chromeos/builders/reks-release/builds/706

See autoserv.DEBUG here:

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93088570-chromeos-test/hostless/debug/

Tests were scheduled at 5:35, and ran between 5:39 and 6:23.  Logging_CrashSender is shown as scheduled, but there is no record of it executing.  The AFE points to an empty folder:

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93088723-chromeos-test

The hostless status.log ends with this:

INFO	----	----	Job aborted by autotest_system on 2016-12-28 08:38:37

The AFE suggests that the test itself has a timeout of 180 minutes.  Isn't that too long a timeout?  Does it include scheduling?

Also, the stdio for bvt-inline suggests that the build was aborted at 7:59.  Why is the last timestamp in status.log 39 minutes later?

https://uberchromegw.corp.google.com/i/chromeos/builders/reks-release/builds/706/steps/HWTest%20%5Bbvt-inline%5D/logs/stdio

05:35:12: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmp0AQ0CJ/tmpIy2YOi/temp_summary.json --raw-cmd --task-name reks-release/R57-9129.0.0-bvt-inline --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 14400 --io-timeout 14400 --hard-timeout 14400 --expiration 1200 '--tags=priority:Build' '--tags=suite:bvt-inline' '--tags=build:reks-release/R57-9129.0.0' '--tags=task_name:reks-release/R57-9129.0.0-bvt-inline' '--tags=board:reks' -- /usr/local/autotest/site_utils/run_suite.py --build reks-release/R57-9129.0.0 --board reks --suite_name bvt-inline --pool bvt --num 6 --file_bugs True --priority Build --timeout_mins 180 --retry True --max_retries 10 --minimum_duts 4 --suite_min_duts 6 --offload_failures_only False -m 93088570
07:59:13: WARNING: Killing tasks: [<_BackgroundTask(_BackgroundTask-7:7:4, started)>]

According to the AFE, the test was supposed to run on chromeos4-row11-rack7-host1.  I logged onto this DUT and it's been up for a while:

 11:07:42 up 1 day,  2:48,  0 users,  load average: 0.00, 0.01, 0.05

So what is going on?

So many questions!






 
Summary: difficult to explain timeouts for tests on chromeos4-row11-rack7-host1 (was: difficult to explain timeout in logging_CrashSender)
I took a look at build 705 which failed similarly.  This time the AFE shows that the aborted test is on login_OwnershipRetaken, but the DUT is the same: chromeos4-row11-rack7-host1.

Same thing with build 704: same DUT, but the test this time is platform_DMVerityCorruption.

It would seem that any test on this DUT doesn't end well.  Yet the DUT looks fine.

I'll swap that dut out to the suites pool, last couple of tests have resulted in aborts on this dut.
Owner: nxia@chromium.org
+ nxia - please file a bug to have the lab team investigate this dut

Comment 4 by nxia@chromium.org, Jan 5 2017

fix ticket filed at b/34093381

Comment 5 by nxia@chromium.org, Jan 9 2017

Status: Fixed (was: Untriaged)
reks-release has been working well since then and the dut has been filed to fix. closing the bug.

Comment 6 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 7 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 8 by dchan@google.com, May 30 2017

Labels: VerifyIn-60

Sign in to add a comment