New issue
Advanced search Search tips

Issue 835063 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

skylab_swarming_worker: fail a task if a test within the task fails

Project Member Reported by pprabhu@chromium.org, Apr 20 2018

Issue description

Currently, skylab_swarming_worker only fails a task if lucifer returns non-zero exit code.

We want to fail the task in two further situations:
 - autoserv exits with non-zero exit
 - A test fails.
 
Lucifer isn't correctly sending "TestFailed test" when tests in a job fail. eg: https://chrome-swarming.appspot.com/task?id=3cf9468f867eb610&refresh=10&show_raw=1&wide_logs=true
autoserv's pidfile only contains the number of tests failed if autoserv is launched with --parse_job <tag>, which lucifer doesn't.

So, this number will always be 0.

We must get the value from tko/parse instead.
Project Member

Comment 3 by bugdroid1@chromium.org, Apr 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/infra/lucifer/+/67da309c0d00cf24e644b2f36edf29be5b12ca05

commit 67da309c0d00cf24e644b2f36edf29be5b12ca05
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Tue Apr 24 01:17:48 2018

lucifer_run_job: Forward -task-name to autoserv

CL:1015231 broke -task-name flag of lucifer_run_job. This name is
simply forwarded to autoserv execution for the test, so save it inside
autotest args directly.

BUG= chromium:835063 
TEST=make check; manual on skylab-drone

Change-Id: Ia36234352836f4676713dba3a710ed8cdf53fe34
Reviewed-on: https://chromium-review.googlesource.com/1025018
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/67da309c0d00cf24e644b2f36edf29be5b12ca05/src/lucifer/cmd/lucifer_run_job/flags.go
[modify] https://crrev.com/67da309c0d00cf24e644b2f36edf29be5b12ca05/src/lucifer/cmd/lucifer_run_job/types.go

Project Member

Comment 4 by bugdroid1@chromium.org, Apr 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/infra/lucifer/+/ee73489f51b6213f536579c4bf01c4c5a2389600

commit ee73489f51b6213f536579c4bf01c4c5a2389600
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Tue Apr 24 01:17:48 2018

lucifer_run_job: Extract sendHostStatusEvents

BUG= chromium:835063 
TEST=make check

Change-Id: I4eb4823b57984225abeb9c147ae6c6a6a72aa5ce
Reviewed-on: https://chromium-review.googlesource.com/1025036
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/ee73489f51b6213f536579c4bf01c4c5a2389600/src/lucifer/cmd/lucifer_run_job/main.go

Project Member

Comment 5 by bugdroid1@chromium.org, Apr 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/b1241d1f422e720487f81cafcac5ff2d8f6478ab

commit b1241d1f422e720487f81cafcac5ff2d8f6478ab
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Tue Apr 24 23:38:30 2018

autoserv: Report number of failed tests in .parser_execute

BUG= chromium:835063 
TEST=manual, on skylab-drone.

Change-Id: I8a00ebde97432dd0685683b0002bf7d93a5f0bc3
Reviewed-on: https://chromium-review.googlesource.com/1020663
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/b1241d1f422e720487f81cafcac5ff2d8f6478ab/tko/parse.py

Status: WontFix (was: Started)
run_skylab_suite is no longer going to depend on swarming task result, instead using TKO (same as run_suite).
So this is not urgent, and it's probably better to keep the AutotestInfra behaviour of only failing tasks when an infra error happens.

Punt for how.
Cc: xixuan@chromium.org
Status: Assigned (was: WontFix)
Looks like xixuan@'s skylab_run_suite is depending on Swarming task failing in this scenario. So, need to do this after all.
Owner: ayatane@chromium.org
Allen will do this this week.
Re #2 actually the flag that controls it is --execution-tag or -P

We DO pass -P from Lucifer to autoserv, or at least we should if the execution tag is passed to Lucifer, which we don't currently from swarming.  Going to pick the best way to fix this.
results dir = /usr/local/autotest/results/<execution tag>

execution tag = id-owner/execution_subdir
Status: Started (was: Assigned)
Issue 849830 has been merged into this issue.
Project Member

Comment 14 by bugdroid1@chromium.org, Jun 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/infra/lucifer/+/271054d65cbd1ce17113c0885db3f361bb1050a6

commit 271054d65cbd1ce17113c0885db3f361bb1050a6
Author: Allen Li <ayatane@google.com>
Date: Mon Jun 18 21:40:28 2018

lucifer_run_job: Emit test failure events from tko/parse

Previously we were parsing the results from autoserv to get how many
tests failed.  autoserv does not actually provide this information; it
is always zero.  Thus, get it from tko/parse instead.

BUG= chromium:835063 
TEST=None

Change-Id: I40c86ba39aafd920aaba80fff583f7ac42e63fe4
Reviewed-on: https://chromium-review.googlesource.com/1098393
Commit-Ready: Allen Li <ayatane@chromium.org>
Tested-by: Allen Li <ayatane@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/271054d65cbd1ce17113c0885db3f361bb1050a6/src/lucifer/cmd/lucifer_run_job/autotest.go
[modify] https://crrev.com/271054d65cbd1ce17113c0885db3f361bb1050a6/src/lucifer/cmd/lucifer_run_job/main.go

Status: Fixed (was: Started)
Project Member

Comment 16 by bugdroid1@chromium.org, Jun 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/infra/lucifer/+/6a5f602ce235312120c6fba88f0dbd0213f294f4

commit 6a5f602ce235312120c6fba88f0dbd0213f294f4
Author: Allen Li <ayatane@google.com>
Date: Wed Jun 20 02:46:58 2018

skylab_swarming_worker: Exit non-zero if tests failed

BUG= chromium:835063 
TEST=None

Change-Id: Ifff1c8a0dd7fcb86704b7cc4fa3d4d1c807c6be3
Reviewed-on: https://chromium-review.googlesource.com/1107268
Commit-Ready: Allen Li <ayatane@chromium.org>
Tested-by: Allen Li <ayatane@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/6a5f602ce235312120c6fba88f0dbd0213f294f4/src/lucifer/cmd/skylab_swarming_worker/lucifer.go
[modify] https://crrev.com/6a5f602ce235312120c6fba88f0dbd0213f294f4/src/lucifer/cmd/skylab_swarming_worker/main.go

Sign in to add a comment