New issue
Advanced search Search tips

Issue 860027 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 31
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: ----



Sign in to add a comment

Make run_suite_skylab re-runable

Project Member Reported by xixuan@chromium.org, Jul 3

Issue description

Buildbot has silence check: if a command doesn't generate output in a fixed period (larger than 1.5 hour), it will fail immediately.

To avoid this, current run_suite.py is called every 1.5 hour in HWTestStage. But run_suite_skylab didn't support this. So every call of run_suite_skylab will create a new suite with the same build.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Jul 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/f2da195fcbc92729cae890f821e632400927a97e

commit f2da195fcbc92729cae890f821e632400927a97e
Author: Xixuan Wu <xixuan@chromium.org>
Date: Wed Jul 11 20:20:52 2018

skylab_hwtest: Use --create_and_return to skip suite waiting

BUG= chromium:860027 
TEST=Ran locally.

Change-Id: Ifed630e792280ebb4c266699b62b8bc909695071
Reviewed-on: https://chromium-review.googlesource.com/1131831
Commit-Queue: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/f2da195fcbc92729cae890f821e632400927a97e/venv/skylab_suite/cros_suite.py
[modify] https://crrev.com/f2da195fcbc92729cae890f821e632400927a97e/venv/skylab_suite/cmd/run_suite_skylab.py

Project Member

Comment 2 by bugdroid1@chromium.org, Jul 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/c7430719f181703c4266b2c765ac91c6785868c6

commit c7430719f181703c4266b2c765ac91c6785868c6
Author: Xixuan Wu <xixuan@chromium.org>
Date: Thu Jul 12 21:22:35 2018

autotest: Make run_suite_skylab accept a suite_id to monitor the suite

Builder has a silence check for each command it triggered. If the command
doesn't return any response in a period, builder will fail.

Run_suite_skylab is one of the command that triggered by buildbot (builder).
However, it's easy for run_suite_skylab to hang for more than 1.5 hour
since it keeps waiting for a suite and its child tasks to finish.

To avoid that, builder will first kick off run_suite_skylab
--create_and-return to make it return immediately after scheduling child
tasks. Then builder keeps kicking off run_suite_skylab --suite_id to
resume this suite and monitor its progress. Since 'run_suite_skylab --suite_id'
is idempotent, it can be called several times without side-effects.

BUG= chromium:860027 
TEST=Ran "bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0
--board nyan_blaze --suite_name bvt-inline --pool cq --priority 70
--timeout_mins 90 --test_retry --max_retries 5 --parent_suite_id
3e1f46708ec5b211",
"bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0 --board
reef --model reef --suite_name provision --pool cq --priority 70
--timeout_mins 90 --test_retry --max_retries 5 --suite_args
"{u'num_required': 1}" --parent_suite_id 3ea24dbf27043f11",
"bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0 --board
nyan_blaze --suite_name bvt-inline --pool cq --priority 70
--timeout_mins 90 --test_retry --max_retries 5 --parent_suite_id
3e98e82272e3c211", locally.

Change-Id: I925f0b5b722757ae0fe52e377b302f80256a965e
Reviewed-on: https://chromium-review.googlesource.com/1132470
Commit-Queue: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/suite_runner.py
[modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/suite_parser.py
[modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/cros_suite.py
[modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/cmd/run_suite_skylab.py

Project Member

Comment 3 by bugdroid1@chromium.org, Jul 13

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/58bbb64c2870d1819a4387067cfc421d627c984c

commit 58bbb64c2870d1819a4387067cfc421d627c984c
Author: Xixuan Wu <xixuan@chromium.org>
Date: Fri Jul 13 18:45:58 2018

autotest: Check tags instead of bot_id when resuming provision suite

It's possible that when we resume a suite, its child task is still
pending. In this case, the json info of a retrieved task is like:
{
    "created_ts": "2018-07-12T21:07:01.156070",
    "current_task_slice": "1",
    "name": "dummy_Pass",
    "server_versions": [
      "3675-21ffa58"
    ],
    "state": "PENDING",
    "tags": [
      "build:nyan_blaze-paladin-tryjob/R69-10869.0.0-b2743183",
      "id:chromeos-skylab-bot-140e9f86-ffef-49ea-bb07-40494e0b0481",
      ...
    ],
    "task_id": "3ea9231703a70010",
    "user": "xixuan@google.com"
}

It doesn't include 'bot_id', only include 'name' and 'tags'. This CL uses tags
to match the bot_id of each child task of a provision suite.

BUG= chromium:860027 
TEST=Ran "./bin/run_suite_skylab --build
nyan_blaze-paladin-tryjob/R69-10869.0.0-b2743183 --board nyan_blaze
--suite_name provision --pool suites --priority 50 --timeout_mins 180
--test_retry --max_retries 5 --suite_args "{u'num_required': 1}"
--suite_id 3ea922fe44959f11" locally.

Change-Id: Ibe49be87b47f621355256324b53cc032dd165197
Reviewed-on: https://chromium-review.googlesource.com/1135790
Reviewed-by: Xixuan Wu <xixuan@chromium.org>
Commit-Queue: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/58bbb64c2870d1819a4387067cfc421d627c984c/venv/skylab_suite/suite_runner.py

Project Member

Comment 4 by bugdroid1@chromium.org, Jul 17

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a25c991e2a231f66cec037a27e9455dd8e40762d

commit a25c991e2a231f66cec037a27e9455dd8e40762d
Author: Xixuan Wu <xixuan@chromium.org>
Date: Tue Jul 17 19:00:32 2018

autotest: Skip result parsing if create_and_return is specified

Parsing the child tasks of the suite is unnecessary if create_and_return
is specified. This CL skips this step and return OK immediately.

BUG= chromium:860027 
TEST=Run "bin/run_suite_skylab --pool=suites --board=nyan_blaze
--suite_name=sanity --build=nyan_blaze-release/R69-10763.0.0 --priority
65 --timeout_mins 30 --test_retry --max_retries 5 --create_and_return"
locally.

Change-Id: I8ef14e10906354ddc16782f41d5ff4da2fba96ff
Reviewed-on: https://chromium-review.googlesource.com/1137333
Reviewed-by: Xixuan Wu <xixuan@chromium.org>
Commit-Queue: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/a25c991e2a231f66cec037a27e9455dd8e40762d/venv/skylab_suite/cmd/run_suite_skylab.py

Project Member

Comment 5 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/e46566856cc86d932a47601b5715ea4322afc771

commit e46566856cc86d932a47601b5715ea4322afc771
Author: Xixuan Wu <xixuan@chromium.org>
Date: Thu Jul 19 04:59:02 2018

SkylabSuite: Separate create and wait for buildbot silency check.

BUG= chromium:860027 
TEST=Ran tryjob.

Change-Id: Ia0dd3bc1a2794b79a5c1cd59040d368ed718b511
Reviewed-on: https://chromium-review.googlesource.com/1133970
Commit-Ready: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/e46566856cc86d932a47601b5715ea4322afc771/cbuildbot/commands.py
[modify] https://crrev.com/e46566856cc86d932a47601b5715ea4322afc771/cbuildbot/commands_unittest.py

Project Member

Comment 6 by bugdroid1@chromium.org, Jul 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/52e396e586f325433a7e341936f1c0f04c5c0d04

commit 52e396e586f325433a7e341936f1c0f04c5c0d04
Author: Xixuan Wu <xixuan@chromium.org>
Date: Fri Jul 20 19:47:50 2018

autotest: change the default value of abort_limit.

If abort_limit is not specified, just abort whatever abort_suite_skylab
filters based on passed-in parameters.

BUG= chromium:860027 
TEST=None

Change-Id: Ifdd41feb84fc4c212a6f294195432797c48ab32b
Reviewed-on: https://chromium-review.googlesource.com/1144475
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>
Commit-Queue: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/52e396e586f325433a7e341936f1c0f04c5c0d04/venv/skylab_suite/suite_parser.py

Project Member

Comment 7 by bugdroid1@chromium.org, Jul 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/7e5f806a635636124482e18a99f55ed31268d68d

commit 7e5f806a635636124482e18a99f55ed31268d68d
Author: Xixuan Wu <xixuan@chromium.org>
Date: Sat Jul 21 01:34:06 2018

cbuildbot: Remove the abort_limit setting.

Since currently SkylabHWTestStage may kick off multiple commands to wait
for a suite to finish, we cannot decide abort_limit. Remove this setting
and let abort_suite_skylab to abort all found tasks.

BUG= chromium:860027 
TEST=None

Change-Id: I9f1cc95aecabdd743cb3d58e8ece765e5895f7c7
Reviewed-on: https://chromium-review.googlesource.com/1144473
Commit-Ready: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/7e5f806a635636124482e18a99f55ed31268d68d/cbuildbot/commands.py

Status: Fixed (was: Assigned)

Sign in to add a comment