Make run_suite_skylab re-runable |
||
Issue descriptionBuildbot has silence check: if a command doesn't generate output in a fixed period (larger than 1.5 hour), it will fail immediately. To avoid this, current run_suite.py is called every 1.5 hour in HWTestStage. But run_suite_skylab didn't support this. So every call of run_suite_skylab will create a new suite with the same build.
,
Jul 12
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/c7430719f181703c4266b2c765ac91c6785868c6 commit c7430719f181703c4266b2c765ac91c6785868c6 Author: Xixuan Wu <xixuan@chromium.org> Date: Thu Jul 12 21:22:35 2018 autotest: Make run_suite_skylab accept a suite_id to monitor the suite Builder has a silence check for each command it triggered. If the command doesn't return any response in a period, builder will fail. Run_suite_skylab is one of the command that triggered by buildbot (builder). However, it's easy for run_suite_skylab to hang for more than 1.5 hour since it keeps waiting for a suite and its child tasks to finish. To avoid that, builder will first kick off run_suite_skylab --create_and-return to make it return immediately after scheduling child tasks. Then builder keeps kicking off run_suite_skylab --suite_id to resume this suite and monitor its progress. Since 'run_suite_skylab --suite_id' is idempotent, it can be called several times without side-effects. BUG= chromium:860027 TEST=Ran "bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0 --board nyan_blaze --suite_name bvt-inline --pool cq --priority 70 --timeout_mins 90 --test_retry --max_retries 5 --parent_suite_id 3e1f46708ec5b211", "bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0 --board reef --model reef --suite_name provision --pool cq --priority 70 --timeout_mins 90 --test_retry --max_retries 5 --suite_args "{u'num_required': 1}" --parent_suite_id 3ea24dbf27043f11", "bin/run_suite_skylab --build nyan_blaze-release/R69-10763.0.0 --board nyan_blaze --suite_name bvt-inline --pool cq --priority 70 --timeout_mins 90 --test_retry --max_retries 5 --parent_suite_id 3e98e82272e3c211", locally. Change-Id: I925f0b5b722757ae0fe52e377b302f80256a965e Reviewed-on: https://chromium-review.googlesource.com/1132470 Commit-Queue: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/suite_runner.py [modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/suite_parser.py [modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/cros_suite.py [modify] https://crrev.com/c7430719f181703c4266b2c765ac91c6785868c6/venv/skylab_suite/cmd/run_suite_skylab.py
,
Jul 13
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/58bbb64c2870d1819a4387067cfc421d627c984c commit 58bbb64c2870d1819a4387067cfc421d627c984c Author: Xixuan Wu <xixuan@chromium.org> Date: Fri Jul 13 18:45:58 2018 autotest: Check tags instead of bot_id when resuming provision suite It's possible that when we resume a suite, its child task is still pending. In this case, the json info of a retrieved task is like: { "created_ts": "2018-07-12T21:07:01.156070", "current_task_slice": "1", "name": "dummy_Pass", "server_versions": [ "3675-21ffa58" ], "state": "PENDING", "tags": [ "build:nyan_blaze-paladin-tryjob/R69-10869.0.0-b2743183", "id:chromeos-skylab-bot-140e9f86-ffef-49ea-bb07-40494e0b0481", ... ], "task_id": "3ea9231703a70010", "user": "xixuan@google.com" } It doesn't include 'bot_id', only include 'name' and 'tags'. This CL uses tags to match the bot_id of each child task of a provision suite. BUG= chromium:860027 TEST=Ran "./bin/run_suite_skylab --build nyan_blaze-paladin-tryjob/R69-10869.0.0-b2743183 --board nyan_blaze --suite_name provision --pool suites --priority 50 --timeout_mins 180 --test_retry --max_retries 5 --suite_args "{u'num_required': 1}" --suite_id 3ea922fe44959f11" locally. Change-Id: Ibe49be87b47f621355256324b53cc032dd165197 Reviewed-on: https://chromium-review.googlesource.com/1135790 Reviewed-by: Xixuan Wu <xixuan@chromium.org> Commit-Queue: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> [modify] https://crrev.com/58bbb64c2870d1819a4387067cfc421d627c984c/venv/skylab_suite/suite_runner.py
,
Jul 17
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a25c991e2a231f66cec037a27e9455dd8e40762d commit a25c991e2a231f66cec037a27e9455dd8e40762d Author: Xixuan Wu <xixuan@chromium.org> Date: Tue Jul 17 19:00:32 2018 autotest: Skip result parsing if create_and_return is specified Parsing the child tasks of the suite is unnecessary if create_and_return is specified. This CL skips this step and return OK immediately. BUG= chromium:860027 TEST=Run "bin/run_suite_skylab --pool=suites --board=nyan_blaze --suite_name=sanity --build=nyan_blaze-release/R69-10763.0.0 --priority 65 --timeout_mins 30 --test_retry --max_retries 5 --create_and_return" locally. Change-Id: I8ef14e10906354ddc16782f41d5ff4da2fba96ff Reviewed-on: https://chromium-review.googlesource.com/1137333 Reviewed-by: Xixuan Wu <xixuan@chromium.org> Commit-Queue: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> [modify] https://crrev.com/a25c991e2a231f66cec037a27e9455dd8e40762d/venv/skylab_suite/cmd/run_suite_skylab.py
,
Jul 19
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/e46566856cc86d932a47601b5715ea4322afc771 commit e46566856cc86d932a47601b5715ea4322afc771 Author: Xixuan Wu <xixuan@chromium.org> Date: Thu Jul 19 04:59:02 2018 SkylabSuite: Separate create and wait for buildbot silency check. BUG= chromium:860027 TEST=Ran tryjob. Change-Id: Ia0dd3bc1a2794b79a5c1cd59040d368ed718b511 Reviewed-on: https://chromium-review.googlesource.com/1133970 Commit-Ready: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/e46566856cc86d932a47601b5715ea4322afc771/cbuildbot/commands.py [modify] https://crrev.com/e46566856cc86d932a47601b5715ea4322afc771/cbuildbot/commands_unittest.py
,
Jul 20
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/52e396e586f325433a7e341936f1c0f04c5c0d04 commit 52e396e586f325433a7e341936f1c0f04c5c0d04 Author: Xixuan Wu <xixuan@chromium.org> Date: Fri Jul 20 19:47:50 2018 autotest: change the default value of abort_limit. If abort_limit is not specified, just abort whatever abort_suite_skylab filters based on passed-in parameters. BUG= chromium:860027 TEST=None Change-Id: Ifdd41feb84fc4c212a6f294195432797c48ab32b Reviewed-on: https://chromium-review.googlesource.com/1144475 Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> Commit-Queue: Xixuan Wu <xixuan@chromium.org> [modify] https://crrev.com/52e396e586f325433a7e341936f1c0f04c5c0d04/venv/skylab_suite/suite_parser.py
,
Jul 21
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/7e5f806a635636124482e18a99f55ed31268d68d commit 7e5f806a635636124482e18a99f55ed31268d68d Author: Xixuan Wu <xixuan@chromium.org> Date: Sat Jul 21 01:34:06 2018 cbuildbot: Remove the abort_limit setting. Since currently SkylabHWTestStage may kick off multiple commands to wait for a suite to finish, we cannot decide abort_limit. Remove this setting and let abort_suite_skylab to abort all found tasks. BUG= chromium:860027 TEST=None Change-Id: I9f1cc95aecabdd743cb3d58e8ece765e5895f7c7 Reviewed-on: https://chromium-review.googlesource.com/1144473 Commit-Ready: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/7e5f806a635636124482e18a99f55ed31268d68d/cbuildbot/commands.py
,
Jul 31
|
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Jul 11