New issue
Advanced search Search tips

Issue 863624 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 19
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: ----



Sign in to add a comment

Abort_suite_skylab raises unexpected errors

Project Member Reported by xixuan@chromium.org, Jul 13

Issue description

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8941092765424144544

Errors like:
  2018-07-13 11:38:38,283 INFO | RunCommand: /usr/local/google/home/chromeos-test/chromiumos/chromite/third_party/swarming.client/swarming.py query --auth-service-account-json /creds/skylab_swarming_bot/skylab_bot_service_account.json --swarming chrome-swarming.appspot.com 'tasks/list?tags=suite%3Aprovision&tags=board%3Anyan_blaze&tags=pool%3AChromeOSSkylab-suite&tags=build%3Anyan_blaze-paladin-tryjob%2FR69-10870.0.0-b2744017'
  2018-07-13 11:38:38,994 INFO | Aborting suite task 3ea99a26ac170f10
  2018-07-13 11:38:38,995 INFO | RunCommand: /usr/local/google/home/chromeos-test/chromiumos/chromite/third_party/swarming.client/swarming.py cancel --auth-service-account-json /creds/skylab_swarming_bot/skylab_bot_service_account.json --swarming chrome-swarming.appspot.com --kill-running 3ea99a26ac170f10
  2018-07-13 11:38:39,624 INFO | (stdout):
  Deleting 3ea99a26ac170f10 failed. Probably already gone
  
  2018-07-13 11:38:39,625 INFO | (stderr):
  97916 2018-07-13 18:38:39.601 E: Request to https://chrome-swarming.appspot.com/api/swarming/v1/task/3ea99a26ac170f10/cancel failed with HTTP status code 403: 403 Client Error: Forbidden for url: https://chrome-swarming.appspot.com/api/swarming/v1/task/3ea99a26ac170f10/cancel - 3ea99a26ac170f10 is not accessible.
  97916 2018-07-13 18:38:39.602 E: Use auth.py to login if haven't done so already:
      python auth.py login --service=https://chrome-swarming.appspot.com
  
  2018-07-13 11:38:39,625 ERROR| Task 3ea99a26ac170f10 probably already gone, skip canceling it.
  []
  1
  Traceback (most recent call last):
    File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
      "__main__", fname, loader, pkg_name)
    File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
      exec code in run_globals
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 102, in <module>
      sys.exit(main())
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 97, in main
      _abort_suite(options)
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 73, in _abort_suite
      _abort_suite_tasks(parent_tasks, options.abort_limit)
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 34, in _abort_suite_tasks
      for ct in pt['children_task_ids']:
  KeyError: 'children_task_ids'


11:38:40: INFO: RunCommand: /b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chrome-swarming.appspot.com --task-summary-json /b/swarming/w/ir/tmp/t/cbuildbot-tmpyMQulF/cbuildbot-tmpTFVDiD/tmpkBaZ1B/temp_summary.json --print-status-updates --raw-cmd --task-name abort-nyan_blaze-paladin-tryjob/R69-10870.0.0-b2744017-bvt-inline --priority 80 --dimension os Ubuntu-14.04 --dimension pool ChromeOSSkylab-suite --expiration 1200 -- /usr/local/autotest/bin/abort_suite_skylab --board nyan_blaze --suite_name bvt-inline --build nyan_blaze-paladin-tryjob/R69-10870.0.0-b2744017 --abort_limit 1 --pool suites
Triggered task: abort-nyan_blaze-paladin-tryjob/R69-10870.0.0-b2744017-bvt-inline
cros-skylab-suite-server1-60: 3eadc1a2d89d0810 1
  2018-07-13 11:38:43,170 INFO | RunCommand: /usr/local/google/home/chromeos-test/chromiumos/chromite/third_party/swarming.client/swarming.py query --auth-service-account-json /creds/skylab_swarming_bot/skylab_bot_service_account.json --swarming chrome-swarming.appspot.com 'tasks/list?tags=suite%3Abvt-inline&tags=board%3Anyan_blaze&tags=pool%3AChromeOSSkylab-suite&tags=build%3Anyan_blaze-paladin-tryjob%2FR69-10870.0.0-b2744017'
  []
  1
  Traceback (most recent call last):
    File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
      "__main__", fname, loader, pkg_name)
    File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
      exec code in run_globals
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 102, in <module>
      sys.exit(main())
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 97, in main
      _abort_suite(options)
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 71, in _abort_suite
      parent_tasks = _get_suite_tasks_by_specs(suite_specs)
    File "/usr/local/autotest/venv/skylab_suite/cmd/abort_suite_skylab.py", line 58, in _get_suite_tasks_by_specs
      return swarming_lib.query_task_by_tags(tags)
    File "/usr/local/autotest/venv/skylab_suite/swarming_lib.py", line 270, in query_task_by_tags
      return json.loads(result.output)['items']
  KeyError: 'items'
11:38:44: WARNING: AbortHWTests failed
 
Components: -Infra Infra>Client>ChromeOS>Test
Labels: -Build Hotlist-Skylab
Owner: xixuan@chromium.org
Status: Assigned (was: Untriaged)
Labels: -Pri-2 Pri-1
Seems like this is needed in the current phase (mark skylab-based paladin important)
Status: Fixed (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/cc86e56c11210482bfd4a909b68a040f4b860460

commit cc86e56c11210482bfd4a909b68a040f4b860460
Author: Xixuan Wu <xixuan@chromium.org>
Date: Thu Jul 19 22:41:56 2018

autotest: Skip aborting the child tasks of a suite if there's none of them

It's possible that a task with given board, build and suite_name doesn't
have any child tasks, e.g. builder kicks off such a task to wait for a
previous suite to finish. In this case, no error should be raised.

BUG= chromium:863624 
TEST=Ran "bin/abort_suite_skylab --board nyan_blaze --suite_name
provision --build nyan_blaze-paladin-tryjob/R69-10870.0.0-b2744017
--abort_limit 2 --pool suites" locally.

Change-Id: I2d375d9ea36017681e3179d636545434b71da6dc
Reviewed-on: https://chromium-review.googlesource.com/1137352
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>
Commit-Queue: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/cc86e56c11210482bfd4a909b68a040f4b860460/venv/skylab_suite/cmd/abort_suite_skylab.py

Project Member

Comment 5 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9dbf06b227af181e636a266bbf9e5a338486fa4f

commit 9dbf06b227af181e636a266bbf9e5a338486fa4f
Author: Xixuan Wu <xixuan@chromium.org>
Date: Thu Jul 19 22:42:19 2018

autotest: Return empty list if no suite tasks found

Builder will kick off abort_suite_skylab for every suite listed in
config_dump.json of a given build_type. However, it's not guaranteed
that all of these suites are successfully kicked off. So it's possible
that abort_suite_skylab.py cannot find a suite to abort. This CL handles
this case.

BUG= chromium:863624 
TEST=Ran './bin/abort_suite_skylab --board nyan_blaze --suite_name
bvt-arc --build nyan_blaze-paladin-tryjob/R69-10870.0.0-b2744017
--abort_limit 1 --pool suites' locally.

Change-Id: Id753f4e5f693da9a2b8022e5c939e86ec7ab1263
Reviewed-on: https://chromium-review.googlesource.com/1137359
Commit-Queue: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/9dbf06b227af181e636a266bbf9e5a338486fa4f/venv/skylab_suite/swarming_lib.py

Sign in to add a comment