New issue
Advanced search Search tips

Issue 912675 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocked on:
issue 912776

Blocking:
issue 898324



Sign in to add a comment

launch a skylab suite against quota scheduler

Project Member Reported by akes...@chromium.org, Dec 6

Issue description

Staging instance of qscheduler is up and running; needs a bit of feature work in run_skylab_suite to be able to run a suite.
 
Blocking: 898324
Project Member

Comment 2 by bugdroid1@chromium.org, Dec 6

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9ea7db01a2e5de7db2267614bd78b242c1b2384a

commit 9ea7db01a2e5de7db2267614bd78b242c1b2384a
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Dec 06 23:29:47 2018

autotest: add quota-metered pool to skylab_suite pool map

Also, fix the run_skylab_suite helptext.

BUG=chromium:912675
TEST=None

Change-Id: I618f28f3811e180a7f5a8fee58d1476b3c0551ef
Reviewed-on: https://chromium-review.googlesource.com/1366337
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/9ea7db01a2e5de7db2267614bd78b242c1b2384a/venv/skylab_suite/swarming_lib.py
[modify] https://crrev.com/9ea7db01a2e5de7db2267614bd78b242c1b2384a/venv/skylab_suite/suite_parser.py

For posterity, here is the command I am attempting to run. I think I need to wait for ^ to go through a staging run before this will work.

~/chops/chrome_infra/infra/luci/client$ ./swarming.py run --swarming chromium-swarm-dev.appspot.com --print-status-updates --timeout 9000 --raw-cmd --task-name akeshet/qstest --priority 50 --dimension os Ubuntu-14.04 --dimension pool ChromeOSSkylab-suite --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=skylab:run_suite' -- /usr/local/autotest/bin/run_suite_skylab --build nyan_blaze-paladin/R73-11357.0.0-rc2 --board nyan_blaze --suite_name dummy --pool quota-metered
Triggered task: akeshet/qstest
cros-skylab-staging-2-15: 419eb0659782cb10 3
  2018-12-06 15:33:56,861 INFO | Kicked off suite dummy
  2018-12-06 15:33:56,991 INFO | Getting devservers for host: None
  2018-12-06 15:33:57,262 INFO | Staging artifacts on devserver http://100.108.133.197:8082: build=nyan_blaze-paladin/R73-11357.0.0-rc2, artifacts=['test_suites'], files=, archive_url=gs://chromeos-image-archive/nyan_blaze-paladin/R73-11357.0.0-rc2
  2018-12-06 15:33:57,882 INFO | Finished staging artifacts: build=nyan_blaze-paladin/R73-11357.0.0-rc2, artifacts=['test_suites'], files=, archive_url=gs://chromeos-image-archive/nyan_blaze-paladin/R73-11357.0.0-rc2
  2018-12-06 15:33:58,217 INFO | RunCommand: /usr/local/google/home/chromeos-test/chromiumos/chromite/third_party/swarming.client/swarming.py query --auth-service-account-json /creds/skylab_swarming_bot/skylab_bot_service_account.json --swarming https://chromium-swarm-dev.appspot.com 'bots/list?dimensions=label-board%3Anyan_blaze&dimensions=pool%3AChromeOSSkylab&dimensions=label-pool%3ANone'
  2018-12-06 15:33:58,770 ERROR| Infra failure in setting up suite job
  Traceback (most recent call last):
    File "/usr/local/autotest/venv/skylab_suite/cmd/run_suite_skylab.py", line 78, in _run_suite
      suite_job.prepare()
    File "/usr/local/autotest/venv/skylab_suite/cros_suite.py", line 346, in prepare
      self.minimum_duts)
  NoAvailableDUTsError: The available number of DUTs for board nyan_blaze and pool quota-metered is 0 ,which is less than 1, the required number.
  2018-12-06 15:33:58,771 INFO | Will return from run_suite_skylab.py with status: INFRA_FAILURE

Blockedon: 912776
^ works, but hit a new bug at  Issue 912776 
./swarming.py run --swarming chromium-swarm-dev.appspot.com --print-status-updates --timeout 9000 --raw-cmd --task-name akeshet/qstest --priority 50 --dimension os Ubuntu-14.04 --dimension pool ChromeOSSkylab-suite --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=skylab:run_suite' -- /usr/local/autotest/bin/run_suite_skylab --build nyan_blaze-paladin/R73-11357.0.0-rc2 --board nyan_blaze --suite_name dummy --pool quota-metered

^ worked, resulted in this suite which ran to completion, including for instance this task: https://chromium-swarm-dev.appspot.com/task?id=41b8ce4a3123b910&refresh=10

Note that the task slice 0 is the one that ran, even though the bot didn't have the right provisionable label. The is filed at  Issue 914187 

Project Member

Comment 6 by bugdroid1@chromium.org, Dec 17

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/f8f1055ef422c25ea1ef9a9da0b102d59ea4881e

commit f8f1055ef422c25ea1ef9a9da0b102d59ea4881e
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Dec 17 20:49:21 2018

Roll infra/go/src/go.chromium.org/luci/ dda5a9b55..1d67f2270 (9 commits)

https://chromium.googlesource.com/infra/luci/luci-go/+log/dda5a9b55a03..1d67f22707e0

$ git log dda5a9b55..1d67f2270 --date=short --no-merges --format='%ad %ae %s'
2018-12-15 vadimsh [cipd] Handle an edge case in EnsureFileGone.
2018-12-15 vadimsh [cipd] Add 'deployment-check' and 'deployment-repair' advanced commands.
2018-12-14 hinoka [milo] Rename Step component type to StepLegacy
2018-12-14 smut [GCE] Reset hostname if instance creation fails
2018-12-14 hinoka [milo] Regenerate
2018-12-13 akeshet svcdec: add period to the end of DO NOT EDIT
2018-12-13 hinoka [milo] Only show timeline for LUCI builds.
2018-12-13 dnj [httpmitm] fix shallow copy bug
2018-12-12 smut [GCE] Support service accounts

Created with:
  roll-dep infra/go/src/go.chromium.org/luci

This is needed for the svcdec change.

R=akeshet@chromium.org, iannucci@chromium.org

Bug: 912675
Change-Id: I8db4d17bfbece18f1348b4699dcfc9f7e0e87e9e
Reviewed-on: https://chromium-review.googlesource.com/c/1380895
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Commit-Queue: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19608}
[modify] https://crrev.com/f8f1055ef422c25ea1ef9a9da0b102d59ea4881e/DEPS

Project Member

Comment 7 by bugdroid1@chromium.org, Dec 19

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96

commit a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Dec 19 03:29:52 2018

Roll infra/luci/ c3b934388..3c2b86813 (68 commits); update swarming Go

https://chromium.googlesource.com/infra/luci/luci-py/+log/c3b934388a68..3c2b868130fd

$ git log c3b934388..3c2b86813 --date=short --no-merges --format='%ad %ae %s'
2018-12-17 maruel [swarming] Rewrite TaskState enum
2018-12-17 maruel [swarming] Fine tune Bot proto.
2018-12-17 maruel [swarming] stream BotEvent to BigQuery
2018-12-17 maruel [swarming] Split config and api protos; rename api proto package to swarming.v1
2018-12-17 maruel [config] Remove make rollback
2018-12-14 maruel Revert "[isolate] Temporarily disable send_to_bq cron job"
2018-12-14 maruel [isolate] Temporarily disable send_to_bq cron job
2018-12-14 maruel [isolate] Tweak isolated.proto
2018-12-14 maruel [swarming] Implement BotAPI.Events
2018-12-14 maruel [swarming] Add BotEvent message and first pRPC RPC
2018-12-13 maruel [prpc] log the exception trace on decoding error
2018-12-13 vadimsh [swarming] Start accepting X-Luci-Gce-Vm-Token headers.
2018-12-12 maruel [isolate] Minor cleanups
2018-12-12 maruel [swarming] Add base for pRPC server
2018-12-12 akeshet swarming: ensure the external scheduler's slice is active
2018-12-11 nodir [endpoints_webapp2] Allow User-Agent in CORS
2018-12-11 maruel [isolate] call out that I know the API needs to be completed
2018-12-11 maruel [swarming] enable cron job to delete stale bot
2018-12-10 maruel [isolate] Fix a spurious line in setup_bigquery.sh; enable BQ API
2018-12-10 maruel [isolate] Enable the BQ cron job.
2018-12-10 maruel [isolate] Remove another log entry.
2018-12-10 akeshet swarming: add cron entry for external scheduler cancellations
2018-12-10 maruel [isolate] Fix incorrect logging
2018-12-10 maruel [isolate] Add bigquery exporter
2018-12-10 maruel [isolate] Internal refactoring in preparation to bigquery
2018-12-10 maruel [stats_framework] Add schema graph, disable cache, add span
2018-12-07 vadimsh [swarming] Add require_gce_vm_token field to BotAuth proto.
2018-12-07 maruel [swarming] Work around inconsistent index
2018-12-07 maruel [swarming] Add cron job for both bot and task monitoring
2018-12-07 akeshet swarming: don't use external scheduler credentials in local dev mode
2018-12-06 maruel [swarming] Cleanup handlers_backend.py
2018-12-06 maruel [swarming] group public code together in lease_management.py
2018-12-06 maruel [swarming] Further refactor lease_management.py
2018-12-06 maruel [swarming] Fine tune the logging added in d2a77d1006ed33d803e22
2018-12-06 akeshet swarming: use service_account_credentials to call external scheduler
2018-12-05 maruel [swarming] Mark many functions in lease_management as private
2018-12-05 maruel [swarming] Be tolerant to inconsistent index
2018-12-05 maruel [swarming] Delete old bot; Log timestamp for entities being deleted
2018-12-05 maruel [components] Fix regression in 7b83a13ada570fdbed158f1cfdf9416805e20290
2018-12-05 maruel [isolate] Add new prpc service to retrieve statistics
2018-12-04 maruel [client] Add tarring code but keep it disabled
2018-12-04 maruel [client] lazy hash and improve symlink processing
2018-12-04 maruel [client] Remove timestamp from file metadata
2018-12-04 maruel [client] Move expand_directories_and_symlinks()
2018-12-04 maruel [client] Small improvements
2018-12-04 nodir [config] Update proto/Makefile
2018-12-03 akeshet swarming: add a taskqueue handler for cancelling task on bot
2018-12-03 maruel [swarming] Lower task and BotEvent retention
2018-12-03 akeshet swarming: add a partial stub for external scheduler cancellations
2018-12-03 maruel [tools] make proto work better.
2018-12-03 maruel [swarming] Remove TaskOutput removal cron job
2018-11-30 akeshet swarming: extract find-then-cancel logic into cancel_task_with_id
2018-11-30 benjaminwagner Add aliases for Intel Graphics 655 and Galaxy S9.
2018-11-30 maruel [client] Add unit test for symlink bug
2018-11-30 maruel [client] Fix a crash when no entry was processed
2018-11-30 iannucci [swarming] Update details API to work with pool-provided isolate defaults.
2018-11-30 akeshet swarming: add bot_id argument to cancel_task
2018-11-30 maruel [swarming] Fix failing tests on macOS
2018-11-29 maruel [client] Fix symlink bug
2018-11-29 maruel [swarming] fix breakage by 35b19ce7634469a5af1012b
2018-11-29 tikuta [client] do not convert json format from new api to old api
2018-11-28 maruel [swarming] Disable in-process cache for many user initiated queries
2018-11-28 benjaminwagner swarming: Roll py-adb to 017007413e7400438c39623b92ce4288488a8272.
2018-11-28 maruel [client] Further use generator during enumeration
2018-11-28 maruel [client] Make TaskChannel usable as a generator
2018-11-28 maruel [client] Convert the upload to be pipelined; various performance tuning
2018-11-28 maruel [client] reduce the duplication of logic in isolateserver.py
2018-11-28 maruel [swarming] enable cron jobs to trim TaskRequest and TaskOutputChunk

Created with:
  roll-dep infra/luci

and then modified go/src/infra/swarming/ to update the protos with the
new package name "swarming_v1".

R=akeshet@chromium.org

Bug: 912675
Change-Id: Ib7ac8ab6f282664dd1c55dc6ac70d7dab6a48548
Reviewed-on: https://chromium-review.googlesource.com/c/1380893
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19666}
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/frontend.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/plugin.pb.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/externalschedulerserver_dec.go
[add] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/swarming.pb.go
[delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/README
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/pb.discovery.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/cron/cron.go
[delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/tasks.pb.go
[add] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/README.md
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/gen.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/DEPS
[delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/config.pb.go
[delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/bots.pb.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go
[delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/pools.pb.go
[modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_test.go

Sign in to add a comment