launch a skylab suite against quota scheduler |
|
Issue descriptionStaging instance of qscheduler is up and running; needs a bit of feature work in run_skylab_suite to be able to run a suite.
,
Dec 6
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9ea7db01a2e5de7db2267614bd78b242c1b2384a commit 9ea7db01a2e5de7db2267614bd78b242c1b2384a Author: Aviv Keshet <akeshet@chromium.org> Date: Thu Dec 06 23:29:47 2018 autotest: add quota-metered pool to skylab_suite pool map Also, fix the run_skylab_suite helptext. BUG=chromium:912675 TEST=None Change-Id: I618f28f3811e180a7f5a8fee58d1476b3c0551ef Reviewed-on: https://chromium-review.googlesource.com/1366337 Commit-Ready: Aviv Keshet <akeshet@chromium.org> Tested-by: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Xixuan Wu <xixuan@chromium.org> [modify] https://crrev.com/9ea7db01a2e5de7db2267614bd78b242c1b2384a/venv/skylab_suite/swarming_lib.py [modify] https://crrev.com/9ea7db01a2e5de7db2267614bd78b242c1b2384a/venv/skylab_suite/suite_parser.py
,
Dec 6
For posterity, here is the command I am attempting to run. I think I need to wait for ^ to go through a staging run before this will work. ~/chops/chrome_infra/infra/luci/client$ ./swarming.py run --swarming chromium-swarm-dev.appspot.com --print-status-updates --timeout 9000 --raw-cmd --task-name akeshet/qstest --priority 50 --dimension os Ubuntu-14.04 --dimension pool ChromeOSSkylab-suite --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=skylab:run_suite' -- /usr/local/autotest/bin/run_suite_skylab --build nyan_blaze-paladin/R73-11357.0.0-rc2 --board nyan_blaze --suite_name dummy --pool quota-metered Triggered task: akeshet/qstest cros-skylab-staging-2-15: 419eb0659782cb10 3 2018-12-06 15:33:56,861 INFO | Kicked off suite dummy 2018-12-06 15:33:56,991 INFO | Getting devservers for host: None 2018-12-06 15:33:57,262 INFO | Staging artifacts on devserver http://100.108.133.197:8082: build=nyan_blaze-paladin/R73-11357.0.0-rc2, artifacts=['test_suites'], files=, archive_url=gs://chromeos-image-archive/nyan_blaze-paladin/R73-11357.0.0-rc2 2018-12-06 15:33:57,882 INFO | Finished staging artifacts: build=nyan_blaze-paladin/R73-11357.0.0-rc2, artifacts=['test_suites'], files=, archive_url=gs://chromeos-image-archive/nyan_blaze-paladin/R73-11357.0.0-rc2 2018-12-06 15:33:58,217 INFO | RunCommand: /usr/local/google/home/chromeos-test/chromiumos/chromite/third_party/swarming.client/swarming.py query --auth-service-account-json /creds/skylab_swarming_bot/skylab_bot_service_account.json --swarming https://chromium-swarm-dev.appspot.com 'bots/list?dimensions=label-board%3Anyan_blaze&dimensions=pool%3AChromeOSSkylab&dimensions=label-pool%3ANone' 2018-12-06 15:33:58,770 ERROR| Infra failure in setting up suite job Traceback (most recent call last): File "/usr/local/autotest/venv/skylab_suite/cmd/run_suite_skylab.py", line 78, in _run_suite suite_job.prepare() File "/usr/local/autotest/venv/skylab_suite/cros_suite.py", line 346, in prepare self.minimum_duts) NoAvailableDUTsError: The available number of DUTs for board nyan_blaze and pool quota-metered is 0 ,which is less than 1, the required number. 2018-12-06 15:33:58,771 INFO | Will return from run_suite_skylab.py with status: INFRA_FAILURE
,
Dec 7
,
Dec 12
./swarming.py run --swarming chromium-swarm-dev.appspot.com --print-status-updates --timeout 9000 --raw-cmd --task-name akeshet/qstest --priority 50 --dimension os Ubuntu-14.04 --dimension pool ChromeOSSkylab-suite --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=skylab:run_suite' -- /usr/local/autotest/bin/run_suite_skylab --build nyan_blaze-paladin/R73-11357.0.0-rc2 --board nyan_blaze --suite_name dummy --pool quota-metered ^ worked, resulted in this suite which ran to completion, including for instance this task: https://chromium-swarm-dev.appspot.com/task?id=41b8ce4a3123b910&refresh=10 Note that the task slice 0 is the one that ran, even though the bot didn't have the right provisionable label. The is filed at Issue 914187
,
Dec 17
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/f8f1055ef422c25ea1ef9a9da0b102d59ea4881e commit f8f1055ef422c25ea1ef9a9da0b102d59ea4881e Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Mon Dec 17 20:49:21 2018 Roll infra/go/src/go.chromium.org/luci/ dda5a9b55..1d67f2270 (9 commits) https://chromium.googlesource.com/infra/luci/luci-go/+log/dda5a9b55a03..1d67f22707e0 $ git log dda5a9b55..1d67f2270 --date=short --no-merges --format='%ad %ae %s' 2018-12-15 vadimsh [cipd] Handle an edge case in EnsureFileGone. 2018-12-15 vadimsh [cipd] Add 'deployment-check' and 'deployment-repair' advanced commands. 2018-12-14 hinoka [milo] Rename Step component type to StepLegacy 2018-12-14 smut [GCE] Reset hostname if instance creation fails 2018-12-14 hinoka [milo] Regenerate 2018-12-13 akeshet svcdec: add period to the end of DO NOT EDIT 2018-12-13 hinoka [milo] Only show timeline for LUCI builds. 2018-12-13 dnj [httpmitm] fix shallow copy bug 2018-12-12 smut [GCE] Support service accounts Created with: roll-dep infra/go/src/go.chromium.org/luci This is needed for the svcdec change. R=akeshet@chromium.org, iannucci@chromium.org Bug: 912675 Change-Id: I8db4d17bfbece18f1348b4699dcfc9f7e0e87e9e Reviewed-on: https://chromium-review.googlesource.com/c/1380895 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Robbie Iannucci <iannucci@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Cr-Commit-Position: refs/heads/master@{#19608} [modify] https://crrev.com/f8f1055ef422c25ea1ef9a9da0b102d59ea4881e/DEPS
,
Dec 19
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96 commit a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Dec 19 03:29:52 2018 Roll infra/luci/ c3b934388..3c2b86813 (68 commits); update swarming Go https://chromium.googlesource.com/infra/luci/luci-py/+log/c3b934388a68..3c2b868130fd $ git log c3b934388..3c2b86813 --date=short --no-merges --format='%ad %ae %s' 2018-12-17 maruel [swarming] Rewrite TaskState enum 2018-12-17 maruel [swarming] Fine tune Bot proto. 2018-12-17 maruel [swarming] stream BotEvent to BigQuery 2018-12-17 maruel [swarming] Split config and api protos; rename api proto package to swarming.v1 2018-12-17 maruel [config] Remove make rollback 2018-12-14 maruel Revert "[isolate] Temporarily disable send_to_bq cron job" 2018-12-14 maruel [isolate] Temporarily disable send_to_bq cron job 2018-12-14 maruel [isolate] Tweak isolated.proto 2018-12-14 maruel [swarming] Implement BotAPI.Events 2018-12-14 maruel [swarming] Add BotEvent message and first pRPC RPC 2018-12-13 maruel [prpc] log the exception trace on decoding error 2018-12-13 vadimsh [swarming] Start accepting X-Luci-Gce-Vm-Token headers. 2018-12-12 maruel [isolate] Minor cleanups 2018-12-12 maruel [swarming] Add base for pRPC server 2018-12-12 akeshet swarming: ensure the external scheduler's slice is active 2018-12-11 nodir [endpoints_webapp2] Allow User-Agent in CORS 2018-12-11 maruel [isolate] call out that I know the API needs to be completed 2018-12-11 maruel [swarming] enable cron job to delete stale bot 2018-12-10 maruel [isolate] Fix a spurious line in setup_bigquery.sh; enable BQ API 2018-12-10 maruel [isolate] Enable the BQ cron job. 2018-12-10 maruel [isolate] Remove another log entry. 2018-12-10 akeshet swarming: add cron entry for external scheduler cancellations 2018-12-10 maruel [isolate] Fix incorrect logging 2018-12-10 maruel [isolate] Add bigquery exporter 2018-12-10 maruel [isolate] Internal refactoring in preparation to bigquery 2018-12-10 maruel [stats_framework] Add schema graph, disable cache, add span 2018-12-07 vadimsh [swarming] Add require_gce_vm_token field to BotAuth proto. 2018-12-07 maruel [swarming] Work around inconsistent index 2018-12-07 maruel [swarming] Add cron job for both bot and task monitoring 2018-12-07 akeshet swarming: don't use external scheduler credentials in local dev mode 2018-12-06 maruel [swarming] Cleanup handlers_backend.py 2018-12-06 maruel [swarming] group public code together in lease_management.py 2018-12-06 maruel [swarming] Further refactor lease_management.py 2018-12-06 maruel [swarming] Fine tune the logging added in d2a77d1006ed33d803e22 2018-12-06 akeshet swarming: use service_account_credentials to call external scheduler 2018-12-05 maruel [swarming] Mark many functions in lease_management as private 2018-12-05 maruel [swarming] Be tolerant to inconsistent index 2018-12-05 maruel [swarming] Delete old bot; Log timestamp for entities being deleted 2018-12-05 maruel [components] Fix regression in 7b83a13ada570fdbed158f1cfdf9416805e20290 2018-12-05 maruel [isolate] Add new prpc service to retrieve statistics 2018-12-04 maruel [client] Add tarring code but keep it disabled 2018-12-04 maruel [client] lazy hash and improve symlink processing 2018-12-04 maruel [client] Remove timestamp from file metadata 2018-12-04 maruel [client] Move expand_directories_and_symlinks() 2018-12-04 maruel [client] Small improvements 2018-12-04 nodir [config] Update proto/Makefile 2018-12-03 akeshet swarming: add a taskqueue handler for cancelling task on bot 2018-12-03 maruel [swarming] Lower task and BotEvent retention 2018-12-03 akeshet swarming: add a partial stub for external scheduler cancellations 2018-12-03 maruel [tools] make proto work better. 2018-12-03 maruel [swarming] Remove TaskOutput removal cron job 2018-11-30 akeshet swarming: extract find-then-cancel logic into cancel_task_with_id 2018-11-30 benjaminwagner Add aliases for Intel Graphics 655 and Galaxy S9. 2018-11-30 maruel [client] Add unit test for symlink bug 2018-11-30 maruel [client] Fix a crash when no entry was processed 2018-11-30 iannucci [swarming] Update details API to work with pool-provided isolate defaults. 2018-11-30 akeshet swarming: add bot_id argument to cancel_task 2018-11-30 maruel [swarming] Fix failing tests on macOS 2018-11-29 maruel [client] Fix symlink bug 2018-11-29 maruel [swarming] fix breakage by 35b19ce7634469a5af1012b 2018-11-29 tikuta [client] do not convert json format from new api to old api 2018-11-28 maruel [swarming] Disable in-process cache for many user initiated queries 2018-11-28 benjaminwagner swarming: Roll py-adb to 017007413e7400438c39623b92ce4288488a8272. 2018-11-28 maruel [client] Further use generator during enumeration 2018-11-28 maruel [client] Make TaskChannel usable as a generator 2018-11-28 maruel [client] Convert the upload to be pipelined; various performance tuning 2018-11-28 maruel [client] reduce the duplication of logic in isolateserver.py 2018-11-28 maruel [swarming] enable cron jobs to trim TaskRequest and TaskOutputChunk Created with: roll-dep infra/luci and then modified go/src/infra/swarming/ to update the protos with the new package name "swarming_v1". R=akeshet@chromium.org Bug: 912675 Change-Id: Ib7ac8ab6f282664dd1c55dc6ac70d7dab6a48548 Reviewed-on: https://chromium-review.googlesource.com/c/1380893 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Cr-Commit-Position: refs/heads/master@{#19666} [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/frontend.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/plugin.pb.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/externalschedulerserver_dec.go [add] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/swarming.pb.go [delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/README [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/pb.discovery.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/cron/cron.go [delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/tasks.pb.go [add] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/README.md [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/swarming/gen.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/DEPS [delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/config.pb.go [delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/bots.pb.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go [delete] https://crrev.com/cb5a4be36f5aea16c69612453bc967379c466dab/go/src/infra/swarming/pools.pb.go [modify] https://crrev.com/a6490ac3d4e23b59c2d07c1d371c17c97fa4dc96/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_test.go |
|
►
Sign in to add a comment |
|
Comment 1 by akes...@chromium.org
, Dec 6