New issue
Advanced search Search tips

Issue 898322 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 6
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 870537
issue 899306

Blocking:
issue 898324


Show other hotlists

Hotlists containing this issue:
CrOSParallelCQ


Sign in to add a comment

create initial implementation of appengine quotascheduler app

Project Member Reported by akes...@chromium.org, Oct 23

Issue description

Create an appengine app that implements the swarming scheduler plugin API as described here: https://docs.google.com/document/d/1cm1IsTGGistGqkRXV82X5jPCv3SmRdhUtMo4d9N-1bg/edit?disco=AAAACOZTUwE&ts=5bc49b48 , and is backed by the quotascheduler algorithm as implemented in  Issue 870537 
 
Blockedon: 870537
Blocking: 898324
Project Member

Comment 3 by bugdroid1@chromium.org, Oct 25

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/39c58de33dfbe1a48fa22d9d183ec43800d98402

commit 39c58de33dfbe1a48fa22d9d183ec43800d98402
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Oct 25 17:56:23 2018

qscheduler: add context.Context to the reconciler and scheduler calls

This will be used soon to add appengine-compatible logging and metrics
to qscheduler.

BUG= chromium:898322 
TEST=existing unit tests pass

Change-Id: I9aaa1963c8fbeb0ebe06d1e3c12e43b065b53125
Reviewed-on: https://chromium-review.googlesource.com/c/1297707
Auto-Submit: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18578}
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/reconciler/doc_test.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/scheduler/scheduler_test.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/reconciler/reconciler_test.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/scheduler/doc_test.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/reconciler/reconciler.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/scheduler/prioritize_test.go
[modify] https://crrev.com/39c58de33dfbe1a48fa22d9d183ec43800d98402/go/src/infra/qscheduler/qslib/scheduler/scheduler.go

Project Member

Comment 4 by bugdroid1@chromium.org, Oct 25

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/0a4b4f070ef4e9365371b20958d7d848270ec893

commit 0a4b4f070ef4e9365371b20958d7d848270ec893
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Oct 25 23:51:06 2018

qscheduler: turn reconciler State and its members into a proto

This is necessary because we're going to be serializing the reconciler
state in forthcoming appengine app.

BUG= chromium:898322 
TEST=unit tests

Change-Id: Ic34a74341eabc4a8c79623a871e79f4c45438aa3
Reviewed-on: https://chromium-review.googlesource.com/c/1299954
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18585}
[modify] https://crrev.com/0a4b4f070ef4e9365371b20958d7d848270ec893/go/src/infra/qscheduler/qslib/reconciler/reconciler.go
[modify] https://crrev.com/0a4b4f070ef4e9365371b20958d7d848270ec893/go/src/infra/qscheduler/qslib/reconciler/reconciler.proto
[modify] https://crrev.com/0a4b4f070ef4e9365371b20958d7d848270ec893/go/src/infra/qscheduler/qslib/reconciler/reconciler.pb.go

Blockedon: 899306
Labels: quotascheduler
Project Member

Comment 8 by bugdroid1@chromium.org, Oct 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/e4d580fb448a2149adc0230a00ee7830f22e9831

commit e4d580fb448a2149adc0230a00ee7830f22e9831
Author: Aviv Keshet <akeshet@chromium.org>
Date: Mon Oct 29 18:30:38 2018

qscheduler: add an ensureMaps helper to Scheduler

When serializing/deserializing a protobuf that includes empty maps, they
get turned into nil maps. This can cause various sites that expect
initialized maps to crash.

Add an ensureMaps helper that initializes any nil maps, and add it at
the entry of all public API calls for the Scheduler.

BUG= chromium:898322 
TEST=existing unit tests pass

Change-Id: I70f73bf6b49b716f40949632ac5e4c5b38c96e29
Reviewed-on: https://chromium-review.googlesource.com/c/1299957
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18617}
[modify] https://crrev.com/e4d580fb448a2149adc0230a00ee7830f22e9831/go/src/infra/qscheduler/qslib/scheduler/scheduler.go

Project Member

Comment 10 by bugdroid1@chromium.org, Nov 2

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa

commit b3bb84d0c0c6290d4500a9465c8ab5c126869eaa
Author: Aviv Keshet <akeshet@chromium.org>
Date: Fri Nov 02 23:03:58 2018

qscheduler: define an admin API for qscheduler-swarming

This API is for creating or administering scheduler pools and their
accounts. It is implemented in a forthcoming appengine app.

BUG= chromium:898322 
TEST=None

Change-Id: I204ea13a0c62a05af9a171b108721befa442484b
Reviewed-on: https://chromium-review.googlesource.com/c/1309353
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18768}
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/Makefile
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/admin.proto
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/qscheduleradminserver_dec.go
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/admin.pb.go
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/v1.infra_testing
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/gen.go
[add] https://crrev.com/b3bb84d0c0c6290d4500a9465c8ab5c126869eaa/go/src/infra/appengine/qscheduler-swarming/api/qscheduler/v1/pb.discovery.go

Project Member

Comment 12 by bugdroid1@chromium.org, Nov 6

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/7039af01c1b9ffdf18c0cbb8bca69ea579b06f49

commit 7039af01c1b9ffdf18c0cbb8bca69ea579b06f49
Author: Aviv Keshet <akeshet@chromium.org>
Date: Tue Nov 06 01:48:25 2018

Project Member

Comment 13 by bugdroid1@chromium.org, Nov 7

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/f0af380e154c32b02c687befcb858b34c967b928

commit f0af380e154c32b02c687befcb858b34c967b928
Author: Aviv Keshet <akeshet@chromium.org>
Date: Wed Nov 07 18:44:52 2018

qscheduler: initial appengine app for qscheduler-swarming

This CL adds an initial implementation of the qscheduler-swarming
appengine app.

In this implementation, the state of a scheduler is read from and
rewritten to datastore on every client call, inside a transaction. This
is a simple implementation to get us off the ground, but will not scale
to > ~1 QPS due to datastore write contention. Eventually, this approach
will be replaced with some combination of:
 - an in-memory copy of the scheduler state, periodically checkpointed
 into datastore.
 - batching of multiple AssignTasks or NotifyRequest calls for a pool
 into a single transaction.

BUG= chromium:898322 
TEST=local `prpc call` tests against dev instance; new basic unit tests
added

Change-Id: I6dadc521ab102bfbb3cc93c97b94601872e63369
Reviewed-on: https://chromium-review.googlesource.com/c/1297520
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18839}
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/frontend.go
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_admin_test.go
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/appengine/app.yaml
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/appengine/devcfg/services/dev/config.cfg
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/appengine/handlers.go
[modify] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/Makefile
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/app.infra_testing
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/appengine/appengine.infra_testing
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_admin.go
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/frontend.infra_testing
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/appengine/devcfg/README.md
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_test.go
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go
[add] https://crrev.com/f0af380e154c32b02c687befcb858b34c967b928/go/src/infra/appengine/qscheduler-swarming/app/frontend/entities.go

Project Member

Comment 14 by bugdroid1@chromium.org, Nov 7

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/b81759eac588a1db881d833c393419a7bb8fe9a4

commit b81759eac588a1db881d833c393419a7bb8fe9a4
Author: Aviv Keshet <akeshet@chromium.org>
Date: Wed Nov 07 21:14:13 2018

qscheduler: add cron config

BUG= chromium:898322 
TEST=None

Change-Id: I065c7052e8d52bcb5d8d73651ea663556168dd6e
Reviewed-on: https://chromium-review.googlesource.com/c/1324429
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Cr-Commit-Position: refs/heads/master@{#18840}
[add] https://crrev.com/b81759eac588a1db881d833c393419a7bb8fe9a4/go/src/infra/appengine/qscheduler-swarming/app/appengine/cron.yaml

Project Member

Comment 16 by bugdroid1@chromium.org, Nov 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/81d2e21406612999dcde78906fdb83829ec8c466

commit 81d2e21406612999dcde78906fdb83829ec8c466
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Nov 29 22:33:54 2018

qscheduler: add an AbortRequest call to the qslib.Scheduler API

We need a mechanism to inform the scheduler that requests should be
stopped without immediately reenqueueing them. This is satisfied by
a new AbortRequest endpoint, which is a companion to NotifyRequest.

BUG= chromium:898322 
TEST=New unit tests

Change-Id: I9d8f02cb436519286f03b8ba2b8cfe7575ea9946
Reviewed-on: https://chromium-review.googlesource.com/c/1343598
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19256}
[modify] https://crrev.com/81d2e21406612999dcde78906fdb83829ec8c466/go/src/infra/qscheduler/qslib/scheduler/state_test.go
[modify] https://crrev.com/81d2e21406612999dcde78906fdb83829ec8c466/go/src/infra/qscheduler/qslib/reconciler/reconciler.go
[modify] https://crrev.com/81d2e21406612999dcde78906fdb83829ec8c466/go/src/infra/qscheduler/qslib/scheduler/state.go
[modify] https://crrev.com/81d2e21406612999dcde78906fdb83829ec8c466/go/src/infra/qscheduler/qslib/scheduler/scheduler.go

Project Member

Comment 17 by bugdroid1@chromium.org, Nov 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/4ee12c8b84ad6a6d394567b6e9f56f4a5014b234

commit 4ee12c8b84ad6a6d394567b6e9f56f4a5014b234
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Nov 29 22:43:15 2018

qscheduler: add ABORTED task update type to reconciler

This task update is to inform qscheduler of tasks that are aborted.

BUG= chromium:898322 
TEST=None

Change-Id: Ib63cc631bda3ec5bd2de32ec5341856ac490f7fe
Reviewed-on: https://chromium-review.googlesource.com/c/1343600
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19257}
[modify] https://crrev.com/4ee12c8b84ad6a6d394567b6e9f56f4a5014b234/go/src/infra/qscheduler/qslib/reconciler/reconciler.go
[modify] https://crrev.com/4ee12c8b84ad6a6d394567b6e9f56f4a5014b234/go/src/infra/qscheduler/qslib/reconciler/reconciler.proto
[modify] https://crrev.com/4ee12c8b84ad6a6d394567b6e9f56f4a5014b234/go/src/infra/qscheduler/qslib/reconciler/reconciler.pb.go

Project Member

Comment 18 by bugdroid1@chromium.org, Nov 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/1db3aba741bdf43b6f9defb0b7bde804a4ed8fcc

commit 1db3aba741bdf43b6f9defb0b7bde804a4ed8fcc
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Nov 29 22:56:35 2018

qscheduler: handle all known task states

Add a helper method for converting swarming task states to reconciler
notifications. Also, add a logging method.

BUG= chromium:898322 
TEST=None

Change-Id: I342c81feaeb5a0d60da4d760fe53372545874e53
Reviewed-on: https://chromium-review.googlesource.com/c/1343601
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19260}
[modify] https://crrev.com/1db3aba741bdf43b6f9defb0b7bde804a4ed8fcc/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go

Project Member

Comment 19 by bugdroid1@chromium.org, Nov 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/19e87e37b504358fe774745b085efaa20a779e83

commit 19e87e37b504358fe774745b085efaa20a779e83
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Nov 29 23:03:24 2018

qscheduler: compute provisionable labels based on task slices

BUG= chromium:898322 
TEST=None

Change-Id: Id7ad6c11dac57f2b68c0ec2df74806c9532f2eab
Reviewed-on: https://chromium-review.googlesource.com/c/1351839
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19261}
[modify] https://crrev.com/19e87e37b504358fe774745b085efaa20a779e83/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go
[modify] https://crrev.com/19e87e37b504358fe774745b085efaa20a779e83/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_test.go

Project Member

Comment 21 by bugdroid1@chromium.org, Dec 3

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/af4ca940a30ec739c77d28472696e25a314553e2

commit af4ca940a30ec739c77d28472696e25a314553e2
Author: Aviv Keshet <akeshet@chromium.org>
Date: Mon Dec 03 22:09:37 2018

qscheduler: add TaskError call to reconciler

The new TaskError reconciler method is used to inform reconciler that a
task had an error prior to being enqueued, and that it needs to be
cancelled. Reconciler will return this task via GetCancellations until
reconciler is informed that the task was aborted.

BUG= chromium:898322 
TEST=unit test added

Change-Id: I8ddbb4c02eda5beadf9043669d96d83b6c485c36
Reviewed-on: https://chromium-review.googlesource.com/c/1357360
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19302}
[modify] https://crrev.com/af4ca940a30ec739c77d28472696e25a314553e2/go/src/infra/qscheduler/qslib/reconciler/reconciler_test.go
[modify] https://crrev.com/af4ca940a30ec739c77d28472696e25a314553e2/go/src/infra/qscheduler/qslib/reconciler/reconciler.go
[modify] https://crrev.com/af4ca940a30ec739c77d28472696e25a314553e2/go/src/infra/qscheduler/qslib/reconciler/reconciler.proto
[modify] https://crrev.com/af4ca940a30ec739c77d28472696e25a314553e2/go/src/infra/qscheduler/qslib/reconciler/reconciler.pb.go

Project Member

Comment 22 by bugdroid1@chromium.org, Dec 3

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/5326ba93ef43530f9618105bacce53419b48cbf4

commit 5326ba93ef43530f9618105bacce53419b48cbf4
Author: Aviv Keshet <akeshet@chromium.org>
Date: Mon Dec 03 22:18:07 2018

qscheduler: add getAccountId helper to extract account info from tags

BUG= chromium:898322 
TEST=new unit test

Change-Id: I1c63506bec6f05d050967d5e9bf835fc9b4b2913
Reviewed-on: https://chromium-review.googlesource.com/c/1357361
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19304}
[modify] https://crrev.com/5326ba93ef43530f9618105bacce53419b48cbf4/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go
[modify] https://crrev.com/5326ba93ef43530f9618105bacce53419b48cbf4/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler_test.go

Project Member

Comment 23 by bugdroid1@chromium.org, Dec 3

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/472d4048be541e9967b21b8fd2efc9a7426fb3a6

commit 472d4048be541e9967b21b8fd2efc9a7426fb3a6
Author: Aviv Keshet <akeshet@chromium.org>
Date: Mon Dec 03 23:00:25 2018

qscheduler: handle notification errors by cancelling task

If tasks have erroneous values at the time they are notified to
qscheduler, then they should be immediately cancelled by adding them to
the GetCancellations set.

This is done instead of simply returning an error from NotifyTasks,
because NotifyTasks is being called asynchronously from swarming, so its
return value does not get propagated back to the task state.

BUG= chromium:898322 
TEST=local dev instance test

Change-Id: I46da8cfd4e7125a4c229478acac46ecaf7860115
Reviewed-on: https://chromium-review.googlesource.com/c/1357362
Commit-Queue: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19306}
[modify] https://crrev.com/472d4048be541e9967b21b8fd2efc9a7426fb3a6/go/src/infra/appengine/qscheduler-swarming/app/frontend/qscheduler.go

Status: Fixed (was: Assigned)

Sign in to add a comment