New issue
Advanced search Search tips

Issue 839173 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature

Blocked on:
issue 781021
issue 843655



Sign in to add a comment

Swarming: deny with NO_RESOURCE if no bot can service a task

Project Member Reported by mar...@chromium.org, May 2 2018

Issue description

This is needed to speed up task fallback in  issue 781021 .

Use case:
- A user create a task with a named cache as a dimension with an expiration delay of 5 minutes, and a fallback with a cold cache
- There's no warm bot yet.

Expected:
- The cold cache fallback is immediately enqueued

Actual:
- The warm cache will be enqueued and only after its expiration the cold cache slice will be enqueued.


AI:
- If there is no bot that can service the TaskSlice TaskProperties dimensions, the TaskToRun should not be enqueued and the task should be denied.
- If there are fallbacks, they should be fell to immediately.

 
Project Member

Comment 1 by bugdroid1@chromium.org, May 3 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/34b0ff4da5502f4e09a491daaefb609e331ef87d

commit 34b0ff4da5502f4e09a491daaefb609e331ef87d
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu May 03 22:37:48 2018

[swarming] refactor schedule_request()

Extract code and merge the two transaction calls into one. It will be useful
when adding NO_RESOURCE state.

Bug:  839173 
Change-Id: I9d737df19c1846dcdcd96b8bded390e1098aaded
Reviewed-on: https://chromium-review.googlesource.com/1042878
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/34b0ff4da5502f4e09a491daaefb609e331ef87d/appengine/swarming/server/task_scheduler.py

Project Member

Comment 2 by bugdroid1@chromium.org, May 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/642a634beaeb987873f7745edc3a2dd3acb45ad4

commit 642a634beaeb987873f7745edc3a2dd3acb45ad4
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri May 04 15:22:08 2018

[swarming] fix bug in scheduling task in 34b0ff4da5502f4e

This happens because to_run is None.

- Fix the key generation.
- Extract the key generation function out of schedule_request().
- Add two unit tests that confirm the case works correctly now.

Bug:  839173 
Change-Id: Iddd28fca36650bc6d4e0444f0dd22aec4112c8a9
Reviewed-on: https://chromium-review.googlesource.com/1043468
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/642a634beaeb987873f7745edc3a2dd3acb45ad4/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/642a634beaeb987873f7745edc3a2dd3acb45ad4/appengine/swarming/server/task_scheduler_test.py

Project Member

Comment 3 by bugdroid1@chromium.org, May 5 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/221435bab36f7fc34cac5ab62bac5dabbe14917b

commit 221435bab36f7fc34cac5ab62bac5dabbe14917b
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Sat May 05 00:33:09 2018

[swarming] Add new task result state NO_RESOURCE

- It is returned when there is no resource (no bot) to service the task.

This is not based on capacity estimation, this only happens when there is
effectively zero bot alive (not dead nor quarantined, maintenance is OK) that
could service the task with the requested dimensions.

The actual function to decide if capacity is available is scaffolded, it will be
implemented later™️. Unit tests are included.

Bug:  839173 
Change-Id: I681f101a0763dc0445528c4dd194d5aa67a67c28
Reviewed-on: https://chromium-review.googlesource.com/1042907
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/handlers_endpoints_test.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/server/task_request.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/server/task_request_test.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/server/task_result.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/server/task_scheduler_test.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/appengine/swarming/swarming_rpcs.py
[modify] https://crrev.com/221435bab36f7fc34cac5ab62bac5dabbe14917b/client/swarming.py

Project Member

Comment 4 by bugdroid1@chromium.org, May 8 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/41936a205e3cd420b2525b492541867d5e8bc416

commit 41936a205e3cd420b2525b492541867d5e8bc416
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue May 08 01:00:20 2018

[swarming] remove dependency from task_queues on bot_management

I need to switch to dependency in the other way, so remove the only dependency.
This is done by passing an ndb.Key to the BotRoot entity instead of passing the
bot id.

Bug:  839173 
Change-Id: I55d1980680c0ea69b489dd24b1fb2918cf166b65
Reviewed-on: https://chromium-review.googlesource.com/1048067
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/lease_management.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/task_queues.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/task_queues_test.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/task_scheduler_test.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/task_to_run.py
[modify] https://crrev.com/41936a205e3cd420b2525b492541867d5e8bc416/appengine/swarming/server/task_to_run_test.py

Project Member

Comment 6 by bugdroid1@chromium.org, May 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f

commit 08bca8e3316b4dca6c8f873b8e6812a3f2dd163f
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue May 15 20:43:05 2018

[swarming] Add capacity logic but still return True all the time

This extensively leverages memcache to determine if a task queue is valid or
not. My little pinky tells me there's an high chance this will blow up, so make
it always return True for now, and log if it thinks it would return False.

This will permit analysing the logs on prod, as staging will likely not be
useful, which will be useful to determine if the logic is working well under
load.

There are two different has_capacity functions, one in task_queues which checks
the task queue based cache, and the other in bot_management which does a BotInfo
DB query.

Bug:  839173 
Change-Id: Iacaf13ac017fb433c27f477324b468018b7d6a38
Reviewed-on: https://chromium-review.googlesource.com/1050201
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/bot_management_test.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/task_queues.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/task_queues_test.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/08bca8e3316b4dca6c8f873b8e6812a3f2dd163f/appengine/swarming/server/task_scheduler_test.py

Project Member

Comment 7 by bugdroid1@chromium.org, May 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d

commit cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue May 15 21:40:25 2018

[swarming] update tests in preparation for NO_RESOURCE.

The assumption 'trigger then register a bot' doesn't hold anymore, so many tests
need to be updated to always register a bot first.

Doing this in a test-only CL so the actual change is clearer in the follow up.

Bug:  839173 
Change-Id: Iaea63f44b5b8d626db685a21156a3d8041f9a0c9
Reviewed-on: https://chromium-review.googlesource.com/1056112
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d/appengine/swarming/handlers_endpoints_test.py
[modify] https://crrev.com/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d/appengine/swarming/local_smoke_test.py
[modify] https://crrev.com/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d/appengine/swarming/server/task_scheduler_test.py
[modify] https://crrev.com/cf4f6a4ecbd1f50d37a2f8edd8441b840adc141d/appengine/swarming/test_env_handlers.py

Project Member

Comment 8 by bugdroid1@chromium.org, May 16 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/38dfd9ef37630bacba9b4526566c52cfeef1b2d1

commit 38dfd9ef37630bacba9b4526566c52cfeef1b2d1
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed May 16 13:13:07 2018

[swarming] fix regression ea7806ac22b5e2016f.

Add a unit test to test this one-off code path for deleted bots. It didn't
happen on staging, and only occasionally on prod.

TBR=qyearsley@chromium.org

Bug:  839173 
Change-Id: I59cbfdfdd59c5b3b3141b82cc2656960e3aff1bf
Reviewed-on: https://chromium-review.googlesource.com/1061594
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/38dfd9ef37630bacba9b4526566c52cfeef1b2d1/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/38dfd9ef37630bacba9b4526566c52cfeef1b2d1/appengine/swarming/handlers_endpoints_test.py

Comment 9 by mar...@chromium.org, May 16 2018

Blockedon: 843655
> Use case:
> - A user create a task with a named cache as a dimension with an expiration
> delay of 5 minutes, and a fallback with a cold cache
> - There's no warm bot yet.
>
> Expected:
> - The cold cache fallback is immediately enqueued

Note that this is not always desired, so it should be configurable. There might be a bot in the fleet that is already running a task that would produce the necessary hot cache, but the bot will only be updated with it once the job completes.
Cc: pprabhu@chromium.org
Ok, will add a bool flag to make this configurable per TaskSlice.
Project Member

Comment 13 by bugdroid1@chromium.org, May 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/fa24dd6cd5eab0be1dcb5e78fb68bf2da9847b09

commit fa24dd6cd5eab0be1dcb5e78fb68bf2da9847b09
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed May 23 19:49:45 2018

[swarming] remove capacity query inside expiration transaction

Run the queries first before the transaction.

Running an ancestor-less query (like a count()) inside a transaction prints out
the warning: 'Only ancestor queries are allowed inside transactions'.

I'm not sure why it's a warning instead of an obvious error. Good thing I caught
it while debugging an unrelated thing before it started to be used in prod.

Add asserts to make sure has_capacity() is not called inside a transaction by
accident in the future.

Bug:  839173 
Change-Id: I4c650326799dd686d3940c02156ecd025de1fcc8
Reviewed-on: https://chromium-review.googlesource.com/1070508
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/fa24dd6cd5eab0be1dcb5e78fb68bf2da9847b09/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/fa24dd6cd5eab0be1dcb5e78fb68bf2da9847b09/appengine/swarming/server/task_queues.py
[modify] https://crrev.com/fa24dd6cd5eab0be1dcb5e78fb68bf2da9847b09/appengine/swarming/server/task_scheduler.py

Project Member

Comment 14 by bugdroid1@chromium.org, May 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/e1f2952e12f70ff6b42eb19b449a875a08856f40

commit e1f2952e12f70ff6b42eb19b449a875a08856f40
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu May 24 00:11:41 2018

[swarming] capacity: move hardcoded True to _FAKE_CAPACITY global

This permits mocking the value at the test suite level, which enables testing
that is closer to the final expected result. It's now close enough O(days) to
being made live that testing the expectation instead of reality is now
preferable.

While updating test cases, I found out a bug in bot_reap_task() which I
commented. I decided to not fix the bug to keep this CL focused, and the bug
will be updated in a relatively simple follow up.

Bug:  839173 
Change-Id: I8c05dc6dea578e26d9006725bc555378615424bc
Reviewed-on: https://chromium-review.googlesource.com/1070516
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/e1f2952e12f70ff6b42eb19b449a875a08856f40/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/e1f2952e12f70ff6b42eb19b449a875a08856f40/appengine/swarming/server/bot_management_test.py
[modify] https://crrev.com/e1f2952e12f70ff6b42eb19b449a875a08856f40/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/e1f2952e12f70ff6b42eb19b449a875a08856f40/appengine/swarming/server/task_scheduler_test.py

Project Member

Comment 15 by bugdroid1@chromium.org, May 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/8b74e161cf76ab6c129f0815dd1b7508e0b6218f

commit 8b74e161cf76ab6c129f0815dd1b7508e0b6218f
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri May 25 13:35:12 2018

[swarming] add wait_for_capacity to have task sit idle

There is a use case for having task wait even if there is no known capacity
after all. Make NO_RESOURCE the default, as we want people to use the flag if
they want their task to hang pending when there's no capacity but not encourage
people to do this.

The flag wait_for_capacity is member of TaskSlice, not TaskRequest, thus each
individual task slice can hang or not, based on the client preference.

Note that NO_RESOURCE is still in alert-only mode, so even if this code path
checks for capacity, in practice the task will *still* be PENDING instead of
aborted with NO_RESOURCE until the bit is flipped.

Bug:  839173 
Change-Id: Ie7fc456c5b8945adc4f813537ccf1ba35fd306d7
Reviewed-on: https://chromium-review.googlesource.com/1070657
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/handlers_endpoints_test.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/server/lease_management.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/server/task_request.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/server/task_request_test.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/server/task_scheduler_test.py
[modify] https://crrev.com/8b74e161cf76ab6c129f0815dd1b7508e0b6218f/appengine/swarming/swarming_rpcs.py

Project Member

Comment 16 by bugdroid1@chromium.org, May 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/3e6132add7d19f2919d479f30472f7faec3ad6ec

commit 3e6132add7d19f2919d479f30472f7faec3ad6ec
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri May 25 16:35:32 2018

[client] Add --wait-for-capacity

Also include refactoring in preparation for a TaskRequest with multiple
TaskSlice, for  issue 839467 .

Small test only change in appengine/swarming to force the CQ to run tests there.

Bug:  839173 
Change-Id: I2d15c04dfb7a1e5acc90451f23cdc967e3f755db
Reviewed-on: https://chromium-review.googlesource.com/1073190
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/3e6132add7d19f2919d479f30472f7faec3ad6ec/appengine/swarming/local_smoke_test.py
[modify] https://crrev.com/3e6132add7d19f2919d479f30472f7faec3ad6ec/client/swarming.py
[modify] https://crrev.com/3e6132add7d19f2919d479f30472f7faec3ad6ec/client/tests/swarming_test.py

Project Member

Comment 17 by bugdroid1@chromium.org, Jun 21 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/66cc650463e2cc9c8354bea04374831fd02e9a33

commit 66cc650463e2cc9c8354bea04374831fd02e9a33
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Jun 21 17:47:06 2018

[swarming] Document TaskState and StateField in swarming_rpcs.py

swarming_rpcs.py defines the messages in the APIs, as the service currently lack
proper protos that would eventually act as the user documentation.

Users shouldn't have to go deeper in the DB code to figure out what each
constant means.

No functional change.

Bug:  839173 
Change-Id: Idc4e43b18ddac42c1d202e5a675a6acdf3bdc3e3
Reviewed-on: https://chromium-review.googlesource.com/1110145
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/66cc650463e2cc9c8354bea04374831fd02e9a33/appengine/swarming/swarming_rpcs.py

Project Member

Comment 18 by bugdroid1@chromium.org, Jun 21 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/2574493d1c014502b9410203f89607b647a4898e

commit 2574493d1c014502b9410203f89607b647a4898e
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Jun 21 18:52:18 2018

[buildbucket] Support Swarming task state NO_RESOURCE

This will soon start happening.

Bug:  839173 
Change-Id: I88a364f753b77f3b0675f8920a204dd37d7873f2
Reviewed-on: https://chromium-review.googlesource.com/1110383
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/2574493d1c014502b9410203f89607b647a4898e/appengine/cr-buildbucket/swarming/swarming.py
[modify] https://crrev.com/2574493d1c014502b9410203f89607b647a4898e/appengine/cr-buildbucket/swarming/test/swarming_test.py

Project Member

Comment 19 by bugdroid1@chromium.org, Jun 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/47a6d1213b669bf9757211edb605e17fcc958aae

commit 47a6d1213b669bf9757211edb605e17fcc958aae
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri Jun 22 18:08:37 2018

[swarming] more task state documentation

Follow up from https://chromium-review.googlesource.com/1110145.

Rename two enums in swarming_rpcs, they were misnomed:
- TaskState -> TaskStateQuery
- StateField -> TaskState

Reduce the number of constant lists. This reduces the risk of forgetting
something when adding more values.

No functional change.

Bug:  839173 
Change-Id: Ia4eb7baef7d6a9ab15154c89d549ac74c284d878
Reviewed-on: https://chromium-review.googlesource.com/1111615
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/handlers_endpoints_test.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/message_conversion.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/server/task_result.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/server/task_result_test.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/server/task_scheduler.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/appengine/swarming/swarming_rpcs.py
[modify] https://crrev.com/47a6d1213b669bf9757211edb605e17fcc958aae/client/swarming.py

Project Member

Comment 21 by bugdroid1@chromium.org, Jun 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/395ab1a32a12666eb5e95c3e7e733cb3f6e175f2

commit 395ab1a32a12666eb5e95c3e7e733cb3f6e175f2
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Jun 27 18:52:40 2018

[swarming] Implement has_capacity returning False

This enables the NO_RESOURCE on lack of resource at task creation.

Include smoke test for both refusal and for immediate fallback.

Bug:  839173 
Change-Id: I6730c521ddc3a76aabc5ddbb68cc8e1bb6a4efee
Reviewed-on: https://chromium-review.googlesource.com/1055754
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/395ab1a32a12666eb5e95c3e7e733cb3f6e175f2/appengine/swarming/local_smoke_test.py
[modify] https://crrev.com/395ab1a32a12666eb5e95c3e7e733cb3f6e175f2/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/395ab1a32a12666eb5e95c3e7e733cb3f6e175f2/appengine/swarming/server/bot_management_test.py
[modify] https://crrev.com/395ab1a32a12666eb5e95c3e7e733cb3f6e175f2/appengine/swarming/server/task_scheduler_test.py

Project Member

Comment 22 by bugdroid1@chromium.org, Jun 28 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/caad9970845b982fecbf70cc4725a1f81329c4a5

commit caad9970845b982fecbf70cc4725a1f81329c4a5
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Jun 28 15:07:54 2018

[recipes] add KILLED and NO_RESOURCE to state.

In practice they are not really used.

R=nodir@chromium.org

Bug:  839173 
Change-Id: I6453a9f12eee91716b7bc5676d6d978027b46a15
Reviewed-on: https://chromium-review.googlesource.com/1118358
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/caad9970845b982fecbf70cc4725a1f81329c4a5/scripts/slave/recipe_modules/swarming/state.py

Project Member

Comment 23 by bugdroid1@chromium.org, Jul 3

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/84bc9e14f5038525bf84caaca49b3e75ac100bc5

commit 84bc9e14f5038525bf84caaca49b3e75ac100bc5
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue Jul 03 14:00:38 2018

[swarming] lower log for NO CAPACITY to warning

Error logs generate a ereporter2 error entry.

R=qyearsley@chromium.org

Bug:  839173 
Change-Id: Ib7b2a713227b1cb66dc8dabd31cc9d0e5bff8412
Reviewed-on: https://chromium-review.googlesource.com/1121517
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/84bc9e14f5038525bf84caaca49b3e75ac100bc5/appengine/swarming/server/bot_management.py

Status: Fixed (was: Assigned)

Sign in to add a comment