Issue metadata
Sign in to add a comment
|
Deny swarming task triggering if no bot can fulfill the task |
||||||||||||||||||||||||
Issue descriptionschedule_request() should do a quick query and presence could be stored in memcache. https://github.com/luci/luci-py/blob/master/appengine/swarming/server/task_scheduler.py#L476 Maybe cache from SwarmingBotsService.count could be reused https://github.com/luci/luci-py/blob/master/appengine/swarming/handlers_endpoints.py#L792
,
Mar 29 2017
Context: this is needed for speed waterfall to distinguish between bots dying & tasks expired due to lack of capacity. The reason is because of device affinity requirement in perf waterfall: every task requires a specific bot. For a usual swarming client, if one bot dies, it can keep waiting for other bots, whereas with perf, that mean the task should never be triggered in the first place to begin with.
,
Apr 7 2017
,
Apr 7 2017
,
Apr 18 2017
Couple of questions on desired behavior: 1) Do we want this feature to be controlled by a setting, or we want to do this for all swarming instances? This would affect swarming clients that might want to trigger tasks even though a bot can't fulfill it yet. For example, bots that dynamically populate a dimension (i.e. app_version of system-under-test), and tasks are required to run against N+1 app_version. 2) Do we still want to disallow the task request if there's a quarantined bot that could have otherwise fulfilled it? 2-B) What about if there are bots that can fulfill the task, but none of them are currently idle?
,
Apr 18 2017
Answering questions in #5 1) We should probably at least start doing this with an option, so that we can selectively roll it out. Ideally this would be an option in the task request, like "fail_if_no_capacity: true" or something. 2) Yes. 2-B) If there are bots that could fulfill it, just let it sit. This will happen for chromium.perf at least; we trigger ~80 tasks for a single bot, and the bot goes through and executes all of them. The bot will not be idle most of the time, since it'll be executing tasks almost continuously.
,
Apr 18 2017
(Answering as one intended user, but you probably should get some answers from someone more familiar with the way chromium uses swarming for main tests)
,
Apr 18 2017
That behavior seems reasonable. You would definitely want things to be configurable. I couldn't say whether this would make more sense as a dimension, or as a separate flag to the request.
,
Apr 18 2017
A new flag in the NewTaskRequest. I'd say to "default to verify", make the check opt-out. I agree having the flag enables relevant use cases.
,
Jun 5 2017
Issue 616267 has been merged into this issue.
,
Jun 5 2017
,
Aug 4 2017
,
Sep 21 2017
,
Jan 12 2018
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by nedngu...@google.com
, Mar 29 2017