New issue
Advanced search Search tips

Issue 905744 link

Starred by 2 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

builder has pending builds and idle tasks

Project Member Reported by martiniss@chromium.org, Nov 15

Issue description

Today at around 10:45 AM PST, https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel had, according to milo, 25 idle builders and 15 pending builds.

After watching the builder for a bit, I noticed that a few of the idle machines got deleted, probably by machine provider. John and I hypothesized that the leases for these machines are close to being over, so no tasks are being scheduled on them. This information isn't present in any of the UIs displaying information about the bots though, so it's very confusing for the user.

Could this be displayed in Milo, or Swarming?
 
note that MP lease expiration is present on the swarming bot page, see "Machine Provider Lease Expires"
Ah, true. I didn't notice that. But even so, I would (maybe naively) think that tasks would be scheduled up until that point. To me, a bot being live on swarming means it can run tasks, so seeing the bot live on swarming but not running tasks looks like a bug.
Cc: s...@google.com
It's a function of the hard_timeout. The longer the hard_timeout, the larger the "dead zone" at the end of the bot's lifetime is.
Status: Available (was: Unconfirmed)
There's issue 894201 about this, but when considering things like preemptive VMs which cannot extend their lease, we may have to decide how to handle this in a non-surprising way.
The new design at go/remove-mp would avoid this issue entirely. The GCE service would decide when to reclaim a VM. From Swarming's perspective, it would always schedule the task with no concern for how long the task could run or how long is left in the VM's lifetime.

Sign in to add a comment