New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 598384 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner:
Email to this user bounced
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocked on:
issue 601499

Blocking:
issue 595787



Sign in to add a comment

Baremetal slaves don't reconnect to masters after being disassociated/re-associated.

Project Member Reported by d...@chromium.org, Mar 28 2016

Issue description

A baremetal slave's startup kicks the BuildBot slave process exactly once. At this point the slave asks the question, "do I belong to a master?". If the answer is "no", it quits, leaving the slave sitting idle.

If the slave is added to a master in the future, the slave doesn't re-poll its state, forcing a trooper to reboot or kick the slave process manually.

This leads to unnecessary maintenance, especially on the CrOS waterfalls, where slaves are detached and re-attached to masters all the time as part of typical builder layout adjustments. Troopers often do forget the "kick new slaves" step, leaving parts of the waterfall offline for extended periods of time.

GCE builders have a monitor process wrapping BuildBot that kicks it periodically. It would be really great to have this on baremetal/VM systems too. Maybe BuildBot itself can be a subordinate of service manager?
 
Blocking: 595787
Status: Assigned (was: Untriaged)
Cc: dsansome@chromium.org
+dsansome fyi
Yes I'd love to run buildbot under service manager.  We'd probably need to add a per-service config option to control how often a service is restarted.
Blockedon: 601499
Mergedinto: 601499
Status: Duplicate (was: Assigned)

Sign in to add a comment