Buildbot buildslaves don't always reconnect. |
||||||||
Issue descriptionFrom time to time I've been finding buildbot slaves that are simply offline until rebooted, but I've been rebooting them manually to get them working again. This is usually easiest to find on the tryserver waterfall by looking for slaves that are offline, and have been offline for more than a few minutes. https://uberchromegw.corp.google.com/i/chromiumos.tryserver/buildslaves We currently have a single slave in this state, and I plan to leave it that way for investigation since having it down isn't causing substantial burden. https://uberchromegw.corp.google.com/i/chromiumos.tryserver/buildslaves/cros-standard36-c2
,
Mar 8 2018
Not currently on my plate.
,
Mar 9 2018
,
Mar 12 2018
https://uberchromegw.corp.google.com/i/chromeos/builders/eve-release cros-beefy59-c2 offline in buildbot, online in GCP. Will reboot and see.
,
Mar 12 2018
,
Mar 13 2018
,
Apr 30 2018
I don't currently have a builder in this state (just rebooted them all), but it is an ongoing issue.
,
May 1 2018
dgarrett: are you expecting some amount of trooper intervention here?
,
May 1 2018
Yes. We don't normally own or interact with the buildbot client. I consider this lower priority than it was because we have fewer buildbot builders than we used too, but it does still cause build failures.
,
May 1 2018
If you want to hand it back until we have another stuck builder, that would be very reasonable.
,
May 2 2018
If you don't mind, I'll assign the bug to you Don, otherwise we keep on looking at triaging the issue.
,
May 15 2018
We currently have two examples of this. https://uberchromegw.corp.google.com/i/chromeos/buildslaves/build173-m2 https://uberchromegw.corp.google.com/i/chromeos/buildslaves/cros-beefy71-c2 The first is associated with the CQ which has 2 redundant slaves, the second is causing an outage of veyron_jaq-release, so I'm going to capture logs and reboot it.
,
May 15 2018
Actually, those appear to be symptoms appear to be different, I can't ssh into either machine.
,
Jun 15 2018
As the number of buildbot builders goes down, this has been less of an issue. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by pprabhu@chromium.org
, Mar 8 2018