New issue
Advanced search Search tips

Issue 697435 link

Starred by 0 users

Issue metadata

Status: WontFix
Owner:
Closed: Aug 17
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Exec slaves.cfgs on slave-side in a subprocess

Project Member Reported by serg...@chromium.org, Mar 1 2017

Issue description

Labels: cit-pm-18 Type-Bug
Labels: -Pri-2 Pri-1

Comment 3 by efoo@google.com, Mar 1 2017

Components: -Infra Infra>Platform>Buildbot
Owner: estaab@chromium.org
Status: Assigned (was: Untriaged)
Erik - can you help find an owner for this? It's a p1 from a PM.

Comment 5 by efoo@chromium.org, May 24 2017

Erik? Who would be the best owner for this? Please update. Thanks!

Comment 6 by estaab@chromium.org, May 24 2017

Labels: -Pri-1 Pri-2
Owner: serg...@chromium.org
I'd like more info on what would need to be done here and why it prevents the issue in the postmortem.
Cc: machenb...@chromium.org
Owner: estaab@chromium.org
The issue is that slaves.cfg files are executed on masters to extract some configuration (probably master->builder mapping, machenbach@ will know more). As a result they could affect master processes and take them down as happened in the outage described in the postmortem. If we were to execute slaves.cfg files in a subprocess, they will not be able to affect global Python state of a master and thus would prevent the outages of this kind.

Comment 8 by estaab@chromium.org, May 24 2017

Ok, so we still want to execute them master-side but in a subprocess, right?
Yes.
The other way around. The slave-side execution should be done in a subprocess.

slaves.cfg is a master configuration file and it is standard buildbot logic to execute the slaves.cfg on the master.

Chrome has extra code that also makes slaves execute the slaves.cfg on startup to figure out which master they belong to. This execution might go wrong and prevent the slave from starting up properly, which happened in the underlying bug.

Comment 11 by efoo@chromium.org, Sep 7 2017

This is blocking closing of cit-pm-18. Any updates on this? 
No updates, we're pushing to get the LUCI migration done so changes to buildbot will continue to be lower priority.
Status: WontFix (was: Assigned)

Sign in to add a comment