New issue
Advanced search Search tips

Issue 650507 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

Android Builder purple half the time

Project Member Reported by martiniss@chromium.org, Sep 27 2016

Issue description

https://build.chromium.org/p/chromium.fyi/builders/Android%20Builder%20%28dbg%29

Looks like something is restarting buildbot in the middle of a build? ddoman@, this looks like your change.
 
In particular, the machine has both /etc/init.d/buildbot and /etc/infra-services/buildbot.json.

Looks like init.d starts buildbot first, later Puppet runs and executes stop_init.d_buildbot and then service manager starts Buildbot again itself.

OR, the process is always started by service_manager, but "stop_init.d_buildbot" kills it anyway because it happens to find twisted.pid :)

Also, /b/build/slave/twisted.log is not longer produced :( Where is buildbot slave log supposed to be now that it runs under service_manager?

Puppet log:
$ tail  /var/log/puppet/run_puppet.log
Info: Retrieving plugin
Info: Loading facts
Could not retrieve fact='package_provider', resolution='<anonymous>': uninitialized constant Gem
Info: Caching catalog for vm823-m1.golo.chromium.org
Info: Applying configuration version '1474935517'
Notice: /Stage[main]/Chrome_infra::Buildbot::Debian/Exec[stop_init.d_buildbot]/returns: executed successfully
Notice: /Stage[main]/Chrome_infra::Setup::Mach_info/Chrome_infra::Win_file[/b/build/slave/info/host]/File[/b/build/slave/info/host]/content: content changed '{md5}37067ef6cc36952f43bf344585532a6d' to '{md5}5ca2e09337f009de882272ab10fd2cb1'
Notice: Finished catalog run in 10.90 seconds
Finshed: Mon Sep 26 17:31:09 PDT 2016
--------------------------------------------------


"/Stage[main]/Chrome_infra::Buildbot::Debian/Exec[stop_init.d_buildbot]/returns: executed successfully" is repeated on each puppet run. (I assume it means it kills buildbot on each run).

Comment 2 by ddoman@google.com, Sep 27 2016

https://chromereviews.googleplex.com/510307013/

vadmish@, you are right. That's what's happening in linux bots since /etc/init.d/buildbot just runs "make stop" in /b/build/slave.

This is a CL to remove chrome_infra::buildbot from all of the linux buildbots for master.chromium.fyi.

This issue is not happening in mac bots because launchctl doesn't interfere with service_manager.


Beware that at least on vm823-m1 the buildbot slave was indeed running under service_manager (as seen by examining /var/run/infra-services/buildbot and PID there).

IIRC, service_manager will kill Buildbot process when the service is removed: https://chromium.googlesource.com/infra/infra/+/master/infra/services/service_manager/config_watcher.py#269

Please ensure all buildbot slaves on chromium.fyi are online once Puppet CL hits them.
Project Member

Comment 4 by bugdroid1@chromium.org, Sep 27 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/puppet/+/112459819ab810cfcbd8bb1aef212b143d9c0bd1

commit 112459819ab810cfcbd8bb1aef212b143d9c0bd1
Author: Scott Lee <ddoman@chromium.org>
Date: Tue Sep 27 06:09:42 2016

Status: Fixed (was: Assigned)
This issue seems to be fixed - the bot was green for quite a while. Closing.

Sign in to add a comment