Daisy_paladin & guado_moblab-paladin don't start |
||||||||
Issue descriptionMaster-paladin (build170-m2): https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/10691 daisy-paladin: https://uberchromegw.corp.google.com/i/chromeos/builders/daisy-paladin moblab-paladin: https://uberchromegw.corp.google.com/i/chromeos/waterfall?builder=guado_moblab-paladin Not sure whether it's a big deal. Feel free to change labels/priority.
,
Apr 4 2016
build139-m2, build116-m2, build143-m2 are offline.
,
Apr 4 2016
trooper@
,
Apr 4 2016
The chromeos master isn't even loading for me right now. Investigating.
,
Apr 4 2016
It's in the middle of booting up. Someone must've restarted it manually. We'll just have to wait until it's done.
,
Apr 4 2016
This is actually really bad. The master is in the middle of its daily cycle and we're trying to debug it. Any idea who decided to restart it? That sort of thing needs to be coordinated with the Infra trooper.
,
Apr 4 2016
Issue 600526 has been merged into this issue.
,
Apr 4 2016
,
Apr 4 2016
The master is continuously loading builds into memory. Either someone keeps restarting it manually, or something is constantly asking the master to load all these pages.
,
Apr 4 2016
We're going to try reverting https://chromereviews.googleplex.com/387507013 and see if that helps. Might be something where the floating builder algorithm goes wonky if one of the slaves is offline
,
Apr 4 2016
Looks like the revert fixed it. The master has been responding well ever since.
,
Apr 4 2016
Alright, I'm going to reopen the tree then.
,
Apr 4 2016
The waterfall still claims that several important builders are offline (updating here soon). closing tree.
,
Apr 4 2016
We're working on getting those up: https://bugs.chromium.org/p/chromium/issues/detail?id=600557 https://bugs.chromium.org/p/chromium/issues/detail?id=600559 https://bugs.chromium.org/p/chromium/issues/detail?id=600561 https://bugs.chromium.org/p/chromium/issues/detail?id=600562 https://bugs.chromium.org/p/chromium/issues/detail?id=600563
,
Apr 4 2016
Builders that offline and their corresponding config name, according to https://uberchromegw.corp.google.com/i/chromeos/buildslaves : build107-m2 daisy-chromium-pfq build111-m2 [paladin float] build116-m2 guado_moblab-paladin build125-m2 x86-alex-chrome-pfq build139-m2 daisy-paladin build143-m2 x86-mario-paladin build149-m2 x86-generic-chromium-pfq build158-m2 amd64-generic-chromium-pfq build183-m2 arm-generic_freon-chromium-pfq build243-m2 lakitu-paladin build259-m2 daisy_skate-chrome-pfq build294-m2 peach_pit-chrome-pfq
,
Apr 4 2016
Hmm. Now a different set of build slaves are offline. Is this just some sort of rolling slave reboot?
,
Apr 4 2016
Sorry, I was just watching the master and seeing if it would restart the slaves. Currently, they all look idle to me. Is that not the case?
,
Apr 4 2016
On the waterfall all the CQ slaves do seem to list "idle", but on the buildslave list https://uberchromegw.corp.google.com/i/chromeos/buildslaves I still see multiple Not Connected buildslaves.
,
Apr 4 2016
Ah, I see. Another question: For example on guado_moblab-paladin, I see that there are 2 build slaves. One is connected and the other is offline. Are both required to be connected for the CQ to run?
,
Apr 5 2016
No, only 1 is required, but if one is offline that often indicates a problem (and we have a limited number of backup floats, which are shared, so if two primaries are offline I think we are already hosed.
,
Apr 5 2016
The waterfall has been restarted, the 6 dead slaves have been replaced, and it is back online.
,
Apr 5 2016
There are 2 other dead slaves shown on the public waterfall: build85-m2 build91-m2 Could those be brought back up/or replaced as well?
,
Apr 5 2016
build85-m2 is up and running build91-m2 is being looked at: https://bugs.chromium.org/p/chromium/issues/detail?id=600599
,
Apr 5 2016
Great, thanks!
,
Apr 5 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build.git/+/2bad7ba7bb69dbd9d6acbb4104d809ff8e2c1515 commit 2bad7ba7bb69dbd9d6acbb4104d809ff8e2c1515 Author: dnj@chromium.org <dnj@chromium.org> Date: Tue Apr 05 01:08:31 2016 CrOS: Replace broken slaves on ChromiumOS. NOPRESUBMIT=true TBR=bpastene@chromium.org BUG= chromium:600479 TEST=None Review URL: https://codereview.chromium.org/1862513002 git-svn-id: svn://svn.chromium.org/chrome/trunk/tools/build@299690 0039d316-1c4b-4281-b951-d872f2087c98 [modify] https://crrev.com/2bad7ba7bb69dbd9d6acbb4104d809ff8e2c1515/masters/master.chromiumos/slave_pool.json [modify] https://crrev.com/2bad7ba7bb69dbd9d6acbb4104d809ff8e2c1515/masters/master.chromiumos/slaves.cfg
,
Apr 5 2016
Just finished, everything should be good to go now.
,
Apr 5 2016
Alright, reopening tree now then.
,
Apr 11 2016
,
Apr 27 2016
,
May 23 2016
Bulk verified |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by akes...@chromium.org
, Apr 4 2016