No machine available in moblab cq pool |
||||||||||
Issue description+sheriff / Infradeputy From log in https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin 1 machine is locked and another 3 are in Repair Failed state. Attempting to display pool info: cq host: chromeos2-row1-rack8-host1, status: Ready, locked: True diagnosis: Unused host: chromeos2-row1-rack8-host3, status: Repair Failed, locked: False diagnosis: Failed repair host: chromeos2-row2-rack8-host1, status: Repair Failed, locked: False diagnosis: Failed repair host: chromeos2-row2-rack8-host5, status: Repair Failed, locked: False diagnosis: Failed repair Reason: Some test(s) was aborted before running, suite must have timed out.
,
Dec 5 2016
From the deputy e-mail earlier today, I see that both the CQ and BVT pools are in a similar state: Status for pool:bvt, by board: Board Bad Idle Good Total guado_moblab 3 1 0 4 link 1 0 7 8 Status for pool:cq, by board: Board Bad Idle Good Total guado_moblab 3 1 0 4 whirlwind 1 0 7 8
,
Dec 5 2016
Issue 671287 has been merged into this issue.
,
Dec 5 2016
Looking at the CQ pool, all four Moblab hosts are offline (no answer to ping). Looking at the repair logs, this isn't showing up. So, we seem to have two problems: * The Moblab instances are broken in various ways. * Repair isn't properly reporting the problems. Holding just this one bug (for now) while I sort out what's really going on.
,
Dec 5 2016
All four Moblab instances went offline in sequence after provisioning.
It looks like there may be a bad build. All four BVT instances have
a problem, too, so the problem could be ToT (not a bad CL).
$ dut-status -b guado_moblab -p cq -g
chromeos2-row1-rack8-host1
2016-12-05 11:17:46 NO http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host1/170275-repair/
2016-12-05 11:13:10 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host1/170272-provision/
2016-11-08 15:42:21 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/84729938-chromeos-test/
2016-11-08 15:30:29 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host1/116355-provision/
chromeos2-row1-rack8-host3
2016-12-03 20:24:30 NO http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host3/166260-repair/
2016-12-03 20:19:52 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host3/166246-provision/
2016-12-02 03:44:07 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row1-rack8-host3/162908-verify/
chromeos2-row2-rack8-host5
2016-12-03 17:32:19 NO http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host5/166116-repair/
2016-12-03 17:27:41 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host5/166114-provision/
2016-12-02 03:44:07 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host5/162909-verify/
chromeos2-row2-rack8-host1
2016-12-03 10:18:23 NO http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host1/165561-repair/
2016-12-03 10:13:57 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host1/165558-provision/
2016-12-03 04:30:55 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/88725551-chromeos-test/
2016-12-03 04:16:44 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row2-rack8-host1/165080-provision/
,
Dec 5 2016
Looking at the logs, it seems all moblab instances were offline at the start of provisioning. So, something happened in the lab (or at least on the DUTs) in between testing.
,
Dec 5 2016
I've filed ticket b2/33346512 to request repair and diagnosis of all eight moblab instances
,
Dec 5 2016
CQ is blocked by this.
,
Dec 5 2016
Most recent moblab lpaladin passed. This was probably fixed by b/33346512
,
Dec 6 2016
only one build success. builds fails on 4432,4431, 4430, 4429 https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin
,
Dec 6 2016
success on 4433
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by jrbarnette@chromium.org
, Dec 5 2016Owner: jrbarnette@chromium.org
Status: Assigned (was: Available)