New issue
Advanced search Search tips
Starred by 3 users

Issue metadata

Status: Archived
Closed: Jan 2017
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug

Sign in to add a comment

kip shard (chromeos-server42.cbf) is down

Project Member Reported by, Jan 12 2017 Back to list

Issue description

The shard serving board:kip is down.  The problem caused the
daily lab inventory to fail to send e-mail.  Attempts to get
status of kip boards (with dut-status) fail similarly.

Test login to the server times out:
    $ become chromeos-test@chromeos-server42.cbf
    ssh: connect to host port 22: Connection timed out

I haven't checked the waterfall status, but this is bound
to be affecting the CQ and the kip canary and release builders.

The CQ has been failing because the kip-paladin can't test.
This has been going on since this build last night:

Please remember to add sheriffs to these bugs!

How rude of me... and thank you for opening it!

By the way, how would the sheriff tell that the shard is down?  The logs show mostly timeouts.  Are we missing an opportunity for a clearer error message?


That's an excellent question.

This is my first duty shift in which this has been a problem.

Comment 6 by, Jan 12 2017

Labels: Hotlist-TreeCloser
Status: Fixed
I don't understand WHY it was locked up, but the machine wasn't responding to ping or anything else.

I used "cham --off <host>", "cham --on <host>" to reset it, and it seems to be recovering. Reverifying the duts now.

I'm running reverify against all kip duts, and have seen some of them succeed. I'm calling this fixed.
Status: Started
It's up and running, be appears to be really slow.

"balance-pool cq kip" timed out once, and finished the second time but took multiple minutes to run.

Also we are still seeing CQ failures because test suites on kip are timing out.
Status: Fixed
Remarking as fixed after investigation.

Comment 11 by, Mar 4 2017

Labels: VerifyIn-58

Comment 12 by, Apr 17 2017

Labels: VerifyIn-59

Comment 13 by, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61
Status: Archived

Sign in to add a comment