winky-paladin failed at HWTest stage due to SSHConnectionError |
||
Issue descriptionwinky-paladin failed at HWTest Stage due to SSHConnectionError two times in a row: https://uberchromegw.corp.google.com/i/chromeos/builders/winky-paladin/builds/766 https://uberchromegw.corp.google.com/i/chromeos/builders/winky-paladin/builds/765 In build 766: chromeos4-row3-rack12-host7 failed to run provision_AutoUpdate.double test due to Provisioning failure: DevServerException: CrOS auto-update failed for host chromeos4-row3-rack12-host7: SSHConnectionError: ssh: connect to host chromeos4-row3-rack12-host7 port 22: Connection timed out In build 765 chromeos4-row3-rack12-host15 failed to run login_RemoteOwnership test also due to SSHConnectionError: ssh: connect to host chromeos4-row3-rack12-host15 port 22: Connection timed out Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute dargs) File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 113, in run_once force_full_update=force) File "/usr/local/autotest/server/afe_utils.py", line 232, in machine_install_and_update_labels *args, **dargs) File "/usr/local/autotest/server/hosts/cros_host.py", line 728, in machine_install_by_devserver full_update=force_full_update) File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2013, in auto_update raise DevServerException(error_msg % (host_name, error_list[0])) DevServerException: CrOS auto-update failed for host chromeos4-row3-rack12-host15: SSHConnectionError: ssh: connect to host chromeos4-row3-rack12-host15 port 22: Connection timed out Why the winky-paladin DUTs always failed to SSH into during test? Is this a network flaky? However, it happened two times in a row.
,
Feb 18 2017
jrbarnette@, is this the same cause?
,
Feb 18 2017
Bug 692342 is hardware specific, and can only occur on kevin. In this case, something caused the DUT to crash and stay down during provisioning. One possible cause would be a bug in the Chrome OS image we installed, but we'd need more detail about when the DUT crashed to guess at what happened.
,
Feb 18 2017
> Why the winky-paladin DUTs always failed to SSH into during test?
> Is this a network flaky? However, it happened two times in a row.
I checked provision jobs on CQ DUTs for the past 24 hours. There were
two provision failures in two different builds:
chromeos4-row3-rack12-host7
2017-02-17 13:47:50 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row3-rack12-host7/307906-provision/
chromeos4-row3-rack12-host15
2017-02-17 10:38:40 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row3-rack12-host15/307693-provision/
Repair showed both DUTs had the same symptoms:
* Offline at the outset
* Toggling AC power and keyboard sysrq didn't get the device's
attention.
* The devices rebooted cleanly after servo reset.
The symptoms are somewhat consistent with bug 677572 , but the winky
devices seem to be using smsc75xx devices for ethernet, and that bug
complains about a problem with smsc95xx devices.
,
Feb 18 2017
For reference, here are the two repair jobs:
2017-02-17 14:22:48 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row3-rack12-host7/308035-repair/
2017-02-17 11:14:10 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row3-rack12-host15/307777-repair/
,
Feb 18 2017
Eventlog from chromeos4-row3-rack12-host15, truncated around the failure event: 133 | 2017-02-16 10:12:38 | System Reset 134 | 2017-02-16 11:07:36 | Kernel Event | Clean Shutdown 135 | 2017-02-16 11:07:37 | System boot | 10399 136 | 2017-02-16 11:07:37 | System Reset 137 | 2017-02-16 11:12:06 | Kernel Event | Clean Shutdown 138 | 2017-02-16 11:12:06 | System boot | 10400 139 | 2017-02-16 11:12:06 | System Reset 140 | 2017-02-16 11:13:08 | Kernel Event | Clean Shutdown 141 | 2017-02-16 11:13:08 | System boot | 10401 142 | 2017-02-16 11:13:08 | System Reset 143 | 2017-02-16 13:58:30 | Kernel Event | Clean Shutdown Same information relative to chromeos4-row3-rack12-host7: 99 | 2017-02-16 13:58:21 | System Reset 100 | 2017-02-16 14:02:41 | Kernel Event | Clean Shutdown 101 | 2017-02-16 14:02:42 | System boot | 7423 102 | 2017-02-16 14:02:42 | System Reset 103 | 2017-02-16 14:03:36 | Kernel Event | Clean Shutdown 104 | 2017-02-16 14:03:37 | System boot | 7424 105 | 2017-02-16 14:03:37 | System Reset 106 | 2017-02-16 16:33:44 | Kernel Event | Clean Shutdown I suspect there's nothing much to see here...
,
Feb 18 2017
The information for chromeos4-row3-rack12-host7 was too truncated. Here's what it should have been: 96 | 2017-02-16 11:12:45 | System Reset 97 | 2017-02-16 13:58:21 | Kernel Event | Clean Shutdown 98 | 2017-02-16 13:58:21 | System boot | 7422 99 | 2017-02-16 13:58:21 | System Reset 100 | 2017-02-16 14:02:41 | Kernel Event | Clean Shutdown 101 | 2017-02-16 14:02:42 | System boot | 7423 102 | 2017-02-16 14:02:42 | System Reset 103 | 2017-02-16 14:03:36 | Kernel Event | Clean Shutdown 104 | 2017-02-16 14:03:37 | System boot | 7424 105 | 2017-02-16 14:03:37 | System Reset 106 | 2017-02-16 16:33:44 | Kernel Event | Clean Shutdown
,
Feb 28 2017
I think this is the same problem: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13772
,
Mar 19 2018
|
||
►
Sign in to add a comment |
||
Comment 1 by xixuan@chromium.org
, Feb 17 2017