New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 656238 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

provision failure: 'ssh: connect to host <hostname> port 22: Connection refused'

Project Member Reported by kevcheng@chromium.org, Oct 15 2016

Issue description

A rash of failures on a veyron_mighty-paladin run: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_mighty-paladin/builds/3332

One of the failures from that build: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/80122053-chromeos-test/chromeos4-row6-rack10-host17/sysinfo/

2016/10/10 01:04:49.544 DEBUG|      auto_updater:0903| Start post check for rootfs update...
2016/10/10 01:04:49.546 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpIhLbBB/testing_rsa root@chromeos4-row6-rack10-host17 -- rootdev -s
2016/10/10 01:04:49.891 DEBUG|      auto_updater:0307| Current root device is /dev/mmcblk0p3
2016/10/10 01:04:49.893 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpIhLbBB/testing_rsa root@chromeos4-row6-rack10-host17 -- cgpt show -n -i 4 -P '$(rootdev -s -d)'
2016/10/10 01:04:50.258 DEBUG|    cros_build_lib:0614| (stdout):
2

2016/10/10 01:04:50.258 DEBUG|    cros_build_lib:0616| (stderr):
Warning: Permanently added 'chromeos4-row6-rack10-host17,100.115.197.113' (RSA) to the list of known hosts.

2016/10/10 01:04:50.259 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpIhLbBB/testing_rsa root@chromeos4-row6-rack10-host17 -- cgpt show -n -i 2 -P '$(rootdev -s -d)'
2016/10/10 01:04:50.616 DEBUG|    cros_build_lib:0614| (stdout):
1

2016/10/10 01:04:50.616 DEBUG|    cros_build_lib:0616| (stderr):
Warning: Permanently added 'chromeos4-row6-rack10-host17,100.115.197.113' (RSA) to the list of known hosts.

...
...
...

2016/10/10 01:12:57.037 INFO |     remote_access:0371| Cannot connect to device; reboot in progress.
2016/10/10 01:12:57.037 ERROR|    cros_build_lib:0660| Reboot has not completed after 480 seconds; giving up.
2016/10/10 01:12:57.038 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpIhLbBB/testing_rsa root@chromeos4-row6-rack10-host17 -- rm -rf /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.dWjRcwmYjB
2016/10/10 01:13:00.111 ERROR|     remote_access:0832| Error connecting to device chromeos4-row6-rack10-host17
2016/10/10 01:13:00.112 DEBUG|       cros_update:0224| Error happens in CrOS auto-update: SSHConnectionError('ssh: connect to host chromeos4-row6-rack10-host17 port 22: Connection refused\r\n',)
 
Hmm... looks like it hit all of the hwtest in this master run: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/12578
Summary: provision failure: 'ssh: connect to host <hostname> port 22: Connection refused' (was: provision failure: 'ssh: connect to host <hostname> port 22: Connection refused\r\n')

Comment 6 by xixuan@chromium.org, Oct 17 2016

Looks like there's some flakeness between devservers and DUTs at the first several hours on Oct 10. 

Hold this bug for tracking.
Owner: xixuan@chromium.org
Status: Assigned (was: Untriaged)
assigning to Xixuan for now (reassign to someone else if needed).

Comment 8 by xixuan@chromium.org, Oct 19 2016

Actually I'm not sure whether it happens often. 

Maybe we can retry some specific commands in auto-update to avoid such network flakeness. But at present, we can hold this and see whether there're following more failures caused by network issue.

Comment 9 by nxia@chromium.org, Oct 19 2016

Cc: nxia@chromium.org
Status: WontFix (was: Assigned)
Looks like we don't face a large scale of failure as we hit at Oct 15. I close this bug for now. Feel free to re-open it.

Seems that 'connection refused' is probably coming from network flakeness or authority issues, not the codes.

Sign in to add a comment