Tests failed or aborted because of network connection issues (ssh connection failures, rsync failures, remote command failures) |
|||||||||
Issue descriptionhttps://luci-milo.appspot.com/buildbot/chromeos/peppy-paladin/14929 http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=113662642 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/113662642-chromeos-test/chromeos4-row6-rack13-host19/ 04/21 07:56:45.177 DEBUG| abstract_ssh:0357| Using Rsync. 04/21 07:56:45.178 DEBUG| base_utils:0185| Running 'rsync -l --timeout=1800 --rsh='/usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_iXMfJQssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row6-rack13-host19:"/usr/local/autotest/results/default/" "/usr/local/autotest/results/113662642-chromeos-test/chromeos4-row6-rack13-host19"' 04/21 07:56:55.051 WARNI| abstract_ssh:0387| trying scp, rsync failed: Command <rsync -l --timeout=1800 --rsh='/usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_iXMfJQssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row6-rack13-host19:"/usr/local/autotest/results/default/" "/usr/local/autotest/results/113662642-chromeos-test/chromeos4-row6-rack13-host19"> failed, rc=23, Command returned non-zero exit status * Command: rsync -l --timeout=1800 --rsh='/usr/bin/ssh -a -x -o ControlPath=/tmp /_autotmp_iXMfJQssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row6-rack13-host19:"/usr/local/autotest/results/default/" "/usr/local/autotest/results/113662642-chromeos- test/chromeos4-row6-rack13-host19" Exit status: 23 Duration: 9.76802301407 stderr: rsync: change_dir "/usr/local/autotest/results/default" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0] rsync: [Receiver] write error: Broken pipe (32) (23) 04/21 07:56:55.053 DEBUG| abstract_ssh:0390| Trying scp. 04/21 07:56:55.054 DEBUG| ssh_host:0284| Running (ssh) 'ls "/usr/local/autotest/results/default/"*' 04/21 07:57:09.410 DEBUG| base_utils:0280| [stderr] ls: cannot access /usr/local/autotest/results/default/*: No such file or directory 04/21 07:57:16.276 DEBUG| ssh_host:0284| Running (ssh) 'ls "/usr/local/autotest/results/default/".[!.]*' 04/21 07:57:24.399 DEBUG| base_utils:0280| [stderr] ls: cannot access /usr/local/autotest/results/default/.[!.]*: No such file or directory 04/21 07:57:24.403 DEBUG| server_job:1371| Client state file /usr/local/autotest/results/113662642-chromeos-test/chromeos4-row6-rack13-host19/control.autoserv.state not found 04/21 07:57:24.405 DEBUG| base_job:0392| Persistent state client.* deleted 04/21 07:57:24.407 DEBUG| autotest:0966| Autotest job finishes. 04/21 07:57:24.408 ERROR| server_job:0809| Exception escaped control file, job aborting: Traceback (most recent call last): File "/usr/local/autotest/server/server_job.py", line 801, in run self._execute_code(server_control_file, namespace) File "/usr/local/autotest/server/server_job.py", line 1301, in _execute_code execfile(code_file, namespace, namespace) File "/usr/local/autotest/results/113662642-chromeos-test/chromeos4-row6-rack13-host19/control.srv", line 10, in <module> job.parallel_simple(run_client, machines) File "/usr/local/autotest/server/server_job.py", line 625, in parallel_simple return_results=return_results) File "/usr/local/autotest/server/subcommand.py", line 93, in parallel_simple function(arg) File "/usr/local/autotest/results/113662642-chromeos-test/chromeos4-row6-rack13-host19/control.srv", line 7, in run_client at.run(control, host=host, use_packaging=use_packaging) File "/usr/local/autotest/server/autotest.py", line 381, in run client_disconnect_timeout, use_packaging=use_packaging) File "/usr/local/autotest/server/autotest.py", line 464, in _do_run client_disconnect_timeout=client_disconnect_timeout) File "/usr/local/autotest/server/autotest.py", line 896, in execute_control boot_id = self.host.get_boot_id() File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 238, in get_boot_id boot_id = self.run(cmd, timeout=timeout).stdout.strip() File "/usr/local/autotest/server/hosts/ssh_host.py", line 300, in run raise error.AutoservRunError(timeout_message, cmderr.args[1]) AutoservRunError: Timeout encountered: /usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_ryEZZ8ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row6-rack13-host19 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::_do_run|execute_control|get_boot_id] -> ssh_run(if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi)\";fi; if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi" * Command: /usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_ryEZZ8ssh- master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row6-rack13-host19 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::_do_run|execute_control|get_boot_id] -> ssh_run(if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi)\";fi; if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi" Exit status: 255 Duration: 61.5624740124 stdout: 4c362b13-4584-4c9a-86dd-59720bc9b38e
,
Apr 21 2017
,
Apr 22 2017
https://luci-milo.appspot.com/buildbot/chromeos/x86-zgb-paladin/9662
,
Apr 22 2017
I'm going to merge different bugs into this one. They were all caused by network connectivity issues, but in different symptoms: ssh failures, remote command failures, rsync failures, etc.
,
Apr 22 2017
Issue 713535 has been merged into this issue.
,
Apr 22 2017
,
Apr 22 2017
Issue 713011 has been merged into this issue.
,
Apr 22 2017
Issue 714275 has been merged into this issue.
,
Apr 22 2017
,
Apr 24 2017
Issue 714286 has been merged into this issue.
,
Apr 25 2017
,
May 16 2017
,
May 16 2017
this is believed to be fixed, please see more details in the postmortem http://shortn/_6ecl5t5CQ7
,
Jun 20 2017
Issue 714252 has been merged into this issue.
,
Aug 1 2017
,
Jan 22 2018
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by nxia@chromium.org
, Apr 21 2017