https://uberchromegw.corp.google.com/i/chromeos/builders/swanky-release/builds/725/steps/AUTest%20%5Bau%5D/logs/stdio
autoserv.DEBUG:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93972045-chromeos-test/chromeos4-row8-rack11-host1/debug/
We see this:
01/02 05:54:43.988 DEBUG| ssh_host:0177| Running (ssh) 'ls "/tmp/autoserv-pNJ1es" > /dev/null'
01/02 05:54:44.868 DEBUG| ssh_host:0177| Running (ssh) 'find "/tmp/autoserv-pNJ1es" -mindepth 1 -maxdepth 1 -print0 | xargs -0 rm -rf'
01/02 06:09:45.712 DEBUG| base_utils:0280| [stderr] Write failed: Broken pipe
01/02 06:09:45.713 DEBUG| abstract_ssh:0443| send_file. source: /tmp/tmprp0ZEw, dest: /tmp/autoserv-pNJ1es/sysinfo.pickle, delete_dest: False,preserve_symlinks:False
01/02 06:09:45.713 DEBUG| abstract_ssh:0460| Using Rsync.
01/02 06:09:45.714 DEBUG| base_utils:0185| Running 'rsync -L --timeout=1800 --rsh='/usr/bin/ssh -a -x -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g "/tmp/tmprp0ZEw" "root@chromeos4-row8-rack11-host1:"/tmp/autoserv-pNJ1es/sysinfo.pickle""'
01/02 06:09:46.862 DEBUG| ssh_host:0177| Running (ssh) 'true'
01/02 06:09:47.738 DEBUG| abstract_ssh:0581| Host chromeos4-row8-rack11-host1 is now up
The "find | xargs rm" starts and 0554 and aborts (broken pipe) at 0609, likely without completing the removal. (Probably without even starting it.)
Then a few seconds later:
01/02 06:09:57.892 DEBUG| ssh_host:0177| Running (ssh) 'nohup /tmp/sysinfo/autoserv-KL6bCK/bin/autotestd /tmp/autoserv-QGV7VE -H autoserv --verbose --hostname=chromeos4-row8-rack11-host1 --user=chromeos-test /tmp/sysinfo/autoserv-KL6bCK/control.autoserv >/dev/null 2>/dev/null &'
01/02 06:09:58.767 DEBUG| ssh_host:0177| Running (ssh) '/tmp/sysinfo/autoserv-KL6bCK/bin/autotestd_monitor /tmp/autoserv-QGV7VE 0 0'
01/02 06:09:59.738 DEBUG| site_autotest:0194| AUTOTEST_STATUS::START ---- ---- timestamp=1483366198 localtime=Jan 02 06:09:58
01/02 06:09:59.739 INFO | server_job:0153| START ---- ---- timestamp=1483366198 localtime=Jan 02 06:09:58
01/02 06:09:59.739 DEBUG| site_autotest:0194| AUTOTEST_STATUS::END ABORT ---- ---- timestamp=1483366198 localtime=Jan 02 06:09:58 Unhandled OSError: [Errno 17] File exists: '/tmp/autoserv-pNJ1es/results'
01/02 06:09:59.740 INFO | server_job:0153| END ABORT ---- ---- timestamp=1483366198 localtime=Jan 02 06:09:58 Unhandled OSError: [Errno 17] File exists: '/tmp/autoserv-pNJ1es/results'
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: Traceback (most recent call last):
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: File "/tmp/sysinfo/autoserv-KL6bCK/bin/job.py", line 1013, in step_engine
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: execfile(self.control, global_control_vars, global_control_vars)
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: File "/tmp/sysinfo/autoserv-KL6bCK/control.autoserv", line 5, in <module>
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: mytest = test.test(job, '', '/tmp/autoserv-pNJ1es')
01/02 06:09:59.740 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: File "/tmp/sysinfo/autoserv-KL6bCK/common_lib/test.py", line 39, in __init__
01/02 06:09:59.741 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: os.mkdir(self.resultsdir)
01/02 06:09:59.741 DEBUG| site_autotest:0194| AUTOTEST_STATUS:: OSError: [Errno 17] File exists: '/tmp/autoserv-pNJ1es/results'
01/02 06:09:59.741 DEBUG| site_autotest:0194| Entered autotestd_monitor.
Two problems:
1. likely there should be a much shorter timeout around the "find | xarg rm" command, and a retry
2. when that fails, the test should fail, instead of producing a secondary failure which is harder to debug.
Comment 1 by autumn@chromium.org
, Jan 4 2017