Repair via servoreset fails if crashcollect fails
Reported by
jrbarnette@chromium.org,
Aug 18 2017
|
||
Issue description
Consider this repair task:
http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack1-host4/343595-repair/
The status.log file shows this failure:
START ---- repair.servoreset timestamp=1503070109 localtime=Aug 18 08:28:29
FAIL ---- repair.servoreset timestamp=1503070134 localtime=Aug 18 08:28:54 __init__() takes exactly 3 arguments (2 given)
END FAIL ---- repair.servoreset timestamp=1503070134 localtime=Aug 18 08:28:54
That message can be traced to this exception from debug/autoserv.DEBUG:
08/18 08:28:54.343 DEBUG| abstract_ssh:0470| send_file. source: /usr/local/autotest/client/bin/result_tools, dest: /usr/local/autotest, delete_dest: False,preserve_symlinks:False
08/18 08:28:54.353 DEBUG| ssh_host:0296| Running (ssh) 'rsync --version' from 'send_file|use_rsync|check_rsync|run|wrapper|run_very_slowly'
08/18 08:28:54.473 WARNI| abstract_ssh:0132| rsync not available on remote host chromeos2-row4-rack1-host4 -- disabled
08/18 08:28:54.473 DEBUG| abstract_ssh:0499| Trying scp.
08/18 08:28:54.474 ERROR| repair:0449| Repair failed: Reset the DUT via servo
Traceback (most recent call last):
File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 447, in _repair_host
self.repair(host)
File "/usr/local/autotest/server/hosts/cros_repair.py", line 402, in repair
host.collect_logs('/var/log', local_log_dir, ignore_errors=True)
File "/usr/local/autotest/server/hosts/abstract_ssh.py", line 848, in collect_logs
result_tools_runner.run_on_client(self, remote_src_dir)
File "/usr/local/autotest/client/bin/result_tools/runner.py", line 86, in run_on_client
_deploy_result_tools(host)
File "/usr/local/autotest/client/bin/result_tools/runner.py", line 60, in _deploy_result_tools
excludes = _EXCLUDES)
File "/usr/local/autotest/server/hosts/abstract_ssh.py", line 503, in send_file
'excludes: %s' % excludes)
TypeError: __init__() takes exactly 3 arguments (2 given)
This is the relevant code in the repair function:
if host.wait_up(host.BOOT_TIMEOUT):
# Collect logs once we regain ssh access before clobbering them.
local_log_dir = crashcollect.get_crashinfo_dir(host, 'after_reset')
host.collect_logs('/var/log', local_log_dir, ignore_errors=True)
# Collect crash info.
crashcollect.get_crashinfo(host, None)
return
Basically, the repair succeeded (the host was up), but follow-up work
to gather logs failed. That should not be a repair failure: the DUT
was working. Exceptions in this code path should be ignored.
,
Sep 20 2017
|
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Sep 15 2017