New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 756880 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Sep 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Repair via servoreset fails if crashcollect fails

Reported by jrbarnette@chromium.org, Aug 18 2017

Issue description

Consider this repair task:
    http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack1-host4/343595-repair/

The status.log file shows this failure:
	START	----	repair.servoreset	timestamp=1503070109	localtime=Aug 18 08:28:29	
		FAIL	----	repair.servoreset	timestamp=1503070134	localtime=Aug 18 08:28:54	__init__() takes exactly 3 arguments (2 given)
	END FAIL	----	repair.servoreset	timestamp=1503070134	localtime=Aug 18 08:28:54	


That message can be traced to this exception from debug/autoserv.DEBUG:
08/18 08:28:54.343 DEBUG|      abstract_ssh:0470| send_file. source: /usr/local/autotest/client/bin/result_tools, dest: /usr/local/autotest, delete_dest: False,preserve_symlinks:False
08/18 08:28:54.353 DEBUG|          ssh_host:0296| Running (ssh) 'rsync --version' from 'send_file|use_rsync|check_rsync|run|wrapper|run_very_slowly'
08/18 08:28:54.473 WARNI|      abstract_ssh:0132| rsync not available on remote host chromeos2-row4-rack1-host4 -- disabled
08/18 08:28:54.473 DEBUG|      abstract_ssh:0499| Trying scp.
08/18 08:28:54.474 ERROR|            repair:0449| Repair failed: Reset the DUT via servo
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 447, in _repair_host
    self.repair(host)
  File "/usr/local/autotest/server/hosts/cros_repair.py", line 402, in repair
    host.collect_logs('/var/log', local_log_dir, ignore_errors=True)
  File "/usr/local/autotest/server/hosts/abstract_ssh.py", line 848, in collect_logs
    result_tools_runner.run_on_client(self, remote_src_dir)
  File "/usr/local/autotest/client/bin/result_tools/runner.py", line 86, in run_on_client
    _deploy_result_tools(host)
  File "/usr/local/autotest/client/bin/result_tools/runner.py", line 60, in _deploy_result_tools
    excludes = _EXCLUDES)
  File "/usr/local/autotest/server/hosts/abstract_ssh.py", line 503, in send_file
    'excludes: %s' % excludes)
TypeError: __init__() takes exactly 3 arguments (2 given)

This is the relevant code in the repair function:
        if host.wait_up(host.BOOT_TIMEOUT):
            # Collect logs once we regain ssh access before clobbering them.
            local_log_dir = crashcollect.get_crashinfo_dir(host, 'after_reset')
            host.collect_logs('/var/log', local_log_dir, ignore_errors=True)
            # Collect crash info.
            crashcollect.get_crashinfo(host, None)
            return

Basically, the repair succeeded (the host was up), but follow-up work
to gather logs failed.  That should not be a repair failure:  the DUT
was working.  Exceptions in this code path should be ignored.

 
Project Member

Comment 1 by bugdroid1@chromium.org, Sep 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/79b1dbcea03c6a59ca67077d3c1b5e82b42e3df1

commit 79b1dbcea03c6a59ca67077d3c1b5e82b42e3df1
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Fri Sep 15 03:14:15 2017

[autotest] Don't let crash collection fail repair operations.

The ServoResetRepair and ServoSysRqRepair repair operations both
attempt to gather logs from a DUT if it comes up.  If log collection
failed, repair would fail.  This fixes the operations so that the
failure to collect logs isn't a repair failure.

BUG= chromium:756880 
TEST=invoke repair from the Python CLI.

Change-Id: Ie99ec7dd267493438587003b441864b61eb6509b
Reviewed-on: https://chromium-review.googlesource.com/621389
Commit-Ready: Richard Barnette <jrbarnette@chromium.org>
Tested-by: Richard Barnette <jrbarnette@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/79b1dbcea03c6a59ca67077d3c1b5e82b42e3df1/server/hosts/cros_repair.py

Status: Fixed (was: Started)

Sign in to add a comment