New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 849391 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

guado_moblab-paladin fails moblab_RunSuite: FAIL: Unhandled AutoservRunError: command execution error

Project Member Reported by davidri...@chromium.org, Jun 4 2018

Issue description

https://luci-milo.appspot.com/buildbot/chromeos/guado_moblab-paladin/9598

Potentially interesting logs from https://storage.cloud.google.com/chromeos-autotest-results/205522606-chromeos-test/chromeos2-row1-rack8-host7/debug/autoserv.DEBUG:

06/04 12:18:41.139 INFO |        server_job:0218| 	FAIL	moblab_RunSuite	moblab_RunSuite	timestamp=1528139921	localtime=Jun 04 12:18:41	Unhandled AutoservRunError: command execution error
  * Command: 
      /usr/bin/ssh -a -x   -o Protocol=2 -o StrictHostKeyChecking=no -o
      UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o
      ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4
      -l root -p 22 chromeos2-row1-rack8-host7 "export LIBC_FATAL_STDERR_=1; if
      type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\"
      \"server[stack::run_once|run_as_moblab|run] -> ssh_run(su - moblab -c
      '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan
      --build=cyan-release/R66-10452.74.0 --suite_name=dummy_server --retry=True
      --max_retries=1')\";fi; su - moblab -c
      '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan
      --build=cyan-release/R66-10452.74.0 --suite_name=dummy_server --retry=True
      --max_retries=1'"
  Exit status: 4
  Duration: 1121.69987607


  dummy_PassServer_nossp   [ FAILED ]
  dummy_PassServer_nossp     ABORT: Timed out, did not run.


   06-04-2018 [12:17:49] Attempting to display pool info: 
  No hosts found for board:cyan in pool:
  Reason: Tests were aborted before running; suite must have timed out.

 
One if the sub duts failed to return from reboot after provision.

From sysinfo.tgz /sysinfo/mnt/moblab/results/3-moblab/192.168.231.101/status.log

START	----	provision	timestamp=1528138767	localtime=Jun 04 11:59:27	
	START	provision_AutoUpdate	provision_AutoUpdate	timestamp=1528138768	localtime=Jun 04 11:59:28	
		START	----	----	timestamp=1528138774	localtime=Jun 04 11:59:34	
			GOOD	----	sysinfo.before	timestamp=1528138774	localtime=Jun 04 11:59:34	
		END GOOD	----	----	timestamp=1528138774	localtime=Jun 04 11:59:34	
		START	----	reboot	timestamp=1528138843	localtime=Jun 04 12:00:43	
			GOOD	----	reboot.start	timestamp=1528138843	localtime=Jun 04 12:00:43	
			ABORT	----	reboot.verify	timestamp=1528139567	localtime=Jun 04 12:12:47	Host did not return from reboot
		END FAIL	----	reboot	timestamp=1528139567	localtime=Jun 04 12:12:47	Host did not return from reboot
  Traceback (most recent call last):
    File "/usr/local/autotest/server/server_job.py", line 952, in run_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 160, in reboot
      **dargs)
    File "/usr/local/autotest/server/hosts/remote.py", line 229, in wait_for_restart
      self.log_op(self.OP_REBOOT, op_func)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 566, in log_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 228, in op_func
      super(RemoteHost, self).wait_for_restart(timeout=timeout, **dargs)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 310, in wait_for_restart
      raise error.AutoservRebootError("Host did not return from reboot")
  AutoservRebootError: Host did not return from reboot
		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1528139668	localtime=Jun 04 12:14:28	Unhandled AutoservRebootError: Host did not return from reboot
  Traceback (most recent call last):
    File "/usr/local/autotest/client/common_lib/test.py", line 831, in _call_test_function
      return func(*args, **dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 495, in execute
      dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 362, in _call_run_once_with_retry
      postprocess_profiled_run, args, dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 400, in _call_run_once
      self.run_once(*args, **dargs)
    File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 126, in run_once
      with_cheets=with_cheets)
    File "/usr/local/autotest/server/afe_utils.py", line 126, in machine_install_and_update_labels
      image_name, host_attributes = host.machine_install(update_url)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 743, in machine_install
      return updater.run_update()
    File "/usr/local/autotest/server/cros/autoupdater.py", line 716, in run_update
      self.host.reboot(timeout=self.host.REBOOT_TIMEOUT)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 1255, in reboot
      super(CrosHost, self).reboot(**dargs)
    File "/usr/local/autotest/server/hosts/remote.py", line 164, in reboot
      self.log_op(self.OP_REBOOT, reboot)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 562, in log_op
      self.job.run_op(op, op_func, self.get_kernel_ver)
    File "/usr/local/autotest/server/server_job.py", line 952, in run_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 160, in reboot
      **dargs)
    File "/usr/local/autotest/server/hosts/remote.py", line 229, in wait_for_restart
      self.log_op(self.OP_REBOOT, op_func)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 566, in log_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 228, in op_func
      super(RemoteHost, self).wait_for_restart(timeout=timeout, **dargs)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 310, in wait_for_restart
      raise error.AutoservRebootError("Host did not return from reboot")
  AutoservRebootError: Host did not return from reboot
	END FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1528139668	localtime=Jun 04 12:14:28	
END FAIL	----	provision	timestamp=1528139668	localtime=Jun 04 12:14:28	
INFO	----	----	timestamp=1528139668	job_abort_reason=	localtime=Jun 04 12:14:28	

I will go check the device in the lab but it seems up at the moment so perhaps just very slow to reboot
Can we make it more obvious for people to know where to look?  Even just adding something to the yaqs entry https://yaqs.googleplex.com/eng/q/6532316467036160?

https://bugs.chromium.org/p/chromium/issues/detail?id=747056 is related to this, but I honestly would never knew to go and look in this sysinfo file.

On the cyan front, there might be light at the end of the tunnel for reboot issues: crbug.com/639301

Status: WontFix (was: Untriaged)
I am closing this as a flake - the sub DUT failed to provision - it happens occasionally.  I added some doc to the YAQ about how I found the logs.

Sign in to add a comment