New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 680596 link

Starred by 2 users

Issue metadata

Status: Archived
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

chromeos6-row2-rack14-host8 continually fails provision

Project Member Reported by kevcheng@chromium.org, Jan 12 2017

Issue description

This dut keeps on failing provision the same way (timing out when copying  stateful payload to device) and repair just keeps bringing it back to life.  Locking the dut for now to avoid suites to fail for this.

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos6-row2-rack14-host8/571597-provision/20171101224740/

START	----	provision	timestamp=1484236812	localtime=Jan 12 08:00:12	
	GOOD	----	verify.ssh	timestamp=1484236815	localtime=Jan 12 08:00:15	
	GOOD	----	verify.update	timestamp=1484236823	localtime=Jan 12 08:00:23	
	GOOD	----	verify.brd_config	timestamp=1484236824	localtime=Jan 12 08:00:24	
	GOOD	----	verify.ser_config	timestamp=1484236824	localtime=Jan 12 08:00:24	
	GOOD	----	verify.job	timestamp=1484236824	localtime=Jan 12 08:00:24	
	GOOD	----	verify.servod	timestamp=1484236827	localtime=Jan 12 08:00:27	
	GOOD	----	verify.pwr_button	timestamp=1484236827	localtime=Jan 12 08:00:27	
	GOOD	----	verify.lid_open	timestamp=1484236828	localtime=Jan 12 08:00:28	
	GOOD	----	verify.PASS	timestamp=1484236828	localtime=Jan 12 08:00:28	
	START	provision_AutoUpdate	provision_AutoUpdate	timestamp=1484236828	localtime=Jan 12 08:00:28	
		START	----	----	timestamp=1484238517	localtime=Jan 12 08:28:37	
			GOOD	----	sysinfo.before	timestamp=1484238520	localtime=Jan 12 08:28:40	
		END GOOD	----	----	timestamp=1484238520	localtime=Jan 12 08:28:40	
		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1484242229	localtime=Jan 12 09:30:29	Unhandled DevServerException: CrOS auto-update failed for host chromeos6-row2-rack14-host8:  Timeout occurred- waited 1800 seconds.. The CrOS auto-update process is timed out, thus will be terminated
  Traceback (most recent call last):
    File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function
      return func(*args, **dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute
      dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry
      postprocess_profiled_run, args, dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once
      self.run_once(*args, **dargs)
    File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 111, in run_once
      force_full_update=force)
    File "/usr/local/autotest/server/afe_utils.py", line 270, in machine_install_and_update_labels
      *args, **dargs)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 748, in machine_install_by_devserver
      full_update=force_full_update)
    File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 1947, in auto_update
      raise DevServerException(error_msg % (host_name, error_list[0]))
  DevServerException: CrOS auto-update failed for host chromeos6-row2-rack14-host8:  Timeout occurred- waited 1800 seconds.. The CrOS auto-update process is timed out, thus will be terminated
	END FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1484242230	localtime=Jan 12 09:30:30	
END FAIL	----	provision	timestamp=1484242230	localtime=Jan 12 09:30:30	
INFO	----	----	timestamp=1484242230	job_abort_reason=	localtime=Jan 12 09:30:30	
 
Status: Started (was: Untriaged)
First responder.

Comment 2 by xixuan@chromium.org, Jan 12 2017

Checked the logs of auto_update, DUT is online, but is not able to run 'rsync big file' (update.gz or stateful.tgz) in the middle, but can run 'rsync small file' (script stateful_update), which is weird.

Anyone is familiar with this pattern? Should we send it to lab for manually repair?
Could the dut be out of space on it's stateful partition?

Comment 4 by xixuan@chromium.org, Jan 12 2017

It may be the truth. 

First provision: (no any provision files)
- successful rsync small files
- successfully rsync stateful.tgz
- cannot rsync upload.gz

Second provision: (the tmp folder that storing the stateful.tgz in the first provision round is removed, which means there should be enough space at least for 'rsync stateful.tgz'?)
- successful rsync small files
- cannot rsync stateful.tgz now...

Will sth occupy the stateful partition after one round of provision?
FYI, in the auto_update logs, you'll also see an error so:

DEVSERVER Import module android_build failed with error: No module named apiclient

This is a red herring (issue 651520).
Owner: pprabhu@chromium.org
Is this still an issue?
Status: Archived (was: Started)
Bulk closing Infra>Client>ChromeOS issues untouched in over a year.

Sign in to add a comment