New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 878164 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: ----

Blocking:
issue 878419



Sign in to add a comment

Skylab: reef DUTs fail TPM verification in repair tasks

Project Member Reported by xixuan@chromium.org, Aug 27

Issue description

Case 1:

After a succeed CQ job:
https://chrome-swarming.appspot.com/task?id=3f95f0bd9ef77710&refresh=10

The next CQ job fails reset:
https://chrome-swarming.appspot.com/task?id=3f95f0bf72707210&refresh=10

It happened several times:
https://chrome-swarming.appspot.com/task?id=3f95f0b95cd92510&refresh=10 (succeeded one)
https://chrome-swarming.appspot.com/task?id=3f95f0c18dc38d10&refresh=10 (the following failed one)

It also could happen once the previous test failed:
https://chrome-swarming.appspot.com/task?id=3f95055b89cc7910&refresh=10 (failed one)
https://chrome-swarming.appspot.com/task?id=3f950578fb35bd10&refresh=10 (the following failed one)

The failure logs of Reset is like:

START	----	reset	timestamp=1535390129	localtime=Aug 27 10:15:29	
	INFO	----	----	timestamp=1535390142	localtime=Aug 27 10:15:42	Beginning verify for host chromeos6-row3-rack12-host15 board reef model 
	GOOD	----	verify.ssh	timestamp=1535390142	localtime=Aug 27 10:15:42	
	GOOD	----	verify.devmode	timestamp=1535390142	localtime=Aug 27 10:15:42	
	GOOD	----	verify.hwid	timestamp=1535390144	localtime=Aug 27 10:15:44	
	GOOD	----	verify.power	timestamp=1535390144	localtime=Aug 27 10:15:44	
	GOOD	----	verify.ext4	timestamp=1535390144	localtime=Aug 27 10:15:44	
	GOOD	----	verify.writable	timestamp=1535390145	localtime=Aug 27 10:15:45	
	FAIL	----	verify.tpm	timestamp=1535390146	localtime=Aug 27 10:15:46	TPM is not enabled -- Hardware is not working.
	GOOD	----	verify.good_au	timestamp=1535390146	localtime=Aug 27 10:15:46	
	GOOD	----	verify.fwstatus	timestamp=1535390146	localtime=Aug 27 10:15:46	
	GOOD	----	verify.rwfw	timestamp=1535390146	localtime=Aug 27 10:15:46	
	FAIL	----	verify.python	timestamp=1535390147	localtime=Aug 27 10:15:47	Python is missing; may be caused by powerwash
	GOOD	----	verify.cros	timestamp=1535390151	localtime=Aug 27 10:15:51	
END FAIL	----	reset	timestamp=1535390151	localtime=Aug 27 10:15:51

 
Cc: jrbarnette@chromium.org
cc deputy
Blocking: 878419
Summary: Skylab: reef DUTs fail TPM verification in repair tasks (was: Reef-paladin failure caused by bad state of DUT after test)
This has been seen intermittently in admin repair tasks, without any intervening tests as well.

I suspect that the TPM check on reef DUTs fails flakily. This affects Skylab worse than it did Autotest because Skylab runs far more repair tasks (no DUT is idle for >20 minutes) than Autotest did.
Status: Available (was: Untriaged)

Sign in to add a comment