Skylab: reef DUTs fail TPM verification in repair tasks |
||||
Issue descriptionCase 1: After a succeed CQ job: https://chrome-swarming.appspot.com/task?id=3f95f0bd9ef77710&refresh=10 The next CQ job fails reset: https://chrome-swarming.appspot.com/task?id=3f95f0bf72707210&refresh=10 It happened several times: https://chrome-swarming.appspot.com/task?id=3f95f0b95cd92510&refresh=10 (succeeded one) https://chrome-swarming.appspot.com/task?id=3f95f0c18dc38d10&refresh=10 (the following failed one) It also could happen once the previous test failed: https://chrome-swarming.appspot.com/task?id=3f95055b89cc7910&refresh=10 (failed one) https://chrome-swarming.appspot.com/task?id=3f950578fb35bd10&refresh=10 (the following failed one) The failure logs of Reset is like: START ---- reset timestamp=1535390129 localtime=Aug 27 10:15:29 INFO ---- ---- timestamp=1535390142 localtime=Aug 27 10:15:42 Beginning verify for host chromeos6-row3-rack12-host15 board reef model GOOD ---- verify.ssh timestamp=1535390142 localtime=Aug 27 10:15:42 GOOD ---- verify.devmode timestamp=1535390142 localtime=Aug 27 10:15:42 GOOD ---- verify.hwid timestamp=1535390144 localtime=Aug 27 10:15:44 GOOD ---- verify.power timestamp=1535390144 localtime=Aug 27 10:15:44 GOOD ---- verify.ext4 timestamp=1535390144 localtime=Aug 27 10:15:44 GOOD ---- verify.writable timestamp=1535390145 localtime=Aug 27 10:15:45 FAIL ---- verify.tpm timestamp=1535390146 localtime=Aug 27 10:15:46 TPM is not enabled -- Hardware is not working. GOOD ---- verify.good_au timestamp=1535390146 localtime=Aug 27 10:15:46 GOOD ---- verify.fwstatus timestamp=1535390146 localtime=Aug 27 10:15:46 GOOD ---- verify.rwfw timestamp=1535390146 localtime=Aug 27 10:15:46 FAIL ---- verify.python timestamp=1535390147 localtime=Aug 27 10:15:47 Python is missing; may be caused by powerwash GOOD ---- verify.cros timestamp=1535390151 localtime=Aug 27 10:15:51 END FAIL ---- reset timestamp=1535390151 localtime=Aug 27 10:15:51
,
Aug 28
,
Sep 4
This has been seen intermittently in admin repair tasks, without any intervening tests as well. I suspect that the TPM check on reef DUTs fails flakily. This affects Skylab worse than it did Autotest because Skylab runs far more repair tasks (no DUT is idle for >20 minutes) than Autotest did.
,
Sep 10
|
||||
►
Sign in to add a comment |
||||
Comment 1 by xixuan@chromium.org
, Aug 27