Device becomes unusable after many attempts to convert from normal mode to developer mode while under FrE |
|||||||||||
Issue descriptionChrome Version: M62.0.3202.82 stable eve, bob OS: ChromeOS 9901.66.0 What steps will reproduce the problem? (1) On an already enrolled device with Forced Enrollment policy, press Esc+Refresh+Power to convert from Normal mode to Developer mode. The attempt would fail and revert to Normal mode. (2) Proceeding through OOBE to the enrollment screen. (3) Repeat from step 1 although actual enrollment is not necessary. What is the expected result? The device should consistently reliable in responding to the user's actions. What happens instead? After repeating a few times, the start up screen shows "Chrome OS is missing or damanged." Sometimes, reboot the device would recover to normal. But, at times, the device would no longer able to start up normal. Attempt to restore with recovery image ended up in black screen after initiating recovery. After initiating recovery by pressing Esc+Refresh+Power, insert the USB stick, the screen went black, the USB LED blinked for a few seconds and stopped, the device remained in black screen indefinitely. This issue is intermittent and not always reproduced. This issue was last seen on one Bob and one Eve during enrollment tests.
,
Nov 3 2017
,
Nov 3 2017
i see the below information on Cr50/Shell console when i turn on the BOB unit. > 1534051.017556 deferred_tpm_rst_isr [1534051.019260 tpm_reset_request(0, 0)] [1534051.020988 tpm_reset_request: already scheduled]
,
Nov 3 2017
Current open bugs related to this issue: https://buganizer.corp.google.com/issues/68671278 https://buganizer.corp.google.com/issues/67784510
,
Nov 3 2017
I could consistently reproduce this issue on a particular Eve #G1072916. Video: https://drive.google.com/open?id=0B66sLSSfR23oZmJ5TE0wRzZLWFk Steps: 1. Enrolled device with Force Re-enrollment policy enable and already in Normal mode (since FrE forbids Developer mode). 2. Proceed through the OOBE steps per instructions to the Enrollment screen. 3. Press Esc+Refresh+Power to trigger conversion to Developer mode. 4. Press Ctrl-D when shown the "Please insert a recovery USB stick" message. 5. Press Enter when shown the "To turn OS verification OFF, press ENTER..." 6. Wait for the screen to re-appear after a blackout. The time taken through these steps may be a factor to reproduce the issue. While trying to capture the video, I was much slower and took a little longer through each step and as a result for twice I did not see the issue; once I work through with my regular pace the issue was reproduced again.
,
Nov 8 2017
also seen this issue during FAFT run.
,
Dec 7 2017
Observed this today with ChromeOS Version: 10176.7.0 on Bob
Steps
1) Recover device with 10176.7.0
2) Reboot device and enter normal mode
3) Wait for device to complete transition to normal mode
4) Once in normal mode, press Esc+Refresh+Pwr to get to recovery screen
5) Start the transition back to Dev mode with Ctrl+D
6) Device reboots with message: "ChromeOS is missing or damaged."
Recovery reason: 0x2b / 0x2b Secure NVRAM (TPM) Initialization error
What is the expected result?
Device should start the transition into developer mode
What happens instead?
Device boots with "ChromeOS is damaged" message. Trying to recover the device at this point also fails, just sitting at a black screen forever.
*Notes
- Tried this on two Bob devices and observed the same issue both times
- Unable to successfully get these devices to recover after they are in this state
- Also tried the workaround in b/62425133 c#6 with no success
I've uploaded logs here from the recovery attempt after the device got into this state:
https://pantheon.corp.google.com/storage/browser/chromiumos-test-logs/bugfiles/cr/780974
,
Dec 8 2017
,
Dec 8 2017
1) What does pressing Tab on the "ChromeOS is damaged" screen reveal? 2) I see "TPM: communication error" on the 2nd screenshot in the bug. Seems to be a variation of http://b/68729265 or maybe http://b/67923075 - tpm reset in a middle of a long command (we are going through OOBE) or just i2c transaction. There were a number of related fixes recently. vbendeb@ can comment on the exact status. 3) Can also be a DA lockout, esp since Esc-Refresh-Power leads to an unorderly shutdown, but unlikely. In any case, it will be seen in the Tab output.
,
Dec 8 2017
recovery.log from comment #7 contains + tpmc def 0x20000004 1 0x1 command "def" failed with code 0x1f TPM_IOERROR An IO error occurred transmitting information to the TPM + dlog error clobbering lockbox space: 31 but that's fine - lockbox is probably already defined, and can't be re-defined. All other tpm commands issued by the recovery script succeed. Recovery also succeeds based on the logs. Does the device from comment #7 work after recovery or still stays in 0x2b/0x2b state?
,
Dec 8 2017
Also re #7: the issue addressable by the procedure from http://b/62425133#comment6 is specific to TPM 1.2 chips. It can't happen on bob/eve, which have H1. So, that workaround does nothing in our case.
,
Dec 8 2017
yes, while debugging some SPI interface issues on a different platform, a few bugs were discovered in both Cr50 and AP firmware. Assigning to philipchen@ for coordinating new firmware testing and releasing.
,
Dec 8 2017
re #7:
Pressing tab at the ChromeOS is damaged screen shows:
Recovery reason: 0x2b / 0x2b Secure NVRAM (TPM) Initialization error
,
Dec 8 2017
re #10: Both my bob devices are still in this state. I thought perhaps the recovery process was happening (after connecting the recovery usb drive) when it was displaying a black screen so I left it for over 2 hours yesterday. The device never completed the recovery process.
,
Dec 8 2017
,
Dec 8 2017
Matthew, how can I get one of these Bobs.
,
Dec 8 2017
Matthew gave me two Bobs, one of them (tag is C112147) is actually booting up happily, but its battery was fully exhausted, it required a power adapter. The other one was still stuck, examining the Cr50 console has shown that the issue was the 'stuck tpm reset', fixed by https://chromium-review.googlesource.com/756914. Probably the other one was in the same condition which cleared when battery fully drained. Before the fix is released, to get the device out of this state without waiting for the battery to drain one needs to take it through the "battery cut off procedure" (plug in AC charger, press power, esc, refresh together and while pressed pull out the AC charger). As mentioned in #12, Bob needs a firmware update.
,
Dec 20 2017
Do we have all the fixes in the firmware branch (firmware-gru-8785.B)?
,
Dec 20 2017
Actually these two patches are still missing: https://chromium-review.googlesource.com/#/c/chromiumos/third_party/coreboot/+/836451 UPSTREAM: spi/tpm: claim locality just once during boot https://chromium-review.googlesource.com/#/c/chromiumos/third_party/coreboot/+/836452 UPSTREAM: spi/tpm.c do not waste time on wake pulses unless necessary also, this needs to be tested to see that nothing else slipped through the cracks.
,
Dec 21 2017
Vadim, what test are you referring to? If it's cr50-specific stress test, can you or anyone from Cr50 team test if the new firmware (8785.252.0) is fine? If it's general test, partners can help run FAFT, and our test team can run regular firmware qual flow. OTOH, is it a RO change?
,
Dec 21 2017
Philip, I am referring to tests a new firmware image is supposed to pass before it can be released. This is a standard procedure in Chrome OS.
,
Jan 4 2018
This is an edge case workflow and not a m64 regression. Removing the blocking status, esp as this appears to require a firmware change.
,
Jan 18 2018
Is this a RO FW change?
,
Jan 23 2018
(ping) Please advise whether this is a RO FW change or just a RW one?
,
Jan 24 2018
Yes, this change only affects RO firmware.
,
Jan 26 2018
Asus have passed FAFT with the new firmware (8785.256.0). I've submitted FW qual request.
,
Jan 30 2018
Still we are seeing this issue. We saw this issue while running faft_ec -> firmware_ECWriteProtect.dev test and faft_bios -> firmware_TryFwB.dev tests. DUT configurations. -------------------------- Type of hardware : bob PVT SKU4 Chrome OS Version : 10353.0.0 (Official Build) dev-channel bob test BIOS Version : Google_Bob.8785.256.0 / Google_Bob.8785.256.0 EC Version : bob_v1.10.231-78edffd / bob_v1.10.231-78edffd cr50 Version : 0.0.10 / 0.0.24 CPU arch : aarch64 CPU model : ARMv8 Processor rev 4 (v8l) CPU speed : 1512.0000 Total Memory : 3904912 kB Memory Type : 1-44: Micron Technology | 00000000 | MT52L512M32D2PF-107WT:B MMC Model : DF4032 MMC Firmware : 0x3130393039392020
,
Feb 1 2018
Looks like this issue is not solved with 8785.256.0 FW, where the patched in #19 are included. Vadim, Julius, thoughts?
,
Feb 1 2018
have all the relevant AP firmware changes made to the gru branch? Cr50 is running version 24, which also means there are some fixes missing. We are about to release an MP image for Cr50 with all the fixes, let's try this again then (provided AP firmware is also updated).
,
Feb 5 2018
Vadim, can you point out the missing Cr50-related fixes on Bob's firmware branch (firmware-gru-8785.B)?
,
Feb 5 2018
Philip, I have not been following firmware updates closely, I am not sure what shape gru firmware branch is in. Do you have a setup where this problem can be reproduced? I can give you a new Cr50 image to try.
,
Mar 21 2018
FW QA team confirmed this issue is fixed in 8785.262.0. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by venkatar...@chromium.org
, Nov 2 2017