New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 873099 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Oct 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug


Show other hotlists

Hotlists containing this issue:
Cros-Hwsec-Ready


Sign in to add a comment

PCR0 extended again after sleep

Project Member Reported by igorcov@chromium.org, Aug 10

Issue description

Chrome Version: 70.0.3511.0 (official build)
OS: ChromeOS
Firmware version: Google_Eve.9584.160.0

1. Put the device in devmode.
2. Login on the device and go to chrome://settings
3. In Screen lock section, set "Show lock screen when waking from sleep" to enabled.
4. Close the device, sending it to sleep and keep it closed for 5 seconds.
5. Open the device and go to console (ctrl+alt+refresh) and run
trunks_client --read_pcr --index 0

Expected result:
PCR Value: 
23E14DD9BB51A50E16911F7E11DF1E1AAF0B17134DC739C5653607A1EC8DD37A

Actual result:
PCR Value: B3060DF3B6E193F953A63AA8AADF4E93ABA48C41DF17BDC6C92CEAE5436DF6D8

As Mattias noticed, the value is obtained after extending the PCR
again with the same value.

Another observation, if the device is closed only for half a second
and open back, the PCR0 value remains unchanged.
 
Components: OS>Systems
Labels: -Pri-2 Pri-1
Regarding the values:

The value we extend PCR0 with for dev_mode=1,recovery_mode=0,firmware_type=dev is SHA1(010002) = c42ac1c46f1d4e211c735cc7dfad4ff8391110e9, so the value we extend with is c42ac1c46f1d4e211c735cc7dfad4ff8391110e9000000000000000000000000.

extend(0000000000000000000000000000000000000000000000000000000000000000, c42ac1c46f1d4e211c735cc7dfad4ff8391110e9000000000000000000000000) = 23e14dd9bb51a50e16911f7e11df1e1aaf0b17134dc739c5653607a1ec8dd37a

extend(23e14dd9bb51a50e16911f7e11df1e1aaf0b17134dc739c5653607a1ec8dd37a, c42ac1c46f1d4e211c735cc7dfad4ff8391110e9000000000000000000000000) = b3060df3b6e193f953a63aa8aadf4e93aba48c41df17bdc6c92ceae5436df6d8

This issue will cause anything that relies on PCR0 binding to fail. I don't have a comprehensive list of affected features. The one feature that immediately comes to mind is remote attestation, i.e. device would no longer appear to be in verified mode to the PCA after going through suspend.

Note that this shouldn't have security implications because the changed PCR0 value will only lead to loss of access to secrets, not gain them (plus userspace could just extend PCR0 at any time on its own, so no security features can assume that PCR0 remains unchanged).
Cc: jwer...@chromium.org vbendeb@chromium.org dlaurie@chromium.org
Components: OS>Firmware>BIOS
Haven't tried to repro yet (my eve is dead as I found out). But, looks like the firmware decided to extend the PCR again on resume. Even though the PCR values were preserved through Shutdown(STATE) - Startup(STATE).


1) Iiuc, in verstage_main() coreboot indeed always calls extend_pcrs() before locking the TPM: https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/firmware-eve-9584.B/src/vboot/vboot_logic.c#405. Seemingly regardless of VB2_CONTEXT_S3_RESUME flag. We only treat this flag differently when deciding if we want a Startup(STATE) or Startup(CLEAR) here: https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/firmware-eve-9584.B/src/vboot/secdata_tpm.c#449

Is this full path actually used on S3 resume? 


2) Also, we also only set VB2_CONTEXT_S3_RESUME flag, and thus do Startup(STATE) instead of Startup(CLEAR) if CONFIG_RESUME_PATH_SAME_AS_BOOT and CONFIG_VBOOT_STARTS_IN_BOOTBLOCK are both set during build, and vboot_platform_is_resuming() says so.

Otherwise, it's Startup(CLEAR). And Startup(CLEAR) after Startup(STATE) clears PCRs (and re-enables PH). Could it be that config options changed at some point, and where are they even set? Or vboot_platform_is_resuming() started working properly.


3) So, if verstage_main() is indeed called and VB2_CONTEXT_S3_RESUME is set on resume, was it always like that?  I don't see any recent changes to this logic or config options. But I can easily be missing something.

And how could it go unnoticed? I guess, PCR0 matters mostly for attestation, which normally starts right away on the 1st boot before the device has a chance to go to suspend. And people who fail to get certs (after sleep on 1st boot or later) can easily tend to reboot first if something goes wrong, but still very interesting...


4) 

> Another observation, if the device is closed only for half a second
and open back, the PCR0 value remains unchanged.

That likely means we haven't gone to sleep yet in that 0.5 second. So, no suspend, no resume, coreboot is not involved, no changes.
And btw, TPM 1.2 also preserves PCRs if Startup(TPM_ST_STATE) on resume follows a SaveState on suspend. So, if coreboot on those boards follows a similar path on resume, we should also be double-extending the PCRs there.
The full path is used on s3 resume. 

these paths haven't changed in a very long time. i suspect it's been a bug for a while?

If this is a bug what is the proposed solution? Conditionally extend pcrs? And when?

Some files with details:

sysfirm.log - is the log from /sys/firmware/log on the device after boot
sysfirm2.log - is the log from /sys/firmware/log on the device went to sleep and wake up changing the PCR0.
sysfirm.log
58.5 KB View Download
sysfirm2.log
103 KB View Download
From the 2nd log:

coreboot-a6f20956aa Sat Sep  9 02:20:48 UTC 2017 verstage starting...
LPSS I2C bus 1 at 0xfe041000 (400 KHz)
tpm_vendor_probe: ValidSts bit set(1) in TPM_ACCESS register after 102 ms
cr50 TPM 2.0 (i2c 1:0x50 id 0x28)
src/lib/tpm2_tlcl.c:208 index 0x1007 return code 0
Phase 1
FMAP: Found "FLASH" version 1.1 at c10000.
FMAP: base = ff000000 size = 1000000 #areas = 32
FMAP: area GBB found @ c11000 (978944 bytes)
VB2:vb2_check_recovery() Recovery reason from previous boot: 0x0 / 0x0
Phase 2
Phase 3
FMAP: area GBB found @ c11000 (978944 bytes)
FMAP: area VBLOCK_B found @ 5e8000 (65536 bytes)
FMAP: area VBLOCK_B found @ 5e8000 (65536 bytes)
VB2:vb2_verify_keyblock() Checking key block signature...
FMAP: area VBLOCK_B found @ 5e8000 (65536 bytes)
FMAP: area VBLOCK_B found @ 5e8000 (65536 bytes)
VB2:vb2_verify_fw_preamble() Verifying preamble.
Phase 4
FMAP: area FW_MAIN_B found @ 5f8000 (4030400 bytes)
VB2:vb2api_init_hash() HW crypto for hash_alg 3 not supported, using SW
Platform is resuming.
Saving vboot hash.
tlcl_extend: response is 0
tlcl_extend: response is 0
tlcl_lock_nv_write: response is 0
tlcl_lock_nv_write: response is 0
Slot B is selected
CBFS: 'VBOOT' located CBFS at [5f8000:707ac0)
CBFS: Locating 'fallback/romstage'
CBFS: Found @ offset 0 size 12944

We are resuming as indicated by "Platform is resuming" above.
Debug logs from the device after the case was reproduced:
debug-logs_20180810-184833.tgz
2.8 MB Download
Re #4:

>  If this is a bug what is the proposed solution? Conditionally extend pcrs? And when?

Yes, we could do the following:
 + if (!(ctx.flags & VB2_CONTEXT_S3_RESUME)) {
	rv = extend_pcrs(&ctx);
	if (rv) {
		printk(BIOS_WARNING, "Failed to extend TPM PCRs (%#x)\n", rv);
		vb2api_fail(&ctx, VB2_RECOVERY_RO_TPM_U_ERROR, rv);
		save_if_needed(&ctx);
		vboot_reboot();
	}
 + }

Because we only do Startup(STATE) if that flag is set.
Or inside extend_pcrs() check if the PCRs are already set to desired values before extending.
Same thing about disabling platform hierarchy before jumping to OS from depthcharge, btw. But there we call TlclLockPhysicalPresence(), which checks if PH is already disabled and returns success immediately, if so.

Unfortunately verstage is in RO so we can't fix this on existing platforms..

The resume path doesn't go through depthcharge so it shouldn't be touching the platform hierarchy there.
Ok, the main currently affected use case is this then:

1) Boot the OS for the first time: OOBE/after powerwash.
2) Let it sit on the very first screen, don't click through OOBE dialogues. Let it go to S3 (but don't let it go to S5).
3) Wake up (now PCR0 has changed).
4) Click through OOBE, own the device as usual. That'll trigger PrepareForEnrollment with the current PCR0. PCR0 quote will be captured for future use. 
5) If you later Enroll you'll send the incorrect PCR0 quote, and then use that obtained AIK cert to request other signed certificates, you machine will be treated as running in dev mode - so, some certs may be rejected, or a 'dev mode' indicated on them, with all the resulting restrictions on using them.

Alternatively, on steps 1-4, you can start going through OOBE, but then put the device to sleep (e.g. on owner login screen) after that initialization has started, but before PrepareForEnrollment is reached and captures PCR0.

If you get past point 4, the only solution currently is powerwash & go through OOBE w/o sleep. Otherwise, (a) your AIKcert will stay, and (b) even if you deleted it and asked to Enroll again, your PCR0 quotes obtained during PrepareForEnrollment step, will stay in the attestation db.

As a partial mitigation, we can:

(0) don't go to sleep while TPM initialization is ongoing before PrepareForEnrollment completes. If the lid is closed before TPM initialization has started during OOBE, go straight to shutdown instead of sleeping.

(1) on boot, in attestationd, see if the current PCR0 is 'normal mode', but the PCR quotes in attestation db are for a different PCR value. And if so, re-request PCR0 quotes [need to check that we have all necessary auths to do that with owner password lost], and delete old AIKcert. At least, next time you request a cert, Enroll will be triggered again, you get AIKcert for the right quotes etc. 

(2) in addition, in attestationd, refuse to Enroll with non-standard PCR0 values (and optionally suggest to reboot first).

(3) additionally, on PCA side we can recognize double-extend and triple-extend PCR0 values for normal mode to cover probably the most likely go-to-sleep-only-once-or-twice-during-OOBE cases.

However, that means on existing devices we can't introduce new features that add PCR0/PCR1-binding to keys that should continue be loadable after suspend-resume. Like cryptohome key.

And we need an autotest that verifies state after suspend-resume for future boards...
Or, we could change cr50 to detect double-extend situations and leave PCR0 and PCR1 as-is in those cases.
That's a small change and will magically solve our issues (and if we indeed at some point move away from tpm commands for vboot purposed along the lines of http://b/112099050#comment14, wouldn't matter in a grand scheme of things), but I'm not sure how I feel about such a hack. wdyt?
Cc: derat@chromium.org
> 2) Let it sit on the very first screen, don't click through OOBE dialogues. Let it go to S3 (but don't let it go to S5).

I'm pretty sure you cannot go to S3 from the OOBE screen. powerd should always shut down instead when no user is logged in (+derat to confirm). So I think we get lucky there.

> And we need an autotest that verifies state after suspend-resume for future boards...

There's already a firmware_TPMExtend test. Unfortunately, it doesn't try to suspend/resume. We should extend that.

> Or, we could change cr50 to detect double-extend situations and leave PCR0 and PCR1 as-is in those cases.

I think(?) verstage resume has been in there since GLaDOS, so not all affected devices would be cr50. It also generally doesn't sound like a good idea to keep accumulating all these magic hacks that change TPM behavior in cr50. At some point nobody will be able to predict what a certain TPM command will really do anymore if there are all these special cases.
> I'm pretty sure you cannot go to S3 from the OOBE screen.
> powerd should always shut down instead when no user is logged in

It's a little bit more complicated than that; see ash/system/power/power_prefs.cc for the default power management policy sent by Chrome. At the login screen, Chrome tells powerd to idle-suspend while on AC but shut down for idle while on battery or when the lid is closed. However, powerd has additional logic (see StateController::UpdateState) to ignore the idle action entirely if /home/chronos/.oobe_completed doesn't exist. So I think we're still lucky. :-)
> I'm pretty sure you cannot go to S3 from the OOBE screen. powerd should always shut down instead when no user is logged in (+derat to confirm). So I think we get lucky there.

There is still a smaller and much less likely, but still a window of opportunity, on first boot to quickly go through OOBE and login, then close the lid before PrepareForEnrollment finishes. TPM initialization (that culminates in PrepareForEnrollment) runs in parallel with login logic, so it's possible to be logged in, but not initialized yet. Still, if we go straight to shutdown from OOBE that significantly reduces the probability.

> It also generally doesn't sound like a good idea to keep accumulating all these magic hacks that change TPM behavior in cr50.

Yes, I also don't like it too much (or at all). But:

1) That fix is comparable in terms of introducing special cases to magic hacks with recognizing double-extend situations in Chrome OS / on PCA side, and is actually a smaller and more traceable change. We can even make cr50 do it only on 'old' boards, for which we can't change coreboot RO; and don't do the cr50 hack, but rely on coreboot changes, for future boards. This way, no special cases in the future... Can we rely on Board ID for that?

2) At least, that'd allow to implement new features that use PCR0 on eve, pyro and other similar already manufactured devices with H1. Though, yes, won't help in 1.2 case. To the best of my knowledge, though we were already thinking about using PCR0 in some new cases, we don't *have* to do that. So, if suspend-at-OOBE doesn't happen, we can decide to just live with this PCR0 KI.
Ok, just to capture the decision on what to do next with this bug. Please speak up if you have objections.

1) Fix ToT & upstream coreboot as described in comment #8. For the future.

2) Since we can't avoid dealing with existing boards with TPM1.2 that have this issue, no point in fixing it in cr50 now. Will do the workaround from comment #12 when & if we need it.

3) Add UMA: gather data on how often we see wrong PCR0 that doesn't match any normal extended value when running PrepareForEnrollment (that's when we obtain PCR quotes that are later used in enrollment).

4) (a) In Enroll: if a non-standard PCR0 quote is read from attestation db, re-request the quote & save in attestation db. [Looks like we don't need any authorization for that or for loading AIK - to be confirmed] If it's still wrong, refuse to Enroll. 
(b) Properly indicate the condition to upper layers. Create or reuse some error code that indicates to AttestationFlow that enrollment is not possible until reboot. May want to add some form of indication to the user, especially if UMA metrics show that it happens often enough.
Just curious, but if we determine that we have the proper auth to request PCR quotes after we throw the TPM owner password away, could we lazily request the PCR quotes at the time we actually try to enroll, and save those in the DB after a successful enrollment?
Re #17:
> could we lazily request the PCR quotes at the time we actually try to enroll

This will actually make this issue worse.

PrepareForEnrollment, where these quotes are requested now, is run right after taking ownership is completed. Which starts as soon as the user starts oobe, and I'd assume normally completes before the user closes the lid and suspends the device(plus, iiuc, at least for part of oobe we disable suspend and reboot instead of suspend-resume if the lid is closed at that stage).

If we wait till the actual Enroll, who knows how many times the device goes through suspend by the time it is called.

So, we better do it as early as possible, i.e. during PrepareForEnrollment. We *probably* can't do it earlier either (since need AIK to sign, so need to create it, so SRK for sure, probably owner password as well - but can double-check). In any case, moving AIK creation to an earlier time may not be a great idea for other reasons even if we can do it earlier, say, at Pre-Ownership stage.

But we can check during PrepareForEnrollment and/or Enroll and don't go ahead with the PCR0 quote if the value is not from a known set. Then, if the TPM is already owned and PrepareForEnrollment is done, we can do PCR0 at cryptohomed start next time after reboot.

Andrey, do you want me or someone to make the coreboot change for #8?
Re #19:

> do you want me or someone to make the coreboot change for #8?

Go ahead if you have time for it now, I'm not working on it yet.
Since we are not universally fixing it anyways, the priority of item #1 from #16 (== #8) is not high (arguably, the most effective short-term 'fix' is item #4 from #16). For coreboot, even if we don't make the fix for the next released board, that'd only mean a couple months on top of ~5 years we need to live with this double-extension anyways. So, I'm treating that part as P2.
Note that asbestos already recognizes only the "standard" set of PCR0 values (generated from dev-mode-byte = 0 or 1, recovery-mode-byte = 0 or 1, fw-type-byte = 0, 1, or 2). In all other cases, it returns "Invalid PCR quote" in response to Enroll. So, if cryptohomed attempted to Enroll with double-extended value (captured at PrepareForEnrollment time), such Enroll request already would fail.
Andrey, question to help my understanding: The comments above are considering the consequence of capturing the wrong PCR0 value in PrepareForEnrollment(). I was expecting that even in the case we obtain the correct PCR values in PrepareForEnrollment(), subsequent enrollment with the PCA should fail if the PCRs contain the wrong values at that point. Can you clarify whether this is the case or not?
Cc: kitching@chromium.org
Joel, would you like to send a CL out to fix the issue in #8? I filed another bug in this area as well to fix some things up. They can go hand-in-hand.
Cc: allenwebb@chromium.org
Components: OS>Systems>Security
Owner: ----
Status: Available (was: Assigned)
Owner: kitching@chromium.org
https://review.coreboot.org/c/coreboot/+/28750
Project Member

Comment 28 by bugdroid1@chromium.org, Oct 18

Labels: merge-merged-chromeos-2016.05
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/7a4b45cd994aff302e6e3ec332f73f48457825bb

commit 7a4b45cd994aff302e6e3ec332f73f48457825bb
Author: Joel Kitching <kitching@google.com>
Date: Thu Oct 18 13:16:10 2018

UPSTREAM: vboot: do not extend PCRs on resume from S3

BUG=b:114018226, chromium:873099 
TEST=compile coreboot

Change-Id: If6eac1dcd78f75b21cba8ae4f43180d48469e88e
Signed-off-by: Patrick Georgi <pgeorgi@google.com>
Original-Commit-Id: 6d88a5d5886d4e66bd16b4f59f9ebbfbd1758740
Original-Change-Id: I6840c45604535089fa8410f03c69702bec91218f
Original-Signed-off-by: Joel Kitching <kitching@google.com>
Original-Reviewed-on: https://review.coreboot.org/28750
Original-Tested-by: build bot (Jenkins) <no-reply@coreboot.org>
Original-Reviewed-by: Aaron Durbin <adurbin@chromium.org>
Original-Reviewed-by: Furquan Shaikh <furquan@google.com>
Reviewed-on: https://chromium-review.googlesource.com/1285959
Commit-Ready: Patrick Georgi <pgeorgi@chromium.org>
Tested-by: Patrick Georgi <pgeorgi@chromium.org>
Reviewed-by: Patrick Georgi <pgeorgi@chromium.org>

[modify] https://crrev.com/7a4b45cd994aff302e6e3ec332f73f48457825bb/src/security/vboot/vboot_logic.c

Project Member

Comment 29 by bugdroid1@chromium.org, Oct 23

Labels: merge-merged-firmware-grunt-11031.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/f9c4d398b2b62c5938b88dce728826050eccd14b

commit f9c4d398b2b62c5938b88dce728826050eccd14b
Author: Joel Kitching <kitching@google.com>
Date: Tue Oct 23 21:43:40 2018

UPSTREAM: vboot: do not extend PCRs on resume from S3

BUG=b:114018226, chromium:873099 
TEST=compile coreboot

Change-Id: If6eac1dcd78f75b21cba8ae4f43180d48469e88e
Signed-off-by: Patrick Georgi <pgeorgi@google.com>
Original-Commit-Id: 6d88a5d5886d4e66bd16b4f59f9ebbfbd1758740
Original-Change-Id: I6840c45604535089fa8410f03c69702bec91218f
Original-Signed-off-by: Joel Kitching <kitching@google.com>
Original-Reviewed-on: https://review.coreboot.org/28750
Original-Tested-by: build bot (Jenkins) <no-reply@coreboot.org>
Original-Reviewed-by: Aaron Durbin <adurbin@chromium.org>
Original-Reviewed-by: Furquan Shaikh <furquan@google.com>
Reviewed-on: https://chromium-review.googlesource.com/1285959
Commit-Ready: Patrick Georgi <pgeorgi@chromium.org>
Tested-by: Patrick Georgi <pgeorgi@chromium.org>
Reviewed-by: Patrick Georgi <pgeorgi@chromium.org>
(cherry picked from commit 7a4b45cd994aff302e6e3ec332f73f48457825bb)
Reviewed-on: https://chromium-review.googlesource.com/c/1296923
Reviewed-by: Martin Roth <martinroth@chromium.org>
Tested-by: Martin Roth <martinroth@chromium.org>

[modify] https://crrev.com/f9c4d398b2b62c5938b88dce728826050eccd14b/src/security/vboot/vboot_logic.c

Status: Fixed (was: Available)

Sign in to add a comment