mount-encrypted: encstateful key is recreated on tpm transient errors |
|||||
Issue descriptionIf a transient error happens when mount-encrypted reads the key for encstateful from TPM NVRAM, it'd be treated as the absence of the key data in nvram. A new encrypted.key would be created and old enctstateful contents would be lost.
,
Mar 8 2018
,
Mar 16 2018
,
Mar 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/6d1535430224f3ffdb586e40408946fdbcd4b011 commit 6d1535430224f3ffdb586e40408946fdbcd4b011 Author: Andrey Pronin <apronin@chromium.org> Date: Mon Mar 26 21:41:40 2018 cryptohome: fix lint errors for mount-encrypted Fixes errors reported by cros lint for mount_encrypted.cc: - sorts headers (and removed linux/fs.h conflicting with sys/mount.h); - provides dirnames for local includes; - converts style casts to C++ style; - fixed curly braces in multi-line if-else statements; - disables lint recommendations for getpwnam and getgrnam. This is a preparatory work for fixing error handling in mount-encrypted. BUG=chromium:702724 TEST=cros lint mount_encrypted.cc; build Change-Id: Icdbd06d253f236e7a3ba009b1eff656fa714aa5e Reviewed-on: https://chromium-review.googlesource.com/974782 Commit-Ready: Andrey Pronin <apronin@chromium.org> Tested-by: Andrey Pronin <apronin@chromium.org> Reviewed-by: Sarthak Kukreti <sarthakkukreti@chromium.org> Reviewed-by: Mattias Nissler <mnissler@chromium.org> [modify] https://crrev.com/6d1535430224f3ffdb586e40408946fdbcd4b011/cryptohome/mount_encrypted.cc
,
Mar 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/vboot_reference/+/d3ec8f8116b167f58d76f0c798ae1b4d14aec389 commit d3ec8f8116b167f58d76f0c798ae1b4d14aec389 Author: Andrey Pronin <apronin@chromium.org> Date: Tue Mar 27 03:16:46 2018 firmware: move TPM_E constants to tss_constants.h The constants from the list defined in tss_constants.h should be the same values regardless of TPM 1.2 vs 2.0 spec version since AP firmware checks for those exact values in certain cases. Stop defining them separately for TPM 1.2 and 2.0 and move to the common tss_constants.h. Before the change, even though TPM_E constants were defined in TPM spec dependent files, they were defined identically. So, no changes to the behavior are caused by this CL. This is a preparatoryy change to fixing error handling for Tlcl and mount-encrypted. BUG=chromium:702724 BRANCH=none TEST=emerge vboot_reference Change-Id: Ib7a5f41ca55579d053ba63ce07f4bed1394e7ae9 Signed-off-by: Andrey Pronin <apronin@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/976871 Reviewed-by: Vadim Bendebury <vbendeb@chromium.org> [modify] https://crrev.com/d3ec8f8116b167f58d76f0c798ae1b4d14aec389/firmware/include/tpm1_tss_constants.h [modify] https://crrev.com/d3ec8f8116b167f58d76f0c798ae1b4d14aec389/firmware/include/tpm2_tss_constants.h [modify] https://crrev.com/d3ec8f8116b167f58d76f0c798ae1b4d14aec389/firmware/include/tss_constants.h
,
Mar 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/vboot_reference/+/3ee5c7d8adf5d78d6864c73bf4f33b95f04a7979 commit 3ee5c7d8adf5d78d6864c73bf4f33b95f04a7979 Author: Andrey Pronin <apronin@chromium.org> Date: Tue Mar 27 21:58:10 2018 firmware: tpm2_lite: propagate actual errors Propagate the actual error - a non-successful response code from the tpm or communication/serializing failure - to the caller of the Tlcl functions in TPM 2.0 case. Currently, the callers only have special processing for the error codes from TCG TPM 1.2 range, which are never returned in case of communication or serialization failures or from the actual TPM 2.0. (The only case of mapping TPM 2.0 error codes to TPM_E_BADINDEX is preserved in this CL.) Thus, changing the actual values returned from the functions won't change any current behavior in the calling layers. This CL is a preparatory work for adding special processing for communication errors in mount-encrypted. BUG=chromium:702724 BRANCH=none TEST=build; test that tpmc getvf, tpmc read still work. Change-Id: I96b20e7285e83f0038abc01e4b7175c938867e7d Signed-off-by: Andrey Pronin <apronin@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/977225 Reviewed-by: Vadim Bendebury <vbendeb@chromium.org> [modify] https://crrev.com/3ee5c7d8adf5d78d6864c73bf4f33b95f04a7979/firmware/lib/tpm2_lite/tlcl.c [modify] https://crrev.com/3ee5c7d8adf5d78d6864c73bf4f33b95f04a7979/firmware/include/tpm2_marshaling.h [modify] https://crrev.com/3ee5c7d8adf5d78d6864c73bf4f33b95f04a7979/firmware/lib/tpm2_lite/marshaling.c
,
Mar 30 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/a912e7d6ef01d0bda36f1670c8ae60c209684aad commit a912e7d6ef01d0bda36f1670c8ae60c209684aad Author: Andrey Pronin <apronin@chromium.org> Date: Fri Mar 30 02:51:38 2018 cryptohome: refactor mount-encrypted error reporting Switch from booleans/ints for returning status from internal functions to an enum of result codes to allow future extension of the set of distinct error types. This unifies the style of reporting errors across mount-encrypted: before the change some functions were using 0 for success, amnd some weer using 1. The new result codes are also returned from mount-encrypted executable as exit codes to allow the callers implement different reactions to different types of errors. This CL doesn't change any functionality or add new types of errors. In one case, an impossible branch - a check if the result was success after we have already checked that it was an error - is eliminated to slightly simplify the code. BUG=chromium:702724 TEST=build, boot Change-Id: If465fa65c99c61f99e3b1b0bbf9d90af6b1507fa Reviewed-on: https://chromium-review.googlesource.com/972340 Commit-Ready: Andrey Pronin <apronin@chromium.org> Tested-by: Andrey Pronin <apronin@chromium.org> Reviewed-by: Mattias Nissler <mnissler@chromium.org> [modify] https://crrev.com/a912e7d6ef01d0bda36f1670c8ae60c209684aad/cryptohome/mount_encrypted.cc
,
Apr 7 2018
Based on discussion with dwmclary@, the next steps are: 1) In case of seeing a tpm comm error in mount-encrypted, assume that it is transient and may go after reboot, so reboot once. If after one reboot (a flag file to be created to track that), a tpm error is seen again, fall back to the current behavior (i.e. recreate encstateful). 2) Add support for reporting UMA metrics to mount-encrypted. Start reporting these tpm-comm-based reboots, and cases when it is not healed after a single reboot. 3) If there exist (a significant number of) cases when the tpm is not healed by a single reboot, consider propagating it to UI instead of recreating encstateful.
,
Apr 9 2018
Not understanding the goal of recreating encstateful *ever* in response to TPM errors, since that's not likely to make the TPM start working again. Is the idea to allow the device to continue running even with a non-functional TPM? Because I'm not sure that's really what we want to have happen here.
,
Apr 9 2018
Re #9: Recreating encstateful is more the result of over-simplification of mount-encrypted error-handling that some deliberate reaction to TPM errors. Currently, it treats two types of errors the same way: (1) there is actually no encstateful key seed in tpm nvram, and (2) there's an issue communicating to the tpm. mount-encrypted first needs to distinguish these two cases, and the CLs above in this bug are about that. Once it does, then in case (2) it indeed just shouldn't continue booting w/o a functional TPM. The discussion was what to do exactly then: silently reboot, show an error, etc. Step 1 is just a way to handle the known issues (where a single reboot saves the day) w/o introducing new UI elements. Step 2 is about checking if there are other cases not covered by that, and how frequent they are. Step 3 is about handling those cases. But the prio of steps 1/2 is higher than of step 3. In case (1), unless there is the encstateful key still on disk, we have to re-create encstateful, we can't decrypt and use the old one. It's not about making TPM start working again (it is working), it's about fixing the unexpected state it is in - no encstateful key in nvram for whatever reason. (There's a separate bug devoted to eliminating this 'still-on-disk' possibility - issue 758552) A separate question for case (1) is if we should do more and wipe the entire local state, or request full recovery, as having the key neither in tpm nor on disk shouldn't happen w/o tampering or filesystem issues. Re-creating just encstateful is a more cautious (but not necessarily the best) self-repair option here - at least the local user data is preserved after going through OOBE screens and 'adding' the same user(s) again.
,
Apr 9 2018
Just to add to the usefulness of (2). We'd like to understand if what measurable correlations there are with case (1) and other events.
,
May 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/vboot_reference/+/41c585ed7482da8ccd898b4118d1414028fe749f commit 41c585ed7482da8ccd898b4118d1414028fe749f Author: Andrey Pronin <apronin@chromium.org> Date: Thu May 24 22:44:34 2018 tpm_lite: stub: retry in case of TPM comm error This CL retries reads and writes from/to TPM device if an error is returned by read()/write(), up to 3 total attempts. This is useful case of transient TPM communication errors that go away after a single retry. Without this CL, after such errors the encstateful key might be regenerated and encstateful data wiped. BRANCH=none BUG=chromium:702724 TEST=1) normal boot still works; 2) simulate a single error, verify that it retries. Change-Id: I259882209df0aad66cd083729f746ea45909922b Reviewed-on: https://chromium-review.googlesource.com/1067939 Commit-Ready: Andrey Pronin <apronin@chromium.org> Tested-by: Andrey Pronin <apronin@chromium.org> Reviewed-by: Andrey Pronin <apronin@chromium.org> [modify] https://crrev.com/41c585ed7482da8ccd898b4118d1414028fe749f/firmware/stub/tpm_lite_stub.c
,
Oct 12
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by apronin@chromium.org
, Mar 6 2018