Dropping shill privileges: Chaps internal error when invoked by shill user |
|||
Issue descriptionIn the existing code, the strongswan starter process (/usr/libexec/ipsec/starter) runs as root and uses Chaps for pkcs11 implementation backed by the TPM. Internal chaps error is encountered when calling from non-root user. Failure: https://cs.corp.google.com/chromeos_public/src/platform2/chaps/chaps_service.cc?rcl=2dfc026f850825e25d48fd85368d8d279aef51f7&l=202 --> https://cs.corp.google.com/chromeos_public/src/platform2/chaps/slot_manager_impl.cc?rcl=2dfc026f850825e25d48fd85368d8d279aef51f7&l=448 For reference: https://www.chromium.org/developers/design-documents/tpm-usage , https://www.chromium.org/developers/design-documents/chaps-technical-design
,
Feb 5 2018
Giving a bit more context. Strongswan invokes Chaps through libchaps.so, which translates pkcs11 API calls into DBUS calls to the Chaps daemon. DBUS config isnt the issue (no errors in log, watched communication with dbus-monitor), but rather a failure in Chaps itself. While Chaps is servicing the DBUS call for Strongswan, it encounters an error (code pointers above) regarding using the "slot" of the current "session" to access cryptographic material (in the TPM?). Strace/auditing Strongswan isnt useful since there are no system-level errors there. Although a trace of whats going on in Chapsd could be -- I'll provide results of a Chapsd strace/system call audit run if it turns up anything useful. Net log errors: INFO charon[2924]: 00[LIB] building CRED_PRIVATE_KEY - ANY failed, tried 4 builders ... INFO charon[2924]: 05[LIB] building CRED_CERTIFICATE - X509 failed, tried 3 builders INFO charon[2924]: 05[CFG] loading certificate from '%smartcard1@crypto_module:8712' failed Chaps log errors: INFO chapsd[2898]: Starting PKCS #11 services. INFO chapsd[2898]: Starting D-Bus dispatcher. INFO chapsd[2898]: Adding slot: 0 INFO chapsd[2898]: Adding slot: 1 INFO chapsd[2898]: Starting asynchronous initialization. INFO chapsd[2898]: Opening database in: /var/lib/chaps INFO chapsd[2898]: Slot 0 ready for token at /var/lib/chaps INFO chapsd[2898]: Master key is ready for token at /var/lib/chaps ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID ERR chapsd[2898]: CloseSession - CKR_SESSION_HANDLE_INVALID ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID
,
Feb 5 2018
* as suspected, no interesting permission/access errors at the syscall level for Chaps, exit condition above regarding "slots" and "sessions" seems to be the cause.
,
Feb 6 2018
Actually, I think the Chaps errors listed above are potentially misleading, as they happen long after autotest failure. Current hypothesis is that the following lines are the root cause of the issue (cryptohome is involved with authentication for TPM-protected keys in the 'user' slot, which is what is being used here, as described in 'Overview': https://www.chromium.org/developers/design-documents/chaps-technical-design): 2018-02-05T17:40:54.423906-08:00 ERR cryptohomed[932]: ConnectContextAsOwner requires an owner password 2018-02-05T17:40:54.423932-08:00 ERR cryptohomed[932]: GetEndorsementPublicKey: Could not connect to the TPM. 2018-02-05T17:40:54.423956-08:00 ERR cryptohomed[932]: Attestation: Failed to get EK public key.
,
Feb 6 2018
Does crosh still have the chaps_debug command? If so that will get a bunch of detailed logs. The CKR_SESSION_HANDLE_INVALID errors may be legit if a stale session is used. PKCS #11 clients should be resilient to these and start a new session. The cryptohomed errors may also be expected depending on the state of the device, I don't think they should affect the enabling of slot 1 when a user logs in. +apronin who may be able to help.
,
Feb 6 2018
I believe these are expected: ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID ERR chapsd[2898]: CloseSession - CKR_SESSION_HANDLE_INVALID ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID They happen during CloseAllSessions, which we call to force all chaps users to re-read the database.
,
Feb 6 2018
This 2018-02-05T17:40:54.423906-08:00 ERR cryptohomed[932]: ConnectContextAsOwner requires an owner password 2018-02-05T17:40:54.423932-08:00 ERR cryptohomed[932]: GetEndorsementPublicKey: Could not connect to the TPM. 2018-02-05T17:40:54.423956-08:00 ERR cryptohomed[932]: Attestation: Failed to get EK public key. just indicate that we attempt to call OwnerReadPubek when the owner password is no longer available (typically, after the 1st boot; can also be before the tpm is owned). Somehow we attempt to do PrepareForEnrollment not on the first boot - I wouldn't be too surprised to see on debug devices, where the tpm owner is regularly cleared w/o wiping the disk or something like that. Shouldn't affect user logins (and enabling chaps slot 1 for them), though.
,
Feb 6 2018
Thanks for the info. I've decided the way i was debugging it (running shill in minijail and invoking l2tpipsec_vpn / strongswan under limited capabilities -- complete with patches and compiler flags to avoid strongswan further messing with capabilities) was too complicated/fragile and too many scattered logs and places to fail. I've backed up to just trying to get l2tpipsec_vpn (and children) to run in a sandbox when shill runs as root, since it turns out shill does some of the TPM initialization stuff. Hopefully taking this on in a more methodical way will be more tractable.
,
Feb 6 2018
> Does crosh still have the chaps_debug command? Yes, crosh still has it. It'd be good to enable chaps debug and then gather and post full logs (/var/log/messages) here. It's possible that something else is going on given that PrepareForEnrollment is failing. Possibly, the device is in some bad state and needs powerwashing before tpm-related daemons start working correctly on it again.
,
Jun 29 2018
IIRC this was solved just by adjusting DBUS perms for shill: https://chromium-review.googlesource.com/c/chromiumos/platform2/+/1007486 |
|||
►
Sign in to add a comment |
|||
Comment 1 by mnissler@chromium.org
, Feb 5 2018Components: Security
Labels: OS-Chrome