New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 809096 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

Dropping shill privileges: Chaps internal error when invoked by shill user

Project Member Reported by mortonm@chromium.org, Feb 5 2018

Issue description

Cc: dkrahn@chromium.org
Components: Security
Labels: OS-Chrome
Doesn't ring a bell for me after cursory inspection, but dkrahn@ will likely have an idea what's going on here. Any chance to get strace output? Some other tracing to see where it goes off rail?
Giving a bit more context. Strongswan invokes Chaps through libchaps.so, which translates pkcs11 API calls into DBUS calls to the Chaps daemon. DBUS config isnt the issue (no errors in log, watched communication with dbus-monitor), but rather a failure in Chaps itself. While Chaps is servicing the DBUS call for Strongswan, it encounters an error (code pointers above) regarding using the "slot" of the current "session" to access cryptographic material (in the TPM?). Strace/auditing Strongswan isnt useful since there are no system-level errors there. Although a trace of whats going on in Chapsd could be -- I'll provide results of a Chapsd strace/system call audit run if it turns up anything useful.

Net log errors:
INFO charon[2924]: 00[LIB] building CRED_PRIVATE_KEY - ANY failed, tried 4 builders
...
INFO charon[2924]: 05[LIB] building CRED_CERTIFICATE - X509 failed, tried 3 builders
INFO charon[2924]: 05[CFG]   loading certificate from '%smartcard1@crypto_module:8712' failed

Chaps log errors:
INFO chapsd[2898]: Starting PKCS #11 services.
INFO chapsd[2898]: Starting D-Bus dispatcher.
INFO chapsd[2898]: Adding slot: 0
INFO chapsd[2898]: Adding slot: 1
INFO chapsd[2898]: Starting asynchronous initialization.
INFO chapsd[2898]: Opening database in: /var/lib/chaps
INFO chapsd[2898]: Slot 0 ready for token at /var/lib/chaps
INFO chapsd[2898]: Master key is ready for token at /var/lib/chaps
ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID
ERR chapsd[2898]: CloseSession - CKR_SESSION_HANDLE_INVALID
ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID
* as suspected, no interesting permission/access errors at the syscall level for Chaps, exit condition above regarding "slots" and "sessions" seems to be the cause.
Actually, I think the Chaps errors listed above are potentially misleading, as they happen long after autotest failure. Current hypothesis is that the following lines are the root cause of the issue (cryptohome is involved with authentication for TPM-protected keys in the 'user' slot, which is what is being used here, as described in 'Overview': https://www.chromium.org/developers/design-documents/chaps-technical-design):

2018-02-05T17:40:54.423906-08:00 ERR cryptohomed[932]: ConnectContextAsOwner requires an owner password
2018-02-05T17:40:54.423932-08:00 ERR cryptohomed[932]: GetEndorsementPublicKey: Could not connect to the TPM.
2018-02-05T17:40:54.423956-08:00 ERR cryptohomed[932]: Attestation: Failed to get EK public key.
Cc: apronin@chromium.org
Does crosh still have the chaps_debug command? If so that will get a bunch of detailed logs. The CKR_SESSION_HANDLE_INVALID errors may be legit if a stale session is used. PKCS #11 clients should be resilient to these and start a new session. The cryptohomed errors may also be expected depending on the state of the device, I don't think they should affect the enabling of slot 1 when a user logs in.

+apronin who may be able to help.
I believe these are expected:
ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID
ERR chapsd[2898]: CloseSession - CKR_SESSION_HANDLE_INVALID
ERR chapsd[2898]: GetSessionInfo - CKR_SESSION_HANDLE_INVALID

They happen during CloseAllSessions, which we call to force all chaps users to re-read the database.
This
2018-02-05T17:40:54.423906-08:00 ERR cryptohomed[932]: ConnectContextAsOwner requires an owner password
2018-02-05T17:40:54.423932-08:00 ERR cryptohomed[932]: GetEndorsementPublicKey: Could not connect to the TPM.
2018-02-05T17:40:54.423956-08:00 ERR cryptohomed[932]: Attestation: Failed to get EK public key.
just indicate that we attempt to call OwnerReadPubek when the owner password is no longer available (typically, after the 1st boot; can also be before the tpm is owned). Somehow we attempt to do PrepareForEnrollment not on the first boot - I wouldn't be too surprised to see on debug devices, where the tpm owner is regularly cleared w/o wiping the disk or something like that. Shouldn't affect user logins (and enabling chaps slot 1 for them), though.
Thanks for the info. I've decided the way i was debugging it (running shill in minijail and invoking l2tpipsec_vpn / strongswan under limited capabilities -- complete with patches and compiler flags to avoid strongswan further messing with capabilities) was too complicated/fragile and too many scattered logs and places to fail. I've backed up to just trying to get l2tpipsec_vpn (and children) to run in a sandbox when shill runs as root, since it turns out shill does some of the TPM initialization stuff. Hopefully taking this on in a more methodical way will be more tractable.
> Does crosh still have the chaps_debug command? 

Yes, crosh still has it.

It'd be good to enable chaps debug and then gather and post full logs (/var/log/messages) here. It's possible that something else is going on given that PrepareForEnrollment is failing. Possibly, the device is in some bad state and needs powerwashing before tpm-related daemons start working correctly on it again.
Status: Fixed (was: Untriaged)
IIRC this was solved just by adjusting DBUS perms for shill: https://chromium-review.googlesource.com/c/chromiumos/platform2/+/1007486

Sign in to add a comment