tpm2: perform long ops for attestation and tpm_manager asynchronously to other requests |
|||||||||||
Issue descriptiontpm_managerd currently performs TakeOwnershp operation on the same message loop that handles all other incoming requests. That blocks other requests while this operation is in progress. Same is true for PrepareForEnrollment for attestationd - it is done by the same worker thread that handles other requests. Either break these long operations into smaller sub-tasks (still other operations are blocked until for the duration of the long tpm commands sent through trunksd - see issue 772187) or move them to a separate thread. ⛆ |
|
|
,
Oct 12
,
Oct 17
,
Oct 23
,
Oct 25
I dug into TakeOwnership a bit. Creating a new thread and moving the task to the new thread may cause problems. TakeOwnership checks and updates ownership status, reads and writes local store, and updates the 3 passwords in TPM. There are also other tasks that may refresh the ownership status (e.g., ReadSpace, DefineSpace. GetTpmStatus) or updating local store (e.g., SavePolicyRecord, DeletePolicyRecord). Parallelizing it with the other thread seems to be dangerous. Although we might be able to avoid that now, we still need to be very careful when updating tpm_managerd in the future. On the other hand, breaking TakeOwnership into multiple small sub-tasks is do-able, however, the benefit we can get might be small. TakeOwnership can be divided into the following 5 steps: 1. Check ownership status. If TPM is already owned, done. 2. Read passwords from local store. 3. If there's no password found in step 2, generate 3 passwords (endorsement, lockout, owner). 4. Store the 3 passwords into local store. 5. Call trunksd to update the 3 passwords in TPM. I did some performance tests on my soraka. Here is the result of each step: 1. status check: 100~150ms 2. read local store: near 0ms 3. passwords generation: 100~150ms 4. write local store: ~80ms 5. call trunk::TpmUtilityImpl::TakeOwnership(): ~1800ms Step 1 definitely can be separated from the other 4 steps, however, I am not sure if dividing the other 4 in any way is safe. First, TPM initialization reads/writes the passwords part in local store, whereas SavePolicyRecord and DeletePolicyRecord reads/writes the nvram policy part. Currently local store is read and written as a whole, i.e., we cannot just write the passwords part only. If a policy record update task is processed between step 2 and 4 above, the nvram policy update will be lost, although I don't know if policy update could possibly happen during TPM initialization. Even we can divide step 2 to 5 into two tasks (2-3 and 4-5), the bottleneck is on the second task, and it looks like we don't gain much benefit from the task division. I am wondering what makes step 5 so slow. Does it involve asymmetric key computation? Can step 5 itself be divided or sped up (using ECC for example)? I haven't looked into PrepareForEnrollment in attestationd yet. Will do it later.
,
Oct 25
Is step 5 slow because it does RSA salt encryption when starting an unbound session? Can we generate the salt and encrypt it at an earlier stage (e.g., when pre-owning TPM) and use it when later calling TpmUtilityImpl::TakeOwnership()?
,
Oct 25
Yes, 1) Running in parallel would require additional synchronization. Plus, we'd have two threads talking to trunks, which either requires two trunks clients, or both threads using the same underlying trunks-communication-thread and synchronizing there. 2) For breaking into sub-tasks: we'd need to break TakeOwnership() into sub-steps to take advantage. TakeOwnership includes several calls to TPM (via trunksd) and this is where we should get extra savings. So, either takeOwnership in TpmUtility should be re-written to consist of several steps, or approach (1) should be taken with the whole take ownership operation being executed on a separate thread, and that thread synchronizing with the rest naturally when it needs to send commands to TPM via trunksd. 3) TakeOwnership contains two long ops - Create for salting key and CreatePrimary for SRK. See https://crbug.com/772187#c28 for an example. There is little that can be done on trunksd/tpm_managerd side to reduce their duration, short of switching to ECC keys. 4) Yes, as discussed offline, we won't be able to avoid blocking other operations on waiting until CreatePrimary/Create are processed before they can access the TPM themselves. But, at least, we can decrease the latency for other operations to this unavoidable single-command delay, which is a partial win already. 5) Note that for TakeOwnership we already try to move these operations to an earlier stage - pre-owning the TPM in the hopes that pre-owning is done by the time user reaches login screen. And that's true in most cases, but is not guaranteed. And pre-owning task deserves the same break down into smaller stages in case other TPM-related activity starts while it is still going, to reduce latency for those other tasks.
,
Oct 26
In TpmUtility::TakeOwnership(), StartUnboundSession() takes a significant part of time, and the following three SetHierarchyAuthorization() calls are bound to the session. I am not sure if they can be divided. Maybe we need virtual session? I need to read the chapter Authorizations and Sessions in the TPM 2.0 book to know more about sessions. However, I found the first step in TakeOwnership(), the CreateStorageAndSaltingKeys() call, can be avoided if the TPM is already pre-owned. That step takes ~300ms. The be able to safely skip the step, we need to make sure both SRK and salting key already exist and handle all possible cases correctly. For example, machine may crash during pre-owning TPM, and we want to make sure the pre-owning process is run again after reboot. Currently TpmUtility::PrepareForOwnership() only checks if owner password is set before skipping CreateStorageAndSaltingKeys(). That needs to be fixed so that we can safely skip CreateStorageAndSaltingKeys() in TakeOwnership().
,
Oct 26
> StartUnboundSession takes a significant part of time What's the duration of it? I'd guess, ~800ms? This is the time it takes to start a salted session. And in this case, I'm afraid, we want a salted session to avoid sending owner, enrollment and lockout passwords plaintext over dbus and spi/i2c. We reuse the same session 3 times, so that's good. > three SetHierarchyAuthorization() calls are bound to the session. I am not sure if they can be divided. Maybe we need virtual session? Technically, even if they are a part of the same method, nothing is guaranteed. If there are multiple other callers (other daemons, for example) that talk to the TPM and create sessions at the same, they can close this session if all session handles are taken between StartUnboundSession and one of SetHierarchyAuthorization (http://cs/chromeos_public/src/platform2/trunks/resource_manager.cc?l=460). And virtual sessions would help to prevent that. It's just that in practice that doesn't happen. Yes, stretching TakeOwnership over several calls potentially increases the time between StartUnboundSession and subsequent SetHierarchyAuthorization. So, having virtual sessions becomes slightly more important. But as the first step we can just make sure that the session object is preserved between these separate parts and is destroyed only after all hierarchy authorizations are set. And in practice it will work. So, shouldn't necessarily depend on virtual session handles. > Currently TpmUtility::PrepareForOwnership() only checks if owner password is set before skipping CreateStorageAndSaltingKeys(). Note that CreateStorageAndSaltingKeys internally checks if the keys were already created and doesn't do that again (it does send commands to check that the keys exist, but those are relatively short compared to long Crete/CreatePrimary commands). So, not skipping it is not a big issue. See http://cs/chromeos_public/src/platform2/trunks/tpm_utility_impl.cc?l=1951 http://cs/chromeos_public/src/platform2/trunks/tpm_utility_impl.cc?l=2038
,
Oct 26
Capturing chat discussion on this subj: We want to either (a) break the TakeOwnership workflow (and other long operations) into smaller tasks (regardless of what functions are called below) running on the same thread, or (b) run TakeOwnership on a separate thread and post commands sent by it back to the original thread as individual tasks. Option (b) essentially achieves the same goal - having smaller tasks. One of the ways to deal with it is to have multiple threads: (1) a pool of threads that handle requests posted to trunksd, with multiple threads that can run simultaneously. so that if request B comes while request A is still being handled it doesn't have to wait for A to finish before it starts. and then we have two options: A] let each thread send commands to trunksd on its own - those requests will be synchronized on trunksd side anyways, but we need to create several trunksd client objects - one for each thread or B] have a single thread that handles sending TPM commands to trunksd, and all threads from pool (1) post commands there when they need to send them to the TPM. Regardless of [A] or [B], it likely makes sense to switch from synchronous SendAndWait approach (send command to trunksd and keep the task & thread idle until we get the response) to SendAndNotify approach - specify what task needs to be posted back on the thread/pool when a response for our command is received and end the current task and release the thread as soon as the command is sent. We'll have to deal with more careful synchronization compared to the single-thread-that-synchronously-handles-everything approach, but we need to have some form of this change to improve latency.
,
Oct 26
Another way to put it: All that is a part of the bigger refactoring. Not only tpm_managerd needs to switch to such approach, but also other daemons that talk to trunksd: cryptohomed, attestationd, etc. Thus we likely want to make support for this functionality a part of some common library like libtrunks, so that we don't need to re-implement it for every daemon separately. Rather than solving the individual TakeOwnership problem, let's create a generic mechanism that allows such parallelization, since it is needed in more than one place, and use it for TakeOwnership benefit.
,
Nov 2
,
Nov 28
,
Dec 12
,
Dec 14
|
||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by apronin@chromium.org
, Jan 13 2018