When we encounter a TPM error in firmware, we're storing the TPM error code in the recovery subcode field for easier debugging in the field. Except that we don't for vb2ex_tpm_clear_owner(), because it always returns VB2_ERROR_EX_TPM_CLEAR_OWNER regardless of what non-zero return value it got from the Tlcl function it called. Other paths that contain TPM accesses (like setup_tpm()) can't get their response code up to vboot at all.
If this did work as intended, it would still be kinda iffy because we're not doing a good job of differentiating protocol errors returned by the TPM from transfer errors that prevented us from even talking to it. For transfer errors we always return VB2_ERROR_UNKNOWN from tpm_send_receive(), even if the underlying TIS implementation could have provided a more detailed code. Truncated down to 8 bits (in the recovery subcode field), this would look the same as the TPM_AUTHFAIL protocol error code.
The Tlcl library in coreboot has a bunch of "vboot local" error codes (like TPM_E_COMMUNICATION_ERROR) that seem to be defined for this, but they're mostly not used anymore. They're also not super helpful since they're starting from 0x5000, which would be masked away when truncating to 8 bits. The TPM 1.2 spec's fatal errors (most common type we'd expect to see) reach from 1 to 99, so we could easily split the one byte subcode space we have into TPM protocol errors (0x00-0x7f) and internal error codes (0x80-0xff).
If someone could clean this all up we might have a much easier time debugging random 1-out-of-10000-tests TPM errors in the lab like chrome-os-partner:55764. (Oh, and while you're at it, please add the recovery subcode to the Chrome OS Recovery Mode event in the eventlog as an additional byte if it's non-zero.)
Comment 1 by apronin@chromium.org
, Aug 16 2016