New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 743537 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

cr50: Allow skipping silent updates on boot time

Project Member Reported by hungte@chromium.org, Jul 15 2017

Issue description

This is reported from ODM partners.

In factory, if the initially flashed image has a newer Cr50 firmware, cr50-*.conf will try to update firmware, keep black screen for a very long (>30s) time. Operators may think "this device has problems" and try to reboot, which would make things even worse and get Cr50 firmware corrupted.

If updating Cr50 via AU/trunksd is stable now, can we keep the boot time update only for particular boards, and don't do that on newer systems? Or is there a way we can configure system to "don't do cr50-update on boot time" (we can do it later using trunks_send).

Moreover, I wonder if the setting of board ID may also be isolated to a standalone script, because partners may want to skip updating and still want to setup board ID.
 

Comment 1 by vbendeb@google.com, Jul 17 2017

cr50 updates should never take more than ten or so seconds. On what systems are you seeing 30+ seconds delays?

Update using trunks_send is happening when a new version of Chrome OS is installed, but I think we want to use the latest and greatest Cr50 image as soon as possible.

Note that Cr50 firmware will never be corrupted, as there are A and B sections and one of the two is always valid, so from this point of view interrupted updates are not a problem.

Also, I think there is a lot of flexibility of what can be done in the factory image, you can modify the *.conf scripts as you see fit, as soon as the prod chrome os image is intact.

Comment 2 by hungte@chromium.org, Jul 17 2017

Ting, can you collect partner's feedback and provide here?
Or maybe create specific issues on partner tracker.

> Also, I think there is a lot of flexibility of what can be done in the factory image, you can modify the *.conf scripts as you see fit

Not really. We are using the standard test image and try to keep rootfs verification enabled. As a result, we have to keep the conf files unmodified and figure out a way to workaround most of their problems. We tried hard but it's often broken due to many unexpected changes.
Sone details:
1. They were trying to update from 0.0.13,and saw error messages like "Tpm read error in rewritable firmware".
2. Fw is not corrupted, just failed to update to newer version.

The failure rate is quite low (31/6773) so we are suspecting that OP did something wrong cause this failure.

They are trying to disable cr50-*.conf at boot time and invoke them manually in a factory test. Looks good so far.

download_20170718_120556.jpg
43.7 KB View Download

Comment 4 by hungte@chromium.org, Jul 18 2017

My concern is if we don't have a standard way to do so, different ODMs will try to do weird things to disable few tests or jobs and get things out of control, including that we'll be very difficult to reproduce their issues since they may have local modifications we're not aware of.

And each project with Cr50 may end filing same issue to get a stable update.

Ting, I remember you mentioned 30s, was that still true or was that caused by something else?
I don't think they actually measured the time. Just ignore this.

Comment 6 by hungte@chromium.org, Jul 18 2017

Can I interpret that as "10s may be still long enough for some OP to try reboot and cause problems"?

I think we need a process that either (1) won't keep black screen for 10s or, (2) even if interrupted during AU, Cr50 should not fail to update in future boots.
When .conf scripts update Cr50 the screen is not black, it is showing OOBE
or the login dialog if the device has a user already.

Long duration of the dark screen on reef clones is the time when the AP
firmware is training DRAM for the first time, after that it saves training
data if flash and its hash in the TPM.

Unfortunately this needs to happen again after cr50 is updated from 0.0.13:
when this transition happens the TPM NVMEM is wiped out (it changes from
plain to encrypted storage), so the hash is lost and training is repeated.

Maybe this is what they see.
Status: Archived (was: Untriaged)
the long delay on some X86 platforms when moving from 0.0.13 Cr50 version to 0.0.22 and later is due to the fact that MRC is re-training DRAM. Also, the factory image does not update Cr50 automatically any more.

Sign in to add a comment