New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 691815 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

reef/snappy/pyro : Rev Repair firmwares and images on Apollo Lake boards to something more recent

Project Member Reported by bleung@chromium.org, Feb 14 2017

Issue description

Here's what atest says are the stable versions for Apollo lake boards today : 
reef                   | R57-9170.0.0
reef/rwfw              | Google_Reef.9042.14.0

snappy/rwfw            | Google_Snappy.8969.0.0
snappy                 | R56-9000.76.0

pyro/rwfw              | Google_Pyro.9042.23.0
pyro                   | R57-9202.0.0


For Reef and Snappy especially, the images are too old. All Apollo Lake boards will FSI on M-57, so repairing to R56-9000 images for snappy and pre-branch R57 is not productive.

In particular, we've tracked an issue where reef lab machines would exhibit rollback during au testing and provisioning :  issue 690286 .

These were made better when Reef was bumped to 9042.14, but we need to have these as close to final firmware as possible. We are very close to finalizing firmware for Reef.

I would suggest that we immediately set the stable build for reef to 
R57-9202.24
firmware : Google_Reef.9042.43.0

For the other boards, we should use the same R57 image as above, but 
Google_Pyro.9042.41.0 for firmware.

I'll check what Snappy needs as well.


 
Owner: jrbarnette@chromium.org
> For the other boards, we should use the same R57 image as above, but 
> Google_Pyro.9042.41.0 for firmware.

For clarity:  There is no real choice in the version of firmware;
the firmware installed _must_ be the firmware delivered in the
shellball of the assigned repair firmware.

So, for the given build, you get these firmware assignments:
reef-release/R57-9202.24.0               Google_Reef.9042.43.0
pyro-release/R57-9202.24.0               Google_Pyro.9042.41.0
snappy-release/R57-9202.24.0             Google_Snappy.9042.33.0

Comment 3 by bleung@google.com, Feb 14 2017

Sounds good to me. Those firmwares are all better starting points than what was there previously and would give me much better confidence. 
Status: Fixed (was: Untriaged)
$ atest stable_version modify -b pyro -i R57-9202.24.0
Stable version for board pyro is changed from R57-9202.0.0 to R57-9202.24.0.
$ atest stable_version modify -b pyro/rwfw -i Google_Pyro.9042.41.0
Stable version for board pyro/rwfw is changed from Google_Pyro.9042.23.0 to Google_Pyro.9042.41.0.
$ atest stable_version modify -b snappy -i R57-9202.24.0
Stable version for board snappy is changed from R56-9000.76.0 to R57-9202.24.0.
$ atest stable_version modify -b snappy/rwfw -i Google_Snappy.9042.33.0
Stable version for board snappy/rwfw is changed from Google_Snappy.8969.0.0 to Google_Snappy.9042.33.0.
$ atest stable_version modify -b reef -i R57-9202.24.0
Stable version for board reef is changed from R57-9170.0.0 to R57-9202.24.0.
$ atest stable_version modify -b reef/rwfw -i Google_Reef.9042.43.0
Stable version for board reef/rwfw is changed from Google_Reef.9042.14.0 to Google_Reef.9042.43.0.

Status: Assigned (was: Fixed)
At least one snappy seems to be failing repair:
    http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host2/59970227-repair/

This is the relevant symptom:
		FAIL	----	repair.powerwash	timestamp=1487036499	localtime=Feb 13 17:41:39	CrOS auto-update failed for host chromeos2-row4-rack2-host2: RootfsUpdateError: Build snappy-release/R57-9202.24.0 failed to boot on chromeos2-row4-rack2-host2; system rolled back to previous build

However, it seems many DUTs already have the Google_Snappy.9042.33.0
firmware build:
    http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host1/59970222-reset/

This is similar to the situation seen on reef late last weekend.  I don't
know exactly how that happened.

I'm going to hold this bug open, until we can see whether there's an
impending outage, or just a few problem children.

Status: Fixed (was: Assigned)
We have four problem children:
hostname                       S   last checked         URL
chromeos2-row4-rack2-host8     NO  2017-02-14 06:27:03  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host8/59981867-repair/
chromeos2-row4-rack2-host7     NO  2017-02-14 03:37:32  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host7/59979530-repair/
chromeos2-row4-rack2-host10    NO  2017-02-14 08:08:53  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host10/59983177-repair/
chromeos2-row4-rack2-host2     NO  2017-02-14 04:18:10  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host2/59980260-repair/
chromeos2-row4-rack1-host16    NO  2017-02-13 19:13:30  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack1-host16/59972902-repair/

The symptom in all cases is that the DUT fails to boot the
repair image from USB.  That _could_ be a servo problem,
but given the other problems we're seeing with R57-9202.24.0
on snappy, I don't think we want to keep that build around.

I'm closing this, and I'll open a new bug for our snappy snafu.

Sign in to add a comment