New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 591366 link

Starred by 5 users

Issue metadata

Status: Verified
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug-Regression



Sign in to add a comment

Recovery fails for latest canary image.

Project Member Reported by tnagel@chromium.org, Mar 2 2016

Issue description

Version: 7995.0.0 canary

What steps will reproduce the problem?
1. Use canary image to recover device.
2. Recovery media is verified and recovery begins.

What is the expected output?
Recovery should finish successfully.

What do you see instead?
Recovery fails.

I've observed this on skate and falco.  (Haven't tried any other boards yet.)

Excerpt from recovery.log:

[...]
ChromeosChrootPostinst(7995.0.0)
Set boot target to /dev/sda3: Partition 3, Slot A
SetImage
KERNEL_CONFIG: console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 2506752 verity payload=PARTUUID=%U/PARTNROFF=1 hashtree=PARTUUID=%U/PARTNROFF=1 hashstart=2506752 alg=sha1 root_hexdigest=7a53d2193c4843f0e7aabd3a833a6c0e3b7dc93a salt=31ae39ac675a590403ce8b1cc351aeefe07b6a6b26c8b6a4374857380d044eea" noinitrd vt.global_cursor_default=0 kern_guid=%U add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic iTCO_vendor_support.vendorsupport=3  
Setting up verity.
Finished after 9 seconds.
Clearing network driver boot cache: /var/lib/preload-network-drivers.
Syncing filesystems before changing boot order...
Finished after 0 seconds.
Updating Partition Table Attributes using CgptManager...
Updated kernel 2 with Successful = 1 and NumTriesLeft = 6
Checking /mnt/stateful_partition/unencrypted permission.
RemovePackFiles Failed
Touch(/mnt/stateful_partition/.install_completed) FAILED
Starting firmware updater (/tmp/install-mount-point/usr/sbin/chromeos-firmwareupdate --mode=recovery)
Command: /tmp/install-mount-point/usr/sbin/chromeos-firmwareupdate --mode=recovery
Starting Google_Falco firmware updater v4 (recovery)...
 - Updater package: [Google_Falco.4389.92.0 / EC:falco_v1.5.132-c77d95f]
 - Current system:  [RO:Google_Falco.4389.92.0 , ACT:Google_Falco.4389.92.0]
 - Write protection: Hardware: ON, Software: Main=off EC=off
One-time RO+RW update from unstable EC firmware.
Try to update with recovery mode...
mode_recovery: update RO+RW
 Execution failed (1): flashrom -p host -r _vpd_temp.bin
 Messages:
flashrom v0.9.4  : 02c368a : Mar 02 2016 00:49:24 UTC on Linux 3.8.11 (x86_64), built with libpci 3.1.10, GCC 4.9.x-google 20150123 (prerelease), little endian
Cannot stat /var/run/lockCould not acquire lock.
ERROR: Failed to read current main firmware.
ERROR: Execution failed: ./updater4.sh (error code = 1)
Finished after 0 seconds.
Failed Command: /tmp/install-mount-point/usr/sbin/chromeos-firmwareupdate --mode=recovery - Exit Code 1
Firmware update failed (error code: 1).
Rolling back update due to failure installing required firmware.
Successfully updated GPT with all settings to rollback.
PostInstall Failed
Running a hw diagnostics test -- this might take a couple minutes.
[...]
 
recovery.log
97.2 KB View Download
Cc: rspangler@chromium.org
Components: OS>Hardware>Firmware OS>Firmware OS>Firmware>EC
Randall, may I ask who would be a good owner for this (and what is the correct component)?
Labels: -Type-Bug Type-Bug-Regression
Concretely, the failing images were:
chromeos_7995.0.0_daisy-skate_recovery_canary-channel_skate-mp.bin
chromeos_7995.0.0_falco_recovery_canary-channel_mp-v2.bin
Same problem with clapper:
chromeos_7995.0.0_clapper_recovery_canary-channel_mp.bin
Owner: dhend...@chromium.org
Seems like flashrom problem?

"Cannot stat /var/run/lock
Could not acquire lock."

I found flashrom had some recent changes against its lock mechanism:
https://chromium-review.googlesource.com/#/c/327407/

so I think it's a problem caused by that "the environment inside initramfs (recovery)" was not updated to support the new lock.

Meanwhile, this may cause a problem when auto-updating from very old devices...
Labels: -ReleaseBlock-Stable ReleaseBlock-Dev
> Meanwhile, this may cause a problem when auto-updating from very old devices...

Setting ReleaseBlock-Dev to prevent broken versions from being pushed to any of the channels.
Looking at the implementation of file lock inside flashrom, I think we should either fix this by
 (1) always create the lock folder if it does not exist (inside flashrom & mosys & ecutil), or
 (2) prepare the folder for it inside initramfs.

Well, we can do both...
Some more info: In current initramfs (src/platform/initramfs/common/fs-layout.txt) /var/run/lock was supposed to be available. Not sure why flashrom can't see it (stat faliure).

Probably need to dig more on recovery program (src/platform/initramfs/recovery) to see if it has replaced var during its rebind process (for postinst execution)...
Ok I think we should probably change src/platform/initramfs/recovery/init 
 BASE_MOUNTS="/sys /proc /dev"
to
 BASE_MOUNTS="/sys /proc /dev /run /var"

Haven't tried, just my two cents. See what David thinks.

> Ok I think we should probably change src/platform/initramfs/recovery/init 
>  BASE_MOUNTS="/sys /proc /dev"
> to
>  BASE_MOUNTS="/sys /proc /dev /run /var"

This sounds like the right answer in this case.  I think saying
"initramfs will provide a /run like any other linux system does"
is reasonable, possibly even a good idea, so I'd vote for
changing initramfs.

However, for completeness, I'll note we have another hook
available.  Recovery doesn't call "chromeos-install" directly; it
calls a script called "chromeos-recovery".  That script can be
changed to establish environment requirements not provided
by the initramfs setup.  It's there for exactly that purpose, so
we shouldn't be afraid to use it that way, if necessary.

Cc: ka...@chromium.org
Cc: sontis@chromium.org shrawan@chromium.org dchan@chromium.org helenzhang@chromium.org
Re #10: Sounds good. FWIW lmt-req and crash_sender normally put lockfiles there as well, not sure if they get run in recovery mode though.

Another approach would be for flashrom and other tools which use this particular lock to fallback to /tmp, but I think Richard's suggestion is better.
While a fix is in progress, I would ask for a test that can effectively block such recovery regressions from being checked in.  Issue 588592  also surfaced recently.

There is an old server side autotest - platform_InstallRecoveryImage - any way it could be refreshed/modified and included to BVT?

Re comment #14: Automatically testing for these regressions is
somewhat expensive.  I'm not sure that the time and effort to do
it would be well rewarded.

I did some digging and indeed we should be using /run: http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

But as Richard pointed out in #10 we still don't mount that. So we need two things:
1. Update utilities to use /run/lock instead of /var/run/lock (CL:329986, CL:329996, CL:330123).
2. Make sure /run gets mounted: https://chromium-review.googlesource.com/330004
Project Member

Comment 17 by bugdroid1@chromium.org, Mar 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/mosys/+/3c5a22baa2ce96c234a5e7064cba8465a1fb6486

commit 3c5a22baa2ce96c234a5e7064cba8465a1fb6486
Author: David Hendricks <dhendrix@chromium.org>
Date: Wed Mar 02 20:27:52 2016

locks: Update lockfile dir to be FHS 3.0 compliant

The Filesystem Hierarchy Standard version 3.0* specifies that /run
should be used for runtime variables such as locks.

The rationale for switching to use /run instead of /var/run was
because /var might not be available at early boot. Since /run is
implemented as a tmpfs and doesn't require /var to be mounted first
it can be made available earlier.

*http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

BUG= chromium:591366 
BRANCH=none
TEST=none

Change-Id: I519c280b386a4e035e2b2ac5f7406162f336cbf1
Signed-off-by: David Hendricks <dhendrix@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/329986
Reviewed-by: Shawn N <shawnn@chromium.org>

[modify] https://crrev.com/3c5a22baa2ce96c234a5e7064cba8465a1fb6486/include/mosys/locks.h

Project Member

Comment 18 by bugdroid1@chromium.org, Mar 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/flashrom/+/5f84cc774717097016958303f375d872fd6f13a2

commit 5f84cc774717097016958303f375d872fd6f13a2
Author: David Hendricks <dhendrix@chromium.org>
Date: Wed Mar 02 20:39:13 2016

locks: Update lockfile dir to be FHS 3.0 compliant

The Filesystem Hierarchy Standard version 3.0* specifies that /run
should be used for runtime variables such as locks.

The rationale for switching to use /run instead of /var/run was
because /var might not be available at early boot. Since /run is
implemented as a tmpfs and doesn't require /var to be mounted first
it can be made available earlier.

*http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

BUG= chromium:591366 
BRANCH=none
TEST=none

Change-Id: I36ca2185cf98c22c6780b3b0f88d7b54954d6baa
Signed-off-by: David Hendricks <dhendrix@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/329996
Reviewed-by: Shawn N <shawnn@chromium.org>

[modify] https://crrev.com/5f84cc774717097016958303f375d872fd6f13a2/locks.h

Project Member

Comment 19 by bugdroid1@chromium.org, Mar 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/ec/+/ad7d6516b5dc041f9d2b1947dd550a592db09e0c

commit ad7d6516b5dc041f9d2b1947dd550a592db09e0c
Author: David Hendricks <dhendrix@chromium.org>
Date: Wed Mar 02 20:39:44 2016

locks: Update lockfile dir to be FHS 3.0 compliant

The Filesystem Hierarchy Standard version 3.0* specifies that /run
should be used for runtime variables such as locks.

The rationale for switching to use /run instead of /var/run was
because /var might not be available at early boot. Since /run is
implemented as a tmpfs and doesn't require /var to be mounted first
it can be made available earlier.

*http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

BUG= chromium:591366 
BRANCH=none
TEST=none

Change-Id: Ic0b5ff336c1c258db8891c0a17c836497d9793c5
Signed-off-by: David Hendricks <dhendrix@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/330123
Reviewed-by: Shawn N <shawnn@chromium.org>

[modify] https://crrev.com/ad7d6516b5dc041f9d2b1947dd550a592db09e0c/util/lock/locks.h

Comment 20 Deleted

I've updated the the utilities and Hung-Te has uploaded a patch to update the recovery image: https://chromium-review.googlesource.com/#/c/330004
Project Member

Comment 22 by bugdroid1@chromium.org, Mar 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/initramfs/+/e602124bbfbf676062bbed30ac142a22e0077a33

commit e602124bbfbf676062bbed30ac142a22e0077a33
Author: David Hendricks <dhendrix@chromium.org>
Date: Wed Mar 02 20:46:44 2016

recovery_init: Prepare /run for installer execution.

We should have a directory available for runtime data created by
various programs, e.g. lockfiles. According to FHS 3.0 /run should
be used for this purpose.

http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

BUG= chromium:591366 
BRANCH=none
TEST=./build_image --board link;
     ./mod_image_for_recovery.sh --board link;

     [vboot_reference/scripts/image_signing]/
     ./tag_image.sh --update_firmware=1 --from \
       ~/trunk/src/build/images/link/latest/recovery_image.bin
     ./sign_official_build.sh recovery \
       ~/trunk/src/build/images/link/latest/recovery_image.bin \
       ../../tests/devkeys ~/Downloads/new_recovery_image.bin

     Flash the new_recovery_image.bin to USB stick and start recovery.
     Firmware updated without problem.

Change-Id: Ibce2fad02f43ff92dce9525617926e0e60a1c7ef
Signed-off-by: David Hendricks <dhendrix@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/330004
Tested-by: Hung-Te Lin <hungte@chromium.org>
Reviewed-by: Hung-Te Lin <hungte@chromium.org>
Commit-Queue: Hung-Te Lin <hungte@chromium.org>

[modify] https://crrev.com/e602124bbfbf676062bbed30ac142a22e0077a33/recovery/recovery_init.sh

Status: Fixed (was: Started)
fix chumped in.
Status: Verified (was: Fixed)
Project Member

Comment 25 by bugdroid1@chromium.org, Apr 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/factory_installer/+/312f06df6f921915f15f976f936eea335c51b236

commit 312f06df6f921915f15f976f936eea335c51b236
Author: Hung-Te Lin <hungte@chromium.org>
Date: Wed Apr 06 10:50:29 2016

netboot: Fix missing /run/lock after chroot invokcation.

After flashrom and mosys have been changed to LFH we have to make sure
/run and /run/lock are both preserved (or re-created) after chroot
calls.

BUG=chrome-os-partner:51705,chrome-os-partner:52038, chromium:591366 
TEST=./build_packages --board=${BOARD}
     ./make_netboot.sh --board=${BOARD} --image_dir=/tmp
     # use /tmp/netboot/vmlinux.bin for netboot kernel.

Change-Id: I32aafc94c5558828c34457d688710051adf0e16d
Reviewed-on: https://chromium-review.googlesource.com/337490
Commit-Ready: Hung-Te Lin <hungte@chromium.org>
Tested-by: Liu Bo <boliu@yifangdigital.com>
Reviewed-by: Shun-Hsing Ou <shunhsingou@chromium.org>

[modify] https://crrev.com/312f06df6f921915f15f976f936eea335c51b236/netboot_postinst.sh

Project Member

Comment 26 by bugdroid1@chromium.org, Apr 8 2016

Labels: merge-merged-factory-strago-7458.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/factory_installer/+/711bdf397466154426305de2fa160ce409b032fa

commit 711bdf397466154426305de2fa160ce409b032fa
Author: Hung-Te Lin <hungte@chromium.org>
Date: Wed Apr 06 10:50:29 2016

netboot: Fix missing /run/lock after chroot invokcation.

After flashrom and mosys have been changed to LFH we have to make sure
/run and /run/lock are both preserved (or re-created) after chroot
calls.

BUG=chrome-os-partner:51705,chrome-os-partner:52038, chromium:591366 
TEST=./build_packages --board=${BOARD}
     ./make_netboot.sh --board=${BOARD} --image_dir=/tmp
     # use /tmp/netboot/vmlinux.bin for netboot kernel.

Change-Id: I32aafc94c5558828c34457d688710051adf0e16d
Reviewed-on: https://chromium-review.googlesource.com/337651
Reviewed-by: Chen Peng <chenpeng@cncoptronics.cn>
Commit-Queue: Chen Peng <chenpeng@cncoptronics.cn>
Tested-by: Chen Peng <chenpeng@cncoptronics.cn>
Reviewed-by: Hung-Te Lin <hungte@chromium.org>

[modify] https://crrev.com/711bdf397466154426305de2fa160ce409b032fa/netboot_postinst.sh

Components: -Internals>Install Internals>Installer
Fixing component typo.
Project Member

Comment 28 by bugdroid1@chromium.org, May 4 2016

Labels: merge-merged-factory-glados-7828.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/flashrom/+/e0b042d8090c2cf2734a0422dc6fa432267a42c5

commit e0b042d8090c2cf2734a0422dc6fa432267a42c5
Author: David Hendricks <dhendrix@chromium.org>
Date: Wed Mar 02 20:39:13 2016

locks: Update lockfile dir to be FHS 3.0 compliant

The Filesystem Hierarchy Standard version 3.0* specifies that /run
should be used for runtime variables such as locks.

The rationale for switching to use /run instead of /var/run was
because /var might not be available at early boot. Since /run is
implemented as a tmpfs and doesn't require /var to be mounted first
it can be made available earlier.

*http://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

BUG= chromium:591366 
BRANCH=none
TEST=none

Change-Id: I36ca2185cf98c22c6780b3b0f88d7b54954d6baa
Signed-off-by: David Hendricks <dhendrix@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/329996
Reviewed-by: Shawn N <shawnn@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/341150
Tested-by: Chia-Hsiu Chang <chia-hsiu.chang@quantatw.com>
Commit-Queue: Chia-Hsiu Chang <chia-hsiu.chang@quantatw.com>

[modify] https://crrev.com/e0b042d8090c2cf2734a0422dc6fa432267a42c5/locks.h

Project Member

Comment 29 by bugdroid1@chromium.org, May 12 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/factory_installer/+/78ce628946acf4d96269db8f12a707a050df820a

commit 78ce628946acf4d96269db8f12a707a050df820a
Author: Hung-Te Lin <hungte@chromium.org>
Date: Wed Apr 06 10:50:29 2016

netboot: Fix missing /run/lock after chroot invokcation.

After flashrom and mosys have been changed to LFH we have to make sure
/run and /run/lock are both preserved (or re-created) after chroot
calls.

BUG=chrome-os-partner:51705,chrome-os-partner:52038, chromium:591366 
TEST=./build_packages --board=${BOARD}
     ./make_netboot.sh --board=${BOARD} --image_dir=/tmp
     # use /tmp/netboot/vmlinux.bin for netboot kernel.

Change-Id: I32aafc94c5558828c34457d688710051adf0e16d
Reviewed-on: https://chromium-review.googlesource.com/337490
Commit-Ready: Hung-Te Lin <hungte@chromium.org>
Tested-by: Liu Bo <boliu@yifangdigital.com>
Reviewed-by: Shun-Hsing Ou <shunhsingou@chromium.org>
(cherry picked from commit 312f06df6f921915f15f976f936eea335c51b236)
Reviewed-on: https://chromium-review.googlesource.com/337658
Reviewed-by: Nicole Li <nicole.li@intel.com>
Reviewed-by: Hung-Te Lin <hungte@chromium.org>
Tested-by: Chia-Hsiu Chang <chia-hsiu.chang@quantatw.com>
Commit-Queue: Kaiyen Chang <kaiyen.chang@intel.com>

[modify] https://crrev.com/78ce628946acf4d96269db8f12a707a050df820a/netboot_postinst.sh

Labels: -Pri-0 Pri-1
We received a report recently that Falco is having difficulty updating from 4920.76.0 to 8172.39.0 due to this issue.

Log available here: https://paste.googleplex.com/5613374582816768

If somebody on the test team has a few cycles to try this out again, it would be greatly appreciated.
Status: Assigned (was: Verified)
Setting status to Assigned to prevent the issue from falling through the cracks.
Status: Fixed (was: Assigned)
Recovery is working fine.

The issue reported by #30 is AU.
AU is different from recovery image so you may probably want to create another issue instead of keeping this assigned.
Cc: dhend...@chromium.org
Labels: -Pri-1 Pri-0
+dhendrix

Hung-Te suggests to create a separate issue.  (Restoring original priority.)
Right, because the components will be different (AU is update_engine, not installer), and the labels here would need to be changed (this has been already merged).
Recovery and AU are different, but they both run the postinstall command or installer script (with different parameters). The AU bug in #30 is about running a new installer script in an old environment. This bug is about running it from the (new) recovery environment; somewhat related. 

Comment 37 by son...@google.com, Jun 17 2016

Status: Verified (was: Fixed)
Verified on build 8172.56.0

Sign in to add a comment