Security test platform_EncryptedStateful fails on Elm board |
|||||||||||
Issue descriptionThe platform_EncryptedStateful test always consistently fails on the Elm board. Please either fix the test, or set it up so that it does not run when the security suite is run on Elm.
,
Sep 26 2016
It turns out the 8th loop device gets eaten by the test itself: https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/client/site_tests/platform_EncryptedStateful/platform_EncryptedStateful.py?rcl=9590ee35249853d4f12353dca69d2349de9a8326&l=58 No surprise the test is failing, I'll look into doubling the number of available devices.
,
Sep 26 2016
Taking ownership. Proposed fix is here: https://chromium-review.googlesource.com/#/c/389153/
,
Sep 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/8af4e06fe5d6b915f174e170996e174cb1b95ba5 commit 8af4e06fe5d6b915f174e170996e174cb1b95ba5 Author: Mattias Nissler <mnissler@chromium.org> Date: Mon Sep 26 14:38:07 2016 cryptohome: Rework loop device handling. Switch the code to use LOOP_CTL_GET_FREE instead of iterating existing devices. This makes sure we get a new loop device allocated even when all existing devices are used. While at it: * Clean up detach logic to read sysfs to get the list of existing block devices instead of looping over /dev/loop%d. * Handle loop device allocation races more gracefully. Note that I haven't seen this in practice, but allocation race is trivial to reproduce in a small test program. BUG= chromium:649737 TEST=platform_EncryptedStateful passes again. Change-Id: I78d0cac06f4fc000efdd1589c93958e017e6d379 Reviewed-on: https://chromium-review.googlesource.com/389153 Commit-Ready: Mattias Nissler <mnissler@chromium.org> Tested-by: Mattias Nissler <mnissler@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/8af4e06fe5d6b915f174e170996e174cb1b95ba5/cryptohome/mount-helpers.c
,
Sep 28 2016
This should be fixed now, but I'll keep this bug open until I'm seeing the test turn green.
,
Sep 28 2016
elm and veyron_jaq turned green on platform_EncryptedStateful meanwhile. Let's call this fixed.
,
Oct 7 2016
,
Nov 19 2016
,
Jan 21 2017
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Aug 3 2017
Closing. Please reopen it if its not fixed. Thanks! |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by mnissler@chromium.org
, Sep 26 2016Here's the relevant section from the logs (see inline comments): 09/25 13:28:08.810 DEBUG| base_utils:0185| Running 'mv /dev/tpm0 /dev/tpm0.off' 09/25 13:28:08.833 DEBUG| base_utils:0185| Running 'MOUNT_ENCRYPTED_ROOT=/mnt/stateful_partition/.test-enc-stateful-hK9q2Y mount-encrypted 2>&1 | tee /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/mount.log' 09/25 13:28:08.850 DEBUG| base_utils:0280| [stdout] [pid:26868] Starting. 09/25 13:28:08.851 DEBUG| base_utils:0280| [stdout] [pid:26868] VFS mount state sanity check ok. 09/25 13:28:08.852 DEBUG| base_utils:0280| [stdout] ERROR: TPM: Cannot open TPM device /dev/tpm0: No such file or directory 09/25 13:28:08.852 DEBUG| base_utils:0280| [stdout] [pid:26868] TPM not available 09/25 13:28:08.853 DEBUG| base_utils:0280| [stdout] [pid:26868] Could not read TPM Permanent Flags: error 0x5006. 09/25 13:28:08.853 DEBUG| base_utils:0280| [stdout] [pid:26868] Using NVRAM as system key; finalization needed. 09/25 13:28:08.854 DEBUG| base_utils:0280| [stdout] [pid:26868] No usable system key found. 09/25 13:28:08.854 DEBUG| base_utils:0280| [stdout] [pid:26868] /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.needs-finalization does not exist. 09/25 13:28:08.854 DEBUG| base_utils:0280| [stdout] [pid:26868] Generating new encryption key. 09/25 13:28:08.855 DEBUG| base_utils:0280| [stdout] [pid:26868] Creating sparse backing file with size 3130396672. 09/25 13:28:08.856 DEBUG| base_utils:0280| [stdout] [pid:26868] Loopback attaching /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.block (named encstateful_f8969101506b16e94). 09/25 13:28:08.856 DEBUG| base_utils:0280| [stdout] ERROR[pid:26868] loop_locate (../../../cryptohome-0.0.1/platform2/cryptohome/mount-helpers.c, 264): Ran out of loopback devices 09/25 13:28:08.856 DEBUG| base_utils:0280| [stdout] ERROR[pid:26868] setup_encrypted (../../../cryptohome-0.0.1/platform2/cryptohome/mount-encrypted.c, 838): loop_attach failed This seems the point where the trouble starts, i.e. the mount attempt fails due to all loopback devices being busy. 09/25 13:28:08.857 DEBUG| base_utils:0280| [stdout] [pid:26868] Done. 09/25 13:28:08.857 DEBUG| base_utils:0185| Running 'mv /dev/tpm0.off /dev/tpm0' 09/25 13:28:08.870 INFO |platform_Encrypted:0020| ok: /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.block exists 09/25 13:28:08.871 ERROR|platform_Encrypted:0023| FAIL: could not satisfy '/mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.needs-finalization exists' Note that the absence of the needs-finalization file is already detected here. 09/25 13:28:08.877 INFO |platform_Encrypted:0020| ok: /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.key does not exist 09/25 13:28:08.881 DEBUG| base_utils:0185| Running 'logger "autotest finished iteration /usr/local/autotest/results/default/platform_EncryptedStateful/sysinfo/iteration.1"' 09/25 13:28:08.896 DEBUG| base_utils:0185| Running 'MOUNT_ENCRYPTED_ROOT=/mnt/stateful_partition/.test-enc-stateful-hK9q2Y mount-encrypted umount' 09/25 13:28:08.908 DEBUG| base_utils:0280| [stdout] [pid:26874] Starting. 09/25 13:28:08.910 DEBUG| base_utils:0280| [stdout] [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/var. 09/25 13:28:08.910 DEBUG| base_utils:0280| [stdout] [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/home/chronos. 09/25 13:28:08.911 DEBUG| base_utils:0280| [stdout] [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted. 09/25 13:28:08.911 ERROR| base_utils:0280| [stderr] ERROR[pid:26874] shutdown (../../../cryptohome-0.0.1/platform2/cryptohome/mount-encrypted.c, 1078): umount(/mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted): No such file or directory 09/25 13:28:08.922 ERROR| test:0810| Exception autotest_lib.client.common_lib.error.CmdError: CmdError('MOUNT_ENCRYPTED_ROOT=/mnt/stateful_partition/.test-enc-stateful-hK9q2Y mount-encrypted umount', * Command: 09/25 13:28:08.933 ERROR| test:0810| MOUNT_ENCRYPTED_ROOT=/mnt/stateful_partition/.test-enc-stateful-hK9q2Y 09/25 13:28:08.943 ERROR| test:0810| mount-encrypted umount 09/25 13:28:08.950 ERROR| test:0810| Exit status: 1 09/25 13:28:08.957 ERROR| test:0810| Duration: 0.00556206703186 09/25 13:28:08.965 ERROR| test:0810| 09/25 13:28:08.971 ERROR| test:0810| stdout: 09/25 13:28:08.979 ERROR| test:0810| [pid:26874] Starting. 09/25 13:28:08.987 ERROR| test:0810| [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/var. 09/25 13:28:08.997 ERROR| test:0810| [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/home/chronos. 09/25 13:28:09.005 ERROR| test:0810| [pid:26874] Unmounting /mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted. 09/25 13:28:09.014 ERROR| test:0810| stderr: 09/25 13:28:09.021 ERROR| test:0810| ERROR[pid:26874] shutdown (../../../cryptohome-0.0.1/platform2/cryptohome/mount-encrypted.c, 1078): umount(/mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted): No such file or directory, 'Command returned non-zero exit status') in <bound method EncryptedStateful.__del__ of <platform_EncryptedStateful.EncryptedStateful object at 0xf67a4c30>> ignored 09/25 13:28:09.030 WARNI| test:0606| Autotest caught exception when running test: Traceback (most recent call last): File "/usr/local/autotest/common_lib/test.py", line 600, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/common_lib/test.py", line 810, in _call_test_function raise error.UnhandledTestFail(e) UnhandledTestFail: Unhandled OSError: [Errno 2] No such file or directory: '/mnt/stateful_partition/.test-enc-stateful-hK9q2Y/mnt/stateful_partition/encrypted.needs-finalization' Traceback (most recent call last): File "/usr/local/autotest/common_lib/test.py", line 804, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/common_lib/test.py", line 461, in execute dargs) File "/usr/local/autotest/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/common_lib/test.py", line 376, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/tests/platform_EncryptedStateful/platform_EncryptedStateful.py", line 298, in run_once self.no_tpm() File "/usr/local/autotest/tests/platform_EncryptedStateful/platform_EncryptedStateful.py", line 291, in no_tpm encstate.check_sizes(finalized=False) File "/usr/local/autotest/tests/platform_EncryptedStateful/platform_EncryptedStateful.py", line 133, in check_sizes info = os.stat(keyfile) I logged into the DUT to get an idea of how the loopback devices are used (we've probably rebooted since the test, but getting an idea of how things look in steady state still makes sense). This is what I see: localhost ~ # losetup -a /dev/loop0: [45825]:11 (/mnt/stateful_partition/encrypted.block) /dev/loop1: [65024]:88222 (/opt/google/containers/android/system.raw.img) /dev/loop2: [65024]:88203 (/opt/google/containers/android/vendor.raw.img) /dev/loop3: [65024]:24212 (/usr/share/mount-passthrough/rootfs.squashfs) /dev/loop4: [65024]:24212 (/usr/share/mount-passthrough/rootfs.squashfs) /dev/loop5: [65024]:88252 (/opt/google/containers/arc-sdcard/rootfs.squashfs) /dev/loop6: [65024]:88258 (/opt/google/containers/arc-obb-mounter/rootfs.squashfs) As one can see, ARC++ eats 6 loop devices, and we use one for the stateful partition. The maximum is still at 8, so we're probably just running out of loop devices... I'll check against a local test device as well and might bump the number of loop devices in response.