security_SandboxedServices failing daisydog checks in VMTest |
||||||||||||||||
Issue descriptionFirst instance: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11919 Then we had a few green runs, followed by: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11923 After which it has been failing consistently. The relevant error log seems to be: 11/18 20:29:20.062 DEBUG| base_utils:0185| Running 'scanelf -qF'%s#F' -gs __asan_init `which debugd`' 11/18 20:29:20.076 DEBUG| asan:0026| running_on_asan(): symbol: '', _ASAN_SYMBOL: '__asan_init' 11/18 20:29:20.077 ERROR|security_Sandboxed:0270| daisydog: bad user: wanted "watchdog" but got "root" 11/18 20:29:20.081 WARNI|security_Sandboxed:0308| Stale baselines: set(['thermal.sh', 'attestationd', 'brcm_patchram_p', 'cromo', 'easy_unlock', 'sslh-fork', 'timberslide', 'wimax-manager', 'esif_ufd', 'arc-networkd', 'X', 'tpm_managerd', 'tcsd', 'arc-obb-mounter', 'lid_touchpad_he']) 11/18 20:29:20.083 WARNI|security_Sandboxed:0311| New services: set(['avahi-daemon', 'nacl_helper_non']) 11/18 20:29:20.085 ERROR|security_Sandboxed:0326| Failed sandboxing: ['daisydog'] 11/18 20:29:20.088 DEBUG| base_utils:0185| Running 'logger "autotest finished iteration /usr/local/autotest/results/default/security_SandboxedServices/sysinfo/iteration.1"' 11/18 20:29:20.091 WARNI| test:0606| Autotest caught exception when running test: Traceback (most recent call last): File "/usr/local/autotest/common_lib/test.py", line 600, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/common_lib/test.py", line 804, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/common_lib/test.py", line 461, in execute dargs) File "/usr/local/autotest/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/common_lib/test.py", line 376, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/tests/security_SandboxedServices/security_SandboxedServices.py", line 327, in run_once raise error.TestFail("One or more processes failed sandboxing")
,
Nov 19 2016
vapier knows about daisydog.
,
Nov 20 2016
/var/log/messages indicates the watchdog device doesn't actually exist on mario 2016-11-19T01:54:38.298342+00:00 WARNING kernel: [ 4.297259] init: daisydog main process (475) terminated with status 1 2016-11-19T01:54:38.298367+00:00 WARNING kernel: [ 4.297297] init: daisydog main process ended, respawning 2016-11-19T01:54:38.894185+00:00 WARNING kernel: [ 4.893339] init: daisydog main process (574) terminated with status 1 2016-11-19T01:54:38.894202+00:00 WARNING kernel: [ 4.893414] init: daisydog main process ended, respawning 2016-11-19T01:54:39.738149+00:00 WARNING kernel: [ 5.714259] init: daisydog main process (631) terminated with status 1 2016-11-19T01:54:39.738347+00:00 WARNING kernel: [ 5.714299] init: daisydog main process ended, respawning 2016-11-19T01:54:40.527066+00:00 WARNING kernel: [ 6.526282] init: daisydog main process (738) terminated with status 1 2016-11-19T01:54:40.530057+00:00 WARNING kernel: [ 6.529907] init: daisydog main process ended, respawning i don't see platform_HWwatchdog being run. is that test properly configured ?
,
Nov 20 2016
Release builder is also starting to fail: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-release/builds/5438 https://crosland.corp.google.com/log/9004.0.0..9005.0.0, I see one interesting minijail change: https://chromium-review.googlesource.com/#/c/412420/ (daisydog runs in minijail) That change was included in the first failure: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11919 Then the CL was committed as part of this master-paladin run: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/12926, where, for some reason, the test passed on x86-mario-paladin: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11922 Later builds starting failing frequently.
,
Nov 20 2016
,
Nov 20 2016
Revert here: https://chromium-review.googlesource.com/#/c/413160/
,
Nov 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/21d455364ac7091e72bcce63f4f137daeeb765ef commit 21d455364ac7091e72bcce63f4f137daeeb765ef Author: Nicolas Boichat <drinkcat@chromium.org> Date: Sun Nov 20 10:24:42 2016 Revert "minijail: bring in network namespace loopback support" This reverts commit e92f13937b1f096c490784b492a5f83c1d4ae99f, which appears to be causing failures in x86-mario-paladin. BUG= chromium:665649 BUG= chromium:667075 TEST=x86-mario-paladin VMTest security_SandboxedServices passes. Change-Id: Icfc559ef58dbe07843df1153425007a3c811deeb Reviewed-on: https://chromium-review.googlesource.com/413160 Commit-Queue: Nicolas Boichat <drinkcat@chromium.org> Commit-Queue: Sean Kau <skau@chromium.org> Tested-by: Nicolas Boichat <drinkcat@chromium.org> Trybot-Ready: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Sean Kau <skau@chromium.org> [rename] https://crrev.com/21d455364ac7091e72bcce63f4f137daeeb765ef/chromeos-base/chromeos-minijail/chromeos-minijail-0.0.1-r1470.ebuild
,
Nov 20 2016
Last 2 builds after revert seem fine: https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11934 https://uberchromegw.corp.google.com/i/chromeos/builders/x86-mario-paladin/builds/11935 Closing, let's follow-up in Issue 665649 .
,
Nov 22 2016
Mike, I don't think platform_HWwatchdog test is routinely run all platforms. Even though I wrote that test, I wasn't able to get it working on all possible platforms. My goal was to validate new platforms would have a working watchdog...but not all platforms use /dev/watchdog. :/
,
Nov 29 2016
i'm not sure how this is passing today short of sheer luck. running daisydog in any VM doesn't make sense because we don't enable any virtual/pseudo watchdog drivers which means daisydog will always fail to start. the reason this test is failing is that we run it after we expect the system to be in a quiescent state. but daisydog is in a restart loop here. and when we launch via minijail, there's a short window where the daisydog process is execed as root because the minijail preload hasn't yet executed the chuid/chgid yet which means the ps output is root:root. what i'm not seeing is why daisydog is in a restart loop in the first place. i would have expected upstart to start it, fail, respawn a few times, and then give up. then this test wouldn't have failed because daisydog would have stopped spawning by the time it ran.
,
Nov 29 2016
ok, grabbing the image and playing around, the issue is that the use of namespaces (plus running under a VM env which is naturally a little slower than hardware) is slowing the startup just enough to keep it below the threshold of the respawn cap. the default cap is 10 respawns within 5 seconds, but we're taking almost 1 second to startup+fail which means we stay below the threshold indefinitely (1 fail/sec is less than the required 2/fail per sec to be halted). i imagine my change to minijail where we bring up the loopback device slows things down just enough to go from breaking the threshold to staying under it. then the security_SandboxedServices autotest runs and gets upset. so the root cause is still that we shouldn't be spawning daisydog in the first place on VMs. or we should be providing a pseudo watchdog device.
,
Dec 6 2016
,
Dec 12 2016
,
Dec 13 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/daisydog/+/ed22845616b31b7e0116a354f7183572c7f912e6 commit ed22845616b31b7e0116a354f7183572c7f912e6 Author: Mike Frysinger <vapier@chromium.org> Date: Mon Dec 12 22:41:24 2016 init: fix startup check logic In my zeal to kill off pointless processes, I missed that this code needs to wait around for daisydog to return when checking for the previous boot status. Rework the env vars so that we don't pass -i during the check phase. BUG= chromium:667075 TEST=added some debugging and saw that the pre-script check phase processed output Change-Id: I005abffd4de489f8d0136491cb11b8515e60824a Reviewed-on: https://chromium-review.googlesource.com/419195 Commit-Ready: Mike Frysinger <vapier@chromium.org> Tested-by: Mike Frysinger <vapier@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/ed22845616b31b7e0116a354f7183572c7f912e6/daisydog.conf
,
Dec 14 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/daisydog/+/8a756a4d5b30069c5ae5013648785c3b808b57eb commit 8a756a4d5b30069c5ae5013648785c3b808b57eb Author: Mike Frysinger <vapier@chromium.org> Date: Mon Dec 12 22:45:52 2016 init: halt service if initial check fails If the pre-startup phase fails, then there's a problem with the watchdog. Stop upstart from trying to respawn us all the time in that case. This fixes the restart storm/spam seen when we build images for real devices, but then boot them in a VM where there is no watchdog service. BUG= chromium:667075 TEST=startup still works on a real device TEST=startup halts in a VM Change-Id: Icbd22f04dd434ad1e1fe8ae42a2385232b419be0 Reviewed-on: https://chromium-review.googlesource.com/419215 Commit-Ready: Mike Frysinger <vapier@chromium.org> Tested-by: Mike Frysinger <vapier@chromium.org> Reviewed-by: Mattias Nissler <mnissler@chromium.org> [modify] https://crrev.com/8a756a4d5b30069c5ae5013648785c3b808b57eb/daisydog.conf
,
Dec 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/0a157aaf3b45f79d914bf19980cb97ffce674245 commit 0a157aaf3b45f79d914bf19980cb97ffce674245 Author: Mike Frysinger <vapier@chromium.org> Date: Thu Nov 17 19:47:55 2016 minijail: bring in network namespace loopback support [reland] We've updated daisydog's init process to deal with failing in VMs, so the issue raised in chromium:667075 should be fixed now. Changes: 7559dfe9ed16 minijail: bring up loopback interface in new net namespaces 531ec3747e71 build: pass down CC from make env BUG= chromium:665649 BUG= chromium:667075 TEST=`sudo ./minijail0 -r -v -p -e /sbin/ip a s` shows loopback as up Change-Id: If8960e31b110f3a38ab2ef45810c42866ae1bfe7 Reviewed-on: https://chromium-review.googlesource.com/420383 Commit-Ready: Mike Frysinger <vapier@chromium.org> Tested-by: Mike Frysinger <vapier@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> [rename] https://crrev.com/0a157aaf3b45f79d914bf19980cb97ffce674245/chromeos-base/chromeos-minijail/chromeos-minijail-0.0.1-r1471.ebuild
,
Dec 15 2016
should be all set now
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
,
Jun 21 2018
|
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by akes...@chromium.org
, Nov 19 2016