Minijail misconfiguration of anomaly-collector |
|||||||||
Issue descriptionHi Mike, This following anomaly-collector behaviors have problems with current minijail configuration. 1) anomaly-collector needs to write to /var/lib/metrics/uma-events 2018-10-02T08:27:03.122016+00:00 ERR anomaly_collector[2178]: message repeated 10 times: [ /var/lib/metrics/uma-events: cannot open: No such file or directory] 2) crash_reporter launched by anomaly-collector cannot write crashes. 2018-10-02T08:27:03.130117+00:00 INFO crash_reporter[8664]: libminijail[3670]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001 2018-10-02T08:27:03.144772+00:00 INFO crash_reporter[8664]: Processing selinux violation: always collect from developer builds 2018-10-02T08:27:03.145111+00:00 INFO crash_reporter[8664]: Accessing crash dir '/var/spool/crash' via symlinked handle '/proc/self/fd/9' 2018-10-02T08:27:03.145246+00:00 WARNING crash_reporter[8664]: Failed to write audit message to /proc/self/fd/9/selinux_violation.20181002.082703.0.log: Read-only file system If you're reverting back to versions without minijail (in short-term) instead of fixing immediately, please revert crrev.com/c/1256303 (or prevent it from landing if it's still not landed when you see it) too.
,
Oct 3
Oh wow..anomaly-collector is completely broken right now. I was looking into some canary failures, and this popped up in the logs: 2018-10-03T00:45:22.003849+00:00 ERR anomaly_collector[5859]: Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such f ile or directory 2018-10-03T00:45:22.004582+00:00 CRIT anomaly_collector[5859]: Check failed: dbus->Connect(). Failed to connect to D-Bus#012 I don't think it was causing the crashes I was seeing, but anomaly-collector can't even talk to DBus. This needs to fixed or reverted ASAP: https://chromium-review.googlesource.com/1214636 Dunno why we don't have any automated tests screaming about this...
,
Oct 3
Hacking in a '-b /run' arg works to fix that for me ('-b /run/dbus' doesn't, because minimalistic-mounts doesn't have a /run directory on which to mount)...
,
Oct 3
,
Oct 3
I think my CL from #4 is in a bit of a Catch 22 right now -- because no crash reporting is working at all, we pass tests today (apparently we ignore timeouts when waiting for anomaly-collector...) and now with my CL, we fail with an SELinux warning: 2018-10-03T08:16:06.238590+00:00 NOTICE crash_sender.sh[16885]: Sending crash: 2018-10-03T08:16:06.241771+00:00 NOTICE crash_sender.sh[16886]: Metadata: /var/spool/crash/selinux_violation.20181003.081605.0.meta (log) 2018-10-03T08:16:06.244824+00:00 NOTICE crash_sender.sh[16887]: Payload: /var/spool/crash/selinux_violation.20181003.081605.0.log 2018-10-03T08:16:06.248806+00:00 NOTICE crash_sender.sh[16888]: Version: 11122.0.2018_10_02_2318 2018-10-03T08:16:06.252921+00:00 NOTICE crash_sender.sh[16889]: Product: ChromeOS 2018-10-03T08:16:06.255681+00:00 NOTICE crash_sender.sh[16890]: URL: https://clients2.google.com/cr/report 2018-10-03T08:16:06.258935+00:00 NOTICE crash_sender.sh[16891]: Board: betty 2018-10-03T08:16:06.262406+00:00 NOTICE crash_sender.sh[16892]: HWClass: undefined 2018-10-03T08:16:06.266483+00:00 NOTICE crash_sender.sh[16893]: write_payload_size: 209 2018-10-03T08:16:06.270771+00:00 NOTICE crash_sender.sh[16894]: send_payload_size: 209 2018-10-03T08:16:06.274790+00:00 NOTICE crash_sender.sh[16895]: sig: 2da11821-selinux-granted-u:r:chromeos:s0-u:object_r:cros_crash_reporter_exec:s0-execute-crashreporter- I'm not that familiar with SELinux -- is this something I just have to whitelist, since it looks harmless?
,
Oct 3
ignore selinux-violation crashes atm. they're advisory for now and the selinux guys are watching them.
,
Oct 3
Can you add a pointer to the full log? The line with "selinux" in it doesn't seem like an error or a warning. Obviously something generated a crash report, but there isn't enough information here.
,
Oct 3
@6: Unfortunately, I can't just "ignore" them, since the pre-cq (logging_UserCrash) doesn't :) https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8933712471968062912
,
Oct 3
I'll just disable the selinux collector if it's blocking tests until it can be sorted out
,
Oct 3
IIUC, this *really* needs to get sorted out now. If we can't disable the selinux collector and land this stuff soon, we need to back out either sandboxing or dbus usage. I think CQ managed to hit this: https://luci-milo.appspot.com/buildbot/chromeos/nyan_big-paladin/6165 10/03 01:40:30.050 WARNI|security_Sandboxed:0382| New services: set(['btdispatch', 'crash_reporter', 'cros-disks']) 10/03 01:40:30.055 ERROR|security_Sandboxed:0395| Failed sandboxing: crash_reporter 10/03 01:40:30.063 DEBUG| test:0381| Test failed due to One or more processes failed sandboxing: defaultdict(<type 'list'>, {'crash_reporter': ['missing euser']}). Exception log follows the after_iteration_hooks. ... and syslog: 2018-10-03T08:40:29.959262+00:00 ERR anomaly_collector[6788]: Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory 2018-10-03T08:40:29.960359+00:00 CRIT anomaly_collector[6788]: Check failed: dbus->Connect(). Failed to connect to D-Bus#012 2018-10-03T08:40:29.984664+00:00 INFO crash_reporter[6795]: libminijail[6795]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001 2018-10-03T08:40:30.042951+00:00 NOTICE autotest[6800]: 01:40:30.037 WARNI|security_Sandboxed:0379| Stale baselines: defaultdict(<type 'list'>, {'vold': ['unused'], 'anomaly_collect': ['potentially missing flags: pidns,mntns,caps'], 'attestationd': ['unused'], 'debuggerd': ['unused'], 'timberslide': ['unused'], 'app_process': ['unused'], 'surfaceflinger': ['unused'], 'cromo': ['unused'], 'arc-obb-mounter': ['unused'], 'midis': ['unused'], 'thermal.sh': ['unused'], 'arc-oemcrypto': ['unused'], 'cros_camera_service': ['unused'], 'cras': ['potentially missing flags: filter'], 'tlsdated': ['potentially missing flags: pidns,mntns'], 'easy_unlock': ['unused'], 'trunksd': ['unused'], 'debuggerd:sig': ['unused'], 'wimax-manager': ['unused'], 'esif_ufd': ['unused'], 'logd': ['unused'], 'servicemanager': ['unused'], 'tpm_managerd': ['unused'], 'cros_camera_algo': ['unused'], 'sslh-fork': ['unused'], 'tlsdated-setter': ['potentially missing flags: pidns,mntns'], 'boot_latch': ['unused'], 'dhcpcd': ['potentially missing flags: nonewprivs'], 'netfilter-queue': ['potentially mi 2018-10-03T08:40:30.054554+00:00 NOTICE autotest[6801]: 01:40:30.050 WARNI|security_Sandboxed:0382| New services: set(['btdispatch', 'crash_reporter', 'cros-disks']) 2018-10-03T08:40:30.060655+00:00 NOTICE autotest[6802]: 01:40:30.055 ERROR|security_Sandboxed:0395| Failed sandboxing: crash_reporter Basically, anomaly_collector is crash-looping, which launches crash_sender -- and our security tests don't expect crash_sender to be running. I guess maybe we could add a 'root' entry for crash_sender too, in the unlikely case that we're in the middle of crash handling during this test? I'll file a separate bug.
,
Oct 3
Bug 891745 for security_SandboxedServices
,
Oct 3
,
Oct 3
mmm, crash_sender is not kicked off by anything, it's on the same schedule everywhere regardless. but crash_sender!=crash_reporter which is what you highlighted there.
,
Oct 3
,
Oct 3
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/881812bab12fe462752b347ced064cc008681b38 commit 881812bab12fe462752b347ced064cc008681b38 Author: Mike Frysinger <vapier@chromium.org> Date: Wed Oct 03 20:59:27 2018 crash: anomaly-collector: enable writing to spool & metrics dirs Since we need to write to these paths, make sure we mount them. BUG= chromium:891203 TEST=booted on a system and checked reports were collected Change-Id: I810b9555eac4172c0587a6d6d5f9b6109e4d7b45 Reviewed-on: https://chromium-review.googlesource.com/1255649 Commit-Ready: Mike Frysinger <vapier@chromium.org> Tested-by: Mike Frysinger <vapier@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> [modify] https://crrev.com/881812bab12fe462752b347ced064cc008681b38/crash-reporter/init/anomaly-collector.conf
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/a43ede8c3b22d9c8b2b49d97f5b56d5aae27d067 commit a43ede8c3b22d9c8b2b49d97f5b56d5aae27d067 Author: Brian Norris <briannorris@chromium.org> Date: Wed Oct 03 21:59:07 2018 crash: anomaly-collector: bind mount /run/dbus in minijail We need to talk to dbus -- otherwise we'll see things like this in syslog: ERR anomaly_collector[5859]: Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory CRIT anomaly_collector[5859]: Check failed: dbus->Connect(). Failed to connect to D-Bus#012 and anomaly-collector will keep crashing. Note CL:1224637, which recently added dbus dependencies. BUG= chromium:891203 TEST=see anomaly-collector start without dying CQ-DEPEND=CL:1259303 Change-Id: I09c88d2dc8981ad1db194cea972a6ccbd3a3fd20 Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/c/1258409 Trybot-Ready: Mike Frysinger <vapier@chromium.org> Reviewed-by: Dan Erat <derat@chromium.org> Reviewed-by: Luigi Semenzato <semenzato@chromium.org> Reviewed-by: Ben Chan <benchan@chromium.org> [modify] https://crrev.com/a43ede8c3b22d9c8b2b49d97f5b56d5aae27d067/crash-reporter/init/anomaly-collector.conf
,
Oct 3
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by f...@chromium.org
, Oct 2