minijail_enter crash in various chromebox daemons (mimo-monitor, cecservice, atrusd, huddly-monitor) |
|||
Issue descriptionhttps://crash.corp.google.com/browse?q=product_name%3D%27ChromeOS%27+AND+product.Version+LIKE+%27108%25.0.0%27+AND+EXISTS+%28SELECT+1+FROM+UNNEST%28CrashedStackTrace.StackFrame%29+WHERE+FunctionName%3D%27minijail_enter%27%29+AND+NOT+EXISTS+%28SELECT+1+FROM+UNNEST%28productdata%29+WHERE+Key%3D%27exec_name%27+AND+Value+IN+%28%27apk-cache-cleaner%27%2C+%27memd%27%29%29#-property-selector,-productname:1000,stablesignature:50,hardwareclass:50,magicsignaturesorted:50 Sample crash: https://crash.corp.google.com/e497235e7fa5d3f9 0x00007fe18e155dd2 (libc-2.23.so -raise.c:54 ) raise 0x00007fe18e157bf5 (libc-2.23.so -abort.c:89 ) abort 0x00007fe18e7f2501 (libminijailpreload.so -libminijail.c ) minijail_enter 0x00007fe18e7efb42 (libminijailpreload.so -libminijailpreload.c:81 ) fake_main 0x00007fe18e142735 (libc-2.23.so -libc-start.c:289 ) __libc_start_main 0x00007fe18e80d188 (mimo-monitor + 0x00002188 ) _start 0x00007ffcdbf32037 0x00007fe18e80d15f (mimo-monitor + 0x0000215f ) _init This is only happening on zako, panther (which are using v3.8 kernels). By inspecting some of the service_failure logs[1], I was able to determine that the crash is caused due to the fact that libminijail is trying to enter a new cgroup namespace, which has not been backported to v3.8 kernels (only up to 3.14 AFAIK): 2018-07-08T10:14:51.689106-05:00 ERR cecservice[1311]: libminijail[2]: unshare(CLONE_NEWCGROUP) failed: Invalid argument The solution is to conditionally add the -N flag to minijail, only if the kernel version is >= 3.14. 1: https://crash.corp.google.com/browse?q=product_name%3D%27ChromeOS%27+AND+EXISTS+%28SELECT+1+FROM+UNNEST%28productdata%29+WHERE+Key%3D%27exec_name%27+AND+Value%3D%27service-failure%27%29+AND+product.Version+LIKE+%27108%25.0.0%27+AND+stable_signature+LIKE+%27%25cecservice%25%27&stbtiq=&reportid=&index=0#2 2:
,
Jul 10
Oh almost forgot, another alternative (probably the cleanest) is to make all these services raise the SIGSTOP signal and change the stanza from "expect fork" to "expect stop": http://upstart.ubuntu.com/cookbook/#expect-stop
,
Jul 10
for now, i would just drop the -N flag from any daemon that expects to be run on 3.8/3.10 devices. it's what we've been doing elsewhere. we opted to not change minijail to silently ignore -N on older systems. does SIGSTOP allow for any forking of children ? if not, that won't be as useful :(.
,
Jul 10
Ugh, I misinterpreted the documentation to mean that 'expect stop' would both tell upstart what the PID to be tracked is AND the moment in which it has finished initializing, but that's not the case[1]. Dropping the -N flag seems straightforward, so https://chromium-review.googlesource.com/q/topic:%22fizz_cgroup_namespaces%22+(status:open%20OR%20status:merged) should fix all the crashes. Mostly for my own reference, tryjob posted here: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8941375536239901072 1: https://bugs.launchpad.net/upstart-cookbook/+bug/1394744
,
Jul 11
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/cfm-device-monitor/+/8e6c28caa1ab93e3efea4fe81ae03c29cd37b870 commit 8e6c28caa1ab93e3efea4fe81ae03c29cd37b870 Author: Luis Hector Chavez <lhchavez@google.com> Date: Wed Jul 11 01:47:29 2018 Stop requesting entering a cgroup namespace This change avoids passing the -N flag to minijail0 since both huddly-monitor and mimo-monitor run on kernels that don't support cgroup namespaces. This avoids a crash on such older devices. BUG= chromium:861994 TEST=fizz tryjob Change-Id: If99ccdf2cd14b4effa8a4cf06ea3db387ac11468 Reviewed-on: https://chromium-review.googlesource.com/1131535 Commit-Ready: Luis Hector Chavez <lhchavez@chromium.org> Tested-by: Luis Hector Chavez <lhchavez@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/8e6c28caa1ab93e3efea4fe81ae03c29cd37b870/init/huddly-monitor.conf [modify] https://crrev.com/8e6c28caa1ab93e3efea4fe81ae03c29cd37b870/init/mimo-monitor.conf
,
Jul 11
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/8c0535ca8583848f2be8a3cef554f9eba728ea1c commit 8c0535ca8583848f2be8a3cef554f9eba728ea1c Author: Luis Hector Chavez <lhchavez@google.com> Date: Wed Jul 11 19:13:10 2018 cecservice: Stop requesting entering a cgroup namespace This change avoids passing the -N flag to minijail0 since cecservice run on kernels that don't support cgroup namespaces. This avoids a crash on such older devices. BUG= chromium:861994 TEST=fizz tryjob Change-Id: I669dc3b0186e8b8c3e63ea22745d089f1e98f84b Reviewed-on: https://chromium-review.googlesource.com/1131540 Commit-Ready: Luis Hector Chavez <lhchavez@chromium.org> Tested-by: Luis Hector Chavez <lhchavez@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/8c0535ca8583848f2be8a3cef554f9eba728ea1c/cecservice/share/cecservice.conf
,
Jul 12
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/atrusctl/+/f2f9d8df9f307aea2f0c269c81ab7f104b8a4a20 commit f2f9d8df9f307aea2f0c269c81ab7f104b8a4a20 Author: Luis Hector Chavez <lhchavez@google.com> Date: Thu Jul 12 18:35:40 2018 Clean up the minijail0 invocation This change uses /var/empty as the chroot to match the way we invoke the rest of the services. It also stops creating/deleting the chroot directory. BUG=b:65450844 BUG= chromium:849455 BUG= chromium:861994 TEST=fizz tryjob Change-Id: I6a76cc92d93bdb8f7edf2990cb0cf219ac20f4ff Reviewed-on: https://chromium-review.googlesource.com/1087681 Commit-Ready: Luis Hector Chavez <lhchavez@chromium.org> Tested-by: Luis Hector Chavez <lhchavez@chromium.org> Reviewed-by: Emil Lundmark <lndmrk@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/f2f9d8df9f307aea2f0c269c81ab7f104b8a4a20/init/atrusd.conf
,
Jul 12
|
|||
►
Sign in to add a comment |
|||
Comment 1 by lhchavez@chromium.org
, Jul 10Owner: egemih@chromium.org