cheets_StartAndroid.stress failed on cyan-chrome-pfq with arc_setup SIGABRT |
|||
Issue descriptionHWTest [bvt-arc] failed on cyan-chrome-pfq in cheets_StartAndroid.stress: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8946478051673404944 INFO ---- ---- kernel=3.18.0-17579-gfb99dee6e87a localtime=May 15 03:26:26 timestamp=1526379986 START ---- ---- timestamp=1526380008 localtime=May 15 03:26:48 START cheets_StartAndroid.stress cheets_StartAndroid.stress timestamp=1526380008 localtime=May 15 03:26:48 FAIL cheets_StartAndroid.stress cheets_StartAndroid.stress timestamp=1526380520 localtime=May 15 03:35:20 Android did not boot! END FAIL cheets_StartAndroid.stress cheets_StartAndroid.stress timestamp=1526380520 localtime=May 15 03:35:20 END GOOD ---- ---- timestamp=1526380523 localtime=May 15 03:35:23 INFO ---- ---- timestamp=1526380566 localtime=May 15 03:36:06 Start crashcollection record INFO ---- New Crash Dump timestamp=1526380566 localtime=May 15 03:36:06 /usr/local/autotest/results/200265399-chromeos-test/chromeos4-row11-rack10-host19/cheets_StartAndroid.stress/sysinfo/iteration.1/var/spool/crash/arc_setup.20180515.033319.15336.dmp INFO ---- ---- timestamp=1526380566 localtime=May 15 03:36:06 End crashcollection record I see SELinux violations like this: type=1400 avc: denied { use } comm="init" path="/var/log/android/android-run_oci.20180515-032553" dev="dm-1" ino=322 scontext=u:r:init:s0 tcontext=u:r:chromeos:s0 tclass=fd permissive=0 and this arc_setup crash: Operating system: Linux 0.0.0 Linux 3.18.0-17579-gfb99dee6e87a #1 SMP PREEMPT Mon May 14 13:47:52 PDT 2018 x86_64 CPU: amd64 family 6 model 76 stepping 3 2 CPUs GPU: UNKNOWN Crash reason: SIGABRT Crash address: 0x0 Process uptime: not available Thread 0 (crashed) 0 libc-2.23.so!raise [raise.c : 54 + 0x10] rax = 0x0000000000000000 rdx = 0x0000000000000006 rcx = 0xffffffffffffffff rbx = 0x0000000000000000 rsi = 0x0000000000003be8 rdi = 0x0000000000003be8 rbp = 0x00007ffdc1cc24c0 rsp = 0x00007ffdc1cc2398 r8 = 0x00007ffdc1cc23e8 r9 = 0x00007ffdc1cc23de r10 = 0x0000000000000008 r11 = 0x0000000000000206 r12 = 0x00007ffdc1cc2980 r13 = 0x00007ffdc1cc24f8 r14 = 0x00007ffdc1cc2990 r15 = 0x00007ffdc1cc2988 rip = 0x00007349b7f9ddd2 Found by: given as instruction pointer in context 1 libc-2.23.so!abort [abort.c : 89 + 0xa] rbx = 0x0000000000000000 rbp = 0x00007ffdc1cc24c0 rsp = 0x00007ffdc1cc23a0 r12 = 0x00007ffdc1cc2980 r13 = 0x00007ffdc1cc24f8 r14 = 0x00007ffdc1cc2990 r15 = 0x00007ffdc1cc2988 rip = 0x00007349b7f9fbf6 Found by: call frame info 2 libbase-core-395517.so!base::debug::BreakDebugger() [debugger_posix.cc : 219 + 0x5] rbx = 0x0000000000000000 rbp = 0x00007ffdc1cc24d0 rsp = 0x00007ffdc1cc24d0 r12 = 0x00007ffdc1cc2980 r13 = 0x00007ffdc1cc24f8 r14 = 0x00007ffdc1cc2990 r15 = 0x00007ffdc1cc2988 rip = 0x00007349b87dee41 Found by: call frame info 3 libbase-core-395517.so!logging::LogMessage::~LogMessage() [logging.cc : 755 + 0x5] rbx = 0x0000000000000000 rbp = 0x00007ffdc1cc2940 rsp = 0x00007ffdc1cc24e0 r12 = 0x00007ffdc1cc2980 r13 = 0x00007ffdc1cc24f8 r14 = 0x00007ffdc1cc2990 r15 = 0x00007ffdc1cc2988 rip = 0x00007349b8802754 Found by: call frame info 4 arc-setup!arc::ArcSetup::ContinueContainerBoot(arc::ArcBootType, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) [arc_setup.cc : 1648 + 0x33] rbx = 0x00007ffdc1cc2980 rbp = 0x00007ffdc1cc2be0 rsp = 0x00007ffdc1cc2950 r12 = 0x0000000000000021 r13 = 0x0000000000000210 r14 = 0x0000000000000018 r15 = 0x0000605b362dc720 rip = 0x0000605b355cefd3 Found by: call frame info 5 arc-setup!arc::ArcSetup::OnBootContinue() [arc_setup.cc : 1856 + 0xe] rbx = 0x00007ffdc1cc2c20 rbp = 0x00007ffdc1cc2d90 rsp = 0x00007ffdc1cc2bf0 r12 = 0x0000000000000001 r13 = 0x00000000c1cc2e01 r14 = 0x00007ffdc1cc2ef8 r15 = 0x00007ffdc1cc2f00 rip = 0x0000605b355d08f2 Found by: call frame info 6 arc-setup!arc::ArcSetup::Run() [arc_setup.cc : 1921 + 0x8] rbx = 0x00007ffdc1cc2ef8 rbp = 0x00007ffdc1cc2ee0 rsp = 0x00007ffdc1cc2da0 r12 = 0x0000605b355e2f20 r13 = 0x00007ffdc1cc30b0 r14 = 0x00007ffdc1cc30b8 r15 = 0x00007ffdc1cc2f30 rip = 0x0000605b355d0d15 Found by: call frame info 7 arc-setup!main [main.cc : 24 + 0x8] rbx = 0x00007ffdc1cc2ef8 rbp = 0x00007ffdc1cc2fc0 rsp = 0x00007ffdc1cc2ef0 r12 = 0x0000605b355e2f20 r13 = 0x00007ffdc1cc30b0 r14 = 0x00007ffdc1cc30b8 r15 = 0x00007ffdc1cc2f30 rip = 0x0000605b355bff69 Found by: call frame info 8 libc-2.23.so!__libc_start_main [libc-start.c : 289 + 0x1a] rbx = 0x0000000000000000 rbp = 0x00007ffdc1cc3090 rsp = 0x00007ffdc1cc2fd0 r12 = 0x0000605b355e2f20 r13 = 0x00007ffdc1cc30b0 r14 = 0x0000000000000000 r15 = 0x0000000000000000 rip = 0x00007349b7f8a736 Found by: call frame info 9 arc-setup!_start + 0x29 rbx = 0x0000000000000000 rbp = 0x0000000000000000 rsp = 0x00007ffdc1cc30a0 r12 = 0x0000605b355bfda0 r13 = 0x00007ffdc1cc30b0 r14 = 0x0000000000000000 r15 = 0x0000000000000000 rip = 0x0000605b355bfdc9 Found by: call frame info 10 0x7ffdc1cc30a8 rbx = 0x0000000000000000 rbp = 0x0000000000000000 rsp = 0x00007ffdc1cc30a8 r12 = 0x0000605b355bfda0 r13 = 0x00007ffdc1cc30b0 r14 = 0x0000000000000000 r15 = 0x0000000000000000 rip = 0x00007ffdc1cc30a8 Found by: call frame info Loaded modules: 0x605b355ba000 - 0x605b355ebfff arc-setup ??? (main) 0x7349b6e70000 - 0x7349b6e85fff libgcc_s.so.1 ??? 0x7349b708b000 - 0x7349b708efff libattr.so.1.1.0 ??? 0x7349b7091000 - 0x7349b70d4fff libdbus-1.so.3.14.8 ??? 0x7349b70d8000 - 0x7349b711efff libprotobuf-lite.so.13.0.0 ??? 0x7349b7123000 - 0x7349b7129fff libinstallattributes-395517.so ??? 0x7349b712d000 - 0x7349b7142fff libz.so.1.2.11 ??? 0x7349b7146000 - 0x7349b7148fff libplds4.so ??? 0x7349b714b000 - 0x7349b714efff libplc4.so ??? 0x7349b7151000 - 0x7349b7152fff libdl-2.23.so ??? 0x7349b7355000 - 0x7349b73d5fff libpcre.so.1.2.8 ??? 0x7349b73d8000 - 0x7349b7438fff libbase-dbus-395517.so ??? 0x7349b743e000 - 0x7349b74a4fff libbrillo-core-395517.so ??? 0x7349b74aa000 - 0x7349b757afff libpolicy-395517.so ??? 0x7349b7587000 - 0x7349b773dfff libcrypto.so.1.0.0 ??? 0x7349b7768000 - 0x7349b77b8fff libssl.so.1.0.0 ??? 0x7349b77c5000 - 0x7349b77f4fff libnspr4.so ??? 0x7349b77fb000 - 0x7349b781afff libnssutil3.so ??? 0x7349b7823000 - 0x7349b792dfff libnss3.so ??? 0x7349b7938000 - 0x7349b7a3cfff libm-2.23.so ??? 0x7349b7c3e000 - 0x7349b7c61fff libevent_core-2.1.so.6.0.2 ??? 0x7349b7c64000 - 0x7349b7d5efff libglib-2.0.so.0.5200.3 ??? 0x7349b7d62000 - 0x7349b7d67fff librt-2.23.so ??? 0x7349b7f6a000 - 0x7349b810afff libc-2.23.so ??? 0x7349b8315000 - 0x7349b8358fff libc++abi.so.1.0 ??? 0x7349b835c000 - 0x7349b841dfff libc++.so.1.0 ??? 0x7349b8429000 - 0x7349b843ffff libpthread-2.23.so ??? 0x7349b8646000 - 0x7349b8668fff libselinux.so.1 ??? 0x7349b866e000 - 0x7349b86affff libmetrics-395517.so ??? 0x7349b86b4000 - 0x7349b86d8fff ld-2.23.so ??? 0x7349b86de000 - 0x7349b86e1fff libcap.so.2.24 ??? 0x7349b86e5000 - 0x7349b86ebfff libfdt-1.4.4.so ??? 0x7349b86f3000 - 0x7349b8704fff libminijail.so ??? 0x7349b870e000 - 0x7349b871dfff libcros_config.so ??? 0x7349b8721000 - 0x7349b8746fff libbase-crypto-395517.so ??? 0x7349b874b000 - 0x7349b88bdfff libbase-core-395517.so ??? 0x7ffdc1cca000 - 0x7ffdc1ccbfff linux-gate.so ???
,
May 16 2018
,
May 16 2018
I don't have further progress, but on the other hand, I couldn't find similar pattern of failure in recent 2 weeks of logs: https://stainless.corp.google.com/search?view=matrix&row=build&col=model&first_date=2018-05-02&last_date=2018-05-16&test=cheets_StartAndroid%5C.&build=R68&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false (Most of the red cells were crbug.com/842939 or b/78436647) It might have been a very rare race
,
May 16 2018
I checked crash/ logs, and looks like about 0.03% of container startup trials seem to have failed this way. I'll add some more code to arc-setup so we can know why it failed in crash/. I'll also make sure Chrome will automatically restart the container when this happens. Reproducing this locally might be difficult.
,
May 19 2018
nsenter code is here: https://github.com/karelzak/util-linux/blob/master/sys-utils/nsenter.c#L425 It's setns() failure.
,
May 21 2018
FYI: Re #4 > Chrome will automatically restart the container when this happens. At the moment, I don't think Chrome restarts in this case, because it restarts ARC if ARC already establishes the ABS Mojo channel once.
,
May 21 2018
#7 Argh, I see. Maybe we should, at least when the user clicks one of the app icons?
,
May 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/6063fd7f91acac5b339fd0ccf93f67801d3039ab commit 6063fd7f91acac5b339fd0ccf93f67801d3039ab Author: yusukes <yusukes@google.com> Date: Wed May 23 01:45:41 2018 arc-setup: tell crash/ the reason of container upgrade failure The frequency is very low, but crash/ says there are cases where the 'nsenter arcbootcontinue' command fails. This CL tries to tell crash/ whether or not the crash is from nsenter, and not from arcbootcontinue. BUG= chromium:843140 TEST=Add 'kill -9 $CONTAINER_PID &' to arc-boot-continue.conf, confirm the new check catches the container shutdown. Change-Id: I76ec337ebefb2986c1b52af4c4e087b406fa2bdf Reviewed-on: https://chromium-review.googlesource.com/1066287 Commit-Ready: Yusuke Sato <yusukes@chromium.org> Tested-by: Yusuke Sato <yusukes@chromium.org> Reviewed-by: Yusuke Sato <yusukes@chromium.org> Reviewed-by: Luis Hector Chavez <lhchavez@chromium.org> [modify] https://crrev.com/6063fd7f91acac5b339fd0ccf93f67801d3039ab/arc/setup/arc_setup_util.h [modify] https://crrev.com/6063fd7f91acac5b339fd0ccf93f67801d3039ab/arc/setup/arc_setup.cc [modify] https://crrev.com/6063fd7f91acac5b339fd0ccf93f67801d3039ab/arc/setup/arc_setup_util.cc [modify] https://crrev.com/6063fd7f91acac5b339fd0ccf93f67801d3039ab/arc/setup/arc_setup_util_unittest.cc
,
May 30 2018
I'd close this for now as the frequency of the crash is low (and I cannot reproduce this at all.) Next time this happens, the log will have better information. |
|||
►
Sign in to add a comment |
|||
Comment 1 by kinaba@chromium.org
, May 16 2018