peppy-chrome-pfq failing login_VMSanity |
|||||||
Issue descriptionhttps://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8931951705679116720 peppy seems to be failing at login_VMSanity due to the DUT rebooting during the test run. Is this related to Issue 896888, Issue 897278 ?
,
Oct 22
Seems unlikely. Those bugs are about Chrome crashes or login issues, while this is about the whole DUT rebooting. In cases like this, I'd recommend looking at the DUT's logs at http://stainless/browse/chromeos-autotest-results/250959230-chromeos-test/ to try to figure out what happened. autoserv.INFO has the following: 10/22 11:39:47.926 INFO | server_job:0217| START ---- ---- timestamp=1540233587 localtime=Oct 22 11:39:47 10/22 11:39:47.989 INFO | server_job:0217| START login_VMSanity login_VMSanity timestamp=1540233587 localtime=Oct 22 11:39:47 10/22 11:55:06.250 INFO | ssh_multiplex:0079| Master ssh connection to chromeos4-row6-rack13-host7 is down. 10/22 11:55:06.251 INFO | ssh_multiplex:0096| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_aDnp31ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row6-rack13-host7' 10/22 11:55:08.508 INFO | server_job:0217| FAIL ---- ---- timestamp=1540234508 localtime=Oct 22 11:55:08 Autotest client terminated unexpectedly: DUT rebooted during the test run. 10/22 11:55:08.508 INFO | server_job:0217| END FAIL ---- ---- timestamp=1540234508 localtime=Oct 22 11:55:08 10/22 11:55:08.509 INFO | server_job:0217| END GOOD ---- ---- timestamp=1540234508 localtime=Oct 22 11:55:08 The messages file has the following: 2018-10-22T18:32:52.660099+00:00 INFO session_manager[4262]: [INFO:session_manager_service.cc(243)] Browser is 4279 2018-10-22T18:32:52.699048+00:00 INFO debugd[4304]: libminijail[4304]: mount '/' -> '/' type '' flags 0x1001 2018-10-22T18:32:52.699061+00:00 INFO debugd[4304]: libminijail[4304]: mount 'none' -> '/proc' type 'proc' flags 0xe 2018-10-22T18:32:52.699067+00:00 INFO debugd[4304]: libminijail[4304]: mount '/var' -> '/var' type '' flags 0x1000 2018-10-22T18:32:52.699071+00:00 INFO debugd[4304]: libminijail[4304]: mount 'tmpfs' -> '/mnt' type 'tmpfs' flags 0xe 2018-10-22T18:32:52.699080+00:00 INFO debugd[4304]: libminijail[4304]: mount '/mnt/stateful_partition/unencrypted/cache/vpd' -> '/mnt/stateful_partition/unencrypted/cache/vpd' type '' flags 0x1000 2018-10-22T18:32:52.699086+00:00 INFO debugd[4304]: libminijail[4304]: mount 'tmpfs' -> '/run' type 'tmpfs' flags 0xe 2018-10-22T18:32:52.699090+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/dbus' -> '/run/dbus' type '' flags 0x1001 2018-10-22T18:32:52.699098+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/cups' -> '/run/cups' type '' flags 0x1001 2018-10-22T18:32:52.699104+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/ippusb' -> '/run/ippusb' type '' flags 0x1001 2018-10-22T18:32:52.699111+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/shill' -> '/run/shill' type '' flags 0x1001 2018-10-22T18:32:52.699118+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/arc/bugreport' -> '/run/arc/bugreport' type '' flags 0x1001 2018-10-22T18:32:52.699123+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/containers' -> '/run/containers' type '' flags 0x1001 2018-10-22T18:32:52.699128+00:00 INFO debugd[4304]: libminijail[4304]: mount '/dev' -> '/dev' type 'bind' flags 0x5000 2018-10-22T18:32:52.699132+00:00 INFO debugd[4304]: libminijail[4304]: mount '/sys' -> '/sys' type 'bind' flags 0x5000 2018-10-22T18:32:52.699138+00:00 INFO debugd[4304]: libminijail[4304]: mount '/run/tcsd' -> '/run/tcsd' type '' flags 0x1001 2018-10-22T18:32:53.115657+00:00 NOTICE ag[4319]: autotest server[stack::<lambda>|get_chrome_session_ident|run] -> ssh_run(bootstat_get_last login-prompt-visible) <many lines of garbage bytes> INFO kernel: [ 0.000000] Initializing cgroup subsys cpu 2018-10-22T11:40:08.168741-07:00 NOTICE kernel: [ 0.000000] Linux version 3.8.11 (chrome-bot@swarm-cros-420) (gcc version 4.9.x 20150123 (prerelease) (4.9.2_cos_gg_4.9.2-r197-ac6128e0a17a52f011797f33ac3e7d6273a9368d_4.9.2-r197) ) #1 SMP Sun Oct 21 03:01:42 PDT 2018 2018-10-22T11:40:08.168746-07:00 INFO kernel: [ 0.000000] Command line: cros_secure console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 2539520 verity payload=PARTUUID=dc327b91-b967-b447-a1cf-73b5843ea66f/PARTNROFF=1 hashtree=PARTUUID=dc327b91-b967-b447-a1cf-73b5843ea66f/PARTNROFF=1 hashstart=2539520 alg=sha1 root_hexdigest=5a1cc062a82deb06fc676ae2e83b5d4eb93c10bb salt=fe05a1eac44a65fd31509007a18f3e94d51ce74f57b55fe420f6a9465ef01bac" noinitrd cros_debug vt.global_cursor_default=0 kern_guid=dc327b91-b967-b447-a1cf-73b5843ea66f add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic iTCO_vendor_support.vendorsupport=3 eventlog suggests that there was a kernel panic: 215 | 2018-10-22 11:22:49 | System boot | 55349 216 | 2018-10-22 11:22:49 | System Reset 217 | 2018-10-22 11:40:05 | Kernel Event | Oops 218 | 2018-10-22 11:40:05 | Kernel Event | Panic 219 | 2018-10-22 11:40:06 | System boot | 55350 220 | 2018-10-22 11:40:06 | System Reset And then console-ramoops contains the following: [ 1035.080018] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 1035.080053] IP: [<ffffffff979a9e96>] list_del_init+0x8/0x1b [ 1035.080087] PGD 0 [ 1035.080104] Oops: 0002 [#1] SMP [ 1035.082503] gsmi: Log Shutdown Reason 0x03 [ 1035.082523] Modules linked in: i2c_dev cmac rfcomm uinput zram(C) zsmalloc(C) aesni_intel xts aes_x86_64 lrw ablk_helper cryptd memconsole ath9k_btcoex ath9k_common_btcoex ath9k_hw_btcoex ath mac80211 snd_hda_codec_realtek smsc75xx snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc fuse cfg80211 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer smsc95xx usbnet ath3k btusb btrtl btbcm btintel bluetooth uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev [ 1035.082881] CPU 3 [ 1035.082895] Pid: 10367, comm: TaskSchedulerFo Tainted: G WC 3.8.11 #1 [ 1035.082919] RIP: 0010:[<ffffffff979a9e96>] [<ffffffff979a9e96>] list_del_init+0x8/0x1b [ 1035.082949] RSP: 0018:ffff88013a9c5c28 EFLAGS: 00010212 [ 1035.082965] RAX: ffff88013a9c5c38 RBX: ffff88016c031540 RCX: ffff88017cb91540 [ 1035.082984] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88016c031540 [ 1035.083003] RBP: ffff88013a9c5c70 R08: 0000000000000000 R09: ffffffff980b7ba8 [ 1035.083021] R10: ffffffff980b7bb0 R11: 0000000000000008 R12: ffff88013a9c5c38 [ 1035.083040] R13: ffff88016c031680 R14: ffff880177c080d8 R15: ffff880135f1b738 [ 1035.083060] FS: 00007fde3f67b700(0000) GS:ffff88017cb80000(0000) knlGS:0000000000000000 [ 1035.083080] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1035.083097] CR2: 0000000000000008 CR3: 000000001800c000 CR4: 00000000001607e0 [ 1035.083115] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1035.083134] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1035.083154] Process TaskSchedulerFo (pid: 10367, threadinfo ffff88013a9c4000, task ffff880135f1b480) [ 1035.083174] Stack: [ 1035.083184] ffff88013a9c5c70 ffffffff979aa637 ffff88016c031540 ffff88016c031680 [ 1035.083221] 00000000fd5ad90a ffff88013a9c5c80 ffff8801767ef8a0 ffff88013a9c5c80 [ 1035.083257] ffff88013a9c5d58 ffff88013a9c5cb0 ffffffff979ac959 ffff88016b4d6640 [ 1035.083294] Call Trace: [ 1035.083314] [<ffffffff979aa637>] ? namespace_unlock+0x98/0x10e [ 1035.083334] [<ffffffff979ac959>] put_mnt_ns+0x19d/0x1c4 [ 1035.083355] [<ffffffff97909cce>] free_nsproxy+0x1d/0x75 [ 1035.083374] [<ffffffff97909eda>] switch_task_namespaces+0x47/0x4e [ 1035.083395] [<ffffffff97909ef1>] exit_task_namespaces+0x10/0x12 [ 1035.083417] [<ffffffff978eefcc>] do_exit+0x7da/0xc75 [ 1035.083436] [<ffffffff978ef5cc>] do_group_exit+0x42/0xb0 [ 1035.083458] [<ffffffff978fb614>] get_signal_to_deliver+0x554/0x573 [ 1035.083480] [<ffffffff978b84c5>] do_signal+0x43/0x53a [ 1035.083501] [<ffffffff97800551>] do_notify_resume+0x29/0x5b [ 1035.083522] [<ffffffff97d55749>] int_signal+0x12/0x17 [ 1035.083537] Code: f1 be 00 00 40 00 48 89 e5 e8 fc fe ff ff 48 3d 00 f0 ff ff 77 0b 48 c7 80 b0 00 00 00 ea ff ff ff 5d c3 48 8b 17 48 8b 47 08 55 <48> 89 42 08 48 89 e5 48 89 10 48 89 3f 48 89 7f 08 5d c3 0f 1f [ 1035.083894] RIP [<ffffffff979a9e96>] list_del_init+0x8/0x1b [ 1035.083918] RSP <ffff88013a9c5c28> [ 1035.083931] CR2: 0000000000000008 [ 1035.083972] ---[ end trace faaf644dd0cc59fe ]--- [ 1035.085086] Kernel panic - not syncing: Fatal exception [ 1035.085115] Kernel Offset: 0x16800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 1035.085239] gsmi: Log Shutdown Reason 0x02 [ 1035.086353] ACPI MEMORY or I/O RESET_REG. So, kernel bug? Maybe rope in the sheriff to see if they can assign it to someone for further investigation.
,
Oct 22
Thanks Daniel! +Chrome OS Sheriffs
,
Oct 22
,
Oct 22
The NULL pointer deref is mount namespace related, in fs/namespace.c. There were a few fs patches landed in v3.8 recently. Zubin, could you help take a quick look and see if the error is related? thanks.
static void namespace_unlock(void)
{
struct mount *mnt;
LIST_HEAD(head);
if (likely(list_empty(&unmounted))) {
up_write(&namespace_sem);
return;
}
list_splice_init(&unmounted, &head);
up_write(&namespace_sem);
while (!list_empty(&head)) {
mnt = list_first_entry(&head, struct mount, mnt_hash);
list_del_init(&mnt->mnt_hash); <==============
src/third_party/kernel/v3.8$ git log --format="%h %cI %ae %s"
94bffd44240e 2018-10-10T17:07:55-07:00 zsm@chromium.org CHROMIUM: ext4: Disable inline data in inode
e8b54f180c2d 2018-10-07T11:55:04-07:00 vovoy@chromium.org CHROMIUM: low_mem: exclude totalreserve_pages from available memory
5fc5a6883b64 2018-10-01T11:31:02-07:00 ernesto.mnd.fernandez@gmail.com UPSTREAM: hfsplus: fix NULL dereference in hfsplus_lookup()
2e98414e6f02 2018-09-28T13:34:37-07:00 tytso@mit.edu UPSTREAM: ext4: clear i_data in ext4_inode_info when removing inline data
c3243734ce73 2018-09-28T10:11:00-07:00 sds@tycho.nsa.gov UPSTREAM: selinux: fix off-by-one in setprocattr
c8acbe16ba96 2018-09-27T19:44:28-07:00 briannorris@chromium.org CHROMIUM: config: make CONFIG_PPP built in
6c0b2d05724e 2018-09-01T18:33:49-07:00 dbehr@chromium.org CHROMIUM: mali: update kbase to r26p0
b8a07e3d18f8 2018-08-30T01:28:03-07:00 ndesaulniers@google.com BACKPORT: ANDROID: fs: ext4: disable support for fallocate FALLOC_FL_PUNCH_HOLE
,
Oct 22
Hi Ben, I took a look at this, the stacktrace does not look related/familiar to me.
,
Oct 22
Thanks Zubin. Just found out this is http://crbug.com/871915 |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by malaykeshav@chromium.org
, Oct 22