New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 795682 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Kernel built with kasan would crash on running "btmgmt find"

Project Member Reported by josephsih@chromium.org, Dec 18 2017

Issue description

Chrome Version: eve-release/R64-10128.0.0

What steps will reproduce the problem?
(1) Build an Eve kernel with kasan by the following command and update the kernel to an Eve.
    (cr) USE="kasan" FEATURES="noclean" cros_workon_make --board=eve --install chromeos-kernel-4_4
(2) Start discovery on the Eve. It will stop discovery automatically after about 10 seconds.
    $ btmgmt find
(3) Observe that system will reboot by itself in 15 seconds. This happens almost every time.

If we use bluetoothctl to execute scan on/off, the system would not suffer from the crash issue.

The system crash only occurs when both of the following conditions are met:
(1) It is a kernel built with kasan.
(2) Start/stop discovery through "btmgmt find".

The system crash is not observed if the kernel is not built with kasan or if "btmgmt find" is not run.
 

Comment 1 by r...@chromium.org, Dec 18 2017

Labels: -Pri-3 M-61 Pri-1
Status: Available (was: Untriaged)
The kernel crashing, even with kasan, needs to be at least investigated.

Comment 2 by r...@chromium.org, Dec 18 2017

Labels: -M-61 M-65
Hi Rahul, you are right. I am worrying about it. It will be my highest priority after I come back from OOO. Hopefully, someone may like to take a look earlier.

Comment 4 by mcchou@chromium.org, Apr 19 2018

Labels: -M-65 M-68

Comment 5 by mcchou@chromium.org, Apr 20 2018

Owner: lepton@chromium.org

Comment 6 by lepton@chromium.org, Apr 25 2018

I can't reproduce it with current kernel.  But I did can reproduce it with a kernel should be around same time with R64-10128.0.0.

To reproduce it:
cd src/third_party/kernel/v4.4
 git checkout 3f018c593fa9e9a2d4a6a
(got commit here: https://crosland.corp.google.com/log/10127.0.0..10128.0.0)

And then build kernel with this way (clang won't build on this old kernel unless you CP some CL):

USE="kasan -clang" FEATURES="noclean" cros_workon_make --board=eve --install chromeos-kernel-4_4

But so far I can'get panic info under /dev/pstore now. (See https://groups.google.com/a/google.com/d/msg/chromeos-kernel/yHEFKXIbIwE/oqMFitc8AAAJ)

Need to try to get panic info to move on.

Comment 7 by lepton@chromium.org, Apr 25 2018

OK. Actually this was a fixed bug:

it's fixed by this one:

https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/844256


I just flushed my chrome os to latest stable version 10452.74.0 and use same kernel built in #6, I can reproduce the crash, this is panic which confirm the
dead lock path in said CL:

[   82.956999] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 11s! [btmgmt:4949]
[   82.957006] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/u9:2:338]
[   82.957006] Modules linked in: ip6t_REJECT nf_reject_ipv6[   82.957013] Modules linked in: ip6t_REJECT nf_reject_ipv6 veth cmac rfcomm uinput snd_soc_skl_ssp_clk snd_soc_dmic snd_soc_kbl_rt5663_rt5514_max98927 snd_soc_hdac_hdmi snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_sst_match snd_hda_ext_core snd_hda_core btusb uvcvideo btrtl btbcm btintel videobuf2_vmalloc bluetooth videobuf2_memops videobuf2_v4l2 videobuf2_core snd_soc_rt5514 snd_soc_rt5663 snd_soc_rt5514_spi snd_soc_rl6231 snd_soc_max98927 xt_nat zram bridge stp llc ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat xt_mark fuse snd_seq_dummy snd_seq snd_seq_device iio_trig_sysfs cros_ec_light_prox cros_ec_sensors_ring cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio ip6table_filter iwlmvm r8152 iwl7000_mac80211 mii iwlwifi cfg80211 joydev
[   82.957097] CPU: 3 PID: 4949 Comm: btmgmt Tainted: G    B           4.4.96 #8
[   82.957099] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.107.0 11/07/2017
[   82.957103] task: ffff88034800ec80 ti: ffff88039c068000 task.ti: ffff88039c068000
[   82.957105] RIP: 0010:[<ffffffffab6d8fa6>]  [<ffffffffab6d8fa6>] queued_write_lock_slowpath+0x7d/0xa1
[   82.957114] RSP: 0018:ffff88039c06fd18  EFLAGS: 00000206
[   82.957117] RAX: 0000000000000101 RBX: ffffffffc074e2c8 RCX: ffffffffab6d8fae
[   82.957120] RDX: fffffbfff80e9c59 RSI: dffffc0000000000 RDI: ffffffffc074e2c8
[   82.957123] RBP: ffff88039c06fd30 R08: 0000000000000000 R09: fffffbfff80e9c5b
[   82.957126] R10: 00000000000003c0 R11: 00000000ffffffff R12: ffffffffc074e2cc
[   82.957128] R13: 00000000000000ff R14: ffff88034d10d2cc R15: ffff88034d10cfa8
[   82.957132] FS:  00007a08f31da700(0000) GS:ffff8803eef80000(0000) knlGS:0000000000000000
[   82.957135] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.957137] CR2: 000002af94482000 CR3: 000000034aba8000 CR4: 00000000003606e0
[   82.957140] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   82.957142] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   82.957144] Stack:
[   82.957145]  ffffffffc074e2c8 ffffffffc074e2d8 ffffffffc074e2d4 ffff88039c06fd60
[   82.957152]  ffffffffab7a1466 ffff88034d10cf38 ffffffffc074e2c8 0000000000000004
[   82.957158]  ffff88034d10d2cc ffff88039c06fd70 ffffffffac18ea96 ffff88039c06fda8
[   82.957164] Call Trace:
[   82.957169]  [<ffffffffab7a1466>] do_raw_write_lock+0x9c/0xce
[   82.957174]  [<ffffffffac18ea96>] _raw_write_lock+0x15/0x17
[   82.957196]  [<ffffffffc06f0221>] bt_sock_unlink+0x25/0xb1 [bluetooth]
[   82.957218]  [<ffffffffc071a443>] hci_sock_release+0xe0/0x1ce [bluetooth]
[   82.957223]  [<ffffffffabfc3692>] sock_release+0x4e/0xd6
[   82.957227]  [<ffffffffabfc372c>] sock_close+0x12/0x16
[   82.957231]  [<ffffffffab93731d>] __fput+0x199/0x2c8
[   82.957236]  [<ffffffffab83a96f>] ____fput+0xe/0x10
[   82.957240]  [<ffffffffab784f5c>] task_work_run+0x99/0xc5
[   82.957245]  [<ffffffffab6813a9>] prepare_exit_to_usermode+0xb8/0xd4
[   82.957250]  [<ffffffffab68150b>] syscall_return_slowpath+0x146/0x151
[   82.957253]  [<ffffffffac18eec0>] int_ret_from_sys_call+0x25/0x93
[   82.957255] Code: 90 48 89 df e8 62 10 12 00 8a 03 84 c0 75 f0 f0 44 0f b0 2b 84 c0 75 e7 41 bd ff 00 00 00 eb 0b f0 44 0f b1 2b ff c8 74 13 f3 90 <48> 89 df e8 a3 0e 12 00 8b 03 83 f8 01 75 ef eb e4 4c 89 e7 e8 
[   82.957313] Kernel panic - not syncing: softlockup: hung tasks
[   82.957316] CPU: 3 PID: 4949 Comm: btmgmt Tainted: G    B        L  4.4.96 #8
[   82.957317] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.107.0 11/07/2017
[   82.957319]  ffff88039c06fc68 7574430b6b716bd9 ffff8803eef87dd0 ffffffffaba4d933
[   82.957324]  ffffffffac3b3587 ffff8803eef87e68 ffff8803eef87e58 ffffffffab7ba0ea
[   82.957330]  ffff880300000008 ffff8803eef87e68 ffff8803eef87e00 7574430b6b716bd9
[   82.957335] Call Trace:
[   82.957336]  <IRQ>  [<ffffffffaba4d933>] dump_stack+0x4d/0x63
[   82.957344]  [<ffffffffab7ba0ea>] panic+0xf3/0x231
[   82.957347]  [<ffffffffab7a1339>] ? do_raw_spin_unlock+0xc7/0xd1
[   82.957351]  [<ffffffffab737080>] watchdog_timer_fn+0x1e0/0x203
[   82.957354]  [<ffffffffab736ea0>] ? watchdog_cleanup+0x10/0x10
[   82.957357]  [<ffffffffab7acf0b>] hrtimer_interrupt+0x381/0x7bc
[   82.957361]  [<ffffffffab6837f3>] local_apic_timer_interrupt+0xa6/0xad
[   82.957365]  [<ffffffffac191682>] smp_apic_timer_interrupt+0x5d/0x6f
[   82.957368]  [<ffffffffac18fb85>] apic_timer_interrupt+0x95/0xa0
[   82.957369]  <EOI>  [<ffffffffab6d8fae>] ? queued_write_lock_slowpath+0x85/0xa1
[   82.957375]  [<ffffffffab6d8fa6>] ? queued_write_lock_slowpath+0x7d/0xa1
[   82.957378]  [<ffffffffab6d8fae>] ? queued_write_lock_slowpath+0x85/0xa1
[   82.957381]  [<ffffffffab7a1466>] do_raw_write_lock+0x9c/0xce
[   82.957383]  [<ffffffffac18ea96>] _raw_write_lock+0x15/0x17
[   82.957404]  [<ffffffffc06f0221>] bt_sock_unlink+0x25/0xb1 [bluetooth]
[   82.957427]  [<ffffffffc071a443>] hci_sock_release+0xe0/0x1ce [bluetooth]
[   82.957431]  [<ffffffffabfc3692>] sock_release+0x4e/0xd6
[   82.957434]  [<ffffffffabfc372c>] sock_close+0x12/0x16
[   82.957437]  [<ffffffffab93731d>] __fput+0x199/0x2c8
[   82.957440]  [<ffffffffab83a96f>] ____fput+0xe/0x10
[   82.957443]  [<ffffffffab784f5c>] task_work_run+0x99/0xc5
[   82.957447]  [<ffffffffab6813a9>] prepare_exit_to_usermode+0xb8/0xd4
[   82.957451]  [<ffffffffab68150b>] syscall_return_slowpath+0x146/0x151
[   82.957454]  [<ffffffffac18eec0>] int_ret_from_sys_call+0x25/0x93

[   82.957509]  veth cmac rfcomm uinput snd_soc_skl_ssp_clk snd_soc_dmic snd_soc_kbl_rt5663_rt5514_max98927 snd_soc_hdac_hdmi snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_sst_match snd_hda_ext_core snd_hda_core btusb uvcvideo btrtl btbcm btintel videobuf2_vmalloc bluetooth videobuf2_memops videobuf2_v4l2 videobuf2_core snd_soc_rt5514 snd_soc_rt5663 snd_soc_rt5514_spi snd_soc_rl6231 snd_soc_max98927 xt_nat zram bridge stp llc ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat xt_mark fuse snd_seq_dummy snd_seq snd_seq_device iio_trig_sysfs cros_ec_light_prox cros_ec_sensors_ring cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio ip6table_filter iwlmvm r8152 iwl7000_mac80211 mii iwlwifi cfg80211 joydev
[   82.957604] CPU: 0 PID: 338 Comm: kworker/u9:2 Tainted: G    B        L  4.4.96 #8
[   82.957607] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.107.0 11/07/2017
[   82.957632] Workqueue: hci0 le_scan_disable_work [bluetooth]
[   82.957637] task: ffff880075a66c80 ti: ffff880075ae8000 task.ti: ffff880075ae8000
[   82.957640] RIP: 0010:[<ffffffffab6d76b3>]  [<ffffffffab6d76b3>] queued_spin_lock_slowpath+0x75/0x203
[   82.957649] RSP: 0018:ffff880075aefba8  EFLAGS: 00000202
[   82.957652] RAX: ffff7fffffffffff RBX: ffffffffc074e2cc RCX: ffffffffab6d76aa
[   82.957656] RDX: fffffbfff80e9c59 RSI: 0000000000000101 RDI: ffffffffc074e2cc
[   82.957660] RBP: ffff880075aefbd8 R08: ffffed00728d9484 R09: ffffed00777a7571
[   82.957663] R10: ffffed00728d9484 R11: ffff8803946ca41f R12: ffffffffc074e2cc
[   82.957666] R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
[   82.957670] FS:  0000000000000000(0000) GS:ffff8803eee00000(0000) knlGS:0000000000000000
[   82.957674] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.957677] CR2: 00005b1976509020 CR3: 000000002c614000 CR4: 00000000003606f0
[   82.957680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   82.957684] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   82.957686] Stack:
[   82.957689]  0000000000000001 ffffffffc074e2c8 ffffffffc074e2cc 0000000000000002
[   82.957697]  0000000000000000 0000000000000000 ffff880075aefbf8 ffffffffab6d901d
[   82.957704]  ffffffffc074e2c8 0000000000000002 ffff880075aefc10 ffffffffab7a138d
[   82.957712] Call Trace:
[   82.957717]  [<ffffffffab6d901d>] queued_read_lock_slowpath+0x53/0x7f
[   82.957722]  [<ffffffffab7a138d>] do_raw_read_lock+0x4a/0x4d
[   82.957727]  [<ffffffffac18e994>] _raw_read_lock+0x15/0x17
[   82.957750]  [<ffffffffc0718ca4>] hci_send_to_channel+0x2d/0xe5 [bluetooth]
[   82.957774]  [<ffffffffc071a845>] hci_send_monitor_ctrl_event+0x1e1/0x21f [bluetooth]
[   82.957798]  [<ffffffffc073b80b>] mgmt_send_event+0x153/0x16d [bluetooth]
[   82.957821]  [<ffffffffc07087bc>] mgmt_event+0x22/0x24 [bluetooth]
[   82.957844]  [<ffffffffc0717467>] mgmt_discovering+0x6b/0x86 [bluetooth]
[   82.957866]  [<ffffffffc06f6619>] hci_discovery_set_state+0x80/0x89 [bluetooth]
[   82.957889]  [<ffffffffc0737555>] le_scan_disable_work+0x137/0x15a [bluetooth]
[   82.957894]  [<ffffffffab784725>] worker_thread+0xa85/0xdd9
[   82.957899]  [<ffffffffab783ca0>] ? queue_work_on+0x24/0x24
[   82.957904]  [<ffffffffab6ad12b>] kthread+0x186/0x196
[   82.957908]  [<ffffffffab6acfa5>] ? kthread_stop+0x1cc/0x1cc
[   82.957913]  [<ffffffffac18f0ef>] ret_from_fork+0x3f/0x70
[   82.957917]  [<ffffffffab6acfa5>] ? kthread_stop+0x1cc/0x1cc
[   82.957920] Code: ca 89 f0 0f 44 d7 f0 0f b1 13 39 f0 74 04 89 c6 eb e2 ff ca 0f 84 93 01 00 00 48 89 df e8 a7 27 12 00 8b 33 40 84 f6 74 04 f3 90 <eb> ed 48 89 df e8 02 28 12 00 66 c7 03 01 00 e9 6e 01 00 00 49 
[   84.037287] Shutting down cpus with NMI
[   84.037297] Kernel Offset: 0x2a600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   84.039209] gsmi: Log Shutdown Reason 0x02
[   84.051626] ACPI MEMORY or I/O RESET_REG.


Comment 8 by lepton@chromium.org, Apr 25 2018

Status: Fixed (was: Available)

Comment 9 by lepton@chromium.org, Apr 25 2018

FYI, tested with kernel at commit 3f018c593fa9e9a2d4a6a and CP 21100b9
with R64-10128.0.0 and no more crash.

Sign in to add a comment