New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 778145 link

Starred by 1 user

Issue metadata

Status: Verified
Owner: ----
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

WARNING: kernel/kthread.c: kthread_worker_fn+0x31/0x1a6

Project Member Reported by drinkcat@chromium.org, Oct 25 2017

Issue description

Booting soraka with kernel ToT:

420fde10575f (HEAD, m/master, cros/chromeos-4.4) CHROMIUM: Bluetooth: Remove the assumption on the Adv state in connection event

[   13.006885] ------------[ cut here ]------------
[   13.006897] WARNING: CPU: 2 PID: 1915 at /mnt/host/source/src/third_party/kernel/v4.4/kernel/kthread.c:573 kthread_worker_fn+0x31/0x1a6()
[   13.006900] Modules linked in: snd_soc_hdac_hdmi snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_sst_match snd_hda_ext_core snd_hda_core ipu3_cio2 intel_acpi_camera snd_soc_rt5663(+) snd_soc_max98927 snd_soc_rl6231 xt_nat ov5670 cmac ov13858 at24 dw9714 acpi_als ipu3_imgu ipu3_mmu ipu3_dmamap videobuf2_dma_sg bridge videobuf2_memops zram stp llc videobuf2_v4l2 videobuf2_core rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat xt_mark fuse iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio ip6table_filter iwlmvm iwl7000_mac80211 iwlwifi cfg80211 hid_google_hammer btusb btrtl btbcm btintel bluetooth usb_serial_simple ax88179_178a usbnet mii joydev
[   13.007006] CPU: 2 PID: 1915 Comm: 0000:00:1f.3 Tainted: G     U          4.4.92 #16
[   13.007010] Hardware name: HP Soraka/Soraka, BIOS Google_Soraka.9971.0.0 09/22/2017
[   13.007014]  0000000000000286 c2ae425eb31f0090 ffff880058847e10 ffffffff8a4948d7
[   13.007022]  0000000000000000 0000000000000009 ffff880058847e48 ffffffff8a269888
[   13.007029]  ffffffff8a283d3e ffff880059ac2108 ffff880059ac2108 ffffffff8a283d0d
[   13.007037] Call Trace:
[   13.007046]  [<ffffffff8a4948d7>] dump_stack+0x4d/0x63
[   13.007053]  [<ffffffff8a269888>] warn_slowpath_common+0x9f/0xb8
[   13.007058]  [<ffffffff8a283d3e>] ? kthread_worker_fn+0x31/0x1a6
[   13.007062]  [<ffffffff8a283d0d>] ? kthread_park+0x52/0x52
[   13.007067]  [<ffffffff8a26999a>] warn_slowpath_null+0x1a/0x1c
[   13.007072]  [<ffffffff8a283d3e>] kthread_worker_fn+0x31/0x1a6
[   13.007077]  [<ffffffff8a283d0d>] ? kthread_park+0x52/0x52
[   13.007081]  [<ffffffff8a283d0d>] ? kthread_park+0x52/0x52
[   13.007086]  [<ffffffff8a283c91>] kthread+0x12c/0x134
[   13.007091]  [<ffffffff8a283b65>] ? kthread_parkme+0x24/0x24
[   13.007098]  [<ffffffff8a916f2f>] ret_from_fork+0x3f/0x70
[   13.007102]  [<ffffffff8a283b65>] ? kthread_parkme+0x24/0x24
[   13.007106] ---[ end trace 71244f0127c87ab4 ]---

 

Comment 1 by groeck@chromium.org, Oct 25 2017

This is seen with all 4.4 images. If I remember correctly, it was introduced by the recent drm patch series, though I don't remember details.

Comment 2 Deleted

I found bisecting (especially over such a wide range as I wasn't sure when this started to happen) quite difficult, as many (all?) of the commits within the stable merge(s) do not actually boot on soraka.

Anyway, I still managed, and the result is as follows:

6a1466f773fe54069da27e2939f09bbaf65c2f5a is the first bad commit
commit 6a1466f773fe54069da27e2939f09bbaf65c2f5a
Author: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Date:   Tue Aug 22 16:45:50 2017 +0530

    FROMLIST: ASoC: Intel: Skylake: Fix to free dsp resource on ipc_init failure
    
    For some dsp init error path, irq and few more resources are not freed.
    This results in oops. So, fix it by freeing up the resources on ipc_init
    failure.
    
    BUG=b:64204809
    TEST=no oops during reboot testing on eve.
    
    Signed-off-by: Todd Broch <tbroch@chromium.org>
    (am from http://www.spinics.net/lists/alsa-devel/msg66471.html)
    
    Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
    Acked-By: Vinod Koul <vinod.koul@intel.com>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    (cherry picked from commit 3b3011adada3bba47c56c205634e1b32512e0c7c)
    
    Change-Id: Ib794a202fb2a1f7a90ae9133ff4f0babf988d352
    Reviewed-on: https://chromium-review.googlesource.com/630510
    Commit-Ready: HARSHAPRIYA N <harshapriya.n@intel.com>
    Tested-by: HARSHAPRIYA N <harshapriya.n@intel.com>
    Tested-by: Todd Broch <tbroch@chromium.org>
    Reviewed-by: Hsinyu Chao <hychao@chromium.org>

Reverting the CL on ToT fixes the issue.
Labels: M-61
The patch was backported to M-61: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/664068
Cc: bleung@chromium.org

Comment 6 by groeck@chromium.org, Oct 25 2017

#3: You would normally use "git bisect skip" to skip over stable merges.

#6: Right, I ended up figuring that out ,-(

Comment 8 by tbroch@chromium.org, Oct 25 2017

See same failure on eve on ToT as of today.  ToT ~10/23 didn't repro though from brief testing (several reboots). Sorry no SHA for that.

Looks like now call to skl_ipc_init in skl_sst_ctx_init succeeds so the one in skl_sst_dsp_init is redundant leading to the WARN_ON.  Seems like revert is the way to go.

Harshapriya can you have a look?


You are right. That is true. But the cleanup of the dsp resources needs to happen if the call fails. I think taking the patch on latest upstream code caused it calling it twice as the location of the call has been changed in latest upstream code. We would need call skl_dsp_free(sst); function if skl_sst_ctx_init() fails. I will submit a 2nd version of this. Not adding this cleanup might cause crashes.
Cc: chintan....@intel.com
Revert is here, if this is what we want to do:
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/738010

In 
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/630510, Chintan suggests:

Not sure if it is going to fix this, but I think this patch should have been brought with this series too:

2eed1b024a11 2017-08-02 ASoC: Intel: Skylake: Move platform specific init to platform dsp_init() [Guneshwor Singh 2017-08-03 11:07:26 +0100].
"""
As data point, this also happens on Eve always.

I did quick test with above mentioned patch which I suggested and it seems it fixes this issue.

Backport: https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/739922/

Status: Verified (was: Available)

Sign in to add a comment