3.14 kernel shows several debug splats on Samus using "lockdebug" |
|||||
Issue descriptionR61 see crbug.com/736668 after patching in these two CLs https://chromium-review.googlesource.com/c/547566/ https://chromium-review.googlesource.com/c/547567/ and building with USE="lockdebug" I'm still seeing several splats on 3.14 one for perf_event_ctx_lock [ 1.343105] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v3.14/kernel/locking/mutex.c:616 [ 1.343119] in_atomic(): 1, irqs_disabled(): 0, pid: 147, name: sysctl [ 1.343126] 2 locks held by sysctl/147: [ 1.343130] #0: (sb_writers#4){.+.+.+}, at: [<ffffffff8f15d389>] file_start_write+0x27/0x29 [ 1.343151] #1: (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff8f0e5ddb>] proc_dowatchdog+0x34/0x125 [ 1.343169] CPU: 1 PID: 147 Comm: sysctl Not tainted 3.14.0 #4 [ 1.343175] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015 [ 1.343182] 0000000000000000 00000000b3c5dbf9 ffff8802733cdcb8 ffffffff8f66f27d [ 1.343195] 0000000000000093 ffff8802733cdce0 ffffffff8f088cc4 ffff88027ec16838 [ 1.343206] 0000000000000000 ffff8802734ac800 ffff8802733cdd80 ffffffff8f6769d1 [ 1.343216] Call Trace: [ 1.343222] [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f [ 1.343228] [<ffffffff8f088cc4>] __might_sleep+0x1bd/0x1c4 [ 1.343234] [<ffffffff8f6769d1>] mutex_lock_nested+0x4d/0x432 [ 1.343241] [<ffffffff8f0a511e>] ? lock_acquire+0x116/0x140 [ 1.343247] [<ffffffff8f109a4a>] ? perf_adjust_period+0x14e/0x14e [ 1.343254] [<ffffffff8f10ab93>] perf_event_ctx_lock+0x42/0x67 [ 1.343259] [<ffffffff8f10ab93>] ? perf_event_ctx_lock+0x42/0x67 [ 1.343265] [<ffffffff8f10abc7>] perf_event_disable+0xf/0x27 [ 1.343272] [<ffffffff8f0e590e>] update_timers_all_cpus+0x9e/0xf5 [ 1.343278] [<ffffffff8f0e5965>] ? update_timers_all_cpus+0xf5/0xf5 [ 1.343284] [<ffffffff8f0e5e87>] proc_dowatchdog+0xe0/0x125 [ 1.343290] [<ffffffff8f0e5e87>] ? proc_dowatchdog+0xe0/0x125 [ 1.343296] [<ffffffff8f1b8bbe>] proc_sys_call_handler+0xb0/0xea [ 1.343302] [<ffffffff8f1b8c0c>] proc_sys_write+0x14/0x16 [ 1.343308] [<ffffffff8f15daaa>] vfs_write+0xac/0xd6 [ 1.343313] [<ffffffff8f15e18c>] SyS_write+0x5f/0xa3 [ 1.343321] [<ffffffff8f6798dc>] system_call_fastpath+0x20/0x25 several hits for xhci_alloc_command() including this one [ 3.870052] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v3.14/mm/slub.c:965 [ 3.870065] in_atomic(): 1, irqs_disabled(): 1, pid: 38, name: kworker/1:1 [ 3.870072] 3 locks held by kworker/1:1/38: [ 3.870077] #0: ("pm"){.+.+.+}, at: [<ffffffff8f078d6a>] process_one_work+0x183/0x422 [ 3.870096] #1: ((&dev->power.work)){+.+.+.}, at: [<ffffffff8f078d6a>] process_one_work+0x183/0x422 [ 3.870110] #2: (&(&xhci->lock)->rlock){-.-...}, at: [<ffffffff8f46caf6>] xhci_stop_device.constprop.7+0x66/0x14f [ 3.870128] irq event stamp: 10408 [ 3.870132] hardirqs last enabled at (10407): [<ffffffff8f0a3d0c>] debug_check_no_locks_freed+0x91/0x135 [ 3.870144] hardirqs last disabled at (10408): [<ffffffff8f6782ac>] _raw_spin_lock_irqsave+0x18/0x4f [ 3.870154] softirqs last enabled at (10344): [<ffffffff8f064270>] __do_softirq+0x2ef/0x350 [ 3.870165] softirqs last disabled at (10323): [<ffffffff8f0644d3>] irq_exit+0x44/0x9a [ 3.870176] CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 3.14.0 #4 [ 3.870182] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015 [ 3.870192] Workqueue: pm pm_runtime_work [ 3.870198] 0000000000000000 000000000662516c ffff88027308b840 ffffffff8f66f27d [ 3.870209] 0000000000000026 ffff88027308b868 ffffffff8f088cc4 0000000000008010 [ 3.870220] 0000000000000030 ffffffff8f465ac9 ffff88027308b880 ffffffff8f14ad4e [ 3.870232] Call Trace: [ 3.870238] [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f [ 3.870246] [<ffffffff8f088cc4>] __might_sleep+0x1bd/0x1c4 [ 3.870255] [<ffffffff8f465ac9>] ? kzalloc+0xf/0x11 [ 3.870263] [<ffffffff8f14ad4e>] slab_pre_alloc_hook.isra.40+0x2c/0x31 [ 3.870271] [<ffffffff8f14d2b1>] __kmalloc+0x62/0x11f [ 3.870278] [<ffffffff8f465ac9>] kzalloc+0xf/0x11 [ 3.870286] [<ffffffff8f466cb9>] xhci_alloc_command+0x2a/0xca [ 3.870293] [<ffffffff8f46cb1b>] xhci_stop_device.constprop.7+0x8b/0x14f [ 3.870301] [<ffffffff8f46d6bc>] xhci_hub_control+0x8f8/0xd8c [ 3.870308] [<ffffffff8f088bd4>] ? __might_sleep+0xcd/0x1c4 [ 3.870316] [<ffffffff8f14ad4e>] ? slab_pre_alloc_hook.isra.40+0x2c/0x31 [ 3.870325] [<ffffffff8f43e6b8>] usb_hcd_submit_urb+0x5a7/0x78f [ 3.870333] [<ffffffff8f0a3d0c>] ? debug_check_no_locks_freed+0x91/0x135 [ 3.870341] [<ffffffff8f0a3a9f>] ? trace_hardirqs_on_caller+0x180/0x19c [ 3.870347] [<ffffffff8f0a3ac8>] ? trace_hardirqs_on+0xd/0xf [ 3.870353] [<ffffffff8f43f6a7>] usb_submit_urb+0x443/0x458 [ 3.870359] [<ffffffff8f44061c>] usb_start_wait_urb+0x6d/0xf7 [ 3.870365] [<ffffffff8f088bd4>] ? __might_sleep+0xcd/0x1c4 [ 3.870371] [<ffffffff8f440787>] usb_control_msg+0xe1/0x11d [ 3.870377] [<ffffffff8f4368a7>] set_port_feature+0x35/0x37 [ 3.870383] [<ffffffff8f43952f>] usb_port_suspend+0x16c/0x237 [ 3.870390] [<ffffffff8f44a30f>] generic_suspend+0x26/0x2c [ 3.870395] [<ffffffff8f4425f6>] usb_suspend_both+0x14b/0x1a0 [ 3.870402] [<ffffffff8f4435eb>] usb_runtime_suspend+0x33/0x66 [ 3.870408] [<ffffffff8f4435b8>] ? usb_probe_interface+0x222/0x222 [ 3.870414] [<ffffffff8f3e66d5>] __rpm_callback+0x39/0x60 [ 3.870420] [<ffffffff8f3e6770>] rpm_callback+0x74/0x8a [ 3.870426] [<ffffffff8f3e6f69>] rpm_suspend+0x331/0x4ab [ 3.870432] [<ffffffff8f3e7e8b>] pm_runtime_work+0x87/0x94 [ 3.870437] [<ffffffff8f078e45>] process_one_work+0x25e/0x422 [ 3.870443] [<ffffffff8f07a4b3>] worker_thread+0x143/0x202 [ 3.870449] [<ffffffff8f07a370>] ? rescuer_thread+0x2d6/0x2d6 [ 3.870455] [<ffffffff8f080245>] kthread+0xdb/0xe3 [ 3.870461] [<ffffffff8f08016a>] ? __kthread_parkme+0x67/0x67 [ 3.870467] [<ffffffff8f67982c>] ret_from_fork+0x7c/0xb0 [ 3.870473] [<ffffffff8f08016a>] ? __kthread_parkme+0x67/0x67 one for regmap_lock_mutex() [ 6.092869] ============================================= [ 6.092874] [ INFO: possible recursive locking detected ] [ 6.092879] 3.14.0 #4 Not tainted [ 6.092882] --------------------------------------------- [ 6.092886] udevd/2055 is trying to acquire lock: [ 6.092891] (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12 [ 6.092904] but task is already holding lock: [ 6.092909] (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12 [ 6.092918] other info that might help us debug this: [ 6.092924] Possible unsafe locking scenario: [ 6.092928] CPU0 [ 6.092931] ---- [ 6.092934] lock(&map->mutex); [ 6.092938] lock(&map->mutex); [ 6.092942] *** DEADLOCK *** [ 6.092947] May be due to missing lock nesting notation [ 6.092953] 3 locks held by udevd/2055: [ 6.092957] #0: (&__lockdep_no_validate__){......}, at: [<ffffffff8f3dcbbb>] __driver_attach+0x34/0x83 [ 6.092969] #1: (&__lockdep_no_validate__){......}, at: [<ffffffff8f3dcbd3>] __driver_attach+0x4c/0x83 [ 6.092980] #2: (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12 [ 6.092990] stack backtrace: [ 6.092996] CPU: 2 PID: 2055 Comm: udevd Not tainted 3.14.0 #4 [ 6.093001] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015 [ 6.093006] 0000000000000000 000000004c1a3847 ffff88024d9df920 ffffffff8f66f27d [ 6.093016] ffffffff902ad060 ffff88024d9df9f0 ffffffff8f0a4b74 000000024d95cf20 [ 6.093024] ffff88024d95c710 ffffffff902ad060 ffffffff8f3eac00 00005e82f4b2c596 [ 6.093033] Call Trace: [ 6.093039] [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f [ 6.093045] [<ffffffff8f0a4b74>] __lock_acquire+0xdc4/0xf8b [ 6.093052] [<ffffffff8f3eac00>] ? regmap_parse_16_be_inplace+0xb/0xf [ 6.093059] [<ffffffff8f00f893>] ? save_stack_trace+0x2b/0x46 [ 6.093065] [<ffffffff8f0a1254>] ? save_trace+0x3b/0x91 [ 6.093071] [<ffffffff8f0a511e>] lock_acquire+0x116/0x140 [ 6.093077] [<ffffffff8f0a511e>] ? lock_acquire+0x116/0x140 [ 6.093083] [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12 [ 6.093089] [<ffffffff8f676a0d>] mutex_lock_nested+0x89/0x432 [ 6.093095] [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12 [ 6.093100] [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12 [ 6.093107] [<ffffffff8f676d0d>] ? mutex_lock_nested+0x389/0x432 [ 6.093112] [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12 [ 6.093118] [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12 [ 6.093124] [<ffffffff8f3ecb8b>] regmap_read+0x37/0x5f [ 6.093132] [<ffffffffc03114e6>] rt5677_sel_asrc_clk_src+0x220d/0x270c [snd_soc_rt5677] [ 6.093139] [<ffffffff8f3eca9a>] _regmap_read+0xa5/0x15f [ 6.093145] [<ffffffff8f3ecb99>] regmap_read+0x45/0x5f [ 6.093151] [<ffffffffc0310af0>] rt5677_sel_asrc_clk_src+0x1817/0x270c [snd_soc_rt5677] [ 6.093160] [<ffffffff8f2d9a5f>] ? acpi_dev_pm_attach+0xa9/0xc4 [ 6.093168] [<ffffffffc03107b7>] ? rt5677_sel_asrc_clk_src+0x14de/0x270c [snd_soc_rt5677] [ 6.093178] [<ffffffff8f4a320c>] i2c_device_probe+0xae/0xd2 [ 6.093186] [<ffffffff8f3dc9d3>] driver_probe_device+0xae/0x1fc [ 6.093193] [<ffffffff8f3dcbe8>] __driver_attach+0x61/0x83 [ 6.093200] [<ffffffff8f3dcb87>] ? __device_attach_driver+0x66/0x66 [ 6.093207] [<ffffffff8f3dbb06>] bus_for_each_dev+0x8f/0xb2 [ 6.093214] [<ffffffff8f3dc448>] driver_attach+0x1e/0x20 [ 6.093221] [<ffffffff8f3dc165>] bus_add_driver+0xee/0x1e6 [ 6.093228] [<ffffffff8f3dd30f>] driver_register+0x8e/0xca [ 6.093236] [<ffffffff8f4a2205>] i2c_register_driver+0x44/0xa3 [ 6.093245] [<ffffffffc011c000>] ? 0xffffffffc011bfff [ 6.093253] [<ffffffffc011c017>] init_module+0x17/0x1000 [snd_soc_rt5677] [ 6.093261] [<ffffffff8f0002c0>] do_one_initcall+0x93/0x132 [ 6.093270] [<ffffffff8f036fbc>] ? set_memory_nx+0x37/0x3a [ 6.093279] [<ffffffff8f0ca9f0>] load_module+0x15fa/0x1ba7 [ 6.093287] [<ffffffff8f0cb115>] SyS_finit_module+0x7d/0xa2 [ 6.093294] [<ffffffff8f0cb115>] ? SyS_finit_module+0x7d/0xa2 [ 6.093304] [<ffffffff8f6798dc>] system_call_fastpath+0x20/0x25 Since 3.18 and 4.4 don't seem to hit these problems when enabling USE="lockdebug" I'm guessing there's some upstream solutions to these issues that we can backport
,
Jun 25 2017
The xhci one looks like a real regression which was caused by backporting xhci: Use command structures when queuing commands on the command ring the regmap one is not a real problem (I think) but probably needs: 3cfe7a74d42b regmap: Use different lockdep class for each regmap init call which was authored by drinkcat@ :-)
,
Jun 26 2017
Ok, the regmap one were clean after two additional cherry-picks watchdog one needs: bde92cf455a0 kernel/watchdog.c: remove preemption restrictions when restarting lockup detector full series to fix all the boottime splat on samus is up here: https://chromium-review.googlesource.com/547687
,
Jun 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/64ac4d16bae7204c39217c6949b69fd93f50c89d commit 64ac4d16bae7204c39217c6949b69fd93f50c89d Author: Boris BREZILLON <boris.brezillon@free-electrons.com> Date: Tue Jun 27 22:51:59 2017 UPSTREAM: regmap: i2c: fallback to SMBus if the adapter does not support standard I2C Some I2C adapters are only compatible with the SMBus protocol and do not support standard I2C transfers. Fallback to SMBus transfers if we encounter such kind of adapters. The transfer type is chosen according to the val_bits field in the regmap config. BUG= chromium:736671 TEST=build/boot samus with USE="lockdebug" Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit b42261078a91db8a8307db42cad41a619077d1df) Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Change-Id: I369cc3ff37d616848c821b1aedb823d7ff5a6684 Reviewed-on: https://chromium-review.googlesource.com/547651 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> [modify] https://crrev.com/64ac4d16bae7204c39217c6949b69fd93f50c89d/drivers/base/regmap/regmap-i2c.c
,
Jun 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/64ac4d16bae7204c39217c6949b69fd93f50c89d commit 64ac4d16bae7204c39217c6949b69fd93f50c89d Author: Boris BREZILLON <boris.brezillon@free-electrons.com> Date: Tue Jun 27 22:51:59 2017 UPSTREAM: regmap: i2c: fallback to SMBus if the adapter does not support standard I2C Some I2C adapters are only compatible with the SMBus protocol and do not support standard I2C transfers. Fallback to SMBus transfers if we encounter such kind of adapters. The transfer type is chosen according to the val_bits field in the regmap config. BUG= chromium:736671 TEST=build/boot samus with USE="lockdebug" Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit b42261078a91db8a8307db42cad41a619077d1df) Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Change-Id: I369cc3ff37d616848c821b1aedb823d7ff5a6684 Reviewed-on: https://chromium-review.googlesource.com/547651 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> [modify] https://crrev.com/64ac4d16bae7204c39217c6949b69fd93f50c89d/drivers/base/regmap/regmap-i2c.c
,
Jun 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/eb26f6445a0ab39aca73e2f81eb5e2cffa0719da commit eb26f6445a0ab39aca73e2f81eb5e2cffa0719da Author: Mathias Nyman <mathias.nyman@linux.intel.com> Date: Tue Jun 27 22:51:58 2017 UPSTREAM: xhci: Fix sleeping with IRQs disabled in xhci_stop_device() xhci_stop_device() allocates and issues stop commands for each active endpoint. This is done with spinlock held and interrupt disabled so we can't sleep during memory allocation. Use GFP_NOWAIT instead Regression from commit ddba5cd0aeff5bbed92ebdf4b1223300b0541e78 "xhci: Use command structures when queuing commands on the command ring" for 3.16-rc1 BUG= chromium:736671 TEST=build/boot samus with USE="lockdebug" Fixes: ddba5cd0aeff ("xhci: Use command structures when queuing commands") Reported-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit be3de32107091c266b00a48265fe7e06233af4f0) Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Change-Id: Ic1ab945e2d8ae1e3f6a4dce9e4c2849dae13e143 Reviewed-on: https://chromium-review.googlesource.com/547738 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Reviewed-by: Vincent Palatin <vpalatin@chromium.org> [modify] https://crrev.com/eb26f6445a0ab39aca73e2f81eb5e2cffa0719da/drivers/usb/host/xhci-hub.c
,
Jun 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/77820775dc1feb161e667deae098d043f7cde29a commit 77820775dc1feb161e667deae098d043f7cde29a Author: Eric Dumazet <edumazet@google.com> Date: Tue Jun 27 22:52:05 2017 UPSTREAM: ipv6: tcp: add rcu locking in tcp_v6_send_synack() When first SYNACK is sent, we already hold rcu_read_lock(), but this is not true if a SYNACK is retransmitted, as a timer (soft) interrupt does not hold rcu_read_lock() BUG= chromium:736671 TEST=build/boot samus with USE="lockdebug" Fixes: 45f6fad84cc30 ("ipv6: add complete rcu protection around np->opt") Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 3e4006f0b86a5ae5eb0e8215f9a9e1db24506977) Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Change-Id: Ibb2a4b6116084e4ce73855d2f49b0c38b0263550 Reviewed-on: https://chromium-review.googlesource.com/548974 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> [modify] https://crrev.com/77820775dc1feb161e667deae098d043f7cde29a/net/ipv6/tcp_ipv6.c
,
Jun 28 2017
,
Jan 22 2018
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by sonnyrao@chromium.org
, Jun 25 2017