New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 736671 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Closed: Jun 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

3.14 kernel shows several debug splats on Samus using "lockdebug"

Project Member Reported by sonnyrao@chromium.org, Jun 25 2017

Issue description

R61 

see crbug.com/736668

after patching in these two CLs  
https://chromium-review.googlesource.com/c/547566/
https://chromium-review.googlesource.com/c/547567/

and building with USE="lockdebug" I'm still seeing several splats on 3.14

one for perf_event_ctx_lock

[    1.343105] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v3.14/kernel/locking/mutex.c:616
[    1.343119] in_atomic(): 1, irqs_disabled(): 0, pid: 147, name: sysctl
[    1.343126] 2 locks held by sysctl/147:
[    1.343130]  #0:  (sb_writers#4){.+.+.+}, at: [<ffffffff8f15d389>] file_start_write+0x27/0x29
[    1.343151]  #1:  (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff8f0e5ddb>] proc_dowatchdog+0x34/0x125
[    1.343169] CPU: 1 PID: 147 Comm: sysctl Not tainted 3.14.0 #4
[    1.343175] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015
[    1.343182]  0000000000000000 00000000b3c5dbf9 ffff8802733cdcb8 ffffffff8f66f27d
[    1.343195]  0000000000000093 ffff8802733cdce0 ffffffff8f088cc4 ffff88027ec16838
[    1.343206]  0000000000000000 ffff8802734ac800 ffff8802733cdd80 ffffffff8f6769d1
[    1.343216] Call Trace:
[    1.343222]  [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f
[    1.343228]  [<ffffffff8f088cc4>] __might_sleep+0x1bd/0x1c4
[    1.343234]  [<ffffffff8f6769d1>] mutex_lock_nested+0x4d/0x432
[    1.343241]  [<ffffffff8f0a511e>] ? lock_acquire+0x116/0x140
[    1.343247]  [<ffffffff8f109a4a>] ? perf_adjust_period+0x14e/0x14e
[    1.343254]  [<ffffffff8f10ab93>] perf_event_ctx_lock+0x42/0x67
[    1.343259]  [<ffffffff8f10ab93>] ? perf_event_ctx_lock+0x42/0x67
[    1.343265]  [<ffffffff8f10abc7>] perf_event_disable+0xf/0x27
[    1.343272]  [<ffffffff8f0e590e>] update_timers_all_cpus+0x9e/0xf5
[    1.343278]  [<ffffffff8f0e5965>] ? update_timers_all_cpus+0xf5/0xf5
[    1.343284]  [<ffffffff8f0e5e87>] proc_dowatchdog+0xe0/0x125
[    1.343290]  [<ffffffff8f0e5e87>] ? proc_dowatchdog+0xe0/0x125
[    1.343296]  [<ffffffff8f1b8bbe>] proc_sys_call_handler+0xb0/0xea
[    1.343302]  [<ffffffff8f1b8c0c>] proc_sys_write+0x14/0x16
[    1.343308]  [<ffffffff8f15daaa>] vfs_write+0xac/0xd6
[    1.343313]  [<ffffffff8f15e18c>] SyS_write+0x5f/0xa3
[    1.343321]  [<ffffffff8f6798dc>] system_call_fastpath+0x20/0x25


several hits for xhci_alloc_command() including this one


[    3.870052] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v3.14/mm/slub.c:965
[    3.870065] in_atomic(): 1, irqs_disabled(): 1, pid: 38, name: kworker/1:1
[    3.870072] 3 locks held by kworker/1:1/38:
[    3.870077]  #0:  ("pm"){.+.+.+}, at: [<ffffffff8f078d6a>] process_one_work+0x183/0x422
[    3.870096]  #1:  ((&dev->power.work)){+.+.+.}, at: [<ffffffff8f078d6a>] process_one_work+0x183/0x422
[    3.870110]  #2:  (&(&xhci->lock)->rlock){-.-...}, at: [<ffffffff8f46caf6>] xhci_stop_device.constprop.7+0x66/0x14f
[    3.870128] irq event stamp: 10408
[    3.870132] hardirqs last  enabled at (10407): [<ffffffff8f0a3d0c>] debug_check_no_locks_freed+0x91/0x135
[    3.870144] hardirqs last disabled at (10408): [<ffffffff8f6782ac>] _raw_spin_lock_irqsave+0x18/0x4f
[    3.870154] softirqs last  enabled at (10344): [<ffffffff8f064270>] __do_softirq+0x2ef/0x350
[    3.870165] softirqs last disabled at (10323): [<ffffffff8f0644d3>] irq_exit+0x44/0x9a
[    3.870176] CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 3.14.0 #4
[    3.870182] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015
[    3.870192] Workqueue: pm pm_runtime_work
[    3.870198]  0000000000000000 000000000662516c ffff88027308b840 ffffffff8f66f27d
[    3.870209]  0000000000000026 ffff88027308b868 ffffffff8f088cc4 0000000000008010
[    3.870220]  0000000000000030 ffffffff8f465ac9 ffff88027308b880 ffffffff8f14ad4e
[    3.870232] Call Trace:
[    3.870238]  [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f
[    3.870246]  [<ffffffff8f088cc4>] __might_sleep+0x1bd/0x1c4
[    3.870255]  [<ffffffff8f465ac9>] ? kzalloc+0xf/0x11
[    3.870263]  [<ffffffff8f14ad4e>] slab_pre_alloc_hook.isra.40+0x2c/0x31
[    3.870271]  [<ffffffff8f14d2b1>] __kmalloc+0x62/0x11f
[    3.870278]  [<ffffffff8f465ac9>] kzalloc+0xf/0x11
[    3.870286]  [<ffffffff8f466cb9>] xhci_alloc_command+0x2a/0xca
[    3.870293]  [<ffffffff8f46cb1b>] xhci_stop_device.constprop.7+0x8b/0x14f
[    3.870301]  [<ffffffff8f46d6bc>] xhci_hub_control+0x8f8/0xd8c
[    3.870308]  [<ffffffff8f088bd4>] ? __might_sleep+0xcd/0x1c4
[    3.870316]  [<ffffffff8f14ad4e>] ? slab_pre_alloc_hook.isra.40+0x2c/0x31
[    3.870325]  [<ffffffff8f43e6b8>] usb_hcd_submit_urb+0x5a7/0x78f
[    3.870333]  [<ffffffff8f0a3d0c>] ? debug_check_no_locks_freed+0x91/0x135
[    3.870341]  [<ffffffff8f0a3a9f>] ? trace_hardirqs_on_caller+0x180/0x19c
[    3.870347]  [<ffffffff8f0a3ac8>] ? trace_hardirqs_on+0xd/0xf
[    3.870353]  [<ffffffff8f43f6a7>] usb_submit_urb+0x443/0x458
[    3.870359]  [<ffffffff8f44061c>] usb_start_wait_urb+0x6d/0xf7
[    3.870365]  [<ffffffff8f088bd4>] ? __might_sleep+0xcd/0x1c4
[    3.870371]  [<ffffffff8f440787>] usb_control_msg+0xe1/0x11d
[    3.870377]  [<ffffffff8f4368a7>] set_port_feature+0x35/0x37
[    3.870383]  [<ffffffff8f43952f>] usb_port_suspend+0x16c/0x237
[    3.870390]  [<ffffffff8f44a30f>] generic_suspend+0x26/0x2c
[    3.870395]  [<ffffffff8f4425f6>] usb_suspend_both+0x14b/0x1a0
[    3.870402]  [<ffffffff8f4435eb>] usb_runtime_suspend+0x33/0x66
[    3.870408]  [<ffffffff8f4435b8>] ? usb_probe_interface+0x222/0x222
[    3.870414]  [<ffffffff8f3e66d5>] __rpm_callback+0x39/0x60
[    3.870420]  [<ffffffff8f3e6770>] rpm_callback+0x74/0x8a
[    3.870426]  [<ffffffff8f3e6f69>] rpm_suspend+0x331/0x4ab
[    3.870432]  [<ffffffff8f3e7e8b>] pm_runtime_work+0x87/0x94
[    3.870437]  [<ffffffff8f078e45>] process_one_work+0x25e/0x422
[    3.870443]  [<ffffffff8f07a4b3>] worker_thread+0x143/0x202
[    3.870449]  [<ffffffff8f07a370>] ? rescuer_thread+0x2d6/0x2d6
[    3.870455]  [<ffffffff8f080245>] kthread+0xdb/0xe3
[    3.870461]  [<ffffffff8f08016a>] ? __kthread_parkme+0x67/0x67
[    3.870467]  [<ffffffff8f67982c>] ret_from_fork+0x7c/0xb0
[    3.870473]  [<ffffffff8f08016a>] ? __kthread_parkme+0x67/0x67


one for regmap_lock_mutex()

[    6.092869] =============================================
[    6.092874] [ INFO: possible recursive locking detected ]
[    6.092879] 3.14.0 #4 Not tainted
[    6.092882] ---------------------------------------------
[    6.092886] udevd/2055 is trying to acquire lock:
[    6.092891]  (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12
[    6.092904] 
               but task is already holding lock:
[    6.092909]  (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12
[    6.092918] 
               other info that might help us debug this:
[    6.092924]  Possible unsafe locking scenario:

[    6.092928]        CPU0
[    6.092931]        ----
[    6.092934]   lock(&map->mutex);
[    6.092938]   lock(&map->mutex);
[    6.092942] 
                *** DEADLOCK ***

[    6.092947]  May be due to missing lock nesting notation

[    6.092953] 3 locks held by udevd/2055:
[    6.092957]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff8f3dcbbb>] __driver_attach+0x34/0x83
[    6.092969]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff8f3dcbd3>] __driver_attach+0x4c/0x83
[    6.092980]  #2:  (&map->mutex){+.+...}, at: [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12
[    6.092990] 
               stack backtrace:
[    6.092996] CPU: 2 PID: 2055 Comm: udevd Not tainted 3.14.0 #4
[    6.093001] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 04/02/2015
[    6.093006]  0000000000000000 000000004c1a3847 ffff88024d9df920 ffffffff8f66f27d
[    6.093016]  ffffffff902ad060 ffff88024d9df9f0 ffffffff8f0a4b74 000000024d95cf20
[    6.093024]  ffff88024d95c710 ffffffff902ad060 ffffffff8f3eac00 00005e82f4b2c596
[    6.093033] Call Trace:
[    6.093039]  [<ffffffff8f66f27d>] dump_stack+0x4d/0x6f
[    6.093045]  [<ffffffff8f0a4b74>] __lock_acquire+0xdc4/0xf8b
[    6.093052]  [<ffffffff8f3eac00>] ? regmap_parse_16_be_inplace+0xb/0xf
[    6.093059]  [<ffffffff8f00f893>] ? save_stack_trace+0x2b/0x46
[    6.093065]  [<ffffffff8f0a1254>] ? save_trace+0x3b/0x91
[    6.093071]  [<ffffffff8f0a511e>] lock_acquire+0x116/0x140
[    6.093077]  [<ffffffff8f0a511e>] ? lock_acquire+0x116/0x140
[    6.093083]  [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12
[    6.093089]  [<ffffffff8f676a0d>] mutex_lock_nested+0x89/0x432
[    6.093095]  [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12
[    6.093100]  [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12
[    6.093107]  [<ffffffff8f676d0d>] ? mutex_lock_nested+0x389/0x432
[    6.093112]  [<ffffffff8f3eac85>] regmap_lock_mutex+0x10/0x12
[    6.093118]  [<ffffffff8f3eac85>] ? regmap_lock_mutex+0x10/0x12
[    6.093124]  [<ffffffff8f3ecb8b>] regmap_read+0x37/0x5f
[    6.093132]  [<ffffffffc03114e6>] rt5677_sel_asrc_clk_src+0x220d/0x270c [snd_soc_rt5677]
[    6.093139]  [<ffffffff8f3eca9a>] _regmap_read+0xa5/0x15f
[    6.093145]  [<ffffffff8f3ecb99>] regmap_read+0x45/0x5f
[    6.093151]  [<ffffffffc0310af0>] rt5677_sel_asrc_clk_src+0x1817/0x270c [snd_soc_rt5677]
[    6.093160]  [<ffffffff8f2d9a5f>] ? acpi_dev_pm_attach+0xa9/0xc4
[    6.093168]  [<ffffffffc03107b7>] ? rt5677_sel_asrc_clk_src+0x14de/0x270c [snd_soc_rt5677]
[    6.093178]  [<ffffffff8f4a320c>] i2c_device_probe+0xae/0xd2
[    6.093186]  [<ffffffff8f3dc9d3>] driver_probe_device+0xae/0x1fc
[    6.093193]  [<ffffffff8f3dcbe8>] __driver_attach+0x61/0x83
[    6.093200]  [<ffffffff8f3dcb87>] ? __device_attach_driver+0x66/0x66
[    6.093207]  [<ffffffff8f3dbb06>] bus_for_each_dev+0x8f/0xb2
[    6.093214]  [<ffffffff8f3dc448>] driver_attach+0x1e/0x20
[    6.093221]  [<ffffffff8f3dc165>] bus_add_driver+0xee/0x1e6
[    6.093228]  [<ffffffff8f3dd30f>] driver_register+0x8e/0xca
[    6.093236]  [<ffffffff8f4a2205>] i2c_register_driver+0x44/0xa3
[    6.093245]  [<ffffffffc011c000>] ? 0xffffffffc011bfff
[    6.093253]  [<ffffffffc011c017>] init_module+0x17/0x1000 [snd_soc_rt5677]
[    6.093261]  [<ffffffff8f0002c0>] do_one_initcall+0x93/0x132
[    6.093270]  [<ffffffff8f036fbc>] ? set_memory_nx+0x37/0x3a
[    6.093279]  [<ffffffff8f0ca9f0>] load_module+0x15fa/0x1ba7
[    6.093287]  [<ffffffff8f0cb115>] SyS_finit_module+0x7d/0xa2
[    6.093294]  [<ffffffff8f0cb115>] ? SyS_finit_module+0x7d/0xa2
[    6.093304]  [<ffffffff8f6798dc>] system_call_fastpath+0x20/0x25


Since 3.18 and 4.4 don't seem to hit these problems when enabling USE="lockdebug" I'm guessing there's some upstream solutions to these issues that we can backport


 
for the xhci splat it looks like we need:
be3de3210709 xhci: Fix sleeping with IRQs disabled in xhci_stop_device()

Cc: drinkcat@chromium.org vpalatin@chromium.org
The xhci one looks like a real regression which was caused by backporting
xhci: Use command structures when queuing commands on the command ring

the regmap one is not a real problem (I think) but probably needs:
3cfe7a74d42b regmap: Use different lockdep class for each regmap init call

which was authored by drinkcat@ :-)
Ok, the regmap one were clean after two additional cherry-picks

watchdog one needs:
bde92cf455a0 kernel/watchdog.c: remove preemption restrictions when restarting lockup detector

full series to fix all the boottime splat on samus is up here:
https://chromium-review.googlesource.com/547687
Project Member

Comment 4 by bugdroid1@chromium.org, Jun 27 2017

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/64ac4d16bae7204c39217c6949b69fd93f50c89d

commit 64ac4d16bae7204c39217c6949b69fd93f50c89d
Author: Boris BREZILLON <boris.brezillon@free-electrons.com>
Date: Tue Jun 27 22:51:59 2017

UPSTREAM: regmap: i2c: fallback to SMBus if the adapter does not support standard I2C

Some I2C adapters are only compatible with the SMBus protocol and do not
support standard I2C transfers.

Fallback to SMBus transfers if we encounter such kind of adapters.
The transfer type is chosen according to the val_bits field in the regmap
config.

BUG= chromium:736671 
TEST=build/boot samus with USE="lockdebug"

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit b42261078a91db8a8307db42cad41a619077d1df)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: I369cc3ff37d616848c821b1aedb823d7ff5a6684
Reviewed-on: https://chromium-review.googlesource.com/547651
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>

[modify] https://crrev.com/64ac4d16bae7204c39217c6949b69fd93f50c89d/drivers/base/regmap/regmap-i2c.c

Project Member

Comment 5 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/64ac4d16bae7204c39217c6949b69fd93f50c89d

commit 64ac4d16bae7204c39217c6949b69fd93f50c89d
Author: Boris BREZILLON <boris.brezillon@free-electrons.com>
Date: Tue Jun 27 22:51:59 2017

UPSTREAM: regmap: i2c: fallback to SMBus if the adapter does not support standard I2C

Some I2C adapters are only compatible with the SMBus protocol and do not
support standard I2C transfers.

Fallback to SMBus transfers if we encounter such kind of adapters.
The transfer type is chosen according to the val_bits field in the regmap
config.

BUG= chromium:736671 
TEST=build/boot samus with USE="lockdebug"

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit b42261078a91db8a8307db42cad41a619077d1df)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: I369cc3ff37d616848c821b1aedb823d7ff5a6684
Reviewed-on: https://chromium-review.googlesource.com/547651
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>

[modify] https://crrev.com/64ac4d16bae7204c39217c6949b69fd93f50c89d/drivers/base/regmap/regmap-i2c.c

Project Member

Comment 6 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/eb26f6445a0ab39aca73e2f81eb5e2cffa0719da

commit eb26f6445a0ab39aca73e2f81eb5e2cffa0719da
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Tue Jun 27 22:51:58 2017

UPSTREAM: xhci: Fix sleeping with IRQs disabled in xhci_stop_device()

xhci_stop_device() allocates and issues stop commands for each active endpoint.
This is done with spinlock held and interrupt disabled so we can't sleep during
memory allocation. Use GFP_NOWAIT instead

Regression from commit ddba5cd0aeff5bbed92ebdf4b1223300b0541e78
"xhci: Use command structures when queuing commands on the command ring"
for 3.16-rc1

BUG= chromium:736671 
TEST=build/boot samus with USE="lockdebug"

Fixes: ddba5cd0aeff ("xhci: Use command structures when queuing commands")
Reported-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit be3de32107091c266b00a48265fe7e06233af4f0)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ic1ab945e2d8ae1e3f6a4dce9e4c2849dae13e143
Reviewed-on: https://chromium-review.googlesource.com/547738
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Vincent Palatin <vpalatin@chromium.org>

[modify] https://crrev.com/eb26f6445a0ab39aca73e2f81eb5e2cffa0719da/drivers/usb/host/xhci-hub.c

Project Member

Comment 7 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/77820775dc1feb161e667deae098d043f7cde29a

commit 77820775dc1feb161e667deae098d043f7cde29a
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Jun 27 22:52:05 2017

UPSTREAM: ipv6: tcp: add rcu locking in tcp_v6_send_synack()

When first SYNACK is sent, we already hold rcu_read_lock(), but this
is not true if a SYNACK is retransmitted, as a timer (soft) interrupt
does not hold rcu_read_lock()

BUG= chromium:736671 
TEST=build/boot samus with USE="lockdebug"

Fixes: 45f6fad84cc30 ("ipv6: add complete rcu protection around np->opt")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3e4006f0b86a5ae5eb0e8215f9a9e1db24506977)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ibb2a4b6116084e4ce73855d2f49b0c38b0263550
Reviewed-on: https://chromium-review.googlesource.com/548974
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/77820775dc1feb161e667deae098d043f7cde29a/net/ipv6/tcp_ipv6.c

Status: Fixed (was: Unconfirmed)

Comment 9 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment