kevin: psr deadlock |
|||
Issue description
Shelley reports she has encountered this in her dmesg:
Looks like a deadlock between psr flush handler and psr worker.
[ 220.118778] BUG: sleeping function called from invalid context at ../../../../7
[ 220.136132] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/2
[ 220.136143] 1 lock held by swapper/2/0:
[ 220.136192] #0: (((&psr->flush_timer))){+.-...}, at: [<ffffffc000298a98>] ca4
[ 220.136208] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.14 #83
[ 220.136214] Hardware name: Google Kevin (DT)
[ 220.136221] Call trace:
[ 220.136239] [<ffffffc0002088b0>] dump_backtrace+0x0/0x160
[ 220.136254] [<ffffffc000208a30>] show_stack+0x20/0x28
[ 220.136269] [<ffffffc0004fe7ec>] dump_stack+0xb4/0xf0
[ 220.136284] [<ffffffc00024e7d8>] ___might_sleep+0x1f4/0x204
[ 220.136297] [<ffffffc00024e868>] __might_sleep+0x80/0x90
[ 220.136314] [<ffffffc000978830>] mutex_lock_nested+0x50/0x3f4
[ 220.136329] [<ffffffc0005fe78c>] psr_set_state+0x30/0x90
[ 220.136342] [<ffffffc0005fe81c>] psr_flush_handler+0x30/0x3c
[ 220.136354] [<ffffffc000298c44>] call_timer_fn+0x1ac/0x424
[ 220.136366] [<ffffffc0002991c4>] run_timer_softirq+0x2dc/0x38c
[ 220.136378] [<ffffffc000200a0c>] __do_softirq+0x25c/0x55c
[ 220.136391] [<ffffffc000226584>] irq_exit+0x78/0xd8
[ 220.136406] [<ffffffc000283838>] __handle_domain_irq+0x9c/0xbc
[ 220.136417] [<ffffffc0002006f4>] gic_handle_irq+0xcc/0x188
[ 220.136426] Exception stack(0xffffffc0ee04bde0 to 0xffffffc0ee04bf00)
[ 220.136441] bde0: ffffffc00142efe8 ffffffc00142e000 ffffffc0ee04bf40 ffffffc008
[ 220.136453] be00: 0000000060000145 0000000000000000 000000000239d000 ffffffc00c
[ 220.136466] be20: ffffffc0ee03ab00 0000000000000004 0000000000000000 0000000000
[ 220.136478] be40: 0100000000000000 0000000000000028 0000000000000000 ffffffc008
[ 220.136491] be60: ffffffc0ee03bcb0 ffffffc0ee04beb0 0000000000001150 0000000000
[ 220.136503] be80: 00000000000000f8 00000000fffd0c80 00000000000d3c05 0000000000
[ 220.136516] bea0: ffffffc000225450 0000000000000000 0000000000000000 ffffffc008
[ 220.136529] bec0: ffffffc00142e000 0000000000000002 ffffffc001349820 0000000000
[ 220.136541] bee0: 0000000000000000 000000000239d000 00000000023a0000 ffffffc000
[ 220.136554] [<ffffffc0002035ac>] el1_irq+0x6c/0xe0
[ 220.136571] [<ffffffc00026b7a8>] default_idle_call+0x30/0x38
[ 220.136585] [<ffffffc00026baa8>] cpu_startup_entry+0x284/0x418
[ 220.136599] [<ffffffc00020df3c>] secondary_start_kernel+0x154/0x160
[ 220.136609] [<0000000000400e7c>] 0x400e7c
[ 220.136680]
[ 220.136686] =================================
[ 220.136690] [ INFO: inconsistent lock state ]
[ 220.136698] 4.4.14 #83 Not tainted
[ 220.136702] ---------------------------------
[ 220.136709] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 220.136721] swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 220.136750] (&psr->state_mutex){+.?...}, at: [<ffffffc0005fe78c>] psr_set_sta0
[ 220.136755] {SOFTIRQ-ON-W} state was registered at:
[ 220.136774] [<ffffffc000273bfc>] mark_lock+0x2ac/0x644
[ 220.136791] [<ffffffc000274d40>] __lock_acquire+0x880/0x1958
[ 220.136807] [<ffffffc00027665c>] lock_acquire+0x1b4/0x278
[ 220.136825] [<ffffffc00097886c>] mutex_lock_nested+0x8c/0x3f4
[ 220.136841] [<ffffffc0005fe78c>] psr_set_state+0x30/0x90
[ 220.136857] [<ffffffc0005fe898>] rockchip_drm_psr_disable+0x30/0x40
[ 220.136873] [<ffffffc0005fef1c>] vop_psr_work+0x30/0x3c
[ 220.136893] [<ffffffc00023d7a8>] process_one_work+0x360/0x6ec
[ 220.136910] [<ffffffc00023edd0>] worker_thread+0x300/0x428
[ 220.136926] [<ffffffc0002450a0>] kthread+0x108/0x110
[ 220.136943] [<ffffffc000203c50>] ret_from_fork+0x10/0x40
[ 220.136949] irq event stamp: 729128
[ 220.136966] hardirqs last enabled at (729128): [<ffffffc0002035cc>] el1_irq+00
[ 220.136983] hardirqs last disabled at (729127): [<ffffffc00020359c>] el1_irq+00
[ 220.137000] softirqs last enabled at (729102): [<ffffffc000226124>] _local_bh0
[ 220.137016] softirqs last disabled at (729103): [<ffffffc000226584>] irq_exit+8
[ 220.137020]
[ 220.137020] other info that might help us debug this:
[ 220.137025] Possible unsafe locking scenario:
[ 220.137025]
[ 220.137029] CPU0
[ 220.137032] ----
[ 220.137043] lock(&psr->state_mutex);
[ 220.137046] <Interrupt>
[ 220.137057] lock(&psr->state_mutex);
[ 220.137060]
[ 220.137060] *** DEADLOCK ***
[ 220.137060]
[ 220.137068] 1 lock held by swapper/2/0:
[ 220.137097] #0: (((&psr->flush_timer))){+.-...}, at: [<ffffffc000298a98>] ca4
[ 220.137102]
[ 220.137102] stack backtrace:
[ 220.137113] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.14 #83
[ 220.137119] Hardware name: Google Kevin (DT)
[ 220.137123] Call trace:
[ 220.137138] [<ffffffc0002088b0>] dump_backtrace+0x0/0x160
[ 220.137151] [<ffffffc000208a30>] show_stack+0x20/0x28
[ 220.137164] [<ffffffc0004fe7ec>] dump_stack+0xb4/0xf0
[ 220.137179] [<ffffffc000317fb0>] print_usage_bug.part.34+0x264/0x27c
[ 220.137191] [<ffffffc000273d64>] mark_lock+0x414/0x644
[ 220.137204] [<ffffffc000274d04>] __lock_acquire+0x844/0x1958
[ 220.137217] [<ffffffc00027665c>] lock_acquire+0x1b4/0x278
[ 220.137230] [<ffffffc00097886c>] mutex_lock_nested+0x8c/0x3f4
[ 220.137243] [<ffffffc0005fe78c>] psr_set_state+0x30/0x90
[ 220.137255] [<ffffffc0005fe81c>] psr_flush_handler+0x30/0x3c
[ 220.137267] [<ffffffc000298c44>] call_timer_fn+0x1ac/0x424
[ 220.137279] [<ffffffc0002991c4>] run_timer_softirq+0x2dc/0x38c
[ 220.137290] [<ffffffc000200a0c>] __do_softirq+0x25c/0x55c
[ 220.137301] [<ffffffc000226584>] irq_exit+0x78/0xd8
[ 220.137313] [<ffffffc000283838>] __handle_domain_irq+0x9c/0xbc
[ 220.137325] [<ffffffc0002006f4>] gic_handle_irq+0xcc/0x188
[ 220.137333] Exception stack(0xffffffc0ee04bde0 to 0xffffffc0ee04bf00)
[ 220.137347] bde0: ffffffc00142efe8 ffffffc00142e000 ffffffc0ee04bf40 ffffffc008
[ 220.137360] be00: 0000000060000145 0000000000000000 000000000239d000 ffffffc00c
[ 220.137372] be20: ffffffc0ee03ab00 0000000000000004 0000000000000000 0000000000
[ 220.137384] be40: 0100000000000000 0000000000000028 0000000000000000 ffffffc008
[ 220.137397] be60: ffffffc0ee03bcb0 ffffffc0ee04beb0 0000000000001150 0000000000
[ 220.137409] be80: 00000000000000f8 00000000fffd0c80 00000000000d3c05 0000000000
[ 220.137422] bea0: ffffffc000225450 0000000000000000 0000000000000000 ffffffc008
[ 220.137434] bec0: ffffffc00142e000 0000000000000002 ffffffc001349820 0000000000
[ 220.137446] bee0: 0000000000000000 000000000239d000 00000000023a0000 ffffffc000
[ 220.137458] [<ffffffc0002035ac>] el1_irq+0x6c/0xe0
[ 220.137474] [<ffffffc00026b7a8>] default_idle_call+0x30/0x38
[ 220.137489] [<ffffffc00026baa8>] cpu_startup_entry+0x284/0x418
[ 220.137501] [<ffffffc00020df3c>] secondary_start_kernel+0x154/0x160
[ 220.137509] [<0000000000400e7c>] 0x400e7c
[ 241.886897] usb 7-1.1: USB disconnect, device number 3
[ 243.391296] reboot: Restarting system
,
Jan 8 2018
Sean do you have time to look at this one?
,
Jan 8 2018
ah fun. i'll take a look
,
Jan 23 2018
Given that this code has had significant improvements/changes since the initial report, I'm going to mark this obsolete. Without a repro or current trace, this will be a wild goose chase. |
|||
►
Sign in to add a comment |
|||
Comment 1 by dtapu...@chromium.org
, Jan 8 2018