New issue
Advanced search Search tips

Issue 698283 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Repeated bind/unbind of usb platform driver can result in kernel loop and hangup

Project Member Reported by groeck@chromium.org, Mar 3 2017

Issue description

The script

while true
do
    for i in /sys/bus/platform/drivers/rockchip-dwc3/usb*; do
	basename $i > $(dirname $i)/unbind
	basename $i > $(dirname $i)/bind
    done
    sleep 1
done

running on kevin with and Ethernet dongle on one Type-C port and an Apple dongle on the other results in the following error log. Note that "sleep 1" is essential to reproduce the problem.

[   38.368406] hub 5-0:1.0: USB hub found
[   38.368568] hub 5-0:1.0: 1 port detected
[   38.372317] xhci-hcd xhci-hcd.2.auto: xHCI Host Controller
[   38.374331] xhci-hcd xhci-hcd.2.auto: new USB bus registered, assigned bus number 6
[   38.374734] usb usb6: We don't know the algorithms for LPM for this host, disabling LPM.
[   38.375526] usb usb6: New USB device found, idVendor=1d6b, idProduct=0003
[   38.375536] usb usb6: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   38.375545] usb usb6: Product: xHCI Host Controller
[   38.375555] usb usb6: Manufacturer: Linux 4.4.52 xhci-hcd
[   38.375563] usb usb6: SerialNumber: xhci-hcd.2.auto
[   38.382576] hub 6-0:1.0: USB hub found
[   38.382742] hub 6-0:1.0: 1 port detected
[   38.386817] rockchip-dwc3 usb@fe800000: USB HOST connected
[   38.674057] usb 5-1: new high-speed USB device number 2 using xhci-hcd
[   38.843068] xhci-hcd xhci-hcd.3.auto: xHCI Host Controller
[   38.851004] xhci-hcd xhci-hcd.3.auto: new USB bus registered, assigned bus number 7
[   38.861667] xhci-hcd xhci-hcd.3.auto: hcc params 0x0220fe64 hci version 0x110 quirks 0x02030010
[   38.870529] xhci-hcd xhci-hcd.3.auto: irq 229, io mem 0xfe900000
[   38.877588] usb usb7: New USB device found, idVendor=1d6b, idProduct=0002
[   38.884448] usb usb7: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   38.892301] usb usb7: Product: xHCI Host Controller
[   38.897230] usb usb7: Manufacturer: Linux 4.4.52 xhci-hcd
[   38.902699] usb usb7: SerialNumber: xhci-hcd.3.auto
[   38.914943] hub 7-0:1.0: USB hub found
[   38.918925] hub 7-0:1.0: 1 port detected
[   38.927500] xhci-hcd xhci-hcd.3.auto: xHCI Host Controller
[   38.935761] xhci-hcd xhci-hcd.3.auto: new USB bus registered, assigned bus number 8
[   38.944251] usb usb8: We don't know the algorithms for LPM for this host, disabling LPM.
[   38.953559] usb usb8: New USB device found, idVendor=1d6b, idProduct=0003
[   38.960430] usb usb8: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   38.967746] usb usb8: Product: xHCI Host Controller
[   38.972767] usb usb8: Manufacturer: Linux 4.4.52 xhci-hcd
[   38.978242] usb usb8: SerialNumber: xhci-hcd.3.auto
[   38.990764] hub 8-0:1.0: USB hub found
[   38.995332] hub 8-0:1.0: 1 port detected
[   39.003733] rockchip-dwc3 usb@fe900000: USB HOST connected
[   39.229094] usb 7-1: new high-speed USB device number 2 using xhci-hcd
[   39.354214] usb 7-1: New USB device found, idVendor=05ac, idProduct=100f
[   39.360986] usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[   39.368188] usb 7-1: Product: USB2.0 Hub
[   39.372174] usb 7-1: Manufacturer: Apple Inc.
[   39.396276] hub 7-1:1.0: USB hub found
[   39.401648] hub 7-1:1.0: 2 ports detected
[   39.457358] usb 8-1: new SuperSpeed USB device number 2 using xhci-hcd
[   39.478276] usb 8-1: New USB device found, idVendor=05ac, idProduct=100e
[   39.485027] usb 8-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[   39.492403] usb 6-1: new SuperSpeed USB device number 2 using xhci-hcd
[   39.498993] usb 8-1: Product: USB3.0 Hub
[   39.503554] usb 8-1: Manufacturer: Apple Inc.
[   39.512387] usb 6-1: New USB device found, idVendor=0bda, idProduct=8153
[   39.519153] usb 6-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[   39.527075] usb 6-1: Product: USB 10/100/1000 LAN
[   39.535486] usb 6-1: Manufacturer: Realtek
[   39.539946] usb 6-1: SerialNumber: 000001000000
[   39.548097] hub 8-1:1.0: USB hub found
[   39.554955] hub 8-1:1.0: 1 port detected
[   39.557938] xhci-hcd xhci-hcd.2.auto: remove, state 1
[   39.558015] usb usb6: USB disconnect, device number 1
[   39.592175] usb 6-1: USB disconnect, device number 2
[   39.603365] xhci-hcd xhci-hcd.2.auto: USB bus 6 deregistered
[   39.610368] xhci-hcd xhci-hcd.2.auto: remove, state 4
[   39.615544] usb usb5: USB disconnect, device number 1
[   39.623007] xhci-hcd xhci-hcd.2.auto: USB bus 5 deregistered
[   39.699059] xhci-hcd xhci-hcd.3.auto: remove, state 1
[   39.704452] usb usb8: USB disconnect, device number 1
[   39.710088] usb 8-1: USB disconnect, device number 2
[   39.730918] xhci-hcd xhci-hcd.3.auto: USB bus 8 deregistered
[   39.737260] xhci-hcd xhci-hcd.3.auto: remove, state 1
[   39.741112] usb 7-1.2: new low-speed USB device number 3 using xhci-hcd
[   39.741127] usb 7-1.2: hub failed to enable device, error -108
[   39.741210] usb 7-1-port2: cannot disable (err = -22)
[   39.741479] usb 7-1-port2: couldn't allocate usb_device
[   39.741525] usb 7-1-port2: cannot disable (err = -22)
[   39.741729] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.741780] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.741802] hub 7-1:1.0: activate --> -22
[   39.741919] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.741998] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742009] hub 7-1:1.0: activate --> -22
[   39.742118] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742167] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742181] hub 7-1:1.0: activate --> -22

[   39.742286] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742337] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742348] hub 7-1:1.0: activate --> -22
[   39.742451] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742501] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742514] hub 7-1:1.0: activate --> -22
[   39.742625] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742673] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742686] hub 7-1:1.0: activate --> -22
[   39.742791] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742845] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.742855] hub 7-1:1.0: activate --> -22
[   39.742983] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.743033] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.743043] hub 7-1:1.0: activate --> -22
[   39.743147] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
[   39.743195] hub 7-1:1.0: hub_ext_port_status failed (err = -22)
** 57 printk messages dropped ** [   39.746522] hub 7-1:1.0: activate --> -22
** 82 printk messages dropped ** [   39.751272] hub 7-1:1.0: hub_ext_port_status failed (err = -22)

[ and so on ]


 
The same script also resulted in a kernel lock-up.

[  360.654183] INFO: task kworker/3:0:25 blocked for more than 120 seconds.
[  360.660939]       Not tainted 4.4.52 #410
[  360.665025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.672879] kworker/3:0     D ffffffc000205d60     0    25      2 0x00000000
[  360.680075] Workqueue: usb_hub_wq hub_event
[  360.684317] Call trace:
[  360.686860] [<ffffffc000205d60>] __switch_to+0xf4/0x108
[  360.692206] [<ffffffc000c0021c>] __schedule+0x6a8/0xa94
[  360.697488] [<ffffffc000c006f4>] schedule+0xec/0x11c
[  360.702502] [<ffffffc000c00bd0>] schedule_preempt_disabled+0x34/0x5c
[  360.708875] [<ffffffc000c02fcc>] __mutex_lock_slowpath+0x170/0x270
[  360.715064] [<ffffffc000c03118>] mutex_lock+0x4c/0x78
[  360.720137] [<ffffffc0007fea8c>] hub_event+0x7c/0x12a4
[  360.725301] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[  360.731068] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[  360.736562] [<ffffffc000251a80>] kthread+0x164/0x178
[  360.741661] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[  360.746985]   task                        PC stack   pid father
[  360.752945] kworker/3:0     D ffffffc000205d60     0    25      2 0x00000000
[  360.760611] Workqueue: usb_hub_wq hub_event
[  360.764838] Call trace:
[  360.767317] [<ffffffc000205d60>] __switch_to+0xf4/0x108
[  360.772571] [<ffffffc000c0021c>] __schedule+0x6a8/0xa94
[  360.777807] [<ffffffc000c006f4>] schedule+0xec/0x11c
[  360.782796] [<ffffffc000c00bd0>] schedule_preempt_disabled+0x34/0x5c
[  360.789159] [<ffffffc000c02fcc>] __mutex_lock_slowpath+0x170/0x270
[  360.795358] [<ffffffc000c03118>] mutex_lock+0x4c/0x78
[  360.800422] [<ffffffc0007fea8c>] hub_event+0x7c/0x12a4
[  360.805586] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[  360.811345] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[  360.816852] [<ffffffc000251a80>] kthread+0x164/0x178
[  360.821829] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[  360.827380] kworker/2:3     D ffffffc000205d60     0  4734      2 0x00000000
[  360.834497] Workqueue: events driver_set_config_work
[  360.839498] Call trace:
[  360.841972] [<ffffffc000205d60>] __switch_to+0xf4/0x108
[  360.847222] [<ffffffc000c0021c>] __schedule+0x6a8/0xa94
[  360.852456] [<ffffffc000c006f4>] schedule+0xec/0x11c
[  360.857444] [<ffffffc000c0489c>] schedule_timeout+0x4c/0x32c
[  360.863559] [<ffffffc000c01744>] wait_for_common+0x18c/0x250
[  360.869248] [<ffffffc000c01830>] wait_for_completion+0x28/0x34
[  360.875094] [<ffffffc000847fd4>] xhci_discover_or_reset_device+0x1a4/0x3f8
[  360.881995] [<ffffffc0007fa9a8>] hub_port_reset+0x42c/0x6cc
[  360.887578] [<ffffffc0007fad3c>] hub_port_init+0xf4/0xd10
[  360.893294] [<ffffffc0007fbab4>] usb_reset_and_verify_device+0x15c/0x82c
[  360.900030] [<ffffffc0007fc268>] usb_reset_device+0xe4/0x298
[  360.905744] [<ffffffbffc0e3fcc>] rtl8152_probe+0x84/0x9b0 [r8152]
[  360.911853] [<ffffffc00080c734>] usb_probe_interface+0x244/0x2f8
[  360.917888] [<ffffffc0007746ac>] driver_probe_device+0x180/0x3b4
[  360.923904] [<ffffffc000774ad0>] __device_attach_driver+0xb4/0xe0
[  360.930022] [<ffffffc000771df0>] bus_for_each_drv+0xb4/0xe4
[  360.935615] [<ffffffc000774474>] __device_attach+0xd0/0x158
[  360.941211] [<ffffffc000774d08>] device_initial_probe+0x24/0x30
[  360.947140] [<ffffffc00077365c>] bus_probe_device+0x50/0xe4
[  360.952732] [<ffffffc000770858>] device_add+0x414/0x738
[  360.957966] [<ffffffc000809c90>] usb_set_configuration+0x89c/0x914
[  360.965068] [<ffffffc000809dc8>] driver_set_config_work+0xc0/0xf0
[  360.971186] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[  360.976953] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[  360.982448] [<ffffffc000251a80>] kthread+0x164/0x178
[  360.987433] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[  360.992779] udevd           D ffffffc000205d60     0 10351    198 0x00400001
...
[  361.426746] basename        D ffffffc000205d60     0 18166   3271 0x00400009
[  361.433854] Call trace:
[  361.436325] [<ffffffc000205d60>] __switch_to+0xf4/0x108
[  361.441559] [<ffffffc000c0021c>] __schedule+0x6a8/0xa94
[  361.446800] [<ffffffc000c006f4>] schedule+0xec/0x11c
[  361.451781] [<ffffffc000c00bd0>] schedule_preempt_disabled+0x34/0x5c
[  361.458153] [<ffffffc000c02fcc>] __mutex_lock_slowpath+0x170/0x270
[  361.464342] [<ffffffc000c03118>] mutex_lock+0x4c/0x78
[  361.469416] [<ffffffc0007fd0c8>] usb_disconnect+0x74/0x28c
[  361.474917] [<ffffffc0007fd120>] usb_disconnect+0xcc/0x28c
[  361.481146] [<ffffffc0008032bc>] usb_remove_hcd+0x10c/0x2a8
[  361.486750] [<ffffffc00085aeb8>] xhci_plat_remove+0xa8/0xf0
[  361.492353] [<ffffffc000777328>] platform_drv_remove+0x48/0x6c
[  361.498199] [<ffffffc000774c08>] __device_release_driver+0x10c/0x1a8
[  361.504568] [<ffffffc000774cd0>] device_release_driver+0x2c/0x40
[  361.510592] [<ffffffc0007738c8>] bus_remove_device+0x1d8/0x200
[  361.516437] [<ffffffc00076f98c>] device_del+0x218/0x2d0
[  361.521670] [<ffffffc000777110>] platform_device_del+0x2c/0xd4
[  361.527516] [<ffffffc0007771d8>] platform_device_unregister+0x20/0x34
[  361.533961] [<ffffffc000820b34>] dwc3_host_exit+0xbc/0xd0
[  361.539386] [<ffffffc00081ca7c>] dwc3_remove+0x90/0xe4
[  361.544536] [<ffffffc000777328>] platform_drv_remove+0x48/0x6c
[  361.550383] [<ffffffc000774c08>] __device_release_driver+0x10c/0x1a8
[  361.556741] [<ffffffc000774cd0>] device_release_driver+0x2c/0x40
[  361.562760] [<ffffffc0007738c8>] bus_remove_device+0x1d8/0x200
[  361.568597] [<ffffffc00076f98c>] device_del+0x218/0x2d0
[  361.573837] [<ffffffc000777110>] platform_device_del+0x2c/0xd4
[  361.579686] [<ffffffc0007771d8>] platform_device_unregister+0x20/0x34
[  361.586581] [<ffffffc000997124>] of_platform_device_destroy+0x8c/0xf4
[  361.593037] [<ffffffc00076e910>] device_for_each_child+0x88/0xbc
[  361.599059] [<ffffffc000997070>] of_platform_depopulate+0x54/0x7c
[  361.605175] [<ffffffc000821c00>] dwc3_rockchip_remove+0x94/0x158

This suggests that insertion and removal might interfer with each other.

Complete log attached.

console-ramoops
255 KB View Download
Call traces for looping code:

[   26.619417] Call trace:
[   26.619422] [<ffffffc0007f89b4>] hub_ext_port_status+0x138/0x23c
[   26.619427] [<ffffffc0007f8af8>] hub_port_status+0x40/0x4c
[   26.619431] [<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8
[   26.619436] [<ffffffc0007fceb4>] hub_resume+0x2c/0x3c
[   26.619441] [<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158
[   26.619445] [<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288
[   26.619450] [<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98
[   26.619456] [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[   26.619461] [<ffffffc00078217c>] rpm_callback+0xa8/0xd4
[   26.619466] [<ffffffc000782acc>] rpm_suspend+0x35c/0x684
[   26.619471] [<ffffffc000784124>] __pm_runtime_suspend+0x60/0xac
[   26.619475] [<ffffffc00080ca80>] usb_runtime_idle+0x30/0x40
[   26.619480] [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[   26.619484] [<ffffffc0007830b0>] rpm_idle+0x1e8/0x498
[   26.619489] [<ffffffc00078429c>] pm_runtime_work+0x88/0xcc
[   26.619498] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[   26.619503] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[   26.619507] [<ffffffc000251a80>] kthread+0x164/0x178
[   26.619512] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[   26.619536] hub 7-1:1.0: hub_ext_port_status failed (err = -22)

[   37.020885] Call trace:
[   37.020896] [<ffffffc0007fcc48>] hub_activate+0x6d0/0x7b8
[   37.020905] [<ffffffc0007fceb4>] hub_resume+0x2c/0x3c
[   37.020915] [<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158
[   37.020924] [<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288
[   37.020933] [<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98
[   37.020944] [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[   37.020955] [<ffffffc00078217c>] rpm_callback+0xa8/0xd4
[   37.020966] [<ffffffc000782acc>] rpm_suspend+0x35c/0x684
[   37.020977] [<ffffffc000784124>] __pm_runtime_suspend+0x60/0xac
[   37.020987] [<ffffffc00080ca80>] usb_runtime_idle+0x30/0x40
[   37.020998] [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[   37.021008] [<ffffffc0007830b0>] rpm_idle+0x1e8/0x498
[   37.021020] [<ffffffc00078429c>] pm_runtime_work+0x88/0xcc
[   37.021031] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[   37.021042] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[   37.021051] [<ffffffc000251a80>] kthread+0x164/0x178
[   37.021061] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[   37.021089] hub 7-1:1.0: activate --> -22

Effectively suspend fails because the device has been disconnected, and the subsequent resume fails for the same reason. With the right timing, this continues forever.

Additional information: Console output has to be active for the problem to be  observed. If console output is enabled, the kernel has time to handle the actual disconnect event, and the loop stops at that point.

Call trace for initial failure:

[   28.726991] Call trace:
[   28.729465] [<ffffffc0007fb21c>] hub_port_init+0x538/0xd38
[   28.734966] [<ffffffc0007ff65c>] hub_event+0xb04/0x134c
[   28.740215] [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[   28.745977] [<ffffffc00024abcc>] worker_thread+0x480/0x610
[   28.751479] [<ffffffc000251a80>] kthread+0x164/0x178
[   28.756457] [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
[   28.762271] usb 7-1.2: hub failed to enable device, error -108 

Status: Started (was: Assigned)
Updated traceback:

[<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8
[<ffffffc0007fceb4>] hub_resume+0x2c/0x3c
[<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158
[<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288
[<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98
[<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[<ffffffc00078217c>] rpm_callback+0xa8/0xd4
[<ffffffc000786234>] rpm_suspend+0x84/0x758
[<ffffffc000786ca4>] rpm_idle+0x2c8/0x498
[<ffffffc000786ed4>] __pm_runtime_idle+0x60/0xac
[<ffffffc00080eba8>] usb_autopm_put_interface+0x6c/0x7c
[<ffffffc000803798>] hub_event+0x10ac/0x12ac
[<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[<ffffffc00024abcc>] worker_thread+0x480/0x610
[<ffffffc000251a80>] kthread+0x164/0x178
[<ffffffc0002045d0>] ret_from_fork+0x10/0x40

In the loop condition, hub_activate() schedules another hub event. This repeats forever.

Project Member

Comment 6 by bugdroid1@chromium.org, Mar 10 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3bd28855622210d0b7fb591b65009517a31f8548

commit 3bd28855622210d0b7fb591b65009517a31f8548
Author: Al Cooper <alcooperx@gmail.com>
Date: Fri Mar 10 06:18:34 2017

UPSTREAM: usb: Add connected retry on resume for non SS devices

Currently usb_port_resume waits for up to 2 seconds for CONNECT
status for SS devices only. This change will do the same thing for
non-SS devices even though the reason is a little different. This
will fix an issue where VBUS is turned off during system wide
"suspend to ram" and some 2.0 devices take greater than the current
max of 100ms to show connected after VBUS is enabled. This is most
commonly seen on hard drive based devices and USB3.0 devices plugged
into a 2.0 only port.

BUG= chromium:698283 
TEST=bind/unbind USB devices in loop

Change-Id: If61e71012073b7cd9943cc46a5fa0b962ad7d914
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 6b82b1223e3b14afd89d167795671a9f4f77b2f0)
Reviewed-on: https://chromium-review.googlesource.com/450737
Reviewed-by: Douglas Anderson <dianders@chromium.org>

[modify] https://crrev.com/3bd28855622210d0b7fb591b65009517a31f8548/drivers/usb/core/hub.c

Project Member

Comment 7 by bugdroid1@chromium.org, Mar 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/6711e5b6fd7ed685e61722ed3aba7761c2c95d89

commit 6711e5b6fd7ed685e61722ed3aba7761c2c95d89
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Fri Mar 10 06:18:35 2017

UPSTREAM: xhci: don't try to reset the host if it is unaccessible

There is no point in trying to reset the host controller by writing
to its registers if host is removed and registers just return 0xffffffff

bail out and return -ENODEV instead

BUG= chromium:698283 
TEST=bind/unbind USB devices in loop

Change-Id: Icef47b5714d117db2f093383a0cff285fa4fdea3
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit c11ae038d62bf07231be7b813435e5067c978ddc)
Reviewed-on: https://chromium-review.googlesource.com/450738
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>

[modify] https://crrev.com/6711e5b6fd7ed685e61722ed3aba7761c2c95d89/drivers/usb/host/xhci.c

Project Member

Comment 8 by bugdroid1@chromium.org, Mar 10 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/b48ee25f172e4c854567c2c30c2f88132cd9df03

commit b48ee25f172e4c854567c2c30c2f88132cd9df03
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Fri Mar 10 06:18:38 2017

UPSTREAM: xhci: rename EP_HALT_PENDING to EP_STOP_CMD_PENDING

We don't want to confuse halted and stalled endpoint states with
a flag indicating we are waiting for a stop endpoint command to
finish or timeout

BUG= chromium:698283 
TEST=bind/unbind USB devices in loop

Change-Id: I00a273e6239cc1ee4650e935a0a9bf6f297c551e
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 9983a5fc39bfce7581db49f884aa782f24149d93)
Reviewed-on: https://chromium-review.googlesource.com/450739
Reviewed-by: Douglas Anderson <dianders@chromium.org>

[modify] https://crrev.com/b48ee25f172e4c854567c2c30c2f88132cd9df03/drivers/usb/host/xhci.c
[modify] https://crrev.com/b48ee25f172e4c854567c2c30c2f88132cd9df03/drivers/usb/host/xhci-ring.c
[modify] https://crrev.com/b48ee25f172e4c854567c2c30c2f88132cd9df03/drivers/usb/host/xhci.h

Project Member

Comment 9 by bugdroid1@chromium.org, Mar 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e640e02daae2274c62471d028e0fb186b784f678

commit e640e02daae2274c62471d028e0fb186b784f678
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Fri Mar 10 06:18:39 2017

UPSTREAM: xhci: detect stop endpoint race using pending timer instead of counter.

A counter was used to find out if the stop endpoint completion raced with
the stop endpoint timeout timer. This was needed in case the stop ep
completion failed to delete the timer as it was running on anoter cpu.

The EP_STOP_CMD_PENDING flag was not enough as a new stop endpoint command
may be queued between the command completion and timeout function, which
would set the flag back.

Instead of the separate counter that was used we can detect the race by
checking both the STOP_EP_PENDING flag and timer_pending in the timeout
function.

BUG= chromium:698283 
TEST=bind/unbind USB devices in loop

Change-Id: Ibecd50054aaa997f7e4f4c11e2c269832ef23b0b
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit f99265965b3203baf5266994578db14851fbf7fa)
Reviewed-on: https://chromium-review.googlesource.com/450740
Commit-Ready: Dean Brown <lbrown6445@gmail.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>

[modify] https://crrev.com/e640e02daae2274c62471d028e0fb186b784f678/drivers/usb/host/xhci.c
[modify] https://crrev.com/e640e02daae2274c62471d028e0fb186b784f678/drivers/usb/host/xhci-ring.c
[modify] https://crrev.com/e640e02daae2274c62471d028e0fb186b784f678/drivers/usb/host/xhci.h

Project Member

Comment 10 by bugdroid1@chromium.org, Mar 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/d77579449347011155f9d7ac0a87523fe749077a

commit d77579449347011155f9d7ac0a87523fe749077a
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Fri Mar 10 06:18:36 2017

UPSTREAM: xhci: simplify if statement to make it more readable

No functional change, De Morgan !(A && B) = (!A || !B)

BUG= chromium:698283 
TEST=bind/unbind USB devices in loop

Change-Id: I794b18265fcbd133e852444c5a70c0cf1c92f75a
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 505f581c48bc)
Reviewed-on: https://chromium-review.googlesource.com/452757
Reviewed-by: Douglas Anderson <dianders@chromium.org>

[modify] https://crrev.com/d77579449347011155f9d7ac0a87523fe749077a/drivers/usb/host/xhci-ring.c

Project Member

Comment 11 by bugdroid1@chromium.org, Mar 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/4374846f01cce61f013a5908711d706fb19b87ad

commit 4374846f01cce61f013a5908711d706fb19b87ad
Author: Guenter Roeck <linux@roeck-us.net>
Date: Thu Mar 16 23:15:10 2017

BACKPORT: usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers

Upstream commit 98d74f9ceaef ("xhci: fix 10 second timeout on removal of
PCI hotpluggable xhci controllers") fixes a problem with hot pluggable PCI
xhci controllers which can result in excessive timeouts, to the point where
the system reports a deadlock.

The same problem is seen with hot pluggable xhci controllers using the
xhci-plat driver, such as the driver used for Type-C ports on rk3399.
Similar to hot-pluggable PCI controllers, the driver for this chip
removes the xhci controller from the system when the Type-C cable is
disconnected.

The solution for PCI devices works just as well for non-PCI devices
and avoids the problem.

BUG= chromium:698283 
TEST=Run bind/unbind in tight loop for extended period of time

Change-Id: Idb319ac0140c27dd11dd750d0ada9a86ed9db754
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[backport: context changes (locally added pm runtime support)]
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit dcc7620cad5ad1326a78f4031a7bf4f0e5b42984)
Reviewed-on: https://chromium-review.googlesource.com/450741
Reviewed-by: Brian Norris <briannorris@chromium.org>

[modify] https://crrev.com/4374846f01cce61f013a5908711d706fb19b87ad/drivers/usb/host/xhci-plat.c

Project Member

Comment 12 by bugdroid1@chromium.org, Mar 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/4374846f01cce61f013a5908711d706fb19b87ad

commit 4374846f01cce61f013a5908711d706fb19b87ad
Author: Guenter Roeck <linux@roeck-us.net>
Date: Thu Mar 16 23:15:10 2017

BACKPORT: usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers

Upstream commit 98d74f9ceaef ("xhci: fix 10 second timeout on removal of
PCI hotpluggable xhci controllers") fixes a problem with hot pluggable PCI
xhci controllers which can result in excessive timeouts, to the point where
the system reports a deadlock.

The same problem is seen with hot pluggable xhci controllers using the
xhci-plat driver, such as the driver used for Type-C ports on rk3399.
Similar to hot-pluggable PCI controllers, the driver for this chip
removes the xhci controller from the system when the Type-C cable is
disconnected.

The solution for PCI devices works just as well for non-PCI devices
and avoids the problem.

BUG= chromium:698283 
TEST=Run bind/unbind in tight loop for extended period of time

Change-Id: Idb319ac0140c27dd11dd750d0ada9a86ed9db754
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[backport: context changes (locally added pm runtime support)]
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit dcc7620cad5ad1326a78f4031a7bf4f0e5b42984)
Reviewed-on: https://chromium-review.googlesource.com/450741
Reviewed-by: Brian Norris <briannorris@chromium.org>

[modify] https://crrev.com/4374846f01cce61f013a5908711d706fb19b87ad/drivers/usb/host/xhci-plat.c

Status: Fixed (was: Started)
Status: Started (was: Fixed)
Set to fixed too early.

Project Member

Comment 15 by bugdroid1@chromium.org, Mar 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c96738bce1c8c6fd287c67f64d11f7504693f7cb

commit c96738bce1c8c6fd287c67f64d11f7504693f7cb
Author: Guenter Roeck <linux@roeck-us.net>
Date: Tue Mar 21 02:30:10 2017

FROMLIST: usb: hub: Fix error loop seen after hub communication errors

While stress testing a usb controller using a bind/unbind looop, the
following error loop was observed.

usb 7-1.2: new low-speed USB device number 3 using xhci-hcd
usb 7-1.2: hub failed to enable device, error -108
usb 7-1-port2: cannot disable (err = -22)
usb 7-1-port2: couldn't allocate usb_device
usb 7-1-port2: cannot disable (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
** 57 printk messages dropped ** hub 7-1:1.0: activate --> -22
** 82 printk messages dropped ** hub 7-1:1.0: hub_ext_port_status failed (err = -22)

This continues forever. After adding tracebacks into the code,
the call sequence leading to this is found to be as follows.

[<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8
[<ffffffc0007fceb4>] hub_resume+0x2c/0x3c
[<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158
[<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288
[<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98
[<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
[<ffffffc00078217c>] rpm_callback+0xa8/0xd4
[<ffffffc000786234>] rpm_suspend+0x84/0x758
[<ffffffc000786ca4>] rpm_idle+0x2c8/0x498
[<ffffffc000786ed4>] __pm_runtime_idle+0x60/0xac
[<ffffffc00080eba8>] usb_autopm_put_interface+0x6c/0x7c
[<ffffffc000803798>] hub_event+0x10ac/0x12ac
[<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[<ffffffc00024abcc>] worker_thread+0x480/0x610
[<ffffffc000251a80>] kthread+0x164/0x178
[<ffffffc0002045d0>] ret_from_fork+0x10/0x40

kick_hub_wq() is called from hub_activate() even after failures to
communicate with the hub. This results in an endless sequence of
hub event -> hub activate -> wq trigger -> hub event -> ...

Provide two solutions for the problem.

- Only trigger the hub event queue if communication with the hub
  was successful.
- After a suspend failure, only resume already suspended interfaces
  if the communication with the device is still possible.

Each of the changes fixes the observed problem. Use both to improve
robustness.

BUG= chromium:698283 
TEST=Run bind/unbind in tight loop for extended period of time

Change-Id: Ie5f886b9edccb0cf729dcf38e4ea00b39f9683a1
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(am from https://patchwork.kernel.org/patch/9634949/)
Reviewed-on: https://chromium-review.googlesource.com/450965
Reviewed-by: Douglas Anderson <dianders@chromium.org>

[modify] https://crrev.com/c96738bce1c8c6fd287c67f64d11f7504693f7cb/drivers/usb/core/driver.c
[modify] https://crrev.com/c96738bce1c8c6fd287c67f64d11f7504693f7cb/drivers/usb/core/hub.c

Status: Fixed (was: Started)

Sign in to add a comment