servod: don't busy-loop when USB device disappears |
|||
Issue description
Running servo v4 + servod + CCD on Scarlet
I see that a device I wasn't using heavily has had its servod instance (local, at my desk) spinning at 100% CPU for 5 days -- from 'top':
164348 brianno+ 20 0 1075004 28844 4488 S 135.0 0.0 7623:54 python
strace shows that servod must be spinning around looking for the USB device:
select(10, [9], [], [], {tv_sec=0, tv_usec=500000}) = 0 (Timeout)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
select(10, [9], [], [], {tv_sec=0, tv_usec=500000}) = 0 (Timeout)
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
select(10, [9], [], [], {tv_sec=0, tv_usec=500000}) = 0 (Timeout)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
select(10, [9], [], [], {tv_sec=0, tv_usec=500000}) = 0 (Timeout)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAKE_PRIVATE, 1) = 1
select(10, [9], [], [], {tv_sec=0, tv_usec=500000}) = 0 (Timeout)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
futex(0x560128eaab80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff) = 0
$ ls -l /proc/164348/fd/10
lrwx------ 1 briannorris primarygroup 64 Jun 14 15:27 /proc/164348/fd/10 -> /dev/bus/usb/003/016
My kernel logs show this device dropped off the bus 5 days ago, which matches the timeline:
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787032] usb 3-3.5: USB disconnect, device number 93
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787035] usb 3-3.5.1: USB disconnect, device number 94
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787233] google ttyUSB0: google converter now disconnected from ttyUSB0
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787246] usb_serial_simple 3-3.5.1:1.0: device disconnected
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787404] google ttyUSB1: google converter now disconnected from ttyUSB1
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787414] usb_serial_simple 3-3.5.1:1.3: device disconnected
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787502] google ttyUSB2: google converter now disconnected from ttyUSB2
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787512] usb_serial_simple 3-3.5.1:1.4: device disconnected
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787747] usb 3-3.5.2: USB disconnect, device number 95
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787855] google ttyUSB3: google converter now disconnected from ttyUSB3
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787864] usb_serial_simple 3-3.5.2:1.0: device disconnected
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787953] google ttyUSB4: google converter now disconnected from ttyUSB4
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.787962] usb_serial_simple 3-3.5.2:1.1: device disconnected
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.788058] google ttyUSB5: google converter now disconnected from ttyUSB5
Jun 12 15:37:45 censored.corp.google.com kernel: [1229796.788067] usb_serial_simple 3-3.5.2:1.2: device disconnected
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.021175] usb 3-3.5: new high-speed USB device number 96 using xhci_hcd
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.517180] usb 3-3.5: device descriptor read/64, error -71
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.730860] usb 3-3.5: New USB device found, idVendor=04b4, idProduct=6572
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.730863] usb 3-3.5: New USB device strings: Mfr=0, Product=1, SerialNumber=0
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.730865] usb 3-3.5: Product: USB2.0 Hub
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.731428] hub 3-3.5:1.0: USB hub found
Jun 12 15:37:46 censored.corp.google.com kernel: [1229797.731791] hub 3-3.5:1.0: 4 ports detected
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.025188] usb 3-3.5.1: new full-speed USB device number 97 using xhci_hcd
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126031] usb 3-3.5.1: config 1 has 5 interfaces, different from the descriptor's value: 6
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126034] usb 3-3.5.1: config 1 has no interface number 1
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126528] usb 3-3.5.1: New USB device found, idVendor=18d1, idProduct=501b
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126531] usb 3-3.5.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126534] usb 3-3.5.1: Product: Servo V4
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126536] usb 3-3.5.1: Manufacturer: Google Inc.
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.126538] usb 3-3.5.1: SerialNumber: C1706311813
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.127316] usb_serial_simple 3-3.5.1:1.0: google converter detected
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.127478] usb 3-3.5.1: google converter now attached to ttyUSB0
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.128010] usb_serial_simple 3-3.5.1:1.3: google converter detected
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.128120] usb 3-3.5.1: google converter now attached to ttyUSB1
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.128385] usb_serial_simple 3-3.5.1:1.4: google converter detected
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.128535] usb 3-3.5.1: google converter now attached to ttyUSB2
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.209208] usb 3-3.5.2: new full-speed USB device number 98 using xhci_hcd
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.315727] usb 3-3.5.2: New USB device found, idVendor=18d1, idProduct=5014
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.315732] usb 3-3.5.2: New USB device strings: Mfr=1, Product=2, SerialNumber=11
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.315735] usb 3-3.5.2: Product: Cr50
Jun 12 15:37:47 censored.corp.google.com kernel: [1229798.315738] usb 3-3.5.2: Manufacturer: Google Inc.
...
I know there are some similar servod bugs out there about handling failures like this, but I wanted to log this just in case, since I'm probably one of few that's leaving Scarlet attached to servo4+CCD for long periods of time, with full logs. (I can probably retrieve EC logs from about that time too, if needed.)
,
Jun 18 2018
Ah, feel free to close this one then if you'd like. I knew there was a bug somewhere.
,
Aug 3
This bug has an owner, thus, it's been triaged. Changing status to "assigned".
,
Sep 25
fixed by coconutruben. |
|||
►
Sign in to add a comment |
|||
Comment 1 by nsanders@chromium.org
, Jun 18 2018