guado usb "Not enough host controller resources for new device state." for labstation |
||||||
Issue descriptionThis is semi-related to crbug.com/613669 Vincent, if you're not the right person to look at this, do you know who might? We've set up a labstation with 4 servo v4/uServo and we're hitting a new issue. We can initialize 3 servo v3/uServos just fine (with servod), but when we initialize the 4th one, we get this message in vlm: 2016-10-03T11:29:07.283834-07:00 NOTICE servod[13473]: Launching servod for orco on port 9996 using servo serial ND00006 2016-10-03T11:29:08.031802-07:00 WARNING kernel: [ 1252.726190] usb 1-3.3: Not enough host controller resources for new device state. 2016-10-03T11:29:08.063806-07:00 WARNING kernel: [ 1252.757853] init: servod (9996) main process (13468) terminated with status 1 2016-10-03T11:29:08.063819-07:00 WARNING kernel: [ 1252.757881] init: servod (9996) main process ended, respawning 2016-10-03T11:29:08.076090-07:00 NOTICE servod[13508]: Launching servod for orco on port 9996 using servo serial ND00006 2016-10-03T11:29:08.793806-07:00 WARNING kernel: [ 1253.488271] usb 1-3.3: Not enough host controller resources for new device state. 2016-10-03T11:29:08.825804-07:00 WARNING kernel: [ 1253.520385] init: servod (9996) main process (13503) terminated with status 1 2016-10-03T11:29:08.825817-07:00 WARNING kernel: [ 1253.520415] init: servod (9996) main process ended, respawning 2016-10-03T11:29:08.838122-07:00 NOTICE servod[13543]: Launching servod for orco on port 9996 using servo serial ND00006 2016-10-03T11:29:09.608802-07:00 WARNING kernel: [ 1254.303865] usb 1-3.3: Not enough host controller resources for new device state. 2016-10-03T11:29:09.642805-07:00 WARNING kernel: [ 1254.338184] init: servod (9996) main process (13538) terminated with status 1 2016-10-03T11:29:09.642818-07:00 WARNING kernel: [ 1254.338215] init: servod (9996) main process ended, respawning 2016-10-03T11:29:09.656159-07:00 NOTICE servod[13578]: Launching servod for orco on port 9996 using servo serial ND00006 2016-10-03T11:29:10.428807-07:00 WARNING kernel: [ 1255.124803] usb 1-3.3: Not enough host controller resources for new device state. 2016-10-03T11:29:10.461804-07:00 WARNING kernel: [ 1255.157107] init: servod (9996) main process (13573) terminated with status 1 2016-10-03T11:29:10.461817-07:00 WARNING kernel: [ 1255.157137] init: servod (9996) respawning too fast, stopped And from dmesg: [Oct 3 11:47] usb 1-3.3: Not enough host controller resources for new device state. [ +0.027269] init: servod (9996) main process (13675) terminated with status 1 [ +0.000028] init: servod (9996) main process ended, respawning [ +0.767946] usb 1-3.3: Not enough host controller resources for new device state. [ +0.031660] init: servod (9996) main process (13710) terminated with status 1 [ +0.000031] init: servod (9996) main process ended, respawning [ +0.792410] usb 1-3.3: Not enough host controller resources for new device state. [ +0.031742] init: servod (9996) main process (13745) terminated with status 1 [ +0.000030] init: servod (9996) main process ended, respawning [ +0.775943] usb 1-3.3: Not enough host controller resources for new device state. [ +0.031195] init: servod (9996) main process (13780) terminated with status 1 [ +0.000029] init: servod (9996) respawning too fast, stopped Here's lsusb: # lsusb Bus 001 Device 027: ID 18d1:501a Google Inc. Bus 001 Device 019: ID 18d1:501b Google Inc. Bus 001 Device 028: ID 18d1:501a Google Inc. Bus 001 Device 021: ID 18d1:501b Google Inc. Bus 001 Device 029: ID 18d1:501a Google Inc. Bus 001 Device 025: ID 18d1:501b Google Inc. Bus 001 Device 026: ID 18d1:501a Google Inc. Bus 001 Device 023: ID 18d1:501b Google Inc. Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 018: ID 04b4:6572 Cypress Semiconductor Corp. Bus 001 Device 020: ID 04b4:6572 Cypress Semiconductor Corp. Bus 001 Device 002: ID 8087:07dc Intel Corp. Bus 001 Device 024: ID 04b4:6572 Cypress Semiconductor Corp. Bus 001 Device 022: ID 04b4:6572 Cypress Semiconductor Corp. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub localhost ~ # uname -a Linux localhost 3.14.0 #1 SMP PREEMPT Mon Oct 3 03:05:32 PDT 2016 x86_64 Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz GenuineIntel GNU/Linux localhost ~ # cat /etc/lsb-release CHROMEOS_RELEASE_APPID={41D57E57-2150-BB76-2730-EC8AFD1D835D} CHROMEOS_BOARD_APPID={41D57E57-2150-BB76-2730-EC8AFD1D835D} CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1} DEVICETYPE=CHROMEBOX CHROMEOS_RELEASE_BUILDER_PATH=guado_labstation-release/R55-8859.0.0 GOOGLE_RELEASE=8859.0.0 CHROMEOS_DEVSERVER= CHROMEOS_RELEASE_BOARD=guado_labstation CHROMEOS_RELEASE_BUILD_NUMBER=8859 CHROMEOS_RELEASE_BRANCH_NUMBER=0 CHROMEOS_RELEASE_CHROME_MILESTONE=55 CHROMEOS_RELEASE_PATCH_NUMBER=0 CHROMEOS_RELEASE_TRACK=testimage-channel CHROMEOS_RELEASE_DESCRIPTION=8859.0.0 (Official Build) dev-channel guado_labstation test CHROMEOS_RELEASE_BUILD_TYPE=Official Build CHROMEOS_RELEASE_NAME=Chrome OS CHROMEOS_RELEASE_VERSION=8859.0.0 CHROMEOS_AUSERVER=https://tools.google.com/service/update2 Googling around shows that people have gotten around this issue by disabling xhci and using ehci: http://superuser.com/questions/731751/not-enough-host-controller-resources-for-new-device-state Perhaps we can revisit the re-enabling of the ehci controller? https://bugs.chromium.org/p/chromium/issues/detail?id=613669#c8
,
Oct 3 2016
This is a USB issue: the controller complains it can not accept yet another device. Can we try to spread the devices (servos) on both bus 1 and 2?
,
Oct 3 2016
Right now all servos are spread across the 4 usb ports on the guado. should I be putting some behind a usb hub on a specific port?
,
Oct 3 2016
+Duncan if he can advise on ehci.
,
Oct 4 2016
,
Oct 4 2016
Our Chromebook and Chromebox systems that ship with xhci based usb host in the kernel may have never been tested with ehci instead of xhci in the kernel, and it would be nontrivial to even flip it on; You're talking about a coreboot firmware update at a minimum to disable xhci, plus whatever fallout may ensue. It seems like this is a well known issue that affects Intel XHCI including on other operating systems : http://plugable.com/2015/09/08/not-enough-usb-controller-resources/ Could we get a better idea of what devices you are plugging in and in what configuration? We could also engage with Intel to see if there is anything we can do to increase the limits, or at least demystify what these seemingly arbitrary limits are. On various Intel forums (such as https://communities.intel.com/thread/52417) folks are not entirely clear what causes the various messages to pop up on Windows or Linux (~35 devices versus 32 on linux), and there's thoughts that the real issue is the 96 endpoint limit, which is quite low.
,
Oct 4 2016
I see that Vincent's actually already done some sleuthing here and found the xhci controller device limit : https://bugs.chromium.org/p/chromium/issues/detail?id=613669#c6 Kind of disappointing the number of devices supported is so low.
,
Oct 4 2016
I have dumped the descriptors from this guado lab machine (see lsusb -v attached).
Indeed we have a lot of endpoints.
Without counting the control endpoints (and we should probably)
$ grep "Endpoint Descriptor" guado_lsusb_verbose.txt | wc -l
129
$ grep -e "Bus 00" -e bNumInterfaces -e bNumEndpoints guado_lsusb_verbose.txt
Bus 001 Device 027: ID 18d1:501a Google Inc.
bNumInterfaces 7
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 019: ID 18d1:501b Google Inc.
bNumInterfaces 6
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 028: ID 18d1:501a Google Inc.
bNumInterfaces 7
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 021: ID 18d1:501b Google Inc.
bNumInterfaces 6
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 029: ID 18d1:501a Google Inc.
bNumInterfaces 7
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 025: ID 18d1:501b Google Inc.
bNumInterfaces 6
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 026: ID 18d1:501a Google Inc.
bNumInterfaces 7
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 023: ID 18d1:501b Google Inc.
bNumInterfaces 6
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
bNumInterfaces 1
bNumEndpoints 1
Bus 001 Device 018: ID 04b4:6572 Cypress Semiconductor Corp.
bNumInterfaces 1
bNumEndpoints 1
bNumEndpoints 1
Bus 001 Device 020: ID 04b4:6572 Cypress Semiconductor Corp.
bNumInterfaces 1
bNumEndpoints 1
bNumEndpoints 1
Bus 001 Device 002: ID 8087:07dc Intel Corp.
bNumInterfaces 2
bNumEndpoints 3
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
bNumEndpoints 2
Bus 001 Device 024: ID 04b4:6572 Cypress Semiconductor Corp.
bNumInterfaces 1
bNumEndpoints 1
bNumEndpoints 1
Bus 001 Device 022: ID 04b4:6572 Cypress Semiconductor Corp.
bNumInterfaces 1
bNumEndpoints 1
bNumEndpoints 1
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
bNumInterfaces 1
bNumEndpoints 1
,
Oct 5 2016
We're pursuing a couple options right now: 1. Duncan said he (or someone on the fw team) could whip up a ehci-enabled fw for guado for us to play around with https://code.google.com/p/chrome-os-partner/issues/detail?id=58156 2. Benson mentioned the chromebit (veyron_mickey) uses ehci and to give that shot. I'm currently trying this out right now (so far hitting a pyusb/libusb-servod issue with initializing the 2nd servo v4/uServo combo). 3. Try out a fievel (rockchip chipset device) 4. Aseda is looking into reducing the # of endpoints used by servo v4/uServo. Nick has mentioned this is not trivial (need a full rewrite and redesign of the case closed debug code and servod drivers) although one slight reduction could be the gpio endpoint on the v4.
,
Oct 6 2016
> 2. Benson mentioned the chromebit (veyron_mickey) uses ehci and to give that shot. As already mentioned on Hangout yesterday, this seems a not-so-good idea as the USB support on Veyron is terribly fragile (Doug spent 9 months at least trying to stabilize everything so 3 devices plugged together survive ...) Hammering like crazy on it is unlikely to give good results. Please also note the whole stuff is totally unrelated to using EHCI (on Guado, enabling EHCI is just a way of getting a 2nd USB controller) > 3. Try out a fievel (rockchip chipset device) Veyron-derivative, same remark as above.
,
Oct 6 2016
> 4. Aseda is looking into reducing the # of endpoints used by servo v4/uServo. Nick has mentioned this is not trivial (need a full rewrite and redesign of the case closed debug code and servod drivers) although one slight reduction could be the gpio endpoint on the v4. Yup, I was able to hack together something that I believe works and removes the GPIO USB endpoints for servo v4. So that brings it down to 10 endpoints + 14 for the servo micro. Is that sufficient to support 4 servo_v4+servo_micro on guado? I have an other idea, but due to my lack of USB knowledge, I'll need to read the specs to determine if it's possible and allowed.
,
Oct 6 2016
I think what matters here is the number of endpoints used/opened by the host rather than the total number of them in the device descriptors.
,
Oct 6 2016
Ah okay, then I guess I can leave all my modifications in servod then.
,
Oct 6 2016
I tried to make changes to the BIOS on Samus (not quite guado, but broadwell based) and while I think I can get it to leave the devices routed to EHCI, you can't boot a kernel and have it stay that way due to code like this: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/host/pci-quirks.c#n867 Which will forcibly route all USB ports to the XHCI controller instead of EHCI. In theory disabling the XHCI controller entirely would make this not happen, but in that case then you are still stuck with just one controller. In theory with a combination of BIOS and Kernel hacks we could have two ports routed to XHCI and two to EHCI, but Intel does not support running both controllers at once on broadwell (maybe they share some resources?) so this may not be something we could actually do long-term. It also doesn't future proof this design as the EHCI controller is completely removed in platforms >= skylake.
,
Oct 6 2016
I've uploaded my servod changes to gerrit here: https://chromium-review.googlesource.com/#/c/394746/ You can try applying these changes and see if that will get you to support 4 servo_v4/servo_micro on guado.
,
Oct 6 2016
Vincent/Duncan/Benson: One alternative idea I had was can we find some low-cost small-form factor PC which if you guys looked at the specs would: 1) Boot amd64-generic overlay. 2) Not have the USB limitations we have with our current batch of devices. (Would a device that only has USB 2.0 be better?) If someone could identify such a product on newegg or amazon we could rush order one and play with it here onsite. Benefits of this is breaks off the HW limitations of our current set of CrOS devices and if we can load the generic overlay then we can use the same tooling to manage a fleet of these. Lastly if we can get the fan-out of host to v4s to be higher in number, we save costs deploying devices in the lab. Let me know if this idea is viable or not. I don't have enough context to determine such a device myself. Links to help out: http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100019096%20600337533%20600373488%20600337532%20600337530%204814%20600014654&IsNodeId=1 https://www.amazon.com/b/ref=lp_13896617011_ln_0_1?node=13896591011&ie=UTF8&qid=1475778310
,
Oct 7 2016
All the Intel Core -Y and -U likely have the same limitation as our current devices (the 32-device and 96-endpoint comes from the Intel chipset which is similar in all those SKUs). Maybe Core -H or Xeon E3/E5 have higher limits but you are less likely to find high TDP CPU in small form factors (and they might have bugs on amd64-generic we never test such a setup). Maybe you can try the Intel NUC NUC6i7KYK ? (I don't access to any i7 6700HQ or 6700K to have a look)
,
Oct 10 2016
,
Oct 12 2016
Duncan's fw is working great and we have up to 8 devices connected. Closing this bug out as further work to iron out the kinks will be on the partner bug. https://code.google.com/p/chrome-os-partner/issues/detail?id=58156
,
Jun 2 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/hdctools/+/66e0469cf974dec38341206679bacc679ec2e21e commit 66e0469cf974dec38341206679bacc679ec2e21e Author: Nick Sanders <nsanders@chromium.org> Date: Sat Jun 02 21:43:55 2018 servod: add more descriptive error for out-of-endpoints See: crbug.com/652373 , intel systems have a limited number of usb endpoints that can be opened, and the lab often overuns this. Return an error with a reference to the bug so that the problem can be diagnosed easily. BUG= chromium:652373 TEST=adding a raise USBError prints the message. Change-Id: Ie6823e4a5e8dc74b00e03a0d2da8d7c45e1802c1 Signed-off-by: Nick Sanders <nsanders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1081269 Reviewed-by: Aseda Aboagye <aaboagye@chromium.org> [modify] https://crrev.com/66e0469cf974dec38341206679bacc679ec2e21e/servo/stm32usb.py |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by sbasi@chromium.org
, Oct 3 2016