New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 652373 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Oct 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 0
Type: Bug



Sign in to add a comment

guado usb "Not enough host controller resources for new device state." for labstation

Project Member Reported by kevcheng@chromium.org, Oct 3 2016

Issue description

This is semi-related to crbug.com/613669

Vincent, if you're not the right person to look at this, do you know who might?

We've set up a labstation with 4 servo v4/uServo and we're hitting a new issue.  We can initialize 3 servo v3/uServos just fine (with servod), but when we initialize the 4th one, we get this message in vlm:

2016-10-03T11:29:07.283834-07:00 NOTICE servod[13473]: Launching servod for orco on port 9996 using servo serial ND00006
2016-10-03T11:29:08.031802-07:00 WARNING kernel: [ 1252.726190] usb 1-3.3: Not enough host controller resources for new device state.
2016-10-03T11:29:08.063806-07:00 WARNING kernel: [ 1252.757853] init: servod (9996) main process (13468) terminated with status 1
2016-10-03T11:29:08.063819-07:00 WARNING kernel: [ 1252.757881] init: servod (9996) main process ended, respawning
2016-10-03T11:29:08.076090-07:00 NOTICE servod[13508]: Launching servod for orco on port 9996 using servo serial ND00006
2016-10-03T11:29:08.793806-07:00 WARNING kernel: [ 1253.488271] usb 1-3.3: Not enough host controller resources for new device state.
2016-10-03T11:29:08.825804-07:00 WARNING kernel: [ 1253.520385] init: servod (9996) main process (13503) terminated with status 1
2016-10-03T11:29:08.825817-07:00 WARNING kernel: [ 1253.520415] init: servod (9996) main process ended, respawning
2016-10-03T11:29:08.838122-07:00 NOTICE servod[13543]: Launching servod for orco on port 9996 using servo serial ND00006
2016-10-03T11:29:09.608802-07:00 WARNING kernel: [ 1254.303865] usb 1-3.3: Not enough host controller resources for new device state.
2016-10-03T11:29:09.642805-07:00 WARNING kernel: [ 1254.338184] init: servod (9996) main process (13538) terminated with status 1
2016-10-03T11:29:09.642818-07:00 WARNING kernel: [ 1254.338215] init: servod (9996) main process ended, respawning
2016-10-03T11:29:09.656159-07:00 NOTICE servod[13578]: Launching servod for orco on port 9996 using servo serial ND00006
2016-10-03T11:29:10.428807-07:00 WARNING kernel: [ 1255.124803] usb 1-3.3: Not enough host controller resources for new device state.
2016-10-03T11:29:10.461804-07:00 WARNING kernel: [ 1255.157107] init: servod (9996) main process (13573) terminated with status 1
2016-10-03T11:29:10.461817-07:00 WARNING kernel: [ 1255.157137] init: servod (9996) respawning too fast, stopped


And from dmesg:
[Oct 3 11:47] usb 1-3.3: Not enough host controller resources for new device state.
[  +0.027269] init: servod (9996) main process (13675) terminated with status 1
[  +0.000028] init: servod (9996) main process ended, respawning
[  +0.767946] usb 1-3.3: Not enough host controller resources for new device state.
[  +0.031660] init: servod (9996) main process (13710) terminated with status 1
[  +0.000031] init: servod (9996) main process ended, respawning
[  +0.792410] usb 1-3.3: Not enough host controller resources for new device state.
[  +0.031742] init: servod (9996) main process (13745) terminated with status 1
[  +0.000030] init: servod (9996) main process ended, respawning
[  +0.775943] usb 1-3.3: Not enough host controller resources for new device state.
[  +0.031195] init: servod (9996) main process (13780) terminated with status 1
[  +0.000029] init: servod (9996) respawning too fast, stopped


Here's lsusb:
# lsusb
Bus 001 Device 027: ID 18d1:501a Google Inc. 
Bus 001 Device 019: ID 18d1:501b Google Inc. 
Bus 001 Device 028: ID 18d1:501a Google Inc. 
Bus 001 Device 021: ID 18d1:501b Google Inc. 
Bus 001 Device 029: ID 18d1:501a Google Inc. 
Bus 001 Device 025: ID 18d1:501b Google Inc. 
Bus 001 Device 026: ID 18d1:501a Google Inc. 
Bus 001 Device 023: ID 18d1:501b Google Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 018: ID 04b4:6572 Cypress Semiconductor Corp. 
Bus 001 Device 020: ID 04b4:6572 Cypress Semiconductor Corp. 
Bus 001 Device 002: ID 8087:07dc Intel Corp. 
Bus 001 Device 024: ID 04b4:6572 Cypress Semiconductor Corp. 
Bus 001 Device 022: ID 04b4:6572 Cypress Semiconductor Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


localhost ~ # uname -a
Linux localhost 3.14.0 #1 SMP PREEMPT Mon Oct 3 03:05:32 PDT 2016 x86_64 Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz GenuineIntel GNU/Linux
localhost ~ # cat /etc/lsb-release 
CHROMEOS_RELEASE_APPID={41D57E57-2150-BB76-2730-EC8AFD1D835D}
CHROMEOS_BOARD_APPID={41D57E57-2150-BB76-2730-EC8AFD1D835D}
CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1}
DEVICETYPE=CHROMEBOX
CHROMEOS_RELEASE_BUILDER_PATH=guado_labstation-release/R55-8859.0.0
GOOGLE_RELEASE=8859.0.0
CHROMEOS_DEVSERVER=
CHROMEOS_RELEASE_BOARD=guado_labstation
CHROMEOS_RELEASE_BUILD_NUMBER=8859
CHROMEOS_RELEASE_BRANCH_NUMBER=0
CHROMEOS_RELEASE_CHROME_MILESTONE=55
CHROMEOS_RELEASE_PATCH_NUMBER=0
CHROMEOS_RELEASE_TRACK=testimage-channel
CHROMEOS_RELEASE_DESCRIPTION=8859.0.0 (Official Build) dev-channel guado_labstation test
CHROMEOS_RELEASE_BUILD_TYPE=Official Build
CHROMEOS_RELEASE_NAME=Chrome OS
CHROMEOS_RELEASE_VERSION=8859.0.0
CHROMEOS_AUSERVER=https://tools.google.com/service/update2


Googling around shows that people have gotten around this issue by disabling xhci and using ehci: http://superuser.com/questions/731751/not-enough-host-controller-resources-for-new-device-state

Perhaps we can revisit the re-enabling of the ehci controller? https://bugs.chromium.org/p/chromium/issues/detail?id=613669#c8
 
dmesg
89.8 KB View Download
messages
488 KB View Download

Comment 1 by sbasi@chromium.org, Oct 3 2016

Cc: gwendal@chromium.org snanda@chromium.org
Sameer, any idea who may be able to assist/advise with this?

We have our own overlay so if there is maybe a kernel flag we can hit to assist with "Not enough host controller resources for new device state." that would be great.
This is a USB issue: the controller complains it can not accept yet another device.
Can we try to spread the devices (servos) on both bus 1 and 2?


Right now all servos are spread across the 4 usb ports on the guado.  should I be putting some behind a usb hub on a specific port?

Comment 4 by sbasi@chromium.org, Oct 3 2016

Cc: dlaurie@chromium.org
+Duncan if he can advise on ehci.
Cc: groeck@chromium.org bleung@chromium.org
Our Chromebook and Chromebox systems that ship with xhci based usb host in the kernel may have never been tested with ehci instead of xhci in the kernel, and it would be nontrivial to even flip it on; You're talking about a coreboot firmware update at a minimum to disable xhci, plus whatever fallout may ensue.

It seems like this is a well known issue that affects Intel XHCI including on other operating systems : http://plugable.com/2015/09/08/not-enough-usb-controller-resources/

Could we get a better idea of what devices you are plugging in and in what configuration?

We could also engage with Intel to see if there is anything we can do to increase the limits, or at least demystify what these seemingly arbitrary limits are. On various Intel forums (such as https://communities.intel.com/thread/52417) folks are not entirely clear what causes the various messages to pop up on Windows or Linux (~35 devices versus 32 on linux), and there's thoughts that the real issue is the 96 endpoint limit, which is quite low.
I see that Vincent's actually already done some sleuthing here and found the xhci controller device limit : https://bugs.chromium.org/p/chromium/issues/detail?id=613669#c6

Kind of disappointing the number of devices supported is so low.
I have dumped the descriptors from this guado lab machine (see lsusb -v attached).
Indeed we have a lot of endpoints.

Without counting the control endpoints (and we should probably)

$ grep "Endpoint Descriptor" guado_lsusb_verbose.txt  | wc -l
129

$ grep -e "Bus 00" -e bNumInterfaces -e bNumEndpoints  guado_lsusb_verbose.txt  
Bus 001 Device 027: ID 18d1:501a Google Inc. 
    bNumInterfaces          7
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 019: ID 18d1:501b Google Inc. 
    bNumInterfaces          6
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 028: ID 18d1:501a Google Inc. 
    bNumInterfaces          7
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 021: ID 18d1:501b Google Inc. 
    bNumInterfaces          6
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 029: ID 18d1:501a Google Inc. 
    bNumInterfaces          7
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 025: ID 18d1:501b Google Inc. 
    bNumInterfaces          6
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 026: ID 18d1:501a Google Inc. 
    bNumInterfaces          7
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 023: ID 18d1:501b Google Inc. 
    bNumInterfaces          6
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
    bNumInterfaces          1
      bNumEndpoints           1
Bus 001 Device 018: ID 04b4:6572 Cypress Semiconductor Corp. 
    bNumInterfaces          1
      bNumEndpoints           1
      bNumEndpoints           1
Bus 001 Device 020: ID 04b4:6572 Cypress Semiconductor Corp. 
    bNumInterfaces          1
      bNumEndpoints           1
      bNumEndpoints           1
Bus 001 Device 002: ID 8087:07dc Intel Corp. 
    bNumInterfaces          2
      bNumEndpoints           3
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
      bNumEndpoints           2
Bus 001 Device 024: ID 04b4:6572 Cypress Semiconductor Corp. 
    bNumInterfaces          1
      bNumEndpoints           1
      bNumEndpoints           1
Bus 001 Device 022: ID 04b4:6572 Cypress Semiconductor Corp. 
    bNumInterfaces          1
      bNumEndpoints           1
      bNumEndpoints           1
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
    bNumInterfaces          1
      bNumEndpoints           1




guado_lsusb_verbose.txt
88.8 KB View Download
We're pursuing a couple options right now:

1. Duncan said he (or someone on the fw team) could whip up a ehci-enabled fw for guado for us to play around with https://code.google.com/p/chrome-os-partner/issues/detail?id=58156

2. Benson mentioned the chromebit (veyron_mickey) uses ehci and to give that shot.  I'm currently trying this out right now (so far hitting a pyusb/libusb-servod issue with initializing the 2nd servo v4/uServo combo).

3. Try out a fievel (rockchip chipset device)

4. Aseda is looking into reducing the # of endpoints used by servo v4/uServo.  Nick has mentioned this is not trivial (need a full rewrite and redesign of the case closed debug code and servod drivers) although one slight reduction could be the gpio endpoint on the v4.
> 2. Benson mentioned the chromebit (veyron_mickey) uses ehci and to give that shot.

As already mentioned on Hangout yesterday, this seems a not-so-good idea as the USB support on Veyron is terribly fragile (Doug spent 9 months at least trying to stabilize everything so 3 devices plugged together survive ...)
Hammering like crazy on it is unlikely to give good results.
Please also note the whole stuff is totally unrelated to using EHCI 
(on Guado, enabling EHCI is just a way of getting a 2nd USB controller)

> 3. Try out a fievel (rockchip chipset device)

Veyron-derivative, same remark as above.
Cc: aaboagye@chromium.org
> 4. Aseda is looking into reducing the # of endpoints used by servo v4/uServo.  Nick has mentioned this is not trivial (need a full rewrite and redesign of the case closed debug code and servod drivers) although one slight reduction could be the gpio endpoint on the v4.

Yup, I was able to hack together something that I believe works and removes the GPIO USB endpoints for servo v4. So that brings it down to 10 endpoints + 14 for the servo micro. Is that sufficient to support 4 servo_v4+servo_micro on guado?

I have an other idea, but due to my lack of USB knowledge, I'll need to read the specs to determine if it's possible and allowed.
I think what matters here is the number of endpoints used/opened by the host rather than the total number of them in the device descriptors.
Ah okay, then I guess I can leave all my modifications in servod then.
I tried to make changes to the BIOS on Samus (not quite guado, but broadwell based) and while I think I can get it to leave the devices routed to EHCI, you can't boot a kernel and have it stay that way due to code like this:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/host/pci-quirks.c#n867

Which will forcibly route all USB ports to the XHCI controller instead of EHCI.  In theory disabling the XHCI controller entirely would make this not happen, but in that case then you are still stuck with just one controller.

In theory with a combination of BIOS and Kernel hacks we could have two ports routed to XHCI and two to EHCI, but Intel does not support running both controllers at once on broadwell (maybe they share some resources?) so this may not be something we could actually do long-term.

It also doesn't future proof this design as the EHCI controller is completely removed in platforms >= skylake.
I've uploaded my servod changes to gerrit here: https://chromium-review.googlesource.com/#/c/394746/

You can try applying these changes and see if that will get you to support 4 servo_v4/servo_micro on guado.
Vincent/Duncan/Benson:

One alternative idea I had was can we find some low-cost small-form factor PC which if you guys looked at the specs would:
1) Boot amd64-generic overlay.
2) Not have the USB limitations we have with our current batch of devices. (Would a device that only has USB 2.0 be better?)

If someone could identify such a product on newegg or amazon we could rush order one and play with it here onsite. Benefits of this is breaks off the HW limitations of our current set of CrOS devices and if we can load the generic overlay then we can use the same tooling to manage a fleet of these. Lastly if we can get the fan-out of host to v4s to be higher in number, we save costs deploying devices in the lab.

Let me know if this idea is viable or not. I don't have enough context to determine such a device myself.

Links to help out:
http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100019096%20600337533%20600373488%20600337532%20600337530%204814%20600014654&IsNodeId=1
https://www.amazon.com/b/ref=lp_13896617011_ln_0_1?node=13896591011&ie=UTF8&qid=1475778310
All the Intel Core -Y and -U likely have the same limitation as our current devices (the 32-device and 96-endpoint comes from the Intel chipset which is similar in all those SKUs).
Maybe Core -H or Xeon E3/E5 have higher limits but you are less likely to find high TDP CPU in small form factors (and they might have bugs on amd64-generic we never test such a setup).
Maybe you can try the Intel NUC NUC6i7KYK ? (I don't access to any i7 6700HQ or 6700K to have a look)
Cc: akes...@chromium.org
Status: Fixed (was: Assigned)
Duncan's fw is working great and we have up to 8 devices connected.  Closing this bug out as further work to iron out the kinks will be on the partner bug. 

https://code.google.com/p/chrome-os-partner/issues/detail?id=58156
Project Member

Comment 20 by bugdroid1@chromium.org, Jun 2 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/hdctools/+/66e0469cf974dec38341206679bacc679ec2e21e

commit 66e0469cf974dec38341206679bacc679ec2e21e
Author: Nick Sanders <nsanders@chromium.org>
Date: Sat Jun 02 21:43:55 2018

servod: add more descriptive error for out-of-endpoints

See:  crbug.com/652373 , intel systems have a limited number of
usb endpoints that can be opened, and the lab often overuns this.

Return an error with a reference to the bug so that the problem
can be diagnosed easily.

BUG= chromium:652373 
TEST=adding a raise USBError prints the message.

Change-Id: Ie6823e4a5e8dc74b00e03a0d2da8d7c45e1802c1
Signed-off-by: Nick Sanders <nsanders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1081269
Reviewed-by: Aseda Aboagye <aaboagye@chromium.org>

[modify] https://crrev.com/66e0469cf974dec38341206679bacc679ec2e21e/servo/stm32usb.py

Sign in to add a comment