New issue
Advanced search Search tips

Issue 766217 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

kevin: Linksys USB Ethernet is really unhappy when plugged directly

Project Member Reported by diand...@chromium.org, Sep 18 2017

Issue description

While testing USB Ethernet adapters on kevin I ended up finding the Linksys USB3 Gigabit Ethernet adapter.

When I plugged this into a USB3 hub it worked OK (though maybe not at peak performance).  ...but when I plugged it directly in then things were terrible.

The adapter enumerates like this:

[  242.218283] usb 6-1: New USB device found, idVendor=13b1, idProduct=0041
[  242.225095] usb 6-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[  242.232275] usb 6-1: Product: Linksys USB3GIGV1
[  242.236861] usb 6-1: Manufacturer: Linksys
[  242.241096] usb 6-1: SerialNumber: 000001000000

===

The adapter can be used with the CDC Ethernet Driver or (with the patch below) with a patch like this it can work with the r8152 driver:

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 49abd25ef2ec..bf0270b1da0a 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -506,6 +506,7 @@ enum rtl8152_flags {
 #define VENDOR_ID_REALTEK              0x0bda
 #define VENDOR_ID_SAMSUNG              0x04e8
 #define VENDOR_ID_LENOVO               0x17ef
+#define VENDOR_ID_LINKSYS              0x13b1
 #define VENDOR_ID_NVIDIA               0x0955
 
 #define MCU_TYPE_PLA                   0x0100
@@ -4364,6 +4365,7 @@ static struct usb_device_id rtl8152_table[] = {
        {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7205)},
        {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x720c)},
        {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7214)},
+       {REALTEK_USB_DEVICE(VENDOR_ID_LINKSYS, 0x0041)},
        {REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA,  0x09ff)},
        {}
 };

===

No matter which driver is used, things are pretty bad right now.  Depending on the exact circumstances, driver and tests I'm running, I often see some of the following messages:

* xhci-hcd xhci-hcd.2.auto: WARN: HC couldn't access mem fast enough
* r8152 6-1:1.0 eth0: intr status -63
* r8152 6-1:1.0 eth0: Rx status -71


Sometimes the adapter can keep working after this but sometimes it can't and I need to unplug it and plug it back in again.

===

I've tried increasing aclk_usb3 from 300 MHz to 600 MHz and it didn't help.  I've also tried bumping up CPU Frequencies and DDR Frequences and that didn't help.

===

At the moment William at Rockchip is trying to obtain one of these adapters to debug further.

 

Comment 1 by w...@rock-chips.com, Sep 27 2017

Hi Doug,
    Did you test both of the Type-C0 (main board) and Type-C1 (sub board)? 

    I have reproduced this issue with the same USB3 Ethernet on Kevin board, and I have done some experiments, the test result shows that it's possible the Type-C USB3 PHY problem.

    Here's my test case.

    1. Test on four different board with the CDC Ethernet Driver.
       1). rk3399 Kevin board
       2). rk3399 EVB baord-A (has one Type-C0, and one Type-A USB3 on board)
       3). rk3399 EVB board-C (has two Type-C port on board)
       4). rk3328 EVB board (has the same USB3 controller as 3399, but different USB3 phy)
-----------------------------------------------------------
	             Type-C0	 Type-C1      Type-A USB3
-----------------------------------------------------------
3399 Kevin Board       Pass	   Fail	         X
3399 EVB Board-A       Fail	    X	        Pass
3399 EVB Board-C       Fail	   Fail	
3328 EVB Board		X	    X           Pass
___________________________________________________________

    2. connect with an USB3.0 HUB or an USB2.0 HUB, the USB3 ethernet can work happy on all of the above boards and Type-C ports.

    3. Apply the following patch (this patch has been merged in Chrome OS kernel4.4) for 3399 EVB Board-C, the USB3 ethernet can work happy on Type-C1 port (and don't dump the error log "WARN: HC couldn't access mem fast enough") , but Type-C0 port still fail.

    diff --git a/drivers/phy/phy-rockchip-typec.c b/drivers/phy/phy-rockchip-typec.c
index 2bad800..d0b6075 100644
--- a/drivers/phy/phy-rockchip-typec.c
+++ b/drivers/phy/phy-rockchip-typec.c
@@ -390,6 +390,8 @@ static void tcphy_tx_usb3_cfg_lane(struct rockchip_typec_phy *tcphy, u32 lane)
        writel(0x5098, tcphy->base + TX_PSC_A3(lane));
        writel(0, tcphy->base + TX_TXCC_MGNFS_MULT_000(lane));
        writel(0xbf, tcphy->base + XCVR_DIAG_BIDI_CTRL(lane));
+       writel(0x700, tcphy->base + TX_DIAG_TX_DRV(lane)),
+       writel(0x13c, tcphy->base + TX_TXCC_CAL_SCLR_MULT(lane));
 }   

     Base on the above test, I also open xHCI debug, when the error happens ,the log is:
      [    9.542757] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.542797] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63
[    9.552304] xhci-hcd xhci-hcd.9.auto: ep 0x81 - asked for 1514 bytes, 1422 bytes untransferred
[    9.574764] xhci-hcd xhci-hcd.9.auto: ep 0x83 - asked for 16 bytes, 8 bytes untransferred
[    9.622762] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.622824] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63
[    9.654771] xhci-hcd xhci-hcd.9.auto: ep 0x83 - asked for 16 bytes, 8 bytes untransferred
[    9.686769] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.686829] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63
[    9.702760] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.702814] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63
[    9.718771] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.718840] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63
[    9.734766] xhci-hcd xhci-hcd.9.auto: WARN: HC couldn't access mem fast enough
[    9.734827] xhci-hcd xhci-hcd.9.auto: Giveback URB ffffffc0f1a2f6c0, len = 0, expected = 16, status = -63

     According to the verbose log, when the error happens, the data length of the URB is very small, theoretically, the mem is fast enough to transfer the data.

     So I guess that the Type-C USB3 PHY compatibility is not good enough, and on
the other hand, the transmission loss on the Kevin board is a little big, and we
also need to consider the Type-C to Type-A cable impedance, these factors cause the USB3 ethernet work unhappy.
    
Generally I've been testing on C1 on kevin and I can confirm that I have the kernel patch you point to above.  I can try some on C0, but oddly my device suddenly started behaving less badly.  I could still get the "couldn't access mem fast enough" sometimes, but not consistently.  :(

Do you have any ideas on how to make this pass more reliably?  Is there additional tuning we can do?

Comment 3 by w...@rock-chips.com, Sep 28 2017

Hi Doug,
    I don't have good ideas about the Type-C PHY additional tuning for the time being, but I will have a try.

    Do you agree with my thoughts? I just want to confirm that my analysis direction is right.


wulf:

Your data does seem suspicious that it could be some sort of PHY configuration problem.  The one worry I have is that I wonder if you've done enough testing to truly confirm that the data you have above is consistent.

I know that I was able to reproduce the problem reliably before, then suddenly it became inconsistent and I could go many times in a row without reproducing the problem.  ...then it would come back.  If I wasn't being _very_ careful I might have assumed that whatever I was doing when the problem went away "fixed" it, but I don't think it did.  I think it was just luck.

---

It's my top priority today to poke at this problem some more.  Let me see if I can find some more patterns to the failure and see if I can find anything else suspicious.  I'll post again when it gets close to the end of the day here.
Cc: bleung@chromium.org
OK, I did some of my own experiments to help try to confirm.  I tried to keep track of all variables in testing with several type C to type A adapters in both orientations and on both ports.


Summary:

* Even though you wouldn't think so given the "HC couldn't access mem fast enough" error, I agree that it does seem we're facing a signalling issue.

* The problem is definitely affected by the specific type C adapter, the Ethernet adapter, the orientation, and the port.  However there is no single component that is obviously bad.  Failures were seen with lots of different combinations.

* It seems like we really want to figure out how to get more margin here.  As for all things like this, it's possible that the Ethernet adapter's signalling is marginal and it's possible that the type C to type A's signalling is marginal.  ...but the customer will blame the laptop anyway.  We want customers to be happy so we want to provide as much margin on our end as possible.

---

Here are all the tests I ran:


  Host    Type C Adapter Linksys Adapter    Port   Polarity Try Problems At Boot?  Problems At Stress?
--------- -------------- --------------- --------- -------- --- ------------------ -------------------
kevin/DA1       1               1        right (1)   CC2     1  yes (HC...enough)  n/a
kevin/DA1       1               1        right (1)   CC2     2  yes (enum er -71)  n/a
kevin/DA1       1               1        right (1)   CC1     1  yes (went to usb2) no (at usb2 speeds)
kevin/DA1       1               1        right (1)   CC1     2  yes (HC...enough)  n/a
kevin/DA1       1               1        left  (0)   CC2     1  yes (cdc er -110)  n/a
kevin/DA1       1               1        left  (0)   CC2     2  yes (HC...enough)  n/a
kevin/DA1       1               1        left  (0)   CC1     1  yes (HC...enough)  n/a
kevin/DA1       1               1        left  (0)   CC1     2  yes (HC...enough)  n/a

kevin/DA1       1               2        right (1)   CC2     1  yes (HC...enough)  n/a
kevin/DA1       1               2        right (1)   CC1     1  yes (enum er -71)  n/a
kevin/DA1       1               2        left  (0)   CC2     1  no (sorta*)        no
kevin/DA1       1               2        left  (0)   CC2     2  no (sorta*)        no
kevin/DA1       1               2        left  (0)   CC1     1  yes (HC...enough)  n/a
kevin/DA1       1               2        left  (0)   CC1     2  yes (HC...enough)  n/a


kevin/DA1       2               1        right (1)   CC2     1  no                 yes (transfer stopped ~9 seconds in)
kevin/DA1       2               1        right (1)   CC2     2  no                 yes (glitch ~4 seconds in, stopped ~9)
kevin/DA1       2               1        right (1)   CC2     3  no                 yes (glitch ~2 seconds in, stopped ~10)
kevin/DA1       2               1        right (1)   CC1     1  no                 no
kevin/DA1       2               1        right (1)   CC1     2  no                 no
kevin/DA1       2               1        left  (0)   CC2     1  no                 no
kevin/DA1       2               1        left  (0)   CC2     2  no                 no
kevin/DA1       2               1        left  (0)   CC1     1  no                 no
kevin/DA1       2               1        left  (0)   CC1     2  no                 no

kevin/DA1       2               2        right (1)   CC2     1  no                 no
kevin/DA1       2               2        right (1)   CC1     1  no                 no
kevin/DA1       2               2        left  (0)   CC2     1  no                 no
kevin/DA1       2               2        left  (0)   CC1     1  no                 no


kevin/DA1       3               1        right (1)   CC2     1  no                 no
kevin/DA1       3               1        right (1)   CC1     1  no                 no
kevin/DA1       3               1        left  (0)   CC2     1  no                 no
kevin/DA1       3               1        left  (0)   CC1     1  no                 no


kevin/DA1       4               1        right (1)   CC2     1  yes (HC...enough)  n/a
kevin/DA1       4               1        right (1)   CC1     1  no                 yes (glitch ~2 seconds in, stopped ~10)
kevin/DA1       4               1        left  (0)   CC2     1  no                 no
kevin/DA1       4               1        left  (0)   CC1     1  no                 no


kevin/DA1       5               1        right (1)   CC2     1  no                 yes (glitch ~2 seconds in, stopped ~10)
kevin/DA1       5               1        right (1)   CC1     1  no                 no

---

I also tried a few other adapters I could find.  It turns out that the "Samsung" one I found behaved similarly to the Linksys one.  All others seemed fine in the limited test cases I ran.  Note that many of these cables were natively type C so I put "n/a" for the Type C Adapter.


  Host    Type C Adapter Adapter            Port   Polarity Try Problems At Boot?  Problems At Stress?
--------- -------------- --------------- --------- -------- --- ------------------ -------------------
kevin/DA1       1        Monoprice-1     right (1)   CC2     1  no                 no
kevin/DA1       1        Monoprice-1     right (1)   CC1     1  no                 no

kevin/DA1       1        Monoprice-2     right (1)   CC2     1  no                 no

kevin/DA1       1        Samsung         right (1)   CC2     1  yes (r8152 er -71) n/a
kevin/DA1       1        Samsung         right (1)   CC2     2  yes (r8152 er -71) n/a
kevin/DA1       1        Samsung         right (1)   CC1     1  yes (r8152 er -71) n/a
kevin/DA1       1        Samsung         left  (0)   CC1     1  yes (r8152 er -71) n/a

kevin/DA1       3        Samsung         right (1)   CC2     1  no                 no

kevin/DA1      n/a       CableCreatio-1  right (1)   CC2     1  no                 no
kevin/DA1      n/a       CableCreatio-2  right (1)   CC2     1  no                 no
kevin/DA1      n/a       Anker           right (1)   CC2     1  no                 no
kevin/DA1      n/a       CableMatters    right (1)   CC2     1  no                 no
kevin/DA1      n/a       Belkin          right (1)   CC2     1  no                 no

---

Notes about errors:

* Error "HC...enough" is WARN: HC couldn't access mem fast enough

* Error "enum er -71" means that the USB port reported error -71

* Error "cdc er -110" means that the CDC Ethernet driver reported error -110.

* The "no (sorta*)" case the system seemed to be functioning OK but still occasionally (every ~10 seconds) reported "HC couldn't access mem fast enough" to the console.

---

Other notes:

* In all cases I first plugged things in and then typed "reboot" on the serial console, then looked at how things looked at the next bootup.

* Sometimes unplugging and plugging an adapter after bootup would "fix" it, but I don't know quite why.  I haven't quantified that.

* For whatever reason, sometimes the "HC couldn't access mem fast enough" spewed faster than other times.  I'm not sure why.

* To stress, I attempted to use: 
  for i in $(seq 3); do iperf3 -c ${IPERF_SERVER} -u -b 0; done

* I did notice "rockchip-vop ff900000.vop: dmc_notify: Line flag interrupt did not arrive" sometimes when running my stress test.  I don't think that's related and I'll track it down in a separate issue.

---

I guess next steps here are to figure out if there's anything we can do to improve the margin?
The Samsung USB3 Type A dongle [1] is USB 3.0 compliant. I reviewed the test results. And this dongle was thoroughly tested with Exynos 5250 chipset and never exhibited this sort of issue.

The original issue is this failure is causing problems in our test lab that (as Doug observed) could be worked around by inserting a different adapter cable or USB hub between kevin and the dongle. I can't predict what the customer impact will be but whatever it is it will not be good.


[1] Specifically the RTL8153 based part. Google _may_ still have some prototypes floating around with AX88179 which normally work well but have an issue with jitter on one of the voltage rails.
Cc: haoweiw@chromium.org englab-sys-cros@google.com

Comment 9 by w...@rock-chips.com, Oct 13 2017

According to Doug's detailed experimental results and my own experiments, I'm more sure that it's an USB3 Tx/Rx signal issue, and USB2 phy signal is good. 

I remembered that the USB3 phy Tx/Rx original signal of Kevin board is very bad on both Type-C0 and Type-C1 ports, and we have tried to modify hardware design and reconfig USB3 phy parameter last year, finally we got a relatively reasonable Type-C USB phy configuration with Cadence's help, but the signal margin is still not enough.

I'm trying to tuning the Tx/Rx phy without Cadence's help, but it's difficult for me, so I haven't found a good phy configuration to fix this issue.
@9: Any progress?

One thing to note is that when I was testing I certainly got different behaviors in some cases.  For instance, if I unplugged and re-plugged a device in it would often behave better.  Perhaps we could look at the tuning results in those cases and see why one tuning was better than the other?  I'm happy to dump registers if you can give me some ideas of which ones to dump.  I don't believe I have much of the relevant PHY docs (though I wouldn't object to getting them), so it's very hard for me to do this myself.

Given that EVB is also having these problems, it seems likely that this problem is common to many rk3399 boards and really needs to be root caused.
Status: Assigned (was: Untriaged)
Components: Internals>Network>Connectivity
Components: -Internals>Network>Connectivity OS>Systems>Network

Sign in to add a comment