New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 610728 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

[chameleon_audio] Auron_paine at DUT android1758-audiobox4-host1 is failing Basic USB audio tests with Unhandled Exception: RPC error: usb.plug. Needs restart to pass tests. Starts failing again.

Project Member Reported by ka...@chromium.org, May 10 2016

Issue description

Dashboard view: https://wmatrix.googleplex.com/platform/unfiltered?suites=chameleon_audio_perbuild&tests=audio_AudioBasicUSB%2A&days_back=7&platforms=daisy

Screenshot: https://screenshot.googleplex.com/GZWRfWeMwcq

Green results are all from chromeos1-row1-rack4-host3

The red results are all from chromeos1-row5-rack5-host2 and failure is Unhandled Exception: RPC error: usb.plug

Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 810, in _call_test_function
    raise error.UnhandledTestFail(e)
UnhandledTestFail: Unhandled Exception: RPC error: usb.plug
Traceback (most recent call last):
  File "./multimedia_xmlrpc_server.py", line 69, in _dispatch
    return func(*params)
  File "/usr/local/autotest/cros/multimedia/usb_facade_native.py", line 86, in plug
    self._wait_for_nodes_changed()
  File "/usr/local/autotest/cros/multimedia/usb_facade_native.py", line 119, in _wait_for_nodes_changed
    timeout=self._TIMEOUT_CRAS_NODES_CHANGE_SECS)
  File "/usr/local/autotest/bin/site_utils.py", line 244, in poll_for_condition
    raise TimeoutError, desc
TimeoutError: Timed out waiting for condition: Find USB node


After I restart chameleon and rerun the tests, they pass, e.g.
https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=62800183
https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=62800187

This reminds me of issue 606573. 

I'll replace chameleon board with a new one and see how test outcome will change.
 

Comment 1 by ka...@chromium.org, May 10 2016

Cc: waihong@chromium.org
Labels: Chameleon
After the restart only one test failed - audio_InternalCardNodes on R52-8303.0.0
Screenshot: https://screenshot.googleplex.com/Pt3bCNf4MwM

Comment 2 by ka...@chromium.org, May 12 2016


Thee "Unhandled Exception: RPC error: usb.plug" failure reason is observed on 
chell, minnie-cheets, auron_paine, daisy
Owner: cychiang@chromium.org
Status: Started (was: Untriaged)
Hi Kalin,
I just ran usb playback test on chromeos1-row5-rack5-host2 and it passed.
Did you reboot chameleon after the latest failure ?
Thanks!
 

Comment 5 by ka...@chromium.org, May 13 2016

Yes, I rebooted it, b/c needed to run tests against M50 release candidate build. Tests passed. I guess chameleon was still "fresh" for your test run.
I see. Maybe we can left one chameleon in bad state for debugging when it happens again. Thanks!

Comment 7 by ka...@chromium.org, May 16 2016

I believe both failures below currently showing on chromeos1-row5-rack5-host2 are caused by the chameleon USB connection
- Unhandled Exception: RPC error: usb.plug
- client failed to resume from sleep after 60 seconds
https://wmatrix.googleplex.com/unfiltered?suites=chameleon_audio_perbuild,chameleon_hdmi_perbuild&platforms=daisy

I am not going to restart chameleon. Feel free to debug. Thanks.
Thank you Kalin. I will check.
I tested by calling Plug(8) and Unplug(8) on chameleon server.
DUT could not detect any USB event at all.
Then, I rebooted DUT.
It then became normal and USB playback and record tests can pass.
This suggest that chameleon is not in a bad state.
Rather, DUT is in a bad state.
This is contradictory with the previous result.

As for chell at chromeos1-row5-rack7-host1, reboot DUT does not help.

Its lsusb -t output:

localhost ~ # lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 3: Dev 2, If 0, Class=Wireless, Driver=btusb, 12M
    |__ Port 3: Dev 2, If 1, Class=Wireless, Driver=btusb, 12M
    |__ Port 5: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 3: Dev 5, If 0, Class=Vendor Specific Class, Driver=r8152, 480M
        |__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/5p, 480M
            |__ Port 1: Dev 7, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M
    |__ Port 7: Dev 4, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 7: Dev 4, If 1, Class=Video, Driver=uvcvideo, 480M

I am wondering, there are two hubs. Are they both needed ?
I tested locally on my chell with only one hub, and USB can be detected.
localhost ~ # lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
    |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 3: Dev 2, If 0, Class=Wireless, Driver=btusb, 12M
    |__ Port 3: Dev 2, If 1, Class=Wireless, Driver=btusb, 12M
    |__ Port 5: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 3: Dev 5, If 0, Class=Vendor Specific Class, Driver=asix, 480M
        |__ Port 4: Dev 7, If 0, Class=Audio, Driver=snd-usb-audio, 480M
        |__ Port 4: Dev 7, If 1, Class=Audio, Driver=snd-usb-audio, 480M
        |__ Port 4: Dev 7, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 7: Dev 3, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 7: Dev 3, If 1, Class=Video, Driver=uvcvideo, 480M


I also tried to reboot chameleon of chromeos1-row5-rack7-host1.
After I rebooted chameleon, I can no longer ping/ssh it.
I guess there are some other serious issue on that chameleon.

Hi Kalin, after several minutes I still can not access chromeos1-row5-rack7-host1-chameleon.cros, could you please check what happened to that chameleon ?
Thanks!

Comment 11 by ka...@chromium.org, May 18 2016

Hi Jimmy,
1) regarding chromeos1-row5-rack7-host1 - I found out the chameleon USB cable was unplugged. It may be forgotten when looking at resetting the DUT a week ago. Anyway, chameleon does not connect for me either. When I take it to my desk - it connects OK, when in box with all cables connected - it does not. It might be the issue where the u-boot does not proceed to boot - I have seen this before on some boards but after few restart it gets OK... I'll continue to investigate...

2)Regarding daisy - the current failures show the issue as observed originally
http://cautotest/afe/#tab_id=view_job&object_id=63724198 - 	
Unhandled Exception: RPC error: usb.plug
http://cautotest/afe/#tab_id=view_job&object_id=63725736 - TEST CASE: PLUG > SUSPEND > PLUG > PLUG > RESUME - client failed to resume from sleep after 60 seconds
I think this daisy chameleon (or DUT) at chromeos1-row5-rack5-host2 is the the bad state we want to investigate
Hi Kalin, thanks for checking chell device.
As for daisy, now the test passed after rebooting daisy (without rebooting chameleon).
So it is hard to tell whether this issue on Chameleon or DUT.

The current plan I have in mind is
1. Wait for it to happen again.
2. Try reboot DUT when it happens and see if test pass.
3. If 2 pass, add a reboot before and after USB test as a workaround to bring back daisy in a good state.

Thanks!


Comment 13 by ka...@chromium.org, May 19 2016

Thanks Jimmy,
I'll follow these steps when it happens again. 
I'll change the test.

Comment 14 by ka...@chromium.org, May 20 2016

Since we moved daisy to the new audio box in b1758 all seems good - https://screenshot.googleplex.com/fACBOrGdC5r

Will wait for the issue to come back.

Comment 15 by ka...@chromium.org, May 25 2016

Well, the issue continues. 
I filed issue 614390 to reflect this failure happening again at android 1758- location for daisy.
Labels: USBconnect

Comment 17 by son...@google.com, Jun 9 2016

Facing this issue on Paine: android1758-audiobox4-host1

job page: https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=66144911

Test failed after rebooting the device: https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=66149244

Again test failed after rebooting chameleon: https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=66151244

I'll re-plug the chameleon connection.

Comment 18 by ka...@chromium.org, Jun 10 2016

I went to the lab and with device/chameleon setup present, I re-tested numerous times.

I unplugged and plugged USB cable from chameleon, and from DUT too. Restarted chameleon too.
I tried connecting the three USB cables - from Eth switch, servo, and chameleon to four different USB hubs.(Only the right side USB port is available, as the left is quite close to the audio jack port)

The observations show that
- At some times USB node is present, somewhere between widgets binding and USB plug action, but mostly not.
- At or after the USB plug action, the DUT loses it's connection  to ethernet for4-5 seconds(once or twice in a row), and if USB node was present, then it disappears.
- I could get the USB node to appear back if I disconnect and connect the usb hub port with the on/of port switch (the rest three hubs did not have such switch) at some specific moment after the USB plug action, and test PASSES.

So, somehow chameleon USB connection messes up the connectivity no matter what USB hub is used.

- If I plug the chameleon board to left USB port, and leave the Eth and servo on the right one through the hub, I have better results - the Eth connection does not interrupt, and I could get the test to PASS. 

I still have to do few tests, but it left port is so close, that we'll need some USB 90 degrees extension that stays close to the DUT - something like the following URL, but shorter:
https://www.amazon.com/QIBOX-Female-Adapter-Extension-Degree/dp/B00FQ7IJT6

I am suspecting that whatever is causing the Eth connection disruption, is also interfering in other devices for suspend/resume/timeouts etc.
Hi Kalin, the ethernet disconnection is expected since we unbind/bind driver for USB controller to re-enumerate USB devices. This action is not needed if we use kernel 4.2.

I would like to try kernel 4.2 on android1758-audiobox4-host1 to see if it can solve the detection problem. But I am not sure if USB is still connected.

If kernel 4.2 can not solve the detection problem, we have to resort to 90 degrees extension.

Thanks!

Comment 20 by ka...@chromium.org, Jun 13 2016

Yes, the USB is still connected as usual - on the DUT's right side USB port through USB hub for all three - Eth, servo, and chameleon. 

Comment 21 by ka...@chromium.org, Jun 13 2016

As addition to observations from c#18, it appeared the 'RPC error: usb.plug' came back on the left USB port so I really could not get to stable passing state. 
So, I hope kernel 4.2 do really better and gets us on a right track here. 

Thank you for making this progress.
Thank you Kalin. I installed kernel 4.2 on android1758-audiobox4-host1. Let's see how it works.
Summary: [chameleon_audio] Auron_paine at DUT android1758-audiobox4-host1 is failing Basic USB audio tests with Unhandled Exception: RPC error: usb.plug. Needs restart to pass tests. Starts failing again. (was: [chameleon_audio] Daisy at DUT chromeos1-row5-rack5-host2 is failing Basic USB audio tests with Unhandled Exception: RPC error: usb.plug. Needs restart to pass tests. Starts failing again.)
I found that this issue happens 80% of the time on android1758-audiobox4-host1, even after I install kernel 4.2 on chameleon.

However, I can not reproduce it using an auron_yuna here locally.
I am using a Transcend TS-HUB3K USB hub connecting to Cros USB port on the right.

I found that, after USB gadget driver is removed, auron_yuna can detect the removal correctly. I used latest R53-8447.0.0 and R52-8350.26.0 to test.
In the contrary, R52-8350.26.0 on android1758-audiobox4-host1 tried to enumerate a low speed USB device on usb 1-2.4 in a infinite loop after USB gadget driver is removed on Chameleon, and failed to do so (error messages attached, starting from line 17723)
This is a strange behavior.

I am wondering if there is something wrong with the setup in android1758-audiobox4-host1 such that Cros device still thinks there is a USB device to be enumerated even after USB gadget driver is removed on Chameleon side.

Hi Kalin, could you please try to reproduce the issue with the setting as I did locally like the picture attached?

We can remove servo first to reduce the complexity.
Thanks!

messages
2.1 MB View Download
auron_paine.jpg
2.1 MB View Download
Cc: bhthompson@chromium.org
Hi Bernie, could you please loop in the contact person for auron-soc to check the error messages in #23 ? I am not sure why Cros device tried to enumerate a low speed USB device in a loop like this:

2016-06-13T02:10:20.920648-07:00 INFO kernel: [  257.576408] usb 1-2.4: USB disconnect, device number 14
2016-06-13T02:10:21.864301-07:00 INFO kernel: [  258.521234] usb 1-2.4: new low-speed USB device number 15 using xhci_hcd
2016-06-13T02:10:21.938319-07:00 ERR kernel: [  258.595156] usb 1-2.4: device descriptor read/64, error -32
2016-06-13T02:10:22.039298-07:00 ERR kernel: [  258.696045] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 14.
2016-06-13T02:10:22.112277-07:00 INFO kernel: [  258.769138] usb 1-2.4: new low-speed USB device number 16 using xhci_hcd
2016-06-13T02:10:22.376317-07:00 INFO kernel: [  259.033047] usb 1-2.4: new low-speed USB device number 17 using xhci_hcd
2016-06-13T02:10:22.450317-07:00 ERR kernel: [  259.107045] usb 1-2.4: device descriptor read/64, error -32
2016-06-13T02:10:22.551320-07:00 ERR kernel: [  259.207811] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 16.
2016-06-13T02:10:25.244288-07:00 INFO kernel: [  261.900011] usb 1-2.4: new low-speed USB device number 21 using xhci_hcd
2016-06-13T02:10:25.318318-07:00 ERR kernel: [  261.974013] usb 1-2.4: device descriptor read/64, error -32
2016-06-13T02:10:25.419318-07:00 ERR kernel: [  262.074862] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 20.
2016-06-13T02:10:25.492289-07:00 INFO kernel: [  262.147850] usb 1-2.4: new low-speed USB device number 22 using xhci_hcd
2016-06-13T02:10:25.577317-07:00 ERR kernel: [  262.232840] usb 1-2.4: device descriptor read/64, error -32
2016-06-13T02:10:25.678302-07:00 ERR kernel: [  262.333842] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 21.
2016-06-13T02:10:26.728319-07:00 INFO kernel: [  263.383483] usb 1-2.4: new low-speed USB device number 24 using xhci_hcd
2016-06-13T02:10:26.802315-07:00 ERR kernel: [  263.457475] usb 1-2.4: device descriptor read/64, error -32
2016-06-13T02:10:26.903320-07:00 ERR kernel: [  263.558248] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 23

Thanks!

Comment 25 by ka...@chromium.org, Jun 13 2016

Thanks Jimmy, I'll remove servo at android1758-audiobox4-host1 and re-test with the same USB hub as you did with yuna, and update status.

Comment 26 by ka...@chromium.org, Jun 13 2016

First I hit some connectivity problem, and rest the DUT as I transition to Dev mode and then to Normal mode back.
Then, without servo(even removed flex cable), I reran USB playback and record tests several times. issue still persists on inconsistent basis. 

I guess I should replace the DUT and try again. 
Another workaround would be to not test paine and use yuna as main board-family representative, but not sure how tests are going to run on it.
Thank you Kalin. Replacing a paine DUT might be a good try.
Sorry, I still can not find a auron_paine here.
I found that test is very stable on auron_yuna. I can pass USB playback/record test 20 times in a row.
I think I might be able to get a auron_paine from Chromestop. I will check and update.
Thanks!

Comment 29 by ka...@chromium.org, Jun 14 2016

Hi Jimmy,
I swapped auron_pain with auron_yuna. So we'll have more results now(with audio tests) for yuna(in the audio box at android1758-audiobox4-host1). And pain at chromeos1-row1-rack3-host4 will run just display tests. We'll add audio board so we get some audio results for pain too.

Lets see how these boards run for next day or two.
Thank you Kalin!
I have tested auron_paine locally too. The test result (with kernel 4.2) is very stable 40/40 usb playback/record tests passed.

I deployed 4.2 kernel to chromeos1-row1-rack3-host4 as well.
So, we will get test result of 4.2 kernel with paine and yuna.

Thanks!


Comment 31 by ka...@chromium.org, Jun 14 2016

Now 'RPC error: usb.plug' failure happens for auron_yuna audio_AudioBasicUSBPlayback and audio_AudioBasicUSBRecord on build R53-8451.0.0	

Some other failure are observed on audio tests
So, I am pretty much convinced it is chameleon(FPGA) having to do with the issues observed.
I'll replace the chameleon board at the audio box.

Comment 32 by ka...@chromium.org, Jun 15 2016

I replaced FPGA board only, and left auron_yuna to run six jobs through the failing  previously USB audio tests. 
http://cautotest/afe/#tab_id=view_host&object_id=4928

Lets see if this fixes things.

Comment 33 by ka...@chromium.org, Jun 15 2016

Sadly it did not.
Two out of the four custom jobs I rescheduled for usb audio tests FAILED with "RPC error: usb.plug"
Screenshot: https://screenshot.googleplex.com/hQ88Bkiz3kV

Since then three jobs were scheduled and they all pass - https://screenshot.googleplex.com/A6Z8WYOiiJ8
Lets give it few more days, though I am not positive this issue is gone.

Hi Kalin, is it using the same setup as in #23 ? Same USB hub, no servo. Or maybe, that is a bad USB cable ?

I should probably find another chameleon board and do more test locally, and ship one chameleon to you if it is stable.
Thanks!

Comment 35 by ka...@chromium.org, Jun 15 2016

Yes, it connects to the right USB port through a Trancent USb hub for the three connections - servo, Ethernet, and chameleon.

Let me check on cable, and do more testing.

Comment 36 by ka...@chromium.org, Jun 16 2016

Sridhar replaced the USB cable between DUT and chameleon yesterday, and so far three M53 builds are passing - https://screenshot.googleplex.com/12gZkHypFC6

This might be the real issue.
I get the USB cables(typeA-to-miniUSB) from the chameleon FPGA bundles, and I guess   we got a bad cable in the pack.
This is great! Let's monitor it for some days before closing this issue.
Thank you Sridhar and Kalin!

Comment 38 by ka...@chromium.org, Jun 17 2016

Status: Verified (was: Started)
June 14th were the last issues on all channels. Since then(after cable replacement) no failure is observed.

Sign in to add a comment