New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 850222 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 19
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug-Regression

Blocking:
issue 800782



Sign in to add a comment

veyron hardware does not work with some USB touch devices like ELO2002 (dwc2 bug)

Project Member Reported by malaykeshav@chromium.org, Jun 6 2018

Issue description

Device: AOPEN chromebox mini
Peripheral: ELO Touchscreen 2002

Was working in m63
Fails in testing in 64, 65, 66 and 67 (dev)

- Issue is tricky to reproduce in dev mode but common to repro in verified mode which may indicate a timing issue in firmware or kernel (since dev mode is often slower to pass on to kernel after manual CTRL+D or 30 seconds).


 
5475723567521897320.jpg
515 KB View Download
Blocking: 800782
Thanks for creating the separate bug. Is it possible for me to share the build you tested with the customer? It would be good to have them test in their environment. 
The fix has not yet landed so I am not sure how to share the build. Also I am not aware on how the kernel updates work and what release cycle they follow to know when they will be available to the customer.

We might be able to flash a USB drive with the build to share with the customer.
We also have the Fievel device(@jayhlee) with the patch loaded.


Cc: alexpau@chromium.org w...@rock-chips.com zhen...@rock-chips.com
Summary: veyron hardware does not work with some USB touch devices like ELO2002 (dwc2 bug) (was: Kernel v3.14 does not work with some touch devices like ELO2002)

Comment 6 by eryen@chromium.org, Jun 11 2018

@malaykeshav: The customer is familiar with flashing USB drives with a .bin image - can we share an image in this format with the customer?

@dtor: Can we merge the proposed change into a stable version that is coming soon? The customer is not happy that they are already 4 major versions behind.
Cc: ovanieva@chromium.org
I think dtor@ is trying to get something that's upstream happy and needed help testing some variants of his patch since he doesn't have access to the device himself...
eryen@ can the partner test when the fix gets to Dev or canary?

Comment 10 by jayhlee@google.com, Jun 11 2018

dianders: I have the ELO screen and setup now and am glad to do any necessary testing.

Comment 11 by jayhlee@google.com, Jun 11 2018

ovanieva: yes, customer can test in canary / dev but if it's going to take awhile to land in the tree then we'd prefer to get them something unofficial to confirm the problem is where we think it is.

Comment 12 by dtor@google.com, Jun 11 2018

Unfortunately my proposed change does break the devices that we were trying to fix to begin with, so we can't ship it yet.

Comment 13 by eryen@google.com, Jun 12 2018

Thanks. I will let the customer know that we are still working on testing the proposed change and will ask them to test in canary / dev once ready.

Comment 14 by jayhlee@google.com, Jun 13 2018

Thanks dtor, please do keep us updated and let me know if you need any testing with the ELO 2002L unit and a veyron-fievel.

Comment 15 by jayhlee@google.com, Jun 18 2018

dtor@ installed a new kernel he built on the NYC fievel and ELO 2002L setup we have and I was unable to reproduce the touch issue after that. It's hard to 100% confirm fix because repro is much harder in dev mode but I think it's worth proceeding with getting CL in canary and we can test further there in verified mode.

Comment 16 by dtor@google.com, Jun 18 2018

We are landing https://crrev.com/c/1103455 that should fix the issue.
Project Member

Comment 17 by bugdroid1@chromium.org, Jun 20 2018

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e230ca2310f50000123ae047622d54a9fe56b030

commit e230ca2310f50000123ae047622d54a9fe56b030
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Wed Jun 20 02:11:13 2018

FROMLIST: usb: dwc2: host: do not delay retries for CONTROL IN transfers

When handling split transactions we will try to delay retry after
getting a NAK from the device. This works well for BULK transfers that
can be polled for essentially forever. Unfortunately, on slower systems
at boot time, when the kernel is busy enumerating all the devices (USB
or not), we issue a bunch of control requests (reading device
descriptors, etc). If we get a NAK for the IN part of the control
request and delay retry for too long (because the system is busy), we
may confuse the device when we finally get to reissue SSPLIT/CSPLIT IN
and the device will respond with STALL. As a result we end up with
failure to get device descriptor and will fail to enumerate the device:

[    3.428801] usb 2-1.2.1: new full-speed USB device number 9 using dwc2
[    3.508576] usb 2-1.2.1: device descriptor read/8, error -32
[    3.699150] usb 2-1.2.1: device descriptor read/8, error -32
[    3.891653] usb 2-1.2.1: new full-speed USB device number 10 using dwc2
[    3.968859] usb 2-1.2.1: device descriptor read/8, error -32
...

Let's not delay retries of split CONTROL IN transfers, as this allows us
to reliably enumerate devices at boot time.

Fixes 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(am from https://patchwork.kernel.org/patch/10467735/)

BUG= chromium:850222 
TEST=Boot fievel with external touchscreen

Change-Id: I21bdd3e17f80c89ea4f591cba44ebf3a7548da85
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1103455
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>

[modify] https://crrev.com/e230ca2310f50000123ae047622d54a9fe56b030/drivers/usb/dwc2/hcd_intr.c

Labels: M-68 Merge-Request-68
Landed on M-69.  Presumably we'll want a merge for M-68.
Project Member

Comment 19 by sheriffbot@chromium.org, Jun 20 2018

Labels: -Merge-Request-68 Hotlist-Merge-Review Merge-Review-68
This bug requires manual review: M68 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
The customer is reporting that they have tested the fix and it does not resolve the issue for them. 

"I updated one of our Chromebox Mini devices in the lab to 69.0.3464.0 and connected it to an Elo 2002L.  It is still exhibiting the exact same behavior where I cannot use the touchscreen upon initial bootup until I reconnect the USB cable or power cycle the monitor."

Comment 21 by jayhlee@google.com, Jun 20 2018

pnevin: customer needs to be on 10801.0.0 or newer to get this recent fix. That version is not on fievel canary channel but should be in next day or two. They need to test then.
Labels: -Hotlist-Merge-Review -Merge-Review-68 Merge-Approved-68
I'm planning to wait on actually merging until I see for sure that customer has confirmed the fix.
Project Member

Comment 24 by sheriffbot@chromium.org, Jun 25 2018

Cc: bhthompson@google.com diand...@chromium.org
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
@24: Someone needs to use AI to make the bot read my comment from @23.

...speaking of which, do we have confirmation?  I suppose we can merge proactively anyway but I'd rather have confirmation...
I just got cofirmation from the customer that this issue appears to be resolved for them. Please move forward with the marge. Thank you!

"69.0.3464.0 Platform version 10801.0.0 fixes the issue!  I was able to complete my testing in the lab today on the AOpen Fievel Chromebox Mini and the Elo 2002L that we are using for our Kiosk."
Project Member

Comment 27 by bugdroid1@chromium.org, Jun 25 2018

Labels: merge-merged-release-R68-10718.B-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3e287c1d3048d1814d6f246cfd7b2d9fe0539b50

commit 3e287c1d3048d1814d6f246cfd7b2d9fe0539b50
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Mon Jun 25 19:56:09 2018

FROMLIST: usb: dwc2: host: do not delay retries for CONTROL IN transfers

When handling split transactions we will try to delay retry after
getting a NAK from the device. This works well for BULK transfers that
can be polled for essentially forever. Unfortunately, on slower systems
at boot time, when the kernel is busy enumerating all the devices (USB
or not), we issue a bunch of control requests (reading device
descriptors, etc). If we get a NAK for the IN part of the control
request and delay retry for too long (because the system is busy), we
may confuse the device when we finally get to reissue SSPLIT/CSPLIT IN
and the device will respond with STALL. As a result we end up with
failure to get device descriptor and will fail to enumerate the device:

[    3.428801] usb 2-1.2.1: new full-speed USB device number 9 using dwc2
[    3.508576] usb 2-1.2.1: device descriptor read/8, error -32
[    3.699150] usb 2-1.2.1: device descriptor read/8, error -32
[    3.891653] usb 2-1.2.1: new full-speed USB device number 10 using dwc2
[    3.968859] usb 2-1.2.1: device descriptor read/8, error -32
...

Let's not delay retries of split CONTROL IN transfers, as this allows us
to reliably enumerate devices at boot time.

Fixes 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(am from https://patchwork.kernel.org/patch/10467735/)

BUG= chromium:850222 
TEST=Boot fievel with external touchscreen

Change-Id: I21bdd3e17f80c89ea4f591cba44ebf3a7548da85
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1103455
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
(cherry picked from commit e230ca2310f50000123ae047622d54a9fe56b030)
Reviewed-on: https://chromium-review.googlesource.com/1113758

[modify] https://crrev.com/3e287c1d3048d1814d6f246cfd7b2d9fe0539b50/drivers/usb/dwc2/hcd_intr.c

Labels: -Merge-Approved-68 Merge-Merged
Status: Fixed (was: Assigned)
Unfortunately I got a follow-up email from my customer today that the issue has *not* been fixed for them. 

The testing he did yesterday was with a device in developer mode (canary channel). Today the customer did additional testing and started the device in verified boot mode. When that happens 100% of the time the touchscreen device fails to work on boot up. Only unplugging and plugging the touchscreen back in will get it to work.

Logs for customer repro are available on Google Drive. Request access at https://drive.google.com/corp/drive/folders/1Sf2ZYWMdi9ZV74JPgpfIKSJJaV9pT8Bi
Status: Assigned (was: Fixed)

Comment 32 by jayhlee@google.com, Jun 27 2018

I'm testing with ELO 2002L and Fievel and can reproduce the same issues on canary 10821.0.0 verified mode. The fix does not seem to solve the issue here.

dtor@ at this point do you want me to see about shipping the ELO 2002L to you?

Comment 33 by dtor@google.com, Jun 28 2018

Might as well, but I will be OOO starting next week so you might want to see if there is someone else who could take a look at this.
(have not shipped ELO screen yet since not clear someone in MTV can use until dtor return)

From the logs on canary 10833.0.0:

[    2.138591] usb 2-1.1.1: new full-speed USB device number 4 using dwc2
[    2.318397] usb 2-1.1.1: device descriptor read/all, error -32
[    2.398778] usb 2-1.1.1: new full-speed USB device number 5 using dwc2
[    2.578349] usb 2-1.1.1: device descriptor read/all, error -32
[    2.658611] usb 2-1.1.1: new full-speed USB device number 6 using dwc2
[    2.738329] usb 2-1.1.1: device descriptor read/8, error -32
[    2.928364] usb 2-1.1.1: device descriptor read/8, error -32
[    3.118611] usb 2-1.1.1: new full-speed USB device number 7 using dwc2
[    3.198364] usb 2-1.1.1: device descriptor read/8, error -32
[    3.388373] usb 2-1.1.1: device descriptor read/8, error -32
[    3.498597] hub 2-1.1:1.0: unable to enumerate USB device on port 1

still doing some testing but issue seems not resolved on current canary.
Cc: dtor@chromium.org
Owner: puneetster@chromium.org
Puneet,

  Assigning this to you as dtor@ is OoO until 18th and this is very high priority and needs immediate attention, please ping me for full details if needed.

  Can we get another eng resource assigned to this ASAP? I can ship the touchscreen to the resource once assigned.
Cc: puneetster@chromium.org
Owner: diand...@chromium.org
Doug seems to have context around this bug so will have to check with him. Assigning to him temporarily until we know more. Unfortunately Doug is OOO today.  I realize this is a high priority item but given that its been open for a while, let's wait for Doug to come back first.
As an update I've shipped the ELO 2002L unit to Doug who will take shipment of it.
Here's a patch that works for me:

https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1103456

I took something that was part of Dmitry's original series and fixed a use-after-free bug with it.
Since Dmitry will be back tomorrow, will wait for him to give it a +2 rather than getting another person up to speed.  Then we can work on getting this verified (again) and backported to whatever stable branches are relevant at the time.
Project Member

Comment 40 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c3b0ed1cc55da63d8314c2b8c9bce8a24d152427

commit c3b0ed1cc55da63d8314c2b8c9bce8a24d152427
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Thu Jul 19 09:52:03 2018

FROMLIST: usb: dwc2: host: do not delay retries after successful transfer

When handling split transactions we should not be delaying retrying
SSPLIT/CSPLIT after we successfully communicate with the device, so
let's reset dtd->num_naks counter when handling XFERCOMPL.

Fixes: 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
(am from https://patchwork.kernel.org/patch/10530517/)

BUG= chromium:850222 
TEST=Boot fievel with external touchscreen

Change-Id: I4313b19aa245552588ff47b5d366f35ef22cdfe7
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1103456
Reviewed-by: Dmitry Torokhov <dtor@chromium.org>

[modify] https://crrev.com/c3b0ed1cc55da63d8314c2b8c9bce8a24d152427/drivers/usb/dwc2/hcd_intr.c

Labels: -M-68 M-69
The above looks like it (barely) made it into 10891.0.0.  For I can try out that image today and confirm that it works.  For completeness: jayhlee can you also confirm that your customer is happy?

Other notes:

* I personally didn't see any need to switch to "normal" mode to see the bug.  For me with the one patch we had before things sometimes worked and sometimes didn't but it was pretty easy to see it fail.

* It seems unlikely at this point that we'd merge this to M-68.  It looks like M-68 is right around the "stable" milestone and pushing a USB-related patch in so late seems like it's asking for disaster.

Status: Fixed (was: Assigned)
OK, I put 10891.0.0 on an everything looks peachy-keen.  Declaring this bug as Fixed at M-69.
customer's initial reports are good that this issue is solved.

Sign in to add a comment