Issue metadata
Sign in to add a comment
|
veyron hardware does not work with some USB touch devices like ELO2002 (dwc2 bug) |
||||||||||||||||||||||
Issue descriptionDevice: AOPEN chromebox mini Peripheral: ELO Touchscreen 2002 Was working in m63 Fails in testing in 64, 65, 66 and 67 (dev) - Issue is tricky to reproduce in dev mode but common to repro in verified mode which may indicate a timing issue in firmware or kernel (since dev mode is often slower to pass on to kernel after manual CTRL+D or 30 seconds).
,
Jun 6 2018
Thanks for creating the separate bug. Is it possible for me to share the build you tested with the customer? It would be good to have them test in their environment.
,
Jun 6 2018
The fix has not yet landed so I am not sure how to share the build. Also I am not aware on how the kernel updates work and what release cycle they follow to know when they will be available to the customer. We might be able to flash a USB drive with the build to share with the customer. We also have the Fievel device(@jayhlee) with the patch loaded.
,
Jun 6 2018
,
Jun 6 2018
Currently proposed fix at <https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1087508>.
,
Jun 11 2018
@malaykeshav: The customer is familiar with flashing USB drives with a .bin image - can we share an image in this format with the customer? @dtor: Can we merge the proposed change into a stable version that is coming soon? The customer is not happy that they are already 4 major versions behind.
,
Jun 11 2018
,
Jun 11 2018
I think dtor@ is trying to get something that's upstream happy and needed help testing some variants of his patch since he doesn't have access to the device himself...
,
Jun 11 2018
eryen@ can the partner test when the fix gets to Dev or canary?
,
Jun 11 2018
dianders: I have the ELO screen and setup now and am glad to do any necessary testing.
,
Jun 11 2018
ovanieva: yes, customer can test in canary / dev but if it's going to take awhile to land in the tree then we'd prefer to get them something unofficial to confirm the problem is where we think it is.
,
Jun 11 2018
Unfortunately my proposed change does break the devices that we were trying to fix to begin with, so we can't ship it yet.
,
Jun 12 2018
Thanks. I will let the customer know that we are still working on testing the proposed change and will ask them to test in canary / dev once ready.
,
Jun 13 2018
Thanks dtor, please do keep us updated and let me know if you need any testing with the ELO 2002L unit and a veyron-fievel.
,
Jun 18 2018
dtor@ installed a new kernel he built on the NYC fievel and ELO 2002L setup we have and I was unable to reproduce the touch issue after that. It's hard to 100% confirm fix because repro is much harder in dev mode but I think it's worth proceeding with getting CL in canary and we can test further there in verified mode.
,
Jun 18 2018
We are landing https://crrev.com/c/1103455 that should fix the issue.
,
Jun 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e230ca2310f50000123ae047622d54a9fe56b030 commit e230ca2310f50000123ae047622d54a9fe56b030 Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Wed Jun 20 02:11:13 2018 FROMLIST: usb: dwc2: host: do not delay retries for CONTROL IN transfers When handling split transactions we will try to delay retry after getting a NAK from the device. This works well for BULK transfers that can be polled for essentially forever. Unfortunately, on slower systems at boot time, when the kernel is busy enumerating all the devices (USB or not), we issue a bunch of control requests (reading device descriptors, etc). If we get a NAK for the IN part of the control request and delay retry for too long (because the system is busy), we may confuse the device when we finally get to reissue SSPLIT/CSPLIT IN and the device will respond with STALL. As a result we end up with failure to get device descriptor and will fail to enumerate the device: [ 3.428801] usb 2-1.2.1: new full-speed USB device number 9 using dwc2 [ 3.508576] usb 2-1.2.1: device descriptor read/8, error -32 [ 3.699150] usb 2-1.2.1: device descriptor read/8, error -32 [ 3.891653] usb 2-1.2.1: new full-speed USB device number 10 using dwc2 [ 3.968859] usb 2-1.2.1: device descriptor read/8, error -32 ... Let's not delay retries of split CONTROL IN transfers, as this allows us to reliably enumerate devices at boot time. Fixes 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> (am from https://patchwork.kernel.org/patch/10467735/) BUG= chromium:850222 TEST=Boot fievel with external touchscreen Change-Id: I21bdd3e17f80c89ea4f591cba44ebf3a7548da85 Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1103455 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> [modify] https://crrev.com/e230ca2310f50000123ae047622d54a9fe56b030/drivers/usb/dwc2/hcd_intr.c
,
Jun 20 2018
Landed on M-69. Presumably we'll want a merge for M-68.
,
Jun 20 2018
This bug requires manual review: M68 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jun 20 2018
The customer is reporting that they have tested the fix and it does not resolve the issue for them. "I updated one of our Chromebox Mini devices in the lab to 69.0.3464.0 and connected it to an Elo 2002L. It is still exhibiting the exact same behavior where I cannot use the touchscreen upon initial bootup until I reconnect the USB cable or power cycle the monitor."
,
Jun 20 2018
pnevin: customer needs to be on 10801.0.0 or newer to get this recent fix. That version is not on fievel canary channel but should be in next day or two. They need to test then.
,
Jun 20 2018
,
Jun 20 2018
I'm planning to wait on actually merging until I see for sure that customer has confirmed the fix.
,
Jun 25 2018
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible! If all merges have been completed, please remove any remaining Merge-Approved labels from this issue. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jun 25 2018
@24: Someone needs to use AI to make the bot read my comment from @23. ...speaking of which, do we have confirmation? I suppose we can merge proactively anyway but I'd rather have confirmation...
,
Jun 25 2018
I just got cofirmation from the customer that this issue appears to be resolved for them. Please move forward with the marge. Thank you! "69.0.3464.0 Platform version 10801.0.0 fixes the issue! I was able to complete my testing in the lab today on the AOpen Fievel Chromebox Mini and the Elo 2002L that we are using for our Kiosk."
,
Jun 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3e287c1d3048d1814d6f246cfd7b2d9fe0539b50 commit 3e287c1d3048d1814d6f246cfd7b2d9fe0539b50 Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Mon Jun 25 19:56:09 2018 FROMLIST: usb: dwc2: host: do not delay retries for CONTROL IN transfers When handling split transactions we will try to delay retry after getting a NAK from the device. This works well for BULK transfers that can be polled for essentially forever. Unfortunately, on slower systems at boot time, when the kernel is busy enumerating all the devices (USB or not), we issue a bunch of control requests (reading device descriptors, etc). If we get a NAK for the IN part of the control request and delay retry for too long (because the system is busy), we may confuse the device when we finally get to reissue SSPLIT/CSPLIT IN and the device will respond with STALL. As a result we end up with failure to get device descriptor and will fail to enumerate the device: [ 3.428801] usb 2-1.2.1: new full-speed USB device number 9 using dwc2 [ 3.508576] usb 2-1.2.1: device descriptor read/8, error -32 [ 3.699150] usb 2-1.2.1: device descriptor read/8, error -32 [ 3.891653] usb 2-1.2.1: new full-speed USB device number 10 using dwc2 [ 3.968859] usb 2-1.2.1: device descriptor read/8, error -32 ... Let's not delay retries of split CONTROL IN transfers, as this allows us to reliably enumerate devices at boot time. Fixes 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> (am from https://patchwork.kernel.org/patch/10467735/) BUG= chromium:850222 TEST=Boot fievel with external touchscreen Change-Id: I21bdd3e17f80c89ea4f591cba44ebf3a7548da85 Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1103455 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> (cherry picked from commit e230ca2310f50000123ae047622d54a9fe56b030) Reviewed-on: https://chromium-review.googlesource.com/1113758 [modify] https://crrev.com/3e287c1d3048d1814d6f246cfd7b2d9fe0539b50/drivers/usb/dwc2/hcd_intr.c
,
Jun 25 2018
,
Jun 26 2018
Unfortunately I got a follow-up email from my customer today that the issue has *not* been fixed for them. The testing he did yesterday was with a device in developer mode (canary channel). Today the customer did additional testing and started the device in verified boot mode. When that happens 100% of the time the touchscreen device fails to work on boot up. Only unplugging and plugging the touchscreen back in will get it to work.
,
Jun 26 2018
Logs for customer repro are available on Google Drive. Request access at https://drive.google.com/corp/drive/folders/1Sf2ZYWMdi9ZV74JPgpfIKSJJaV9pT8Bi
,
Jun 26 2018
,
Jun 27 2018
I'm testing with ELO 2002L and Fievel and can reproduce the same issues on canary 10821.0.0 verified mode. The fix does not seem to solve the issue here. dtor@ at this point do you want me to see about shipping the ELO 2002L to you?
,
Jun 28 2018
Might as well, but I will be OOO starting next week so you might want to see if there is someone else who could take a look at this.
,
Jul 2
(have not shipped ELO screen yet since not clear someone in MTV can use until dtor return) From the logs on canary 10833.0.0: [ 2.138591] usb 2-1.1.1: new full-speed USB device number 4 using dwc2 [ 2.318397] usb 2-1.1.1: device descriptor read/all, error -32 [ 2.398778] usb 2-1.1.1: new full-speed USB device number 5 using dwc2 [ 2.578349] usb 2-1.1.1: device descriptor read/all, error -32 [ 2.658611] usb 2-1.1.1: new full-speed USB device number 6 using dwc2 [ 2.738329] usb 2-1.1.1: device descriptor read/8, error -32 [ 2.928364] usb 2-1.1.1: device descriptor read/8, error -32 [ 3.118611] usb 2-1.1.1: new full-speed USB device number 7 using dwc2 [ 3.198364] usb 2-1.1.1: device descriptor read/8, error -32 [ 3.388373] usb 2-1.1.1: device descriptor read/8, error -32 [ 3.498597] hub 2-1.1:1.0: unable to enumerate USB device on port 1 still doing some testing but issue seems not resolved on current canary.
,
Jul 9
Puneet, Assigning this to you as dtor@ is OoO until 18th and this is very high priority and needs immediate attention, please ping me for full details if needed. Can we get another eng resource assigned to this ASAP? I can ship the touchscreen to the resource once assigned.
,
Jul 9
Doug seems to have context around this bug so will have to check with him. Assigning to him temporarily until we know more. Unfortunately Doug is OOO today. I realize this is a high priority item but given that its been open for a while, let's wait for Doug to come back first.
,
Jul 10
As an update I've shipped the ELO 2002L unit to Doug who will take shipment of it.
,
Jul 17
Here's a patch that works for me: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1103456 I took something that was part of Dmitry's original series and fixed a use-after-free bug with it.
,
Jul 17
Since Dmitry will be back tomorrow, will wait for him to give it a +2 rather than getting another person up to speed. Then we can work on getting this verified (again) and backported to whatever stable branches are relevant at the time.
,
Jul 19
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c3b0ed1cc55da63d8314c2b8c9bce8a24d152427 commit c3b0ed1cc55da63d8314c2b8c9bce8a24d152427 Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Thu Jul 19 09:52:03 2018 FROMLIST: usb: dwc2: host: do not delay retries after successful transfer When handling split transactions we should not be delaying retrying SSPLIT/CSPLIT after we successfully communicate with the device, so let's reset dtd->num_naks counter when handling XFERCOMPL. Fixes: 38d2b5fb75c1 ("usb: dwc2: host: Don't retry NAKed transactions right away") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> (am from https://patchwork.kernel.org/patch/10530517/) BUG= chromium:850222 TEST=Boot fievel with external touchscreen Change-Id: I4313b19aa245552588ff47b5d366f35ef22cdfe7 Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1103456 Reviewed-by: Dmitry Torokhov <dtor@chromium.org> [modify] https://crrev.com/c3b0ed1cc55da63d8314c2b8c9bce8a24d152427/drivers/usb/dwc2/hcd_intr.c
,
Jul 19
The above looks like it (barely) made it into 10891.0.0. For I can try out that image today and confirm that it works. For completeness: jayhlee can you also confirm that your customer is happy? Other notes: * I personally didn't see any need to switch to "normal" mode to see the bug. For me with the one patch we had before things sometimes worked and sometimes didn't but it was pretty easy to see it fail. * It seems unlikely at this point that we'd merge this to M-68. It looks like M-68 is right around the "stable" milestone and pushing a USB-related patch in so late seems like it's asking for disaster.
,
Jul 19
OK, I put 10891.0.0 on an everything looks peachy-keen. Declaring this bug as Fixed at M-69.
,
Jul 24
customer's initial reports are good that this issue is solved. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by malaykeshav@chromium.org
, Jun 6 2018