New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 723145 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocked on:
issue 722063

Blocking:
issue 537368



Sign in to add a comment

libusb built w/clang fails w/NULL pointer access

Project Member Reported by yueherngl@chromium.org, May 17 2017

Issue description

Starting from last week, I have problem connecting with servo, the error message was,

~/trunk/src/scripts $ sudo servod -b asuka
2017-05-16 15:09:30,922 - servod - INFO - Start
Traceback (most recent call last):
  File "/usr/lib/python-exec/python2.7/servod", line 9, in <module>
    load_entry_point('servo===0.0.1-027193a', 'console_scripts', 'servod')()
  File "/usr/lib64/python2.7/site-packages/servo/servod.py", line 529, in main
    main_function()
  File "/usr/lib64/python2.7/site-packages/servo/servod.py", line 446, in main_function
    multiservo.parse_rc(logger, options.rcfile))
  File "/usr/lib64/python2.7/site-packages/servo/servod.py", line 326, in discover_servo
    all_servos.extend(usb_find(vid, pid, serialname))
  File "/usr/lib64/python2.7/site-packages/servo/servod.py", line 167, in usb_find
    for bus in usb.busses():
  File "/usr/lib64/python2.7/site-packages/usb/legacy.py", line 343, in <genexpr>
    return (Bus(g) for k, g in _interop._groupby(
  File "/usr/lib64/python2.7/site-packages/usb/legacy.py", line 339, in __init__
    self.devices = [Device(d) for d in devices]
  File "/usr/lib64/python2.7/site-packages/usb/legacy.py", line 324, in __init__
    self.configurations = [Configuration(c) for c in dev]
  File "/usr/lib64/python2.7/site-packages/usb/legacy.py", line 127, in __init__
    [Interface(i) for i in cfg],
  File "/usr/lib64/python2.7/site-packages/usb/legacy.py", line 113, in __init__
    self.endpoints = [Endpoint(e) for e in intf]
  File "/usr/lib64/python2.7/site-packages/usb/core.py", line 376, in __iter__
    self.configuration
  File "/usr/lib64/python2.7/site-packages/usb/core.py", line 258, in __init__
    configuration
  File "/usr/lib64/python2.7/site-packages/usb/backend/libusb1.py", line 729, in get_endpoint_descriptor
    return _WrapDescriptor(i.endpoint[ep], i)
ValueError: NULL pointer access

After bisected various builds, I found the problem happened starting from 9535.0.0. Among the changes, this (https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/95789da27d19d74d7032b36d1c1872860ce77ffd%5E%21/) looks to be the suspicious. After I reverted the change, the problem is gone.

Just wondering if any change in SDK and/or toolchain causing this problem?

Note that our partner reported the same issue as well.

Thanks.
 
Cc: vapier@chromium.org josa...@chromium.org tbroch@chromium.org
Summary: servo connection problem / libusb failure (was: servo connection problem)
+cc a few more people

From investigating on YH's system, it seems like libusb or pyusb is messed up somehow. This failure seems to be a null pointer access by libusb (?) while iterating through endpoints. This is also reproducible with platform/ec/extras/usb_serial/console.py given a suzyq or servo micro.

Comment 2 by vapier@chromium.org, May 17 2017

before debugging too much, the easiest thing would be to just try some upgrades.  looks like there are newer versions of both pyusb and libusb1 available.

Comment 3 Deleted

Labels: -Pri-3 Pri-0
Apparently a problem for everyone, partners are complaining, and partners are blocked. I'm going to revert to the old sdk to unblock unless anyone had a better approach to get this fixed today.
Owner: vapier@chromium.org

Comment 6 by vapier@chromium.org, May 17 2017

err, hold up.  don't go reverting SDKs.  that has a number of ramifications and will simply get reverted as soon as the chromiumos-sdk bot passes again (which it normally does at least once a day).

Comment 7 by vapier@chromium.org, May 17 2017

also, that sdk version is only used as the base sdk version when installing a fresh chroot ... it would have no impact on existing users, and anyone running update_chroot would update past it.
The existing sdk build has apparently been broken for like a week, so no worries about a revert getting clobbered. Also, YH has confirmed that reverting the sdk does in fact fix the problem.

Comment 9 by vapier@chromium.org, May 17 2017

that isn't terribly precise.  i could revert that CL locally, run update_chroot, and i'm sure it'll continue to fail.  you'd have to revert the CL, download a completely new chroot, and then not update things inside it.

still seems like the easiest thing to try out would be to grab some package updates.

note: i've never used servod, nor do i know how, so i don't think i'm the best owner here.
Labels: OS-All
Unfortunately it seems like you are the only one who actually knows how the SDK works, unless you can suggest someone better.

easy repro steps:

cros_sdk --delete
cros_sdk --create
cros_sdk --no-ns-pid
$ sudo servod
2017-05-17 13:40:52,759 - servod - INFO - Start
2017-05-17 13:40:52,812 - servod - ERROR - No servos found

Passing. 

cros_sdk --delete
cros_sdk --create
cros_sdk --no-ns-pid
$ sudo servod
...
ValueError: NULL pointer access

Failing.

No knowledge of servod or actual servo is necessary. Do you think you can fix today or prefer to back out to the old sdk and get a fix in later?


you should insert `./update_chroot` before running servod and see if it still fails.

again, "backing out the sdk" isn't going to work the way you think it is.  that sdk version is simply the base tarball that gets downloaded.  then update_chroot runs and updates all the sdk packages and gets things back to ToT behavior.  and any attempts to revert or otherwise modify that file will be automatically clobbered by our bots.
update chroot doesn't change anything. Please try it yourself.
Labels: -Pri-0 Pri-2
it will eventually.  all sdks are "eventually consistent".  so whatever broken package is in the new sdk will make it out to all developers when it is upgraded or rebuilt.

so to reiterate: reverting the SDK version (like the referenced one) would not help.  as you can see in that commit, it was created by a bot.

so you can either try an upgrade of these packages as i suggested, or take a working chroot and merge in the changes from the failing chroot to try and narrow down where it went wrong.

pretty sure if you take your first example and just run:
$ sudo emerge -G libusb:1

it'll start failing too.  rebuild it w/gcc and it'll work fine again.

the SDK environment is owned by everyone.  we are all responsible for keeping it healthy.
Owner: cmt...@chromium.org
Status: Available (was: Untriaged)
Summary: libusb built w/clang fails w/NULL pointer access (was: servo connection problem / libusb failure)
Labels: -Pri-2 Pri-0
Labels: -Pri-0 Pri-2
this isn't a pri-0 as you have a trivial workaround:
  https://chromium-review.googlesource.com/507988
Labels: -Pri-2 Pri-0
It's not a fix unless you submit it. So either commit or leave p0
RE: #16

Just wondering why https://chromium-review.googlesource.com/507988 is being marked as HACK? If libusb does not work when built by clang and building it with gcc addresses the problem, then the CL should be considered a proper fix?

Thanks.

Cc: -vapier@chromium.org
The proper fix is probably something in clang so it aligns some obscure behavior more with what gcc does, or possibly something in libusb to have it not break on some obscure compiler behavior.

Hack or not, using gcc while we figure out why clang built libusb is borked seems reasonable. 

With the CL in comment 16 I can see the issue fixed:

0. Start with old chroot that works ok.
1. sudo emerge libusb
2. sudo servod #Bad error :(
3. cherry-pick in the patch...
4. sudo emerge libusb
5. sudo servod #Good error (no servo detected) :)

So with this CL, any partners whom have been impacted should be instructed to run the following commands:

cd ~/trunk/src/third_party/chromiumos-overlay
git fetch https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay refs/changes/88/507988/1 && git cherry-pick FETCH_HEAD
sudo emerge libusb

In answer to comment 18, no that's not a real fix because the long-term plan is to pretty much stop supporting GCC and have everything building with clang.  So the correct 'real' fix would be to figure out what llvm/clang is doing that is causing the package to break, and to fix that. 

In the short term, building your package with GCC is an OK workaround.

This repro's for me outside of servod,

python -c 'import usb;usb.busses().next()'
RE: #21

That also means all partners need to go through this workaround if I understand it correctly. That can be lots of head-scratching, IMHO.

In this case, why not supplying the gcc-built libusb (until we fix the real issue) to save their hassles?

My 2 cents.
There is a CL in the process that will build libusb with GCC until the real issue is fixed.
After this comment 16 CL lands, and the sdk gets updated (pending https://bugs.chromium.org/p/chromium/issues/detail?id=722063 ) we will be supplying the gcc-built libusb, at which point any newly created chroot will be fine.

If partners (or impacted Googlers for that matter) want it to be working now, they can run the three commands listed in comment 20 in their chroot. 

This is probably PSA worthy if we believe many engineers are having servo trouble (perhaps silently).
Labels: -Pri-0 Pri-2
Lowering priority as TOT should still function while root cause is being investigated.
Labels: Build-Toolchain
Blockedon: 722063
I believe this needs to be marked as blocked on 722063
RE: #24

What is the CL being referred, as https://chromium-review.googlesource.com/507988 has been abandoned?
Thanks.
It's now https://chromium-review.googlesource.com/#/c/508253/.
Thanks Bernie.
Project Member

Comment 31 by bugdroid1@chromium.org, May 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/2840f9a2d1e9941ef7bf9b92940997ad40fad037

commit 2840f9a2d1e9941ef7bf9b92940997ad40fad037
Author: Nick Sanders <nsanders@chromium.org>
Date: Fri May 19 20:57:47 2017

[HACK] libusb: build w/gcc

Clang build causes libusb to break, add a build exception
for libusb to use gcc. TODO(cmtrice): fix libusb under clang
and revert this change.

BUG=chromium:723145
TEST=rebuild libusb in sdk and `sudo servod` no longer crashes

Change-Id: I1d3f4c1d863721fbcff009dde0429043c85b27ea
Reviewed-on: https://chromium-review.googlesource.com/508253
Commit-Ready: YH Lin <yueherngl@chromium.org>
Tested-by: Nick Sanders <nsanders@chromium.org>
Tested-by: YH Lin <yueherngl@chromium.org>
Tested-by: Bernie Thompson <bhthompson@chromium.org>
Reviewed-by: Caroline Tice <cmtice@chromium.org>
Reviewed-by: YH Lin <yueherngl@chromium.org>
Reviewed-by: Bernie Thompson <bhthompson@chromium.org>

[add] https://crrev.com/2840f9a2d1e9941ef7bf9b92940997ad40fad037/chromeos/config/env/dev-libs/libusb

With ToT this morning the problem is gone. Thanks.
it is good we have a workaround. But this is only a workaround.

But can anyone try to figure out the problem with LLVM. This is likely to point to a real problem in the libusb. 

Vapier suggested you tried newer versions of libusb. Can you try that? 
Cc: gkihumba@chromium.org
I tried the newer libusb (1.0.21) and found similar problem.
if you have a CL to update packages, please post it so we can get it merged.  we want to update regardless of this bug.
Oh sure, please have a look, https://chromium-review.googlesource.com/#/c/513502/

Hopefully I'm updating it the right way...

Thanks.
Ok, thanks. It will be good to get this updated.

Does any of you have any expertisse with this package so that you can do more triaging? Otherwise, we will just have to do automatic bisection of object files to triage the issue. 
Project Member

Comment 39 by bugdroid1@chromium.org, May 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/portage-stable/+/bd16cba511f10d99cc1b3149ea1f7349fed2372b

commit bd16cba511f10d99cc1b3149ea1f7349fed2372b
Author: YH Lin <yueherngl@google.com>
Date: Fri May 26 00:15:59 2017

libusb: upgraded package to upstream

Upgraded dev-libs/libusb to version 1.0.21 on amd64

Updating libusb from 1.0.20 to 1.0.21 (amd64), even though
similar problem can still be found (gcc build works but
not llvm build).

BUG=chromium:723145
TEST=sudo emerge libusb

Change-Id: I80187b39f3ab5dbaecce87588b1c0a6c8a193a83
Reviewed-on: https://chromium-review.googlesource.com/513502
Commit-Ready: YH Lin <yueherngl@chromium.org>
Tested-by: YH Lin <yueherngl@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/bd16cba511f10d99cc1b3149ea1f7349fed2372b/dev-libs/libusb/Manifest
[delete] https://crrev.com/1d1d6042659c26a8fef0fddb516e4e97930672cd/metadata/md5-cache/dev-libs/libusb-1.0.19
[rename] https://crrev.com/bd16cba511f10d99cc1b3149ea1f7349fed2372b/dev-libs/libusb/libusb-1.0.21.ebuild
[modify] https://crrev.com/bd16cba511f10d99cc1b3149ea1f7349fed2372b/dev-libs/libusb/metadata.xml
[add] https://crrev.com/bd16cba511f10d99cc1b3149ea1f7349fed2372b/metadata/md5-cache/dev-libs/libusb-1.0.21

Components: Tools>ChromeOS-Toolchain
Blocking: 537368
Owner: llozano@chromium.org
Status: Assigned (was: Available)

Sign in to add a comment