New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 887667 link

Starred by 4 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

tatl paladin failed libbrillo unittest

Project Member Reported by bhthompson@google.com, Sep 20

Issue description

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8934839043041472736

https://luci-logdog.appspot.com/v/?s=chromiumos/bb/chromiumos/tatl-paladin/4485/+/recipes/steps/UnitTest/0/stdout
...
libbrillo-0.0.1-r1379: >>> Source compiled.
libbrillo-0.0.1-r1379: >>> Test phase: chromeos-base/libbrillo-0.0.1-r1379
libbrillo-0.0.1-r1379: /build/tatl/tmp/portage/chromeos-base/libbrillo-0.0.1-r1379/work/libbrillo-0.0.1/common-mk/platform2_test.py --action=pre_test --sysroot=/build/tatl -- 
libbrillo-0.0.1-r1379: ERROR: ld.so: object 'libsandbox.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
libbrillo-0.0.1-r1379: /build/tatl/tmp/portage/chromeos-base/libbrillo-0.0.1-r1379/work/libbrillo-0.0.1/common-mk/platform2_test.py --action=run --sysroot=/build/tatl -- /build/tatl/var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests
libbrillo-0.0.1-r1379: ERROR: ld.so: object 'libsandbox.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
libbrillo-0.0.1-r1379: chroot: /build/tatl
libbrillo-0.0.1-r1379: cwd: /tmp/portage/chromeos-base/libbrillo-0.0.1-r1379/work/libbrillo-0.0.1/libbrillo
libbrillo-0.0.1-r1379: cmd: {/var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests} '/var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests'
libbrillo-0.0.1-r1379: [==========] Running 397 tests from 48 test cases.
...

 
here's the actual failure:
libbrillo-0.0.1-r1379: [ RUN      ] DevmapperTableTest.CreateTableFromBlobTest
libbrillo-0.0.1-r1379: terminating with uncaught exception of type std::length_error: vector
libbrillo-0.0.1-r1379: Error: /var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests: failed with signal SIGIOT|SIGABRT(6)
Key error:
libbrillo-0.0.1-r1379: [----------] 6 tests from DevmapperTableTest
libbrillo-0.0.1-r1379: [ RUN      ] DevmapperTableTest.CreateTableFromBlobTest
libbrillo-0.0.1-r1379: terminating with uncaught exception of type std::length_error: vector
libbrillo-0.0.1-r1379: Error: /var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests: failed with signal SIGIOT|SIGABRT(6)
libbrillo-0.0.1-r1379:  * ERROR: chromeos-base/libbrillo-0.0.1-r1379::chromiumos failed (test phase):
libbrillo-0.0.1-r1379:  *   (no error message)

(This was just merged in the last CQ run)

The only thing I can see that might trigger a std::length_error is: https://chromium.googlesource.com/chromiumos/platform2/libbrillo/+/9b137e0f4b92b948c701db41951da60a0515c004/brillo/blkdev_utils/device_mapper.cc#45 but I'm not sure why that would only fail on tatl-paladin.
Owner: sarthakkukreti@chromium.org
Status: Assigned (was: Untriaged)
Assigning to author of CLs that added blkdev_utils.
Thanks Derek.

I tried to recreate the failure locally by running FEATURES=test emerge-tatl libbrillo:

- With cros_workon-tatl stopped for libbrillo, the unit test fails with the above message.
- With cros_workon-tatl started for libbrillo, the unit test passes.
- Passes with both for eve-arcnext (and looking at the build waterfall, for other boards as well).

This definitely looks odd to me as I'm don't think there is a difference between the two ebuilds either:
$ diff libbrillo-0.0.1-r1379.ebuild libbrillo-9999.ebuild 
6,7d5
< CROS_WORKON_COMMIT="dd82cdb1ca9b8bbb97397a919567a227c406bce4"
< CROS_WORKON_TREE=("db103a9dd2c79eed8075b58d7c1c4484354a1683" "c3ce65a6f9f13c1d88ad567b5b2e29fa140fb405")
24c22
< KEYWORDS="*"
---
> KEYWORDS="~*"

Does anyone have any further ideas on why this might be happening?
Could there be some CL that didn't make it into the prebuilt that you have locally?
No because the unit tests pass for other boards with 'cros_workon-${BOARD} stop libbrillo'. Also, this CL is self-contained.

I also started a tryjob which passed the unittest: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8934829111840673392 and libbrillo unittests pass for the same libbrillo version. Maybe this is a flake?
Components: OS>Packages OS>Systems>Containers
tael/tatl run with different sets of USE flags than the rest of the CrOS packages, so having them fail just on these boards isn't outside the realm of possibility

you'll need to do a local build for this specific board and run unittests against it
Hi Mike, I did try a set of local build tests (ref:c#4) that failed with the same conditions:
-- Checkout to master for platform2(which contains libbrillo) and starting working on for tatl allows the unit test to pass [and only rebuilding libbrillo].
-- The tatl-tryjob with ToT in c#6 passed libbrillo unittests(run after the first failure was reported), which is ... odd.

And now, after a repo sync about half an hour ago, the unit test failure suddenly stopped reproducing in any of the above conditions. That's why I mentioned that it might have been a flake, but the recent tatl-paladin build seems to have failed again. I'll look into it.
Labels: -Pri-3 Pri-0
Just got another failure, so please revert the changes on ToT since it's blocking the CQ.

Mike, do you know the USE flag differences that Sarthak can look into with his changes?
Bumping to Pri-0 until CQ is unblocked.
Reverted and the next run on tatl-paladin passes: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8934814095390480464

FTR, the unit test in question was introduced by the blkdevutils CL, so that was expected.
Labels: -Pri-0 Pri-2
Bumping priority back down now that this isn't causing CQ failures. I'll leave it up to Sarthak on whether to keep this open for tracking the issue with his CLs.
Cc: gwendal@chromium.org
Thanks Derek, I'll keep this open.

It looks like libbrillo had different USE flags on the failing tatl-paladin runs:

>> For the passing case (for all boards/trybot runs/tatl-paladin run that passed during the CQ run: https://luci-logdog.appspot.com/v/?s=chromiumos/bb/chromiumos/tatl-paladin/4484/+/recipes/steps/UnitTest/0/stdout)

15:29:36: INFO: RunCommand: sudo 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmpCpydaB' 'FEATURES=test' 'PKGDIR=/build/tatl/test-packages' -- /mnt/host/source/chromite/bin/parallel_emerge '--sysroot=/build/tatl' '--jobs=10' sys-apps/rootdev chromeos-base/libbrillo dev-libs/modp_b64 media-libs/minigbm chromeos-base/minijail chromeos-base/vm_guest_tools
Starting fast-emerge.
 Building package sys-apps/rootdev chromeos-base/libbrillo dev-libs/modp_b64 media-libs/minigbm chromeos-base/minijail chromeos-base/vm_guest_tools on /build/tatl
Calculating deps...
Deps calculated in 0m2.7s
[ebuild   R    ] sys-apps/rootdev-0.0.1-r33 to /build/tatl/
[ebuild   R    ] chromeos-base/minijail-6-r22 to /build/tatl/ USE="{test*}" 
[ebuild   R    ] media-libs/minigbm-0.0.1-r237 to /build/tatl/
[ebuild   R    ] dev-libs/modp_b64-0.0.1-r3 to /build/tatl/
[ebuild   R    ] chromeos-base/libbrillo-0.0.1-r1379 to /build/tatl/ USE="{test*}" 
[ebuild   R    ] chromeos-base/vm_guest_tools-0.0.1-r169 to /build/tatl/ USE="{test*}" 
...

>> For the failing builds: https://luci-logdog.appspot.com/v/?s=chromiumos/bb/chromiumos/tatl-paladin/4485/+/recipes/steps/UnitTest/0/stdout 

12:36:23: INFO: RunCommand: sudo 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmpTQqahM' 'FEATURES=test' 'PKGDIR=/build/tatl/test-packages' -- /mnt/host/source/chromite/bin/parallel_emerge '--sysroot=/build/tatl' '--jobs=10' sys-apps/rootdev chromeos-base/libbrillo dev-libs/modp_b64 media-libs/minigbm chromeos-base/minijail chromeos-base/vm_guest_tools
Starting fast-emerge.
 Building package sys-apps/rootdev chromeos-base/libbrillo dev-libs/modp_b64 media-libs/minigbm chromeos-base/minijail chromeos-base/vm_guest_tools on /build/tatl
Calculating deps...
Deps calculated in 0m2.3s
[ebuild   R    ] sys-apps/rootdev-0.0.1-r33 to /build/tatl/
[ebuild   R    ] chromeos-base/minijail-6-r22 to /build/tatl/
[ebuild   R    ] media-libs/minigbm-0.0.1-r237 to /build/tatl/
[ebuild   R    ] dev-libs/modp_b64-0.0.1-r3 to /build/tatl/
[ebuild   R    ] chromeos-base/libbrillo-0.0.1-r1379 to /build/tatl/
[ebuild   R    ] chromeos-base/vm_guest_tools-0.0.1-r169 to /build/tatl/
...

Here's a later tatl-paladin run that already has the reverts and doesn't include the above USE flag: https://logs.chromium.org/v/?s=chromiumos%2Fbb%2Fchromiumos%2Ftatl-paladin%2F4491%2F%2B%2Frecipes%2Fsteps%2FUnitTest%2F0%2Fstdout 

Mike is right: the libbrillo unittest probably doesn't play well with a project-termina USE flag. But I'm still not sure what changed the USE flag/dependency for two builds on the same builder in the first place.

I'm looking into the following two things now:
- What changed in the dependency calculation that dropped the USE flags?
- Which USE flag triggers the std::length_error in libbrillo and how can I fix this?
It's unlikely that termina is going to need lvm2 support.  Would it be easier to add it to libbrillo behind a USE flag and just turn that off for termina?  It'll also help keep our component size from growing out of control.
we prob should look at making the lvm/dm stuff optional.  i don't think we want to block ourselves from the entire libbrillo API as it has a number of useful things in there.
Sure Chirantan, Mike, that sounds good to me. The libcontainer ebuild already has a device-mapper USE flag for similar reasons (https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/chromeos-base/libcontainer/libcontainer-0.0.1-r1335.ebuild) so I can reuse that.
Cc: -kitching@google.com

Sign in to add a comment