New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 762283 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

dev-libs/nss build problems on sdk builder

Project Member Reported by bmgordon@chromium.org, Sep 5 2017

Issue description

dev-libs/nss-3.30.2-r1 has to be built twice on the chromiumos-sdk builders.  The second build consistently succeeds, but wastes around 1-2 min out of 33 in the InitSDK stage.

Example output:
http://uberchromegw/i/chromiumos/builders/chromiumos-sdk/builds/8121/steps/InitSDK/logs/stdio

Probably needs a dependency corrected.
 

Comment 1 by vapier@chromium.org, Oct 31 2017

Cc: vapier@chromium.org
Labels: Build-Toolchain
might be related to a parallel build issue, or maybe mixing of toolchains.  focusing on just certhtml.c.

the compile line is here:
x86_64-pc-linux-gnu-clang -o Linux2.6_x86_64_clang-5.0_glibc_PTH_64_OPT.OBJ/certhtml.o -c ...

the static lib line is here:
x86_64-pc-linux-gnu-ar rc ... Linux2.6_x86_64_clang-5.0_glibc_PTH_64_OPT.OBJ/certhtml.o ...

the shared lib line is here:
x86_64-pc-linux-gnu-clang -shared ... ../certhigh/Linux3.13_x86_gcc.real_glibc_PTH_DBG.OBJ/certhtml.o ...
clang-5.0: error: no such file or directory: '../certhigh/Linux3.13_x86_gcc.real_glibc_PTH_DBG.OBJ/certhtml.o'

notice how the object dir in the first lines have "clang-5.0" but the last one has "gcc.real".  something has gone wrong in there.
nss.log
370 KB View Download
I ran into a similar problem recently in bisecting nss. nss uses compiler name in the object files (why?).

Now this particular build fail means compiler was changed from gcc to clang while nss was being built and it somehow detected this and messed up things. 
I've been poking at this a bit.  The problem doesn't seem to be a bad dependency after all.  It seems to be that sys-apps/sandbox gets upgraded from 2.6 to 2.11 while dev-libs/nss is being compiled.  I've tried a couple of different ways of making sure sandbox isn't built at the same time as nss, and they all fix the problem.

Since the sandbox is fairly fundamental to portage, it seems like it might be a good idea to make sure it gets upgraded separately from the rest of the build.  Any ideas on how to do that?  I was considering having make_chroot emerge it upfront like a few of the other tools.

Comment 4 by vapier@chromium.org, Nov 23 2017

the way sandbox is implemented, upgrading it while in use shouldn't be a problem.  its "API" are env vars and so should hand of seamlessly.  once the library is loaded into memory, it should be stable.

how are you testing things ?  just launching sdk bots ?  or some set of local emerges ?

if you run locally `sudo emerge --jobs 32 sandbox nss` over and over locally, it fails constantly ?
Re-emerging 2.11 repeatedly doesn't seem to cause any problems, but going from 2.6 (from the stage3) to 2.11 consistently breaks it.  I've reproduced it on a local build with --bootstrap as well as a bunch of chromiumos-sdk tryjobs.  This also consistently reproduces in a fully-setup chroot:

1. extract /usr/bin/sandbox and /usr/lib64/libsandbox.so from the bootstrap tarball into the chroot (since we don't have a 2.6 package to downgrade to).
2. sudo emerge --jobs 32 sandbox dev-libs/nss (fails with the same errors from the buildbots).
3. sudo emerge --jobs 32 sandbox dev-libs/nss (succeeds because sandbox just reinstalls 2.11).

One difference I see between the two versions:

sandbox-2.6:

configured with these options:
  --prefix=/usr
  --build=x86_64-pc-linux-gnu
  --host=x86_64-pc-linux-gnu 
  --mandir=/usr/share/man 
  --infodir=/usr/share/info 
  --datadir=/usr/share 
  --sysconfdir=/etc 
  --localstatedir=/var/lib 
  --libdir=/usr/lib64 
  build_alias=x86_64-pc-linux-gnu 
  host_alias=x86_64-pc-linux-gnu 
  'CFLAGS=-O2 -pipe' 
  'LDFLAGS=-Wl,-O1 -Wl,--as-needed'
  CPPFLAGS=

sandbox-2.11:

configured with these options:
  --prefix=/usr
  --build=x86_64-pc-linux-gnu
  --host=x86_64-pc-linux-gnu
  --mandir=/usr/share/man
  --infodir=/usr/share/info
  --datadir=/usr/share
  --sysconfdir=/etc
  --localstatedir=/var/lib
  --libdir=/usr/lib64
  --disable-silent-rules
  --disable-dependency-tracking
  build_alias=x86_64-pc-linux-gnu
  host_alias=x86_64-pc-linux-gnu
  CC=x86_64-pc-linux-gnu-clang
  'CFLAGS=-O2 -pipe'
  'LDFLAGS=-Wl,-O2 -Wl,--as-needed -Wl,-O2 -Wl,--as-needed'
  'CPPFLAGS=-D_CLANG_FORTIFY_DISABLE -D_CLANG_FORTIFY_DISABLE -D_CLANG_FORTIFY_DISABLE -D_CLANG_FORTIFY_DISABLE'

Notice the new CC flag.  Do these get passed into the sandbox at all?  My limited testing suggests that they don't, and I don't see any other changes between the package versions that look like they should make a difference, but it seems pretty clear that it's happening :)

I've also seen similar failures in other packages, e.g. freetype fails if automake gets replaced while it's building.  Bigger proposal: Would it make sense to add an additional pass that upgrades all the ambient tools from the bootstrap image before building virtual/target-sdk?  Presumably that would prevent all these implicit dependencies from changing out mid-build.  Alternatively, can we upgrade to a newer stage3 so that fewer of these upgrades are needed?

Comment 6 by vapier@chromium.org, Nov 29 2017

sandbox isn't supposed to work that way, but i don't have time atm to investigate that reliability aspect (and it might be a matter of bugs fixed in newer versions)

we've generally tried to make upgrades atomic, but yeah, autotools wouldn't work out in that regard as it has too many source m4 files it reads at runtime

i'm not sure splitting the passes would address this issue.  sandbox/autotools are in the existing system, but so are packages that would use them.  fundamentally, updating build tools non-atomically in parallel is going to always be racy :/.  even updating a single package in place is slightly racy -- when you upgrade glibc, there's a window as we actually move all the files over the old version.  portage tries to mitigate this by moving all the files at once as fast as it can, but w/out FS support, or some sort of big kernel lock, we'll always have this problem.

if there's a subset of packages that cause more disproportionate pain, we could update the make_chroot.sh file to special case them together.  like adding sandbox to the line that bootstraps portage explicitly.
Project Member

Comment 7 by bugdroid1@chromium.org, Jan 5 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/9413b10aed3b10df15bcc4899a91a60a26daa146

commit 9413b10aed3b10df15bcc4899a91a60a26daa146
Author: Benjamin Gordon <bmgordon@google.com>
Date: Fri Jan 05 05:52:07 2018

make_chroot: Emerge sandbox/automake/patch early

dev-libs/nss, embedded-dev/coreboot-sdk, and media-libs/freetype fail
to build when certain implicit dependencies are emerged during their
build.  They rebuild successfully on the second try, but this makes the
sdk builder take about 30 minutes longer than it needs to.  emerge a few
specific packages early to eliminate this concurrency:

* sys-apps/sandbox breaks dev-libs/nss
* sys-devel/patch breaks dev-embedded/coreboot-sdk
* sys-devel/automake sometimes breaks media-libs/freetype and the
  old version slows down coreboot-sdk build times by 10 minutes.
  This one isn't strictly necessary on every build, but seems worth
  doing to avoid freetype build flakiness.

gcc is still being rebuilt during the host toolchain update, but that
needs to be fixed separately.

BUG= chromium:762283 
TEST=chromiumos-sdk completes without retrying packages in the main
     build phase.

Change-Id: I0af1e0c3d2b4918b76bf07758d857d7a71c99166
Reviewed-on: https://chromium-review.googlesource.com/801130
Commit-Ready: Benjamin Gordon <bmgordon@chromium.org>
Tested-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/9413b10aed3b10df15bcc4899a91a60a26daa146/sdk_lib/make_chroot.sh

Status: Fixed (was: Untriaged)
The next chromiumos-sdk run didn't rebuild dev-libs/nss or coreboot-sdk in InitSDK: https://luci-milo.appspot.com/buildbot/chromiumos/chromiumos-sdk/8373

Sign in to add a comment