New issue
Advanced search Search tips

Issue 676617 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug-Regression



Sign in to add a comment

WebRTC Android trybots fail compilation after checkout restructuring

Project Member Reported by kjellander@chromium.org, Dec 22 2016

Issue description

After I landed https://codereview.webrtc.org/1414343008/ WebRTC no longer checks out a Chromium copy that is used for the build. Instead all dependencies are pulled down using Git mirrors specified in the DEPS file.

After this happened, all the Android trybots in https://build.chromium.org/p/tryserver.webrtc/waterfall started failing with compile errors like this:

FAILED: obj/base/base/task_traits.o 
/b/c/cipd/goma/gomacc ../../third_party/android_tools/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-g++ -MMD -MF obj/base/base/task_traits.o.d -DV8_DEPRECATION_WARNINGS -DDCHECK_ALWAYS_ON=1 -DUSE_OPENSSL_CERTS=1 -DNO_TCMALLOC -DUSE_EXTERNAL_POPUP_MENU=1 -DDISABLE_NACL -DSAFE_BROWSING_DB_REMOTE -DCHROMIUM_BUILD -DENABLE_MEDIA_ROUTER=1 -DENABLE_WEBVR -DFIELDTRIAL_TESTING_ENABLED -D_FILE_OFFSET_BITS=64 -DANDROID -DHAVE_SYS_UIO_H -DANDROID_NDK_VERSION=r12b -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D_FORTIFY_SOURCE=2 -D__GNU_SOURCE=1 -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DBASE_IMPLEMENTATION -I../.. -Igen -Igen/base/base_jni_headers -Igen/base/base_jni_headers/base -Igen/android_runtime_jni_headers/base -I../../third_party/android_tools/ndk/sources/android/cpufeatures -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -funwind-tables -fPIC -pipe -ffunction-sections -fno-short-enums -finline-limit=64 -march=armv7-a -mfloat-abi=softfp -mtune=generic-armv7-a -fno-tree-sra -fno-caller-saves -mfpu=neon -mthumb -mthumb-interwork -Wall -Werror -Wno-psabi -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-missing-field-initializers -Wno-unused-parameter -fomit-frame-pointer -gdwarf-3 -g1 --sysroot=../../third_party/android_tools/ndk/platforms/android-16/arch-arm -fvisibility=hidden -O2 -fno-ident -fdata-sections -ffunction-sections -fno-threadsafe-statics -fvisibility-inlines-hidden -std=gnu++11 -Wno-narrowing -fno-rtti -isystem../../third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++/libcxx/include -isystem../../third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/include -isystem../../third_party/android_tools/ndk/sources/android/support/include -fno-exceptions -c ../../base/task_scheduler/task_traits.cc -o obj/base/base/task_traits.o
In file included from ../../third_party/android_tools/ndk/sources/android/support/include/stdint.h:31:0,
                 from ../../base/task_scheduler/task_traits.h:8,
                 from ../../base/task_scheduler/task_traits.cc:5:
../../third_party/android_tools/ndk/platforms/android-16/arch-arm/usr/include/stdint.h:31:20: fatal error: stddef.h: No such file or directory
 #include <stddef.h>
                    ^
compilation terminated.


I tried submitting a landmine CL (https://codereview.webrtc.org/2601473002/) since when I did a clobber build on a trybot, it went green. However this doesn't seem to help in other cases.

 
Labels: Infra-Troopers
Adding Troopers since I'm unable to figure out what's causing these problems. I've logged into a machine to investigate but some process is altering the /b/c mounts so it's hard to debug locally it seems.
With "other cases" I meant that most builds fail, but the one I clobbered went green. There are some other green builds too though, which is hard to explain.

Our main bots that are used in the CQ are:
https://build.chromium.org/p/tryserver.webrtc/builders/android_arm64_rel
https://build.chromium.org/p/tryserver.webrtc/builders/android_rel
https://build.chromium.org/p/tryserver.webrtc/builders/android_dbg
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x64_dbg
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x86_dbg
and they're all affected by this.

Interestingly, the Android Clang trybot always succeeds: 
https://build.chromium.org/p/tryserver.webrtc/builders/android_clang_dbg
which points at this has something to do with the Android default toolchain (GCC).


Hmm, could it be that clobbering is different when done from landmine than when triggered from the web UI?

These are a trybot builds with clobber triggered by my landmine (happens in the ruhooks step):
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x86_rel/builds/4500 (which happens to be green)
https://uberchromegw.corp.google.com/i/tryserver.webrtc/builders/android_compile_x86_rel/builds/4501 (fails)

but this build was clobbered from the UI:
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x64_dbg/builds/10038 (green)

I just triggered two new clobbers from the web UI to gather more data:
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x64_dbg/builds/10051
https://build.chromium.org/p/tryserver.webrtc/builders/android_compile_x86_dbg/builds/9953




After looking into what https://cs.chromium.org/chromium/src/build/clobber.py does compared to the Web UI clobber (which wipes all of out/Debug or out/Release), the former seems to leave args.gn and writes a build.ninja file.

To be sure, I'm going "nuke it from orbit", so I'm cleaning out all the out/ dirs on these slaves using:
for s in slave719-c4 slave720-c4 slave721-c4 slave722-c4 slave723-c4 slave724-c4 slave725-c4 slave837-c4 slave838-c4 slave839-c4 slave840-c4 slave1210-c4 slave1211-c4 slave1212-c4 slave1213-c4 slave1214-c4 slave1215-c4 slave1216-c4 slave1217-c4 slave1218-c4 slave1219-c4; do ssh $s rm -rf /b/c/b/android_*_dbg/src/out; done;

Let's see if that greens up the next builds.
Labels: -Restrict-View-Google
Oops, I'm doing another cleaning using rm -rf /b/c/b/android_*_rel/src/out to cover the remaining builders.
every one of those builds succeeded. Does this need to still be in trooper queue?
Labels: -Infra-Troopers
Owner: kjellander@chromium.org
Status: Assigned (was: Untriaged)
I fired them to be sure I wasn't just lucky in earlier attempts. The problem is that we have like 25 VMs that share a pool and powers ~10 builders. That means that even if I clobber many times, I cannot be sure each local checkout has picked up the clobbering.
Only a landmine could guarantee that, but apparently it doesn't clean up what's causing this (which I still don't know exactly what it may be).

Since this is a one-time thing, I'm now trying a more brutal approach: wipe all the android checkouts on these machines and let them be re-created the next build (Git cache should make this reasonably fast).
I'll put it back to the trooper queue in case that doesn't work.
Cc: seanmccullough@chromium.org
+seanmccullough@chromium.org so you can see my reply in #9.
Cc: jbudorick@chromium.org
Labels: Infra-Troopers
Owner: ----
Status: Untriaged (was: Assigned)
So, I wiped /b/c/b/android* on all these VMs and triggered new builds. All passed except the ones for android_rel and android_dbg:
https://uberchromegw.corp.google.com/i/tryserver.webrtc/builders/android_rel/builds/19607
https://uberchromegw.corp.google.com/i/tryserver.webrtc/builders/android_dbg/builds/19769
There's nothing special with those configs, so apparently that fix also doesn't work reliably. Could it be needed to wipe the Git cache as well? Maybe there's something weird cached for the third_party/android_tools or something (although that shouldn't really have changed after our restructuring).

+jbudorick who have detailed insight in this toolchain.


Status: Fixed (was: Untriaged)
I had another look today and now all bots look healthy. I'm closing as fixed and will reopen if I see another failure. The autoroller bot will keep triggering builds during the holidays.

Sign in to add a comment