whirlwind-pre-cq BuildPackages/unittest flake |
|||||||||||
Issue descriptionwhirlwind builds are flaky due to missing DEPENDs Example in Unittests stage: actions requires a retry until ap-infra is merged: https://uberchromegw.corp.google.com/i/chromeos/builders/whirlwind-paladin/builds/11096 Example failure in BuildPackages stage: https://uberchromegw.corp.google.com/i/chromeos/builders/whirlwind-paladin/builds/11107
,
Apr 4 2018
As far as I can tell, this breakage is causing every pre-cq that runs on whirlwind-no-vmtest-pre-cq to fail since yesterday. Can we get this out of the pre-cq if its not a quick fix?
,
Apr 4 2018
Agreed, I'm trying to do that now.
,
Apr 4 2018
blocking the pre-cq
,
Apr 4 2018
One option is to remove it from the default set: https://chromium-review.googlesource.com/#/c/chromiumos/chromite/+/996640 Note that everybody who is affected will need to rebase their CLs after this lands (and the pre-cq-launcher restart) in order for this to take effect. +jclinton as this was also a failure of alerting.
,
Apr 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/bca1fd053debef0c43ec2b2955fd62a33190905f commit bca1fd053debef0c43ec2b2955fd62a33190905f Author: Aviv Keshet <akeshet@chromium.org> Date: Wed Apr 04 23:11:27 2018 remove whirlwind-no-vmtest-pre-cq from default set BUG= chromium:829055 TEST=None Change-Id: Id9d8305099150a181c622f6fcab146af9730e52c Reviewed-on: https://chromium-review.googlesource.com/996640 Tested-by: Aviv Keshet <akeshet@chromium.org> Trybot-Ready: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> [modify] https://crrev.com/bca1fd053debef0c43ec2b2955fd62a33190905f/lib/constants.py
,
Apr 4 2018
,
Apr 5 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/overlays/project-jetstream-private/+/d057e4b65bb09c9bb4d7791547baa971e6fc823e commit d057e4b65bb09c9bb4d7791547baa971e6fc823e Author: Stephen Barber <smbarber@chromium.org> Date: Thu Apr 05 01:51:58 2018
,
Apr 5 2018
It looks like this wasn't enough: https://luci-milo.appspot.com/buildbot/chromeos/whirlwind-paladin/11110
,
Apr 5 2018
From jintao@: Stephen, I don't think the following CL is related to the dbus generation error. Generating dbus-proxies.h should not require the DEPEND definition in ebuild file. https://chrome-internal-review.googlesource.com/c/chromeos/overlays/project-jetstream-private/+/602180 mxt@ from my team identified a recent CL that very likely have caused this issue. https://chromium-review.googlesource.com/c/chromiumos/platform2/+/959844 After reverting this CL, dbus proxies and adaptors can be properly generated. This is also consistent with our observation that this dbus header generation error started occurring last week. The build log shows that the build process failed to detect the updated dbus definition and failed to re-generate the dbus proxies. I think we need to include sources in the dbus generator inputs in order to establish the dependencies
,
Apr 5 2018
Missed adding briannorris@ from the thread Agreed that DEPEND isn't helping here. Sometimes the build_packages or unittest stage would succeed on retry, which is often a sign of missing DEPENDs entries. We can revert my CL. sammc@'s change seems to imply that if we just revert that change, with the newer version of GYP we're using now the dbus generator gypi files shouldn't work at all. Is there something else we're missing?
,
Apr 5 2018
,
Apr 5 2018
+ non-PDT sheriff
,
Apr 5 2018
The gyp has underspecified dependencies: see also https://gyp.gsrc.io/docs/InputFormatReference.md#linking-dependencies. Essentially, if you export a generated header by including it in a header in a particular target, that target needs to set 'hard_dependency': 1. Otherwise, ninja is free to reorder the generate step and the build step. In this case, the dependency chain is libvorlonclient-test -> libvorlonclient -> metric-reporter -> ap-monlog-dbus-proxies. However, only ap-monlog-dbus-proxies sets hard_dependency, so the only build order dependency is metric-reporter -> ap-monlog-dbus-proxies. libvorlonclient-test and libvorlonclient have no build order constraints with respect to generating the headers so their .cc files can be compiled before the generation occurs. libvorlonclient-test and libvorlonclient need a hard_dependency chain to ap-monlog-dbus-proxies, either by depending on it directly (it already has hard_dependency set) or having all intermediate steps set hard_dependency.
,
Apr 5 2018
Sam, thanks a lot for the explanation. We will try the hard_dependency solution and will upload CL once it's verified
,
Apr 5 2018
Here's a completely untested CL (I don't even know whether it compiles) that tries to break the header dependencies so the hard_dependency change wouldn't be necessary: https://chrome-internal-review.googlesource.com/#/c/chromeos/ap-daemons/+/602528 I'll check whether it fixes things.
,
Apr 5 2018
The pre-cq now at least got a green build_packages step, so that's a good sign. I'm still trying to validate locally (I've seen the bug once when rebuilding my whirlwind sysroot) to gain confidence that the fix actually works.
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ec1cd17614f7aff652dc7983eb4d14f8a5b40257 commit ec1cd17614f7aff652dc7983eb4d14f8a5b40257 Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Thu Apr 05 13:31:20 2018 Roll src/third_party/chromite/ ec8d7a766..8c177b33b (6 commits) https://chromium.googlesource.com/chromiumos/chromite.git/+log/ec8d7a766a7e..8c177b33bdde $ git log ec8d7a766..8c177b33b --date=short --no-merges --format='%ad %ae %s' 2018-04-05 chrome-bot Update config settings by config-updater. 2018-04-05 smbarber Revert "cbuildbot_launch: Clean distfiles cache when too old" 2018-04-04 dgarrett chromeos_config: Move fuzzer template to 'informational'. 2018-04-03 pprabhu cbuildbot_launch: Clean distfiles cache when too old 2018-04-04 chrome-bot Update config settings by config-updater. 2018-04-04 akeshet remove whirlwind-no-vmtest-pre-cq from default set Created with: roll-dep src/third_party/chromite BUG=chromium:None,chromium:814989,chromium:829055 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: Ic84d9c712043bce2cfaa2452eb101b70ec29843e Reviewed-on: https://chromium-review.googlesource.com/997754 Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#548406} [modify] https://crrev.com/ec1cd17614f7aff652dc7983eb4d14f8a5b40257/DEPS
,
Apr 5 2018
My pre-cq run with the proposed fix runs into apparently unrelated unit-test failure: https://luci-logdog.appspot.com/v/?s=chromeos/buildbucket/cr-buildbucket.appspot.com/8950083994102712528/+/steps/UnitTest/0/stdout Maybe the compilation bug has masked another bug that got introduced meanwhile?
,
Apr 5 2018
It looks like it was relying on a transitive include of brillo/bind_lambda.h. https://cs.corp.google.com/chromeos_public/src/aosp/external/libbrillo/brillo/bind_lambda.h?l=53 seems to be the missing template specialization.
,
Apr 5 2018
More bad news: In an attempt to validate the fix locally, I've replaced generate-chromeos-dbus-bindings with this script:
#!/bin/sh
sleep 10
echo "$0" "$@" >> /tmp/generator_log
exec /usr/bin/generate-chromeos-dbus-bindings.real "$@"
It turns out there is at least one more instance of the missing hard dependency problem:
FAILED: platform/ap-daemons/common/libipv6mtumonitor.ipv6_upstream_mtu_monitor.o
armv7a-cros-linux-gnueabi-clang++ -MMD -MF platform/ap-daemons/common/libipv6mtumonitor.ipv6_upstream_mtu_monitor.o.d -DGWIFI_CHROMEOS -DUSE_TLSDATED -DUSE_METRICS_UPLOADER=1 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Iplatform/ap-daemons/libipv6mtumonitor.gen/include -Igen/include -I/build/whirlwind/tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform2 -I/build/whirlwind/tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform -I/build/whirlwind/usr/include -Igen -Wall -Wno-psabi -Wunused -Wno-unused-parameter -ggdb3 -fstack-protector-strong -Wformat=2 -fvisibility=internal -Wa,--noexecstack -Werror --sysroot=/build/whirlwind -DUSE_RTTI_FOR_TYPE_TAGS -std=c++11 -pthread -Wno-c++11-extensions -Wno-unused
-local-typedefs -DBASE_VER=395517 -pthread -I/build/whirlwind/usr/include/shill-client -I/build/whirlwind/usr/include/chromeos -I/build/whirlwind/usr/include/base-395517 -I/build/whirlwind/usr/include/glib-2.0 -I/build/whirlwind/usr/lib/glib-2.0/include -I/build/whirlwind/usr/include/nss -I/build/whirlwind/usr/include/nspr -I/build/whirlwind/usr/include/dbus-1.0 -I/build/whirlwind/usr/lib/dbus-1.0/include -I/build/whirlwind/usr/include/power_manager-client -fPIE -std=gnu++14 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -O2 -O2 -O2 -pipe -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=hard -g -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -clang-syntax -clang-syntax -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -c ../../../../../../../tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap-daemons/common/ipv6_upstream_mtu_monitor.cc -o p
latform/ap-daemons/common/libipv6mtumonitor.ipv6_upstream_mtu_monitor.oIn file included from ../../../../../../../tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap-daemons/common/ipv6_upstream_mtu_monitor.cc:5:
In file included from /build/whirlwind/tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap-daemons/common/ipv6_upstream_mtu_monitor.h:16:
In file included from /build/whirlwind/tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap/common/wan_configuration.h:8:
/build/whirlwind/tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap/common/configuration_monitor.h:13:10: fatal error: 'ap-daemons/controller/dbus-proxies.h' file not found
#include "ap-daemons/controller/dbus-proxies.h"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
I'm inclined to just put hard_dependency: 1 into target_defaults to get things building again while we figure out a longer-term solution. This CL should do the trick: https://chrome-internal-review.googlesource.com/#/c/chromeos/ap-daemons/+/602571
,
Apr 5 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/ap-daemons/+/bb62b5cc2018e19e94b566562be967ab311d9a4b commit bb62b5cc2018e19e94b566562be967ab311d9a4b Author: Mattias Nissler <mnissler@chromium.org> Date: Thu Apr 05 16:04:47 2018
,
Apr 5 2018
Is PreCQ supposed to be fine? https://chromium-review.googlesource.com/c/chromiumos/platform2/+/760486 just failed (20 mins ago) on whirlwind BuildPackages.
,
Apr 5 2018
PreCQ has passed on the above CL.
,
Apr 5 2018
It looks like whirlwind-no-vmtest-pre-cq is green again: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?id=2447085
,
Apr 5 2018
I've encountered a case locally that has issue with even direct dependencies after the fix in comment#21: Local change deletes a dbus method in ap-monlog (package ap-infra), successfully ran build_packages, switch back to master where the deleted method still exists (both ap-infra and ap-daemons) and run build_packages, it would complain the deleted method does not exist for ap-daemons, looking into ap-daemons/out/Default/gen/include/ap/monlog reveals the dbus-proxies have an older timestamp and is not updated in the most recent run of build_packages. The error message complains about libapmonitor which directly depends on ap-monlog-dbus-proxies, and ap-monlog-dbus-proxies is set with hard_dependency: x -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -c ../../../../../../../tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap-daemons/monitor/monitor_daemon.cc -o platform/ap-daemons/monitor/libapmonitor.monitor_daemon.o ap-daemons-9999: ../../../../../../../tmp/portage/chromeos-base/ap-daemons-9999/work/ap-daemons-9999/platform/ap-daemons/monitor/monitor_daemon.cc:597:31: error: no member named 'RecordStructuredDataByRequestUid' in 'org::chromium::ap::Monlog::ManagerProxyInterface' ap-daemons-9999: if (!monlog_manager_proxy_->RecordStructuredDataByRequestUid( ap-daemons-9999: ~~~~~~~~~~~~~~~~~~~~~ ^ ap-daemons-9999: 1 error generated. ap-daemons-9999: ninja: build stopped: subcommand failed. ap-daemons-9999: ap-daemons-9999: cmd=['ninja', '-C', '/build/gale/var/cache/portage/chromeos-base/ap-daemons/out/Default', '-j', '12', 'all'] Attempt to set hard_dependency in target_defaults for ap-infra.gyp and monlog.gypi doesn't solve this problem still
,
Apr 5 2018
#c26 a Portage-level problem with invalidation of reverse deps. https://b.corp.google.com/issues/75396373 is tracking fixing that when using build_packages.
,
Apr 5 2018
re #c27, adding ap-infra to RDEPEND in ap-daemons does not resolve the issue either, is there a separate bug tracking portage-level problem with outdated deps?
,
Apr 5 2018
RDEPEND is for *runtime* dependencies, not reverse dependencies so is not expected to affect reverse dependencies.
,
Apr 5 2018
whirlwind-no-vmtest-pre-cq has been green consecutively, pls put whirlwind back to the pre_cq_default_config.
,
Apr 5 2018
Before we put this back into a place where it can block all ChromeOS development, I think we need to address the ownership question that allowed it to remain broken for so long. https://b.corp.google.com/issues/37222958
,
Apr 5 2018
In this particular case, whirlwind was a victim. The breakage is noticed 4/4 and immediate actions were taken by jetstream team https://buganizer.corp.google.com/issues/77601144
,
Apr 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/12a067168f2523c71ef27f1e3e48a6d47692f7c6 commit 12a067168f2523c71ef27f1e3e48a6d47692f7c6 Author: Mu Lin <mxt@chromium.org> Date: Mon Apr 09 22:18:57 2018 Revert "remove whirlwind-no-vmtest-pre-cq from default set" This reverts commit bca1fd053debef0c43ec2b2955fd62a33190905f. Reason for revert: https://bugs.chromium.org/p/chromium/issues/detail?id=829055#c32 Original change's description: > remove whirlwind-no-vmtest-pre-cq from default set > > BUG= chromium:829055 > TEST=None > > Change-Id: Id9d8305099150a181c622f6fcab146af9730e52c > Reviewed-on: https://chromium-review.googlesource.com/996640 > Tested-by: Aviv Keshet <akeshet@chromium.org> > Trybot-Ready: Aviv Keshet <akeshet@chromium.org> > Reviewed-by: Stephen Barber <smbarber@chromium.org> Bug: chromium:829055 Change-Id: I81956a4f70f2a896aad09a02ff372cbd166d3abe Reviewed-on: https://chromium-review.googlesource.com/998393 Commit-Ready: Mu Lin <mxt@google.com> Tested-by: Jason Clinton <jclinton@chromium.org> Reviewed-by: Jason Clinton <jclinton@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> [modify] https://crrev.com/12a067168f2523c71ef27f1e3e48a6d47692f7c6/lib/constants.py
,
Apr 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4d1bbb3fd11879ae03b79f844719703b9d4b959c commit 4d1bbb3fd11879ae03b79f844719703b9d4b959c Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Tue Apr 10 04:36:17 2018 Roll src/third_party/chromite/ 02ef76b01..871e7778e (6 commits) https://chromium.googlesource.com/chromiumos/chromite.git/+log/02ef76b018ac..871e7778ef8b $ git log 02ef76b01..871e7778e --date=short --no-merges --format='%ad %ae %s' 2018-03-28 pprabhu gerrit: Add 'private' subcommand. 2018-04-09 dgarrett cros tryjob: Remove --waterfall option. 2018-04-09 cmtice [clang-tidy builder] Change board from 'kahlee' to 'grunt'. 2018-04-05 mxt Revert "remove whirlwind-no-vmtest-pre-cq from default set" 2018-04-06 xzhou Update tss group test to add bootlockboxd to tss. 2018-04-06 dgarrett cbuildbot_launch: RunCbuildbot -> Cbuildbot. Created with: roll-dep src/third_party/chromite BUG=chromium:None,chromium:752342,chromium:830691,chromium:829055,chromium:None The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: Iad8d29ecd5f47e06ce4867453d3ba4a4606d835c Reviewed-on: https://chromium-review.googlesource.com/1004003 Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#549423} [modify] https://crrev.com/4d1bbb3fd11879ae03b79f844719703b9d4b959c/DEPS
,
Apr 11 2018
I believe that this is fixed. mxt@ - can you please confirm?
,
Apr 19 2018
,
May 11 2018
,
Jul 7
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/overlays/project-jetstream-private/+/576ed49488193fdb715c36c55f78c940eb0c0fa2 commit 576ed49488193fdb715c36c55f78c940eb0c0fa2 Author: Stephen Barber <smbarber@google.com> Date: Sat Jul 07 01:48:38 2018 |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by jkardatzke@chromium.org
, Apr 4 2018