Swarming builds don't have a buildbot build_number. |
||||
Issue descriptionThis representative build: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8957835531229180704 Has this failure: ************************************************************ ** Start Stage BuildStart - Tue, 09 Jan 2018 15:16:34 -0800 (PST) ** ** The first stage to run. ** ** This stage writes a few basic metadata values that are known at the start of ** build, and inserts the build into the database, if appropriate. ************************************************************ Preconditions for the stage successfully met. Beginning to execute stage... 15:16:34: INFO: Running cidb query on pid 6145, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fc2eaef4e10> [1;31m15:16:34: ERROR: Error: (IntegrityError) (1062, "Duplicate entry '1-lumpy-compile-only-pre-cq--0' for key 'buildbot_generation'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s)' (None, 1, u'lumpy-compile-only-pre-cq', '', 0, u'lumpy-compile-only-pre-cq', 'swarm-cros-0.c.chromeos-bot.internal', True, '8957835531229180704') If the buildbucket_id to insert is duplicated to the buildbucket_id of an old build and the old build was canceled because of a waterfall master restart, please ignore this error. Else, the error needs more investigation. More context: crbug.com/679974 and crbug.com/685889 [0m [1;31m15:16:34: ERROR: <class 'sqlalchemy.exc.IntegrityError'>: (IntegrityError) (1062, "Duplicate entry '1-lumpy-compile-only-pre-cq--0' for key 'buildbot_generation'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s)' (None, 1, u'lumpy-compile-only-pre-cq', '', 0, u'lumpy-compile-only-pre-cq', 'swarm-cros-0.c.chromeos-bot.internal', True, '8957835531229180704') Traceback (most recent call last): File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/failures_lib.py", line 229, in wrapped_functor return functor(*args, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/report_stages.py", line 284, in PerformStage raise e IntegrityError: (IntegrityError) (1062, "Duplicate entry '1-lumpy-compile-only-pre-cq--0' for key 'buildbot_generation'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s)' (None, 1, u'lumpy-compile-only-pre-cq', '', 0, u'lumpy-compile-only-pre-cq', 'swarm-cros-0.c.chromeos-bot.internal', True, '8957835531229180704') [0m 15:16:34: INFO: Translating result <class 'sqlalchemy.exc.IntegrityError'>: (IntegrityError) (1062, "Duplicate entry '1-lumpy-compile-only-pre-cq--0' for key 'buildbot_generation'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s)' (None, 1, u'lumpy-compile-only-pre-cq', '', 0, u'lumpy-compile-only-pre-cq', 'swarm-cros-0.c.chromeos-bot.internal', True, '8957835531229180704') Traceback (most recent call last): File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/failures_lib.py", line 229, in wrapped_functor return functor(*args, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/report_stages.py", line 284, in PerformStage raise e IntegrityError: (IntegrityError) (1062, "Duplicate entry '1-lumpy-compile-only-pre-cq--0' for key 'buildbot_generation'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s)' (None, 1, u'lumpy-compile-only-pre-cq', '', 0, u'lumpy-compile-only-pre-cq', 'swarm-cros-0.c.chromeos-bot.internal', True, '8957835531229180704') to fail.
,
Jan 10 2018
I'm going to guess that: 1-lumpy-compile-only-pre-cq--0 Is a combination of buildbot-generation, builder_name, waterfall, build_number Which means this is breaking because we populate the buildbot build_number with a hard coded value of 0.
,
Jan 10 2018
What assumptions would be broken by using the buildbucket id there?
,
Jan 10 2018
Or.... adjust the index in question to include buildbucket id.
,
Jan 10 2018
The index enforces that (builder_name, build_number) is unique. Sounds like something here caused build_number (which was a buildbot concept) to either be reset to 0, or to simply stop incrementing at all. If it's the former and we can't un-reset it, then we can bump the "buildbot-generation" number by 1, it's hard coded in chromite. This was a release valve I put in place for precisely this situation. If it's the latter and sequential build_number is not going to be a meaningful concept at all, then it needs some more thought on how we want to migrate away from using build_number.
,
Jan 10 2018
Currently, "build_number" is ALWAYS 0 for swarming builds. There is no similar concept, except for the buildbucket id, or swarming id. My belief is that this is an issue for CIDB interaction, version strings, and (indirectly) artifact URLs. For swarming builds, they could also be replaced with CIDB id's and/or with buildbucket ids. CIDB id's are ordered which is nice when looking at build artifacts in GS. But they create a problem creating the build table entry, since you need to create an entry to allocate a CIDB id. buildbucket id's create an issue because they are non-sequential as well as not formatted as integers (and are really long strings). I'm also fairly certain that the stats scripts use timestamps for ordering, but I'm not 100%. My thinking is to modify the UNIQUE KEY first, then work through the other issues. The need to do so is called out in my roadmap docs.
,
Jan 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/452339af383f081bbf48a0c0cb8d5660460ba5c3 commit 452339af383f081bbf48a0c0cb8d5660460ba5c3 Author: Don Garrett <dgarrett@google.com> Date: Fri Jan 12 03:37:37 2018 cidb: Add buildbucket_id to build_table index. Swarming builds don't have a buildbot build_number. So... they don't appear to be unique in the the build table unique index. Add buildbucket_id to that index. BUG= chromium:800562 TEST=cidb_integration_test Change-Id: I7ffdae1f3d46510d624fb5af86c3113f43bb702f Reviewed-on: https://chromium-review.googlesource.com/858579 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/452339af383f081bbf48a0c0cb8d5660460ba5c3/cidb/schema.dump [add] https://crrev.com/452339af383f081bbf48a0c0cb8d5660460ba5c3/cidb/migrations/00064_add_buildbucket_to_build_table_index.sql
,
Jan 12 2018
That schema migration ran in substantially < 1 second on the debug db (~40k rows). If the cost is linear with the database size, this means that prod (~2.2 million rows) will take < 1 minute.
,
Jan 13 2018
I created a clone of prod CIDB and it finshed in < 14 seconds. Still, it's Friday before a 3 day weekend. Waiting until Monday to apply in prod.
,
Jan 16 2018
Builds are now working, but the unique IDs used for builds in a variety of places aren't always unique. https://ci.chromium.org/p/chromeos/builds/b8957224010844851056 This may affect CIDB behavior (needs investigation), may affect build version strings, and certainly affect build artifact URLs.
,
Jan 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/1c6f6d536bbaad829a610332f1f8dbf7a576eb8e commit 1c6f6d536bbaad829a610332f1f8dbf7a576eb8e Author: Don Garrett <dgarrett@google.com> Date: Thu Jan 18 22:44:46 2018 cidb: Remove start/ending_build_number. build_number is based on a buildbot concept that doesn't exist for swarming builds. Remove the use of it for build history searches. BUG= chromium:800562 TEST=run_tests Change-Id: Ib5e43dabe066f20bd4425ebcbe5512b243815386 Reviewed-on: https://chromium-review.googlesource.com/869193 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> [modify] https://crrev.com/1c6f6d536bbaad829a610332f1f8dbf7a576eb8e/scripts/summarize_build_stats.py [modify] https://crrev.com/1c6f6d536bbaad829a610332f1f8dbf7a576eb8e/lib/fake_cidb.py [modify] https://crrev.com/1c6f6d536bbaad829a610332f1f8dbf7a576eb8e/lib/cidb.py [modify] https://crrev.com/1c6f6d536bbaad829a610332f1f8dbf7a576eb8e/scripts/som_alerts_dispatcher.py
,
Jan 19 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/8cd2a1c6b37480959bd001f932904b74b1f1d2b2 commit 8cd2a1c6b37480959bd001f932904b74b1f1d2b2 Author: Don Garrett <dgarrett@google.com> Date: Fri Jan 19 08:29:46 2018 cbuildbot_run: Adjust GetVersion to use CIDB Id. build_number is a buildbot concept which doesn't exist for swarming builds (and defaults to 0 there). Switch to CIDB ids for the same purpose. This needs unittests and more thought, but uploading now to allow tryjobs. BUG= chromium:800562 TEST=run_tests Change-Id: I4850d1d2438786c6e688db90926f66a047eed6d4 Reviewed-on: https://chromium-review.googlesource.com/869450 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Ningning Xia <nxia@chromium.org> [modify] https://crrev.com/8cd2a1c6b37480959bd001f932904b74b1f1d2b2/cbuildbot/cbuildbot_run.py [modify] https://crrev.com/8cd2a1c6b37480959bd001f932904b74b1f1d2b2/cbuildbot/cbuildbot_run_unittest.py
,
Jan 19 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/10ace85c7a3e6c42cbc0805da3379fe70be70d68 commit 10ace85c7a3e6c42cbc0805da3379fe70be70d68 Author: Dan Erat <derat@chromium.org> Date: Fri Jan 19 19:46:57 2018 Revert "cbuildbot_run: Adjust GetVersion to use CIDB Id." This reverts commit 8cd2a1c6b37480959bd001f932904b74b1f1d2b2. Reason for revert: Apparently breaking provisioning on informational Chrome PFQ builders: https://crbug.com/803854 Original change's description: > cbuildbot_run: Adjust GetVersion to use CIDB Id. > > build_number is a buildbot concept which doesn't exist for swarming > builds (and defaults to 0 there). Switch to CIDB ids for the same > purpose. > > This needs unittests and more thought, but uploading now to allow > tryjobs. > > BUG= chromium:800562 > TEST=run_tests > > Change-Id: I4850d1d2438786c6e688db90926f66a047eed6d4 > Reviewed-on: https://chromium-review.googlesource.com/869450 > Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> > Tested-by: Don Garrett <dgarrett@chromium.org> > Reviewed-by: Ningning Xia <nxia@chromium.org> Bug: chromium:800562 Change-Id: If865e002f1f0e17ce47707655f6df6d38dd05dfe Reviewed-on: https://chromium-review.googlesource.com/876243 Commit-Queue: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Trybot-Ready: Dan Erat <derat@chromium.org> Reviewed-by: Lann Martin <lannm@chromium.org> [modify] https://crrev.com/10ace85c7a3e6c42cbc0805da3379fe70be70d68/cbuildbot/cbuildbot_run.py [modify] https://crrev.com/10ace85c7a3e6c42cbc0805da3379fe70be70d68/cbuildbot/cbuildbot_run_unittest.py
,
Jan 22 2018
To fix the version testing, you'll need to fix this method:
https://cs.corp.google.com/chromeos_internal/src/third_party/autotest/files/server/hosts/cros_host.py?l=1414
The change is along these lines:
* Create a new function in lsbrelease_utils.py that can read
CHROMEOS_RELEASE_BUILDER_PATH.
* Plumb the lsrelease_utils function into a method in CrosHost.
* Drop the current complicated check in favor of a check that
the value of the "cros-version" label is equal to the value
returned by the new method (that is, don't call utils.version_match()).
For reference, a modern /etc/lsb-release with CHROMEOS_RELEASE_BUILDER_PATH
looks like this:
CHROMEOS_RELEASE_APPID={3C46DD14-D42A-9F77-E2CB-7D83422F5B73}
CHROMEOS_BOARD_APPID={3C46DD14-D42A-9F77-E2CB-7D83422F5B73}
CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1}
DEVICETYPE=CHROMEBOX
CHROMEOS_RELEASE_BUILDER_PATH=tricky-tot-chrome-pfq-informational/R65-10321.0.0-c2223837
GOOGLE_RELEASE=10321.0.0
CHROMEOS_DEVSERVER=
CHROMEOS_RELEASE_BOARD=tricky
CHROMEOS_RELEASE_BUILD_NUMBER=10321
CHROMEOS_RELEASE_BRANCH_NUMBER=0
CHROMEOS_RELEASE_CHROME_MILESTONE=65
CHROMEOS_RELEASE_PATCH_NUMBER=0
CHROMEOS_RELEASE_TRACK=testimage-channel
CHROMEOS_RELEASE_DESCRIPTION=10321.0.0 (Official Build) dev-channel tricky test
CHROMEOS_RELEASE_BUILD_TYPE=Official Build
CHROMEOS_RELEASE_NAME=Chrome OS
CHROMEOS_RELEASE_VERSION=10321.0.0
CHROMEOS_AUSERVER=https://tools.google.com/service/update2
,
Feb 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/b9f3580ea0c8a766d8fcf42d00abbf51558f49b0 commit b9f3580ea0c8a766d8fcf42d00abbf51558f49b0 Author: Don Garrett <dgarrett@google.com> Date: Tue Feb 13 19:08:43 2018 [autotest] Adjust DUT version labels verification. Compare lsbrelease's CHROMEOS_RELEASE_BUILD_PATH againt the version label to see if the correct version is installed, instead of CHROMEOS_RELEASE_VERSION. A simple string compare makes the lab slightly more robust against changes to the format of that string. This does NOT affect the provision flow and verifications done there. BUG= chromium:800562 TEST=None Change-Id: Ia98c62744d1cbbacc22cb10500ae0e31b6e61755 Reviewed-on: https://chromium-review.googlesource.com/907326 Reviewed-by: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/b9f3580ea0c8a766d8fcf42d00abbf51558f49b0/client/common_lib/lsbrelease_utils_unittest.py [modify] https://crrev.com/b9f3580ea0c8a766d8fcf42d00abbf51558f49b0/server/hosts/cros_host.py [modify] https://crrev.com/b9f3580ea0c8a766d8fcf42d00abbf51558f49b0/client/common_lib/lsbrelease_utils.py
,
Feb 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/68134125081027f1ae98be03a8e5b882ac24a39a commit 68134125081027f1ae98be03a8e5b882ac24a39a Author: Don Garrett <dgarrett@google.com> Date: Thu Feb 15 08:10:53 2018 cbuildbot_run: Adjust GetVersion to use CIDB Id. build_number is a buildbot concept which doesn't exist for swarming builds (and defaults to 0 there). Switch to CIDB ids for the same purpose. Builder version strings for assorted unofficial builders (including tryjobs) will change formats from: chell-incremental/R66-10401.0.0-b4390 -> chell-incremental/R66-10401.0.0-c123456 Hopefully, there won't be anyone left trying to explicitly parse those strings. This was previously landed as CL:869450, and reverted as CL:876243. Relanding, now that the lab has been adjusted to accept the new build version string format, and "cros buildresult" is available for the toolchain team. BUG= chromium:800562 TEST=run_tests Change-Id: I7e007f1c660c22efce704233792c9716aca6c3f5 Reviewed-on: https://chromium-review.googlesource.com/917601 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Ningning Xia <nxia@chromium.org> [modify] https://crrev.com/68134125081027f1ae98be03a8e5b882ac24a39a/cbuildbot/cbuildbot_run.py [modify] https://crrev.com/68134125081027f1ae98be03a8e5b882ac24a39a/cbuildbot/cbuildbot_run_unittest.py
,
Feb 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/ab252592101b21ba15a2372f0998ed56fb1f047c commit ab252592101b21ba15a2372f0998ed56fb1f047c Author: Don Garrett <dgarrett@chromium.org> Date: Thu Feb 15 18:27:22 2018 Revert "cbuildbot_run: Adjust GetVersion to use CIDB Id." This reverts commit 68134125081027f1ae98be03a8e5b882ac24a39a. Reason for revert: This CL works, but breaks toolchain team scripts. A new tool "cros buildresult" was built to let the toolchain scripts be less dependent on how we run builds, but has a significant bug which prevents them from using it yet. Reverting until the new tool works for them. BUG= chromium:807029 Original change's description: > cbuildbot_run: Adjust GetVersion to use CIDB Id. > > build_number is a buildbot concept which doesn't exist for swarming > builds (and defaults to 0 there). Switch to CIDB ids for the same > purpose. > > Builder version strings for assorted unofficial builders (including > tryjobs) will change formats from: > > chell-incremental/R66-10401.0.0-b4390 > -> > chell-incremental/R66-10401.0.0-c123456 > > Hopefully, there won't be anyone left trying to explicitly parse those > strings. > > This was previously landed as CL:869450, and reverted as CL:876243. > > Relanding, now that the lab has been adjusted to accept the new build > version string format, and "cros buildresult" is available for the > toolchain team. > > BUG= chromium:800562 > TEST=run_tests > > Change-Id: I7e007f1c660c22efce704233792c9716aca6c3f5 > Reviewed-on: https://chromium-review.googlesource.com/917601 > Commit-Ready: Don Garrett <dgarrett@chromium.org> > Tested-by: Don Garrett <dgarrett@chromium.org> > Reviewed-by: Ningning Xia <nxia@chromium.org> Bug: chromium:800562 Change-Id: I833e2324439c6eaf750f47005130f02470336d4d Reviewed-on: https://chromium-review.googlesource.com/922521 Reviewed-by: Don Garrett <dgarrett@chromium.org> Commit-Queue: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/ab252592101b21ba15a2372f0998ed56fb1f047c/cbuildbot/cbuildbot_run.py [modify] https://crrev.com/ab252592101b21ba15a2372f0998ed56fb1f047c/cbuildbot/cbuildbot_run_unittest.py
,
Feb 28 2018
,
Mar 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/8e3708c8093dd77a78e8425d0d23f79fc8779f1e commit 8e3708c8093dd77a78e8425d0d23f79fc8779f1e Author: Don Garrett <dgarrett@google.com> Date: Sat Mar 10 06:59:24 2018 cbuildbot_run: Adjust GetVersion to use CIDB Id. A similar CL:869450 was submitted and reverted previously because assorted lab code could not handle the change in the ChromeOS build version string from *-bXXX to *-cXXX. I've given up trying to make the lab more flexible, so this CL keeps the *-bXXX, but changes the meaning of XXX from builder run number to CIDB id. This is still expected to cause issues for the toolchain team until chromium:807029 is fixed. BUG= chromium:800562 TEST=run_tests CQ-DEPEND=CL:954052 Change-Id: I1132f8e2618314f2cdcc8705756394d275f9ce51 Reviewed-on: https://chromium-review.googlesource.com/951915 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Richard Barnette <jrbarnette@google.com> [modify] https://crrev.com/8e3708c8093dd77a78e8425d0d23f79fc8779f1e/cbuildbot/cbuildbot_run.py [modify] https://crrev.com/8e3708c8093dd77a78e8425d0d23f79fc8779f1e/cbuildbot/cbuildbot_run_unittest.py
,
Mar 12 2018
Existing builders seem to be okay with the new format, and running hwtests: https://uberchromegw.corp.google.com/i/chromeos/builders/chell-incremental/builds/4892 https://uberchromegw.corp.google.com/i/chromeos.chrome/builders/peach_pit-tot-chrome-pfq-informational/builds/9977 I'm running two sample builds here, including hwtests for one of them: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8952232723342486912 http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8952231497141416928 http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8952231133663472816 They seem to be working so far, waiting for them all to complete.
,
Mar 16 2018
|
||||
►
Sign in to add a comment |
||||
Comment 1 by dgarr...@chromium.org
, Jan 9 2018