buildstart stage failing with IntegrityError |
|||||||
Issue descriptionChrome Version: ToT OS: Chrome OS All toolchain builders failed with: @@@STEP_LINK@Builder documentation@http://www.chromium.org/chromium-os/build/builder-overview#TOC-Continuous@@@ 10:01:42: INFO: Running cidb query on pid 19869, repr(query) starts with 'SELECT NOW()' 10:01:42: INFO: Running cidb query on pid 19869, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7f846a485590> 10:01:42: ERROR: Error: (IntegrityError) (1062, "Duplicate entry '8968103418167764992' for key 'buildbucket_id_index'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, deadline, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s, %s)' (1860550, 1, 'amd64-llvm-next-toolchain', 'chromeos', 388, u'amd64-llvm-next-toolchain', 'cros-beefy272-c2.c.chromeos-bot.internal', datetime.datetime(2017, 9, 19, 6, 53, 24), True, '8968103418167764992') If the buildbucket_id to insert is duplicated to the buildbucket_id of an old build and the old build was canceled because of a waterfall master restart, please ignore this error. Else, the error needs more investigation. More context: crbug.com/679974 and crbug.com/685889 What steps will reproduce the problem? Here is an example: https://chromegw.corp.google.com/i/chromeos/builders/arm64-llvm-next-toolchain/builds/387/steps/BuildStart/logs/stdio and https://chromegw.corp.google.com/i/chromeos/builders/amd64-llvm-next-toolchain/builds/388/steps/BuildStart/logs/stdio all the toolchain builders failed with that this morning. https://chromegw.corp.google.com/i/chromeos/waterfall?builder=master-toolchain&builder=amd64-llvm-next-toolchain&builder=arm-llvm-next-toolchain&builder=arm64-llvm-next-toolchain&titles=off&reload=30 I see the error message refers to this issue: crbug.com/679974 And I can see that previous iteration of the builders was "interrupted" (purple color) so maybe I should be ignoring this error. But, it does not sound right to ignore. The builder yesterday was "interrupted" and the one today failed because of this error. I don't think this should be expected behavior. It is just another day of testing that was not done. So, every interrupt on one day means a failure on the next day? assigning to Sheriff (akeshet) for clarification.
,
Sep 18 2017
,
Sep 18 2017
,
Sep 25 2017
we can close this now, it was a corner case.
,
Oct 23 2017
This issue occurred again. All paladins failed or stopped at exception. The previous master-paladin build failed to clean up (https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/16679). Maybe it's the reason. Here is the example of the failure build: https://uberchromegw.corp.google.com/i/chromeos/builders/beaglebone-paladin/builds/15071 I Will keep eyes on it if it's flaky failure or not.
,
Oct 23 2017
Re #5, yes it's a flaky failure.
,
Oct 23 2017
This happens (rarely) when buildbot forgets about a previous build and re-uses its buildbot #.
,
Jan 4 2018
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by akes...@chromium.org
, Sep 18 2017