New issue
Advanced search Search tips
Starred by 1 user
Status: WontFix
Owner:
Closed: Sep 25
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment
buildstart stage failing with IntegrityError
Project Member Reported by llozano@chromium.org, Sep 18 Back to list
Chrome Version: ToT
OS: Chrome OS

All toolchain builders failed with: 
@@@STEP_LINK@Builder documentation@http://www.chromium.org/chromium-os/build/builder-overview#TOC-Continuous@@@
10:01:42: INFO: Running cidb query on pid 19869, repr(query) starts with 'SELECT NOW()'
10:01:42: INFO: Running cidb query on pid 19869, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7f846a485590>
10:01:42: ERROR: Error: (IntegrityError) (1062, "Duplicate entry '8968103418167764992' for key 'buildbucket_id_index'") 'INSERT INTO `buildTable` (master_build_id, buildbot_generation, builder_name, waterfall, build_number, build_config, bot_hostname, start_time, deadline, important, buildbucket_id) VALUES (%s, %s, %s, %s, %s, %s, %s, CURRENT_TIMESTAMP, %s, %s, %s)' (1860550, 1, 'amd64-llvm-next-toolchain', 'chromeos', 388, u'amd64-llvm-next-toolchain', 'cros-beefy272-c2.c.chromeos-bot.internal', datetime.datetime(2017, 9, 19, 6, 53, 24), True, '8968103418167764992')
 If the buildbucket_id to insert is duplicated to the buildbucket_id of an old build and the old build was canceled because of a waterfall master restart, please ignore this error. Else, the error needs more investigation. More context:  crbug.com/679974  and crbug.com/685889


What steps will reproduce the problem?

Here is an example: 
https://chromegw.corp.google.com/i/chromeos/builders/arm64-llvm-next-toolchain/builds/387/steps/BuildStart/logs/stdio

and 

https://chromegw.corp.google.com/i/chromeos/builders/amd64-llvm-next-toolchain/builds/388/steps/BuildStart/logs/stdio

all the toolchain builders failed with that this morning. 

https://chromegw.corp.google.com/i/chromeos/waterfall?builder=master-toolchain&builder=amd64-llvm-next-toolchain&builder=arm-llvm-next-toolchain&builder=arm64-llvm-next-toolchain&titles=off&reload=30

I see the error message refers to this issue:  crbug.com/679974 

And I can see that previous iteration of the builders was "interrupted" (purple color) so maybe I should be ignoring this error.

But, it does not sound right to ignore. The builder yesterday was "interrupted" and the one today failed because of this error. I don't think this should be expected behavior. It is just another day of testing that was not done. So, every interrupt on one day means a failure on the next day?

assigning to Sheriff (akeshet) for clarification.


 
I believe there was a waterfall restart this morning. That may have caused buildbot to "forget" about one of its previous ongoing builds, which could have caused this issue.

If this happens again on the next build, warrants further investigation. Otherwise I believfe it should resolve on its own.
Labels: -Pri-1 Pri-2
Cc: akes...@chromium.org
Owner: llozano@chromium.org
Status: WontFix
we can close this now, it was a corner case. 
Labels: Hotlist-CrOS-Sheriffing
This issue occurred again. All paladins failed or stopped at exception. The previous master-paladin build failed to clean up (https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/16679). Maybe it's the reason.
 
Here is the example of the failure build:
https://uberchromegw.corp.google.com/i/chromeos/builders/beaglebone-paladin/builds/15071

I Will keep eyes on it if it's flaky failure or not.
Re #5, yes it's a flaky failure. 
Cc: dgarr...@chromium.org pprabhu@chromium.org
This happens (rarely) when buildbot forgets about a previous build and re-uses its buildbot #. 
Sign in to add a comment