New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 654481 link

Starred by 2 users

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug

Blocking:
issue 651581



Sign in to add a comment

Retry a build if it failed to start cbuildbot

Project Member Reported by nxia@chromium.org, Oct 10 2016

Issue description

https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/12569

This master failed because amd64-generic-paladin did not start. 


https://build.chromium.org/p/chromiumos/builders/amd64-generic-paladin/builds/27404

amd64-generic-paladin didn't start because it failed at bot-update stage before it reached the cbuildbot stage. 


One improvement we can do is to use buildbucket to retry this build. Ideas?
 
That's a good idea.

Further, we might be able to give better error messages than just "did not start".

First pass list of things it would be useful to know:

1) Build slave is offline.
2) Build started, but failed before cbuildbot started.
3) cbuildbot errored out before reporting status.

Comment 2 by nxia@chromium.org, Oct 11 2016

Blocking: 651581
Project Member

Comment 3 by bugdroid1@chromium.org, Nov 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/30c14d78eada0c9e72911f03a32dca1786ac16a3

commit 30c14d78eada0c9e72911f03a32dca1786ac16a3
Author: Ningning Xia <nxia@chromium.org>
Date: Tue Nov 01 23:30:52 2016

Add unscheduled_slaves metadata.

Add unschduled slaves to unscheduled_slaves metadata.
1) record unscheduled slaves.
2) can reschedule the slaves when retrying builds are implemented.

BUG= chromium:654481 
TEST=unit_tests

Change-Id: I95b12b3283f40e7514c0f9ad86cd6ac20e15576d
Reviewed-on: https://chromium-review.googlesource.com/406488
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Paul Hobbs <phobbs@google.com>

[modify] https://crrev.com/30c14d78eada0c9e72911f03a32dca1786ac16a3/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/30c14d78eada0c9e72911f03a32dca1786ac16a3/cbuildbot/stages/sync_stages.py

Project Member

Comment 4 by bugdroid1@chromium.org, Nov 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/5fe75da17c126a41a894efdf1c539491c63b9743

commit 5fe75da17c126a41a894efdf1c539491c63b9743
Author: Ningning Xia <nxia@chromium.org>
Date: Thu Nov 03 22:57:01 2016

Add RetryBuildRequest in buildbucket_lib.

To support build retry using Buildbucket, add RetryBuildRequest in
buildbucket_lib.

BUG= chromium:654481 
TEST=unit_tests

Change-Id: I1c9c0abc9371982583b5922b3b51d8a5567b4a81
Reviewed-on: https://chromium-review.googlesource.com/407981
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/5fe75da17c126a41a894efdf1c539491c63b9743/cbuildbot/buildbucket_lib_unittest.py
[modify] https://crrev.com/5fe75da17c126a41a894efdf1c539491c63b9743/cbuildbot/buildbucket_lib.py

Project Member

Comment 5 by bugdroid1@chromium.org, Dec 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d

commit 44b887e0c055f66c81ead8c0d61e91b0e8d9b99d
Author: Ningning Xia <nxia@chromium.org>
Date: Tue Nov 29 00:29:11 2016

Pass config and metadata to BuildSpecsManager.

BuildSpecsManager will need to get build statuses from Buildbucket and
retry builds; meanwhile, it needs to update the 'scheduled_slaves'
metadata with the new buildbucket_id and create_ts. Pass config
(config.name determines whether to get slave statues from Buildbucket)
and metadata to BuildSpecsManager.

BUG= chromium:654481 
TEST=unit_tests

Change-Id: Ied7913a54561a1f869b1f1e1be7add587de338e2
Reviewed-on: https://chromium-review.googlesource.com/414235
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/lib/metadata_lib.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/manifest_version_unittest.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/stages/completion_stages.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/manifest_version.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/stages/sync_stages.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/stages/completion_stages_unittest.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/buildbucket_lib.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/buildbucket_lib_unittest.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/lib/constants.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/cbuildbot/lkgm_manager.py
[modify] https://crrev.com/44b887e0c055f66c81ead8c0d61e91b0e8d9b99d/lib/metadata_lib_unittest.py

Project Member

Comment 6 by bugdroid1@chromium.org, Dec 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/4904985f59d4ea9a9ce6a07290474838a293385c

commit 4904985f59d4ea9a9ce6a07290474838a293385c
Author: Ningning Xia <nxia@chromium.org>
Date: Wed Nov 23 00:50:37 2016

GetScheduledBuildDict returns build->build_info map.

Previously GetScheduledBuildDict only returned build->buildbucket_id map.
In order to support retry build with limit times, keep track of the
retry times and return build->build_info map.

BUG= chromium:654481 
TEST=unit_tests

Change-Id: I4f8a639a7704fb07325b996c8058779b7011edcc
Reviewed-on: https://chromium-review.googlesource.com/414409
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/manifest_version_unittest.py
[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/stages/completion_stages.py
[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/manifest_version.py
[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/stages/completion_stages_unittest.py
[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/buildbucket_lib_unittest.py
[modify] https://crrev.com/4904985f59d4ea9a9ce6a07290474838a293385c/cbuildbot/buildbucket_lib.py

Project Member

Comment 7 by bugdroid1@chromium.org, Dec 6 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/f8559ead0dd7f71d003823c760ca190ddd9e7f24

commit f8559ead0dd7f71d003823c760ca190ddd9e7f24
Author: Ningning Xia <nxia@chromium.org>
Date: Thu Dec 01 00:12:54 2016

Retry builds which failed before Cbuildbot step.

For builds which fail before reaching the BuildStartStage, no
corresponding build_ids would be inserted into CIDB. We can safely retry
those failed builds and update the 'scheduled_slaves' in the master
metadata.

BUG= chromium:654481 
TEST=unit_tests

Change-Id: Ida993c7abb90af008c0a7ff3d3e3c894d0c0934e
Reviewed-on: https://chromium-review.googlesource.com/415556
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/f8559ead0dd7f71d003823c760ca190ddd9e7f24/lib/constants.py
[modify] https://crrev.com/f8559ead0dd7f71d003823c760ca190ddd9e7f24/cbuildbot/manifest_version.py
[modify] https://crrev.com/f8559ead0dd7f71d003823c760ca190ddd9e7f24/cbuildbot/manifest_version_unittest.py

Comment 8 by nxia@chromium.org, Dec 14 2016

Status: Fixed (was: Untriaged)

Comment 9 by nxia@chromium.org, Jan 24 2017

An example of a successful retry.


https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13446/steps/CommitQueueCompletion/logs/stdio


09:56:41: INFO: Still waiting for the following builds to complete: ['beaglebone-paladin']
09:56:41: INFO: Going to retry build beaglebone-paladin buildbucket_id 8989573633939369200 with retry # 1
09:56:41: INFO: Refreshing due to a 401 (attempt 1/2)
09:56:41: INFO: Refreshing access_token
09:56:42: INFO: Retried build beaglebone-paladin buildbucket_id 8989564123415132752 created_ts 1485280602305290
09:56:42: INFO: 1:55:59.605314 until timeout...

Comment 10 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 11 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 12 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 14 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment