New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 670532 link

Starred by 0 users

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Feature

Blocked on:
issue 685380



Sign in to add a comment

Retry slaves which pass BuildStartStage but not SyncStage.

Project Member Reported by nxia@chromium.org, Dec 2 2016

Issue description

Project Member

Comment 1 by bugdroid1@chromium.org, Dec 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/be2f57e611a11073f5d5520380b5cf529d1e1d55

commit be2f57e611a11073f5d5520380b5cf529d1e1d55
Author: Ningning Xia <nxia@chromium.org>
Date: Fri Dec 02 01:03:20 2016

Move DetectIrrelevantChangesStage after BuildPackagesStage.

DetectIrrelevantChangesStage helps to allow partial validation_pool
submission, it doesn't rely on the result of BuildImage, move it
before the BuildImageStage but after BuildPackagesStage.

BUG= chromium:670532 
TEST=run_tests;try_job

Change-Id: Ifcc4fb2ae8ed1d25ace2798c660484b60e84df0b
Reviewed-on: https://chromium-review.googlesource.com/415995
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/be2f57e611a11073f5d5520380b5cf529d1e1d55/cbuildbot/builders/simple_builders.py

Project Member

Comment 2 by bugdroid1@chromium.org, Dec 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/10d9055ee399ed14c74dc483436ae0f9f5e0138f

commit 10d9055ee399ed14c74dc483436ae0f9f5e0138f
Author: Ningning Xia <nxia@chromium.org>
Date: Fri Dec 02 22:23:18 2016

Do not fail UploadStatus if status exists for paladin slaves.

We are going to retry paladin slaves with Buildbucket. A failed slave
paladin may have already uploaded status to GS, a retried slave paladin
needs to upload a new Inflight status, we want to set fail_if_exists to
False for paladin slaves so UploadStatus won't fail for retried builds.

BUG= chromium:670532 
TEST=unit_tests

Change-Id: I44f4a3279e8452dac63282160151b0eca83c5e74
Reviewed-on: https://chromium-review.googlesource.com/416209
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Chris Ching <chingcodes@chromium.org>

[modify] https://crrev.com/10d9055ee399ed14c74dc483436ae0f9f5e0138f/lib/config_lib_unittest.py
[modify] https://crrev.com/10d9055ee399ed14c74dc483436ae0f9f5e0138f/lib/config_lib.py
[modify] https://crrev.com/10d9055ee399ed14c74dc483436ae0f9f5e0138f/cbuildbot/manifest_version.py
[modify] https://crrev.com/10d9055ee399ed14c74dc483436ae0f9f5e0138f/cbuildbot/stages/sync_stages.py

Comment 3 by nxia@chromium.org, Dec 13 2016

Summary: Retry slaves which pass BuildStartStage but not SyncStage. (was: Retry slaves which failed before the DetectIrrelevantChanges stage)
Project Member

Comment 4 by bugdroid1@chromium.org, Dec 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/3e33b83ebe735f7d068c399d29ab0ef28440367a

commit 3e33b83ebe735f7d068c399d29ab0ef28440367a
Author: Ningning Xia <nxia@chromium.org>
Date: Wed Dec 14 01:27:15 2016

Add buildbucket_id to failureView.

Will need to filter failed build stages given a list of buildbucket_ids.
Add the buildbucket_id column to failureView.

BUG= chromium:670532 
TEST=cidb_integration_test

Change-Id: Iff8ccee2bc0fb8e01f7d8793a149937e4cab0f42
Reviewed-on: https://chromium-review.googlesource.com/419665
Reviewed-by: Aviv Keshet <akeshet@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/3e33b83ebe735f7d068c399d29ab0ef28440367a/cidb/schema.dump
[add] https://crrev.com/3e33b83ebe735f7d068c399d29ab0ef28440367a/cidb/migrations/00052_alter_failureView_add_buildbucket_id.sql

Project Member

Comment 5 by bugdroid1@chromium.org, Dec 21 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/8d36e6812aef3a1edc7b0541395fb32e60c33fc6

commit 8d36e6812aef3a1edc7b0541395fb32e60c33fc6
Author: Ningning Xia <nxia@chromium.org>
Date: Tue Dec 13 19:26:48 2016

GetSlaveStatues in given buildbucket_ids list.

1) For slaves scheduled by Buildbucket, we'll retry slaves which fail
to pass SyncStage. The failed slaves may have already reported 'inflight'
status to CIDB. We should exclude the old build statuses when we fetch
current slave statuses from CIDB. With the current buildbucket_ids
list, we can get build statuses of current slaves.
2) consolidate code to use constants.METADATA_SCHEDULED_SLAVES.

BUG= chromium:670532 
TEST=unit_tests

Change-Id: I1053b972be489d7e0beb0026e793e8ee4196602d
Reviewed-on: https://chromium-review.googlesource.com/419580
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/stages/scheduler_stages.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/stages/generic_stages.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/lib/cidb.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/stages/completion_stages.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/build_status_unittest.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/stages/completion_stages_unittest.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/lib/fake_cidb.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/build_status.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/stages/generic_stages_unittest.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/buildbucket_lib_unittest.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/cbuildbot/buildbucket_lib.py
[modify] https://crrev.com/8d36e6812aef3a1edc7b0541395fb32e60c33fc6/lib/cidb_integration_test.py

Project Member

Comment 6 by bugdroid1@chromium.org, Dec 21 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/5632e0ae3c74d9bf5e7701ec56790ea64e064d51

commit 5632e0ae3c74d9bf5e7701ec56790ea64e064d51
Author: Ningning Xia <nxia@chromium.org>
Date: Tue Dec 13 22:38:37 2016

GetSlaveStages in given buildbucket_ids list.

For slaves scheduled by Buildbucket, we'll retry slaves which fail to
pass SyncStage. The failed slave builds may have reported the stages
into CIDB. We should exclude the stages of the old slave builds. When
buildbucket_ids is provided, only returns the stages of the builds with
|buildbucket_id| in buildbucket_ids.

BUG= chromium:670532 
TEST=unit_tests

Change-Id: I20ebf36a0ee6818a89032793ed18c20e9075017e
Reviewed-on: https://chromium-review.googlesource.com/419818
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/5632e0ae3c74d9bf5e7701ec56790ea64e064d51/cbuildbot/stages/completion_stages_unittest.py
[modify] https://crrev.com/5632e0ae3c74d9bf5e7701ec56790ea64e064d51/cbuildbot/stages/completion_stages.py
[modify] https://crrev.com/5632e0ae3c74d9bf5e7701ec56790ea64e064d51/lib/cidb.py
[modify] https://crrev.com/5632e0ae3c74d9bf5e7701ec56790ea64e064d51/lib/cidb_integration_test.py

Project Member

Comment 7 by bugdroid1@chromium.org, Dec 21 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/747d844c82facc680cf9ad4e9fa3bed9c972ca7c

commit 747d844c82facc680cf9ad4e9fa3bed9c972ca7c
Author: Ningning Xia <nxia@chromium.org>
Date: Thu Dec 15 03:05:38 2016

Retry slaves which failed to pass the critical stages.

1) For cq-master, retry slaves which failed to pass the critical stage
(CommitQueueSync stage for cq).
2) For regular master builds using buildbucket scheduler, maintain the
old logic of only retrying slaves which failed to start Cbuildbot.
3) add unit tests and simplify old unit tests.

BUG= chromium:670532 
TEST=unit_tests

Change-Id: I7b8aa512b29b878d3ce61db6a5913228c63c0858
Reviewed-on: https://chromium-review.googlesource.com/421143
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/747d844c82facc680cf9ad4e9fa3bed9c972ca7c/cbuildbot/build_status_unittest.py
[modify] https://crrev.com/747d844c82facc680cf9ad4e9fa3bed9c972ca7c/lib/config_lib.py
[modify] https://crrev.com/747d844c82facc680cf9ad4e9fa3bed9c972ca7c/cbuildbot/build_status.py
[modify] https://crrev.com/747d844c82facc680cf9ad4e9fa3bed9c972ca7c/cbuildbot/manifest_version_unittest.py

Comment 8 by nxia@chromium.org, Jan 24 2017

Cc: akes...@chromium.org
master-paladin retried wolf-tot-paladin, but wolf-tot-paladin already passed its SyncStage. The reason is wolf-tot-paladin uses another sync class MasterSlaveLKGMSync, while I assumed all paladin would use CommitQueueSync class. 

Any reason for wolf-tot-paladin to use MasterSlaveLKGMSync? if not, we can convert it to CommitQueueSync; else, need some fixes to cover other sync classes.

https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13431/steps/CommitQueueCompletion/logs/stdio


Comment 9 by nxia@chromium.org, Jan 24 2017

"do_not_apply_cq_patches" is marked as true in wolf-tot-paladin, so MasterSlaveLKGMSyncStage is used by wolf-tot-paladin. Currently wolf-tot-paladin is the only paladin using MasterSlaveLKGMSyncStage.

http://shortn/_FJlTyCbrjP

Will fix the retry to cover this case.

Comment 10 by nxia@chromium.org, Jan 25 2017

Blockedon: 685380

Comment 11 by nxia@chromium.org, Feb 3 2017

Status: Fixed (was: Untriaged)

Comment 12 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 13 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 15 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment