CancelObsoleteSlaveBuilds needs to only kill master's slaves. |
||||||
Issue descriptionCancelObsoleteSlaveBuilds kills all slave builds of the same build type on the same branch. Given two Android PFQ masters on the same branch, both PFQs will kill the slaves for the other each time they start. This will be... counterproductive. We need to do a better job of restricting which builds are killed. Two options: 1) Find the buildbucket_id of the previous master build, then kill slave builds associated with it. 2) Lookup the slave config names for the current master, and kill in-progress builds of all of those slave config builds.
,
May 9 2017
,
May 10 2017
1) has potential issue: master_1 got canceled, its slaves kept running; master_2 started and failed before it reached the CleanUpStage to abort the slaves of master_1; master_3 started and tried to abort slaves of master_2, but actually the running slaves were from master_1. 2) Are the slave config names of the two masters different?
,
May 10 2017
1) True. 2) Today slave names are unique between masters. Post-swarming this might not be true. In particular, trybots may be running with the same config names.
,
May 12 2017
This problem will also affect the release waterfall, since all of the builders there have the same build type.
,
May 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/52acedd403d15fdc71c90ee1a80bbb8f71245438 commit 52acedd403d15fdc71c90ee1a80bbb8f71245438 Author: Don Garrett <dgarrett@google.com> Date: Fri May 12 01:11:56 2017 build_stages: Disable slave killing for android pfq. Our logic to kill old slaves when the master starts up will break when we have two Android PFQs running in parallel. Disable it until we have a clean solution in place. BUG= chromium:719789 TEST=run_tests Change-Id: If53419ebefdc302a18d8b5acf11140bb41e5fd1b Reviewed-on: https://chromium-review.googlesource.com/499630 Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Ningning Xia <nxia@chromium.org> Commit-Queue: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/52acedd403d15fdc71c90ee1a80bbb8f71245438/cbuildbot/stages/build_stages.py
,
Jun 8 2017
,
Jun 19 2017
Sweet! How soon will we trust this enough to re-enable the auto-builder cleanup for Android PFQ?
,
Jun 19 2017
As long as the current android pfq master builds have passed the CleanUpStage, we can enable the cleanup for android PFQ masters
,
Jun 21 2017
The fix has been merged at https://chromium-review.googlesource.com/c/527271/. You can revert your CL at #6.
,
Jun 24 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/d31b035c95f971529d9b36d0fddc8d334965ca13 commit d31b035c95f971529d9b36d0fddc8d334965ca13 Author: Don Garrett <dgarrett@chromium.org> Date: Sat Jun 24 05:56:49 2017 Revert "build_stages: Disable slave killing for android pfq." This reverts commit 52acedd403d15fdc71c90ee1a80bbb8f71245438. Ningning fixed the bug this was working around, so we don't need the work around any more. BUG= chromium:719789 Change-Id: Ib8dc9321b71381ba7dda1b65812223b2399bd67a Reviewed-on: https://chromium-review.googlesource.com/542943 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Ningning Xia <nxia@chromium.org> [modify] https://crrev.com/d31b035c95f971529d9b36d0fddc8d334965ca13/cbuildbot/stages/build_stages.py
,
Jun 27 2017
,
Jan 22 2018
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dgarr...@chromium.org
, May 9 2017