pre-cq-launcher should use buildbucket builder statuses. |
||
Issue descriptionI thought this was filed before, but I can't find it. Currently, the pre-cq-launcher fires off PreCQ builders, and reads the results from CIDB. This is problematic in a number of ways. 1) CLs are not annotated with links to the builds until after all builders get far enough to create CIDB entries. This makes it look like the launcher has not picked up a CL it has. 2) Builds that fail before creating CIDB entries are invisible to the pre-cq-launcher. It doesn't know the builds failed, and instead issues a generic "timeout" error confusing to users. 3) Because we need to use timeouts to know that builds didn't start, we can't distinguish between bad CLs and build infrastructure issues. 4) Because we need to use timeouts, we need to keep them reasonably short. Our build infrastructure could handle a much higher CL load if CLs could be left in the build queue for longer.
,
May 8 2018
Yep, sounds great!
,
May 8 2018
,
May 8 2018
What is the timeline for replacing pre-cq-launcher with a service?
,
May 11 2018
I didn't have an item in my design for a PreCQ launcher replacement. If anything, I figured that the Swarming Scheduler could fire off something that looks like the launcher. Then, once it is running, we could do all the bookkeeping (CL annotations, CIDB udpates) inside of the Recipe instead of inside CBuildBot thereby eliminating the delay. What do you think?
,
May 14 2018
In other words, leave it as a builder-as-a-cron-job. Okay. Though, I don't see how recipe vs cbuildbot will make much difference in performance. The only heavy cbuildbot startup cost is the initial sync. But, since the launcher currently does a full repo sync each time it iterates through it's loop, we have to pay that cost no matter what. Well, unless we can figure out why it does that and avoid the cost.
,
May 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/08be0a3a846983e9e38bce8605d77a93853a87d6 commit 08be0a3a846983e9e38bce8605d77a93853a87d6 Author: Shuhei Takahashi <nya@chromium.org> Date: Fri May 18 05:21:14 2018 validation_pool: Avoid spamming code reviews. In pre-CQ, individual pre-CQ bots send failure notifications, in contrast to CQ where a failure notification is sent only by master-paladin. This leads to more spams as we add more builders to pre-CQ. This change modifies the notification logic so that only the first failing pre-CQ bot sends a notification. Also updates the notification message to refer CL status viewer to see the full results. BUG= chromium:842074 BUG=chromium:840933 TEST=unit tests Change-Id: I2cc282a8965a57cb7a583b3198d394e5db988c15 Reviewed-on: https://chromium-review.googlesource.com/1059975 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Shuhei Takahashi <nya@chromium.org> Reviewed-by: Gregory Meinke <gmeinke@chromium.org> Reviewed-by: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/08be0a3a846983e9e38bce8605d77a93853a87d6/lib/cl_messages.py [modify] https://crrev.com/08be0a3a846983e9e38bce8605d77a93853a87d6/cbuildbot/validation_pool.py [modify] https://crrev.com/08be0a3a846983e9e38bce8605d77a93853a87d6/lib/cl_messages_unittest.py [modify] https://crrev.com/08be0a3a846983e9e38bce8605d77a93853a87d6/cbuildbot/validation_pool_unittest.py
,
May 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e1b14496039c843107300cefb22e8891c5c95aa9 commit e1b14496039c843107300cefb22e8891c5c95aa9 Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Fri May 18 07:29:40 2018 Roll src/third_party/chromite/ 87247739f..08be0a3a8 (1 commit) https://chromium.googlesource.com/chromiumos/chromite.git/+log/87247739f384..08be0a3a8469 $ git log 87247739f..08be0a3a8 --date=short --no-merges --format='%ad %ae %s' 2018-05-16 nya validation_pool: Avoid spamming code reviews. Created with: roll-dep src/third_party/chromite BUG= chromium:842074 ,chromium:840933 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: I2a4cf0ee81922f6c47f0c0c148531e16b1b9236e Reviewed-on: https://chromium-review.googlesource.com/1065498 Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#559826} [modify] https://crrev.com/e1b14496039c843107300cefb22e8891c5c95aa9/DEPS
,
Oct 25
,
Oct 25
The bug merged in is an example of the kinds of failures we are seeing today, that fixing this would avoid. "PreCQ launcher did not notify Gerrit of failure for ~18 hours" |
||
►
Sign in to add a comment |
||
Comment 1 by dgarr...@chromium.org
, May 8 2018