ChromeOS Ozone Tests 1 is WAY behind |
||||||||
Issue descriptionPlease provide the details for your request here. For a number of reasons the ChromeOS Ozone Tests bot is very VERY far behind. Specifically this bot: https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20(1) . At the time I'm writing this it is behind by 29 hours (~65 pending runs)! I think we have a number of fixes landed on trunk that will help. I'm not sure what the right thing is. I would be inclined to drop app pending requests and start over, but I'm not sure how this is typically handled. The other option is more machines.
,
May 4 2017
agable, am I reading this wrong? Clicking on that link ( https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20(1) ) shows a ton of Pending Build requests, the last of which says waiting 25 hours. If I click on the most recent build, https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/46579 , it links to patches that landed early yesterday morning. If I go to the main waterfall in collapsed/merged mode the column for this bot is all gray, because it is way behind. It seems to me this bot is *not* picking up the most recent build, rather serially going through the builds produced from the builder. I tried to verify that, but I'm not sure how to tell which build this bot pulls. I see the hash, but I don't know how to correlate that with the actual build.
,
May 4 2017
Ah, I see what happened. I was looking at the luci-logdog version of the page, instead of the native buildbot version of the page. Check this link: https://luci-milo.appspot.com/buildbot/chromium.chromiumos/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/ It shows that builds have been running recently; much more recently than the builds that you are seeing. It looks to me like the builder is running just fine, but the buildbot page is updating incorrectly.
,
May 4 2017
Oh and never mind again, I see what's going wrong. There are three problems here: 1) luci-milo doesn't correctly show pending builds 2) triggered testers which are given tasks by a builder and don't coalesce those tasks (because coalescing build artifacts doesn't make sense, while coalescing commits kinda does) can get permanently behind We have a couple options here: a) hope it catches up over the weekend b) cancel a bunch of the pending builds c) allocate more machines to run this builder
,
May 4 2017
I prefer what ever makes this bot catch up the quickest. If it's easy to throw machines at the bot, do that. If that can't be done today, then I think we should cancel all the pending jobs. I suspect we should add more machines to this bot either way, but I'm hoping you have quantitative data that shows if new machines are really needed.
,
May 4 2017
We should cancel the pending builds and change the builders.pyl config to merge requests on the tester. I'll post a CL for the latter, and once that lands it probably makes sense to just restart the master (which will cancel all of the builds). I don't think we should add new machines.
,
May 4 2017
I'll take over the bug for now.
,
May 4 2017
,
May 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/2c630ad197926ac96d6600472d8082a2439322b6 commit 2c630ad197926ac96d6600472d8082a2439322b6 Author: Dirk Pranke <dpranke@chromium.org> Date: Thu May 04 22:15:23 2017 Merge build requests on testers on the chromium.chromiumos waterfall. The testers were configured to take every triggered build from the builders and run the tests; this means that if the tests take longer than the builders, we would become farther and farther behind, which is what was happening. This CL changes things to coalesce (merge) the triggered requests so that we can keep up. R=agable@chromium.org BUG= 718287 Change-Id: I7409a4f07f04846fb1931b80840e647540d4d7af Reviewed-on: https://chromium-review.googlesource.com/496726 Reviewed-by: Aaron Gable <agable@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/2c630ad197926ac96d6600472d8082a2439322b6/masters/master.chromium.chromiumos/builders.pyl
,
May 4 2017
Issue 718642 has been merged into this issue.
,
May 4 2017
CL has landed, I'll restart the master now.
,
May 4 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/master-manager/+/912ede929457a1e0f7912ecb10620e59459b23d5 commit 912ede929457a1e0f7912ecb10620e59459b23d5 Author: Aaron Gable <agable@google.com> Date: Thu May 04 23:18:29 2017
,
May 5 2017
,
Aug 1 2017
,
Jan 22 2018
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by agable@google.com
, May 4 2017Status: Fixed (was: Untriaged)