New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 851115 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug

Blocked on:
issue 842940



Sign in to add a comment

linux_layout_tests_slimming_paint_v2: EXPIRED (lack of capacity)

Project Member Reported by pdr@chromium.org, Jun 8 2018

Issue description

This bot keeps failing the webkit_layout_tests step:
https://ci.chromium.org/buildbot/tryserver.chromium.linux/linux_layout_tests_slimming_paint_v2/12503

The stdout says:
+------------------------------------------------------------------------+
| End of shard 9                                                         |
|  Pending: 3653.9s  EXPIRED (lack of capacity)                          |
+------------------------------------------------------------------------+

This is failing fairly consistently. This only started recently.
 
Cc: dpranke@chromium.org jbudorick@chromium.org
Hmmm. That's because that bots tasks run at priority 40:
https://chromium-swarm.appspot.com/task?id=3dfa7606cec32510

A normal CQ test runs at priority 30:
https://chromium-swarm.appspot.com/task?id=3dfa8288f2a95a10

So linux_layout_tests_slimming_paint_v2's tasks are getting starved out do to higher pri tasks. Looks like it has a different priority because it mirrors an FYI bot:
https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/trybots.py?rcl=d5e51f01c25fe15e8d5923276caf54330d918fb0&l=479

I recall this was an issue recently with another (or the same?) optional-CQ bot that mirrored an FYI builder. +jbud/dpranke: What was the resolution there?
And this started happening recently because it looks like our main linux pool is over-loaded:
http://shortn/_aUy1XefGVb

Guess I'll try to find out where that extra load is coming from.
Cc: thakis@chromium.org
Labels: -Pri-3 Pri-2
Awful lot of new linux tests thakis@ is adding recently:
https://chromium.googlesource.com/chromium/src/+log/master/testing/buildbot

Could be the source of the increase in load. If that's the case, then the load is probably here to stay. We may need to increase pool size.
Cc: s...@google.com
Owner: bpastene@chromium.org
Status: Assigned (was: Untriaged)
Let's see how the load does Monday. If it's still tight, we should add more capacity.
It's possible, but almost these tests are added to bots on the chromium.clang waterfall, and they take 4h to cycle each -- so it should be way less load than adding tests to cq bots.

Good news: I'm almost done adding tests.
Maybe less good news: If at all possible, I want to add a bunch of tests to the asan bot, and that does have a corresponding cq bot. Before adding stuff to cq bots, I'll check with infra re capacity though.
The pool's still pretty busy: http://shortn/_NpgJbUvwJE

Uploaded http://crrev.com/i/639133 to get 200 more bots.
landing that CL now ...

Comment 8 by mek@chromium.org, Jun 11 2018

Is this also the reason that the linux_mojo bot hasn't had a single successful build all day? Those tasks also seem to run at priority 40, so any CL that touches a directory that adds the linux_mojo bot to the list of required CQ bots can't land for now...

Comment 9 by chongz@chromium.org, Jun 11 2018

Blockedon: 842940
Cc: chongz@chromium.org
Re #c8: That's my understanding.
Might be worth to link issue 842940 to share the background.
Project Member

Comment 10 by bugdroid1@chromium.org, Jun 11 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/f89d3b715e392b5e317f2ec316e94d560f55de01

commit f89d3b715e392b5e317f2ec316e94d560f55de01
Author: Ben Pastene <bpastene@chromium.org>
Date: Mon Jun 11 23:12:27 2018

Project Member

Comment 11 by bugdroid1@chromium.org, Jun 12 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/83e3ab63423bdd24b94cff0a129f5a18c2cf1b33

commit 83e3ab63423bdd24b94cff0a129f5a18c2cf1b33
Author: Ben Pastene <bpastene@chromium.org>
Date: Tue Jun 12 20:03:51 2018

Project Member

Comment 12 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/796c62860b5156171f3cdf71caaf66510cb957eb

commit 796c62860b5156171f3cdf71caaf66510cb957eb
Author: Ben Pastene <bpastene@chromium.org>
Date: Wed Jun 13 02:40:03 2018

Up the priority of tests on FYI bots that have optional CQ bot mirrors.

By default, all FYI tests run at pri 40, while all non-FYI tests run at pri 30.

Since some optional CQ bots mirror FYI bots, their tests run at a lower pri
than other CQ tests. Consequently, if the pool is in full use, there's a chance
that the optional tests will get starved out and expired. This leads to a blocked
CQ run if the CL in question is required to pass these optional bots via
"Cq-Include-Trybots" commit msg footer.

This increases priority of all known CQ-optional FYI bots to match normal CQ
tests.

Bug:  851115 
Change-Id: I9c7625e498f098a6613729777d3782c2b4b6625c
Reviewed-on: https://chromium-review.googlesource.com/1097600
Reviewed-by: John Budorick <jbudorick@chromium.org>
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Commit-Queue: Ben Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#566696}
[modify] https://crrev.com/796c62860b5156171f3cdf71caaf66510cb957eb/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/796c62860b5156171f3cdf71caaf66510cb957eb/testing/buildbot/generate_buildbot_json.py
[modify] https://crrev.com/796c62860b5156171f3cdf71caaf66510cb957eb/testing/buildbot/waterfalls.pyl

Status: Fixed (was: Assigned)
With the bump in capacity in #10 and the priority increase in #12, the tests on the bot shouldn't starved out of running by other CQ runs anymore.
Cc: smut@chromium.org
Cc: -s...@google.com

Sign in to add a comment