New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 658189 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature



Sign in to add a comment

Improve strategy for buildbot's slave affinity

Project Member Reported by machenb...@chromium.org, Oct 21 2016

Issue description

This is a continuation of efforts from issue 524581 to improve the current slave-affinity strategy:

Currently, when the system is under medium load or more, preferred slaves are often occupied, so that new builds start taking slaves that prefer other builders. This in turn leads to more slaves not being available for their preferred builders.

While load is increasing, the probability that there's a slave available that prefers the next scheduled build gets dramatically lower.

I propose a new strategy that'll lead to a more fair distribution:

Let slaves(builder) be the number of available slaves preferring that builder.

For builder A and available slaves S, choose s in S such that (order is preference):
1. s prefers A, or
2. s prefers nothing (a fall-over pool), or
3. s prefers some B != A and there's no other builder C != B such that slaves(C) > slaves(B)

Also if no fall-over pool exists, the last part will make sure the slaves with highest availability with respect to their preferred builder will be used first.

----

Example:

Lets assume the following setup. Builders A, B, C, D prefer slaves a1, a2, ..., b1, b2, ..., etc. Lets assume we call the nextSlave function with builder B and slaves [a1, a2, a3, c1, d1, d2, d3].

The current method might choose c1 randomly, making it worse for the next build of builder C. The new proposal would only choose from a{1..3} and d{1..3}.
 
Components: -Infra Infra>Platform
Project Member

Comment 3 by bugdroid1@chromium.org, Oct 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/f71cca8a090ef8d7608af034b5e2bdcdc7df2bc5

commit f71cca8a090ef8d7608af034b5e2bdcdc7df2bc5
Author: machenbach <machenbach@chromium.org>
Date: Mon Oct 24 09:44:49 2016

Improve slave-preference function

The current slave-preference function has the following
flaw:

When the system is under medium load, preferred slaves
are often occupied, so that each new build starts taking
slaves preferring other builders. This in turn leads to more
slaves not being available for their preferred builders.

The new implementation has the following strategy:

Let slaves(builder) be the number of available slaves
preferring that builder. For builder A and available
slaves S, choose s in S such that (order is preference):
1. s prefers A, or
2. s prefers nothing (an optional fall-over pool), or
3. s prefers some B != A and there's no other builder C != B
such that slaves(C) > slaves(B)

BUG=658189

Review-Url: https://codereview.chromium.org/2427413005

[modify] https://crrev.com/f71cca8a090ef8d7608af034b5e2bdcdc7df2bc5/scripts/master/master_utils.py
[modify] https://crrev.com/f71cca8a090ef8d7608af034b5e2bdcdc7df2bc5/scripts/master/unittests/master_utils_test.py

Project Member

Comment 4 by bugdroid1@chromium.org, Oct 25 2016

Project Member

Comment 5 by bugdroid1@chromium.org, Oct 25 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/master-manager.git/+/1056712122c1b468336077c4a1f2c10bcb8e9ef8

commit 1056712122c1b468336077c4a1f2c10bcb8e9ef8
Author: machenbach <machenbach@google.com>
Date: Tue Oct 25 13:13:31 2016

Project Member

Comment 6 by bugdroid1@chromium.org, Oct 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/7530f89daaf64818da1c8c6ae924d68b5d759c6b

commit 7530f89daaf64818da1c8c6ae924d68b5d759c6b
Author: machenbach <machenbach@chromium.org>
Date: Tue Oct 25 13:50:32 2016

Revert of V8: Use new slave-preference function on tryserver (patchset #1 id:1 of https://codereview.chromium.org/2448873003/ )

Reason for revert:
Fails on master

Original issue's description:
> V8: Use new slave-preference function on tryserver
>
> BUG=658189
>
> Committed: https://chromium.googlesource.com/chromium/tools/build/+/1978a7f881b4583c3f87362961dbe82cc20ccf52

TBR=tandrii@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=658189

Review-Url: https://codereview.chromium.org/2448033003

[modify] https://crrev.com/7530f89daaf64818da1c8c6ae924d68b5d759c6b/masters/master.tryserver.v8/master.cfg

Project Member

Comment 9 by bugdroid1@chromium.org, Oct 28 2016

Project Member

Comment 10 by bugdroid1@chromium.org, Oct 28 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/master-manager.git/+/63a9d9ed12cf5068b1d18bdfa6442bd94fdcba67

commit 63a9d9ed12cf5068b1d18bdfa6442bd94fdcba67
Author: machenbach <machenbach@google.com>
Date: Fri Oct 28 08:15:57 2016

It is as I feared. The preferred property is not even set on the slaves in production. Output from the V8 logs:

2016-10-28 01:30:32-0700 [-] Assigning slave to v8_linux_chromium_gn_rel. Preferred: [None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]. Chose <SlaveBuilder builder='v8_linux_chromium_gn_rel' slave='slave404-c4'>.

Like that, also the old distribution method simply didn't work at all.
Meh, looks like in V8's master.cfg, one line assigning the preferred builders dict was missing. Hangdog :(
Project Member

Comment 14 by bugdroid1@chromium.org, Oct 28 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/master-manager.git/+/559f3513c791559f57a520943f0c47c208c7fc35

commit 559f3513c791559f57a520943f0c47c208c7fc35
Author: machenbach <machenbach@google.com>
Date: Fri Oct 28 09:06:34 2016

Sign in to add a comment