The ClangToT bots take a long time to cycle; we need more builders |
|||||||
Issue descriptionMany of the bots on https://build.chromium.org/p/chromium.fyi/console?category=clang%20tot take a long time to build. This hurts us because it takes longer to detect upstream compiler problems. Filing this to track getting more builders set up. I'll add specifics below.
,
Sep 15 2017
Cross-posting my comment from the CL (crrev.com/c/667837). I'm assuming the tests are all running under swarming? If so, there's very little reason to keep things split and the way the builder/tester split is implemented today is quite expensive (we upload the whole build directory to cloud storage and download it again on the other bot). That upload/download step can actually be slower then the amount of time it takes to run the tests :(. If/when we fix crbug.com/754104, the builder/tester split might make more sense again. Thoughts?
,
Sep 15 2017
(see also issue 587879 )
,
Sep 15 2017
Let's do it!
New sketch:
{
'master': 'ChromiumFYI',
'builder': [
'ClangToTLinux',
'ClangToTLinux tester', <--- Merge with the builder, move to build44-m1
],
'hostname': 'slave3-c1',
'os': 'linux',
'version': 'precise',
'bits': '64',
},
{
'master': 'ChromiumFYI',
'builder': [
'ClangToTLinuxASan',
'ClangToTLinuxASan tester', <--- Merge with builder.
],
'hostname': 'slave4-c1',
'os': 'linux',
'version': 'precise',
'bits': '64',
},
{
'master': 'ChromiumFYI',
'builder': [
'ClangToTLinuxLLD',
'ClangToTLinuxLLD tester', <--- Merge with builder, move to slave3-c1.
'ClangToTAndroid', <--- Move to new builder
'ClangToTAndroid64', <--- Move to new builder
],
'hostname': 'build44-m1', <--- This is beefy; use for ClangToTLinux to make it fast.
'os': 'linux',
'version': 'precise',
'bits': '64',
},
{
'master': 'ChromiumFYI',
'builder': [
'ClangToTLinuxUBSanVptr',
'ClangToTLinuxUBSanVptr tester', <--- Merge with builder.
],
'hostname': 'slave42-c1',
'os': 'linux',
'version': 'precise',
'bits': '64',
},
{
'master': 'ChromiumFYI',
'builder': [
'ClangToTLinux (dbg)',
'ClangToTLinuxMSan', <-- This doesn't exist yet, but I'd like a new builder for it.
],
'hostname': 'slave5-c1',
'os': 'linux',
'version': 'precise',
'bits': '64',
},
{
'master': 'ChromiumFYI',
'builder': [
'CFI Linux ToT',
'ThinLTO Linux ToT', <--- Move to new builder.
],
'hostname': 'slave59-c1',
'os': 'linux',
'version': 'trusty',
'bits': '64',
},
That's 4 new VMs, for:
- ClangToTAndroid
- ClangToTAndroid64
- ClangToTLinuxMSan
- ThinLTO Linux ToT
Dirk, what do you think?
,
Sep 15 2017
Okay, repurposing this bug for an actual labs request ... Vince, based on what we were talking about earlier today, can we allocate 9 n1-highmem-32 config GCE Linux Trusty bots with 2xSSDs in a single striped RAID-0 volume (so, 32 core, 208 GB RAM, 750 GB disk) to try these out as clang builders? For now, we will assume that we'll do build + test on the same bots. If that turns out to be too slow, we can reevaluate and do builder/tester splits with much smaller test bots. We can then retire all of the existing bots listed above. On a related note, just to confuse things further, I'd like to actually move these bots onto a different master (see crbug.com/765859), though hopefully that'll make the setup easier (and I'll help with that).
,
Sep 22 2017
vhang@, ping?
,
Sep 25 2017
Sorry for the late response. After chatting with the team, I found out that it isn't possible to create a VM or GCE instance with 2xSSDs in a RAID-0 volume. We can create 9 new n1-highmem-32 GCE instances for you. Will they all be in the chromium.fyi master?
,
Sep 25 2017
Can we do either 2xSSDs mounted as separate volumes, or 1 SSD + 1 spinning, so at least /b is on an SSD? I'm not sure that having the other stuff be on SSDs is as important. They'll be on the new master that I'm setting up in bug 765859, master.chromium.clang. As an aside, why can't we do RAID-0? Is that a GCE limitation, or the way we create images, or something else?
,
Sep 25 2017
It's a GCE limitation. Looks like you can get local host ssd's but it's not clear that these are dedicated solely to the vm. You can't use them for the os either so this makes it a non-std. config wrt bootstrapping the vm. +Ryan may provide more insight. https://cloud.google.com/compute/docs/disks/local-ssd#create_local_ssd If you really want the fastest thing possible with no resource contention then sounds like a bare-metal based config is the way to go. I suggest first seeing if the n1-highmem-32 config with a single 500gb persistent disk works? There are several in use today (e.g https://uberchromegw.corp.google.com/i/chromium.fyi/builders/UBSanVptr%20Linux)
,
Sep 25 2017
Okay, I don't want to gate things on the SSD discussion, so yeah, let's go with a n1-highmen-32 + 500gb persistent spinning, and we can figure out an SSD strategy separately.
,
Sep 25 2017
,
Sep 25 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/infra_internal/+/4f4fefee9846815b73944fbf55db286552b55042 commit 4f4fefee9846815b73944fbf55db286552b55042 Author: Peter Schmidt <pschmidt@google.com> Date: Mon Sep 25 21:44:41 2017
,
Sep 25 2017
The new slaves are slave{144,149,{162..168}}-c1
,
Sep 25 2017
Thanks! Per conversation w/ estaab@ earlier today, we might actually want to move this "new master" onto LUCI instead, and not set up a new master. I'll be figuring that out over the next day or two.
,
Sep 26 2017
> The new slaves are slave{144,149,{162..168}}-c1
Dirk, can I start using these now with the bots on chromium.fyi, or do we need to wait for the new master? (I'm not super familiar with how this works.)
,
Sep 26 2017
I'd very much prefer if we wouldn't early-adopt anything for the clang bots. I realize that this is an interesting proving ground for infra folks, but we need these bots to work. Tried, true, and on-the-verge-of-deprecated is perfect for them.
,
Sep 27 2017
@thakis - understood. I will keep that in mind.
,
Sep 27 2017
hans@ - please wait for a new master. I will try to get a CL out for it shortly.
,
Sep 29 2017
,
Sep 29 2017
@pschmidt - it looks like slave149-c1 is already in use by the "codesearch-gen-webrtc-linux" builder on chromium.infra.codesearch, so I guess I need another bot for that.
,
Sep 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/35939cb683d6fb2cd93939a4410c9a7345083928 commit 35939cb683d6fb2cd93939a4410c9a7345083928 Author: Dirk Pranke <dpranke@chromium.org> Date: Fri Sep 29 23:48:39 2017 Add a new master.chromium.clang. We want to split up chromium.fyi so that it is less heavily loaded, and we also want to spin up a bunch of new, higher-end machines for the Clang builders. This CL adds a new master.chromium.clang that will replace some of the existing FYI builders, and add more new ones. Subsequent CLs will migrate the remaining clang FYI bots over to this. Reviewer: pschmidt@chromium.org, hans@chromium.org Bug: 686323, 765420 , 765859 Change-Id: I017b13b21d742501c7df9a8bdfca585f2085f601 Reviewed-on: https://chromium-review.googlesource.com/691086 Commit-Queue: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org> Reviewed-by: Benjamin Pastene <bpastene@chromium.org> Reviewed-by: Hans Wennborg <hans@chromium.org> [modify] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/tests/masters_test.py [add] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/scripts/slave/recipe_modules/chromium_tests/chromium_clang.py [add] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/masters/master.chromium.clang/Makefile [add] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/masters/master.chromium.clang/master_site_config.py [add] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/masters/master.chromium.clang/builders.pyl [modify] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/scripts/slave/recipe_modules/chromium_tests/builders.py [add] https://crrev.com/35939cb683d6fb2cd93939a4410c9a7345083928/masters/master.chromium.clang/master.cfg
,
Sep 29 2017
pschmidt: CL landed, so this is yours now :).
,
Oct 2 2017
Re: slave149-c1. It was marked incorrectly unused per https://chrome-internal-review.googlesource.com/409651 slave172-c1 will be the replacement.
,
Oct 2 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/infra_internal/+/a5c597d18430400279b66cd258e22db6ab957c49 commit a5c597d18430400279b66cd258e22db6ab957c49 Author: Peter Schmidt <pschmidt@google.com> Date: Mon Oct 02 17:42:08 2017
,
Oct 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/cd6588cb69de9a4798fa7a5ce4b6488360a76d7b commit cd6588cb69de9a4798fa7a5ce4b6488360a76d7b Author: Dirk Pranke <dpranke@chromium.org> Date: Tue Oct 03 22:38:04 2017 Add bot for TotAndroid64 on chromium.clang. This fills in the bot name that we were missing before. TBR=pschmidt@chromium.org BUG= 765420 Change-Id: I49f322f3b21b1b9936c79aed4f4288c33443b3fc Reviewed-on: https://chromium-review.googlesource.com/699415 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/cd6588cb69de9a4798fa7a5ce4b6488360a76d7b/scripts/slave/recipe_modules/chromium_tests/chromium_clang.py [modify] https://crrev.com/cd6588cb69de9a4798fa7a5ce4b6488360a76d7b/masters/master.chromium.clang/builders.pyl
,
Oct 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/e1a01740d14f1a45df5a29ca856f12626b77367b commit e1a01740d14f1a45df5a29ca856f12626b77367b Author: Peter Schmidt <pschmidt@google.com> Date: Wed Oct 04 18:27:30 2017 Add ToTAndroid64 builder and fix Alt web port on master.chromium.clang Bug: 765420 ,765859 Change-Id: Ic9af04775c6f401541f0769f5ccdac33c0eb30f6 Reviewed-on: https://chromium-review.googlesource.com/695725 Commit-Queue: Peter Schmidt <pschmidt@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/e1a01740d14f1a45df5a29ca856f12626b77367b/masters/master.chromium.clang/master_site_config.py [modify] https://crrev.com/e1a01740d14f1a45df5a29ca856f12626b77367b/masters/master.chromium.clang/builders.pyl
,
Nov 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/f1b0394eacec8d7e571407522100eb9dd41c444c commit f1b0394eacec8d7e571407522100eb9dd41c444c Author: Reid Kleckner <rnk@google.com> Date: Tue Nov 21 21:59:27 2017 Merge chromium.clang builder and tester bots These bots were configured as separate builders and testers, but now that we have swarming, the build packaging and uploading steps create unnecessary overhead. Consolidating these bots will also reduce the number of boxes to check in the buildbot web UI. The only remaining builder-only bot after this change is "CFI Linux CF". It's not clear if this was intentionally configured to not run tests, but in any case, this patch doesn't touch it. These are the tester VMs that are no longer used: vm12-m1 vm30-m1 vm190-m1 vm310-m1 vm311-m1 vm312-m1 vm313-m1 vm961-m1 vm962-m1 vm970-m1 vm978-m1 R=dpranke@chromium.org, hans@chromium.org Bug: 765420 , 587879 Change-Id: Ice1dd50194c70fb57d1d2eae5f0a36d7dd3accd6 Reviewed-on: https://chromium-review.googlesource.com/782619 Reviewed-by: Hans Wennborg <hans@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Reid Kleckner <rnk@chromium.org> [modify] https://crrev.com/f1b0394eacec8d7e571407522100eb9dd41c444c/scripts/slave/recipe_modules/chromium_tests/chromium_clang.py [modify] https://crrev.com/f1b0394eacec8d7e571407522100eb9dd41c444c/masters/master.chromium.clang/builders.pyl
,
Dec 5 2017
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by h...@chromium.org
, Sep 14 2017Looking at just Linux, because there's no reason that shouldn't be fast: { 'master': 'ChromiumFYI', 'builder': [ 'ClangToTLinux', <--- I'd like this to be on a beefy builder. 'ClangToTLinux tester', ], 'hostname': 'slave3-c1', 'os': 'linux', 'version': 'precise', 'bits': '64', }, { 'master': 'ChromiumFYI', 'builder': [ 'ClangToTLinuxASan', 'ClangToTLinuxASan tester', <--- Split. ], 'hostname': 'slave4-c1', 'os': 'linux', 'version': 'precise', 'bits': '64', }, { 'master': 'ChromiumFYI', 'builder': [ 'ClangToTLinuxLLD', 'ClangToTLinuxLLD tester', 'ClangToTAndroid', 'ClangToTAndroid64', ], 'hostname': 'build44-m1', <--- This is a beefy non-VM. Use it for ClangToTLinux instead. 'os': 'linux', 'version': 'precise', 'bits': '64', }, { 'master': 'ChromiumFYI', 'builder': [ 'ClangToTLinuxUBSanVptr', 'ClangToTLinuxUBSanVptr tester', <--- Split ], 'hostname': 'slave42-c1', 'os': 'linux', 'version': 'precise', 'bits': '64', }, { 'master': 'ChromiumFYI', 'builder': [ 'ClangToTLinux (dbg)', 'ClangToTLinuxMSan', 'ClangToTLinuxMSan tester', <-- These don't exist yet, but I'd like them to. ], 'hostname': 'slave5-c1', 'os': 'linux', 'version': 'precise', 'bits': '64', }, { 'master': 'ChromiumFYI', 'builder': [ 'CFI Linux ToT', 'ThinLTO Linux ToT', <--- Split. ], 'hostname': 'slave59-c1', 'os': 'linux', 'version': 'trusty', 'bits': '64', }, To summarize, I'd like: build44-m1 to be used for ClangToTLinux so that it cycles quickly. New VMs for testers: - ClangToTLinuxASan tester - ClangToTLinuxLLD tester - ClangToTLinuxUBSanVptr tester - ClangToTLinuxMSan tester New VMs for builders: - ClangToTLinuxLLD - ClangToTAndroid - ClangToTAndroid64 - ClangToTLinuxMSan - ThinLTO Linux ToT That's 4 VMs for testers and 5 VMs for building. The builders should ideally be beefy, as these are non-goma builds. Dirk, does this sound reasonable? Should I file file separate infra tickets for these, or can I point someone at this one?