lots of offline bots in the win_msvc_cq pool, need to add more |
||||||
Issue descriptionIn theory the win_msvc_cq pool on tryserver.chromium.win is supposed to have 92 bots in it, but it looks like ~15 of them are permanently offline. We need them (or their replacements) back, and probably should add another ~10-20 or so bots on top of that. This graph -- http://shortn/_mCK2ZoMPCP -- shows the usage over the past day, with win-msvc-dbg at 25%. I've posted https://crrev.com/c/898350 to bump it to 50%, which should give us a better sense of how many more bots we might need. The bots that appear to be offline (from https://ci.chromium.org/buildbot/tryserver.chromium.win/win-msvc-dbg/ ): vm612-m4 vm627-m4 vm632-m4 vm636-m4 vm638-m4 vm715-m4 vm717-m4 vm753-m4 vm755-m4 vm764-m4 vm895-m4 vm896-m4 vm950-m4 vm951-m4 vm952-m4
,
Feb 2 2018
,
Feb 2 2018
Looks like powercycling w/ vmpower brought back vm627-m4, vm632-m4, vm715-m4, vm717-m4, vm753-m4, and vm895-m4, leaving the other nine: vm612-m4 vm636-m4 vm638-m4 vm755-m4 vm764-m4 vm896-m4 vm950-m4 vm951-m4 vm952-m4 I expect Labs probably needs to take it from here.
,
Feb 2 2018
vm95{0..2}-m4 look to be double allocated, as they're 10.9.5 Mac VMs:
$ botmap.py 2>/dev/null | grep vm95'[0-2]'-m4
vm950-m4 win buildbot master.tryserver.chromium.win
vm950-m4 mac swarming chromium-swarm.appspot.com
vm951-m4 win buildbot master.tryserver.chromium.win
vm951-m4 mac swarming chromium-swarm.appspot.com
vm952-m4 win buildbot master.tryserver.chromium.win
vm952-m4 mac swarming chromium-swarm.appspot.com
vm950-m4 os x 10.9.5 (13f1096)
vm951-m4 os x 10.9.5 (13f1096)
vm952-m4 os x 10.9.5 (13f1096)
I'm going to assume this was a mistake to make these swarming vms, so I'll redeploy those.
The other bots start their buildslave process and connect to the master then the process instantly exits. I'm going to try re-bootstrapping those.
,
Feb 2 2018
Bots in #3 are now all reconnected. Will it still be desired to expand this pool?
,
Feb 2 2018
The CL to bump up the load hasn't landed yet, so I can't say. I'll get that landed and see, and then report back if so desired. Thanks!
,
Feb 6 2018
vadimsh@ found win{1508..1526}-c4 as unused (see bug 75620 #c72) so that'll help, too.
,
Feb 15 2018
@dba - at a guess, it looks like we'll probably need 15 more bots in addition to the 18 listed in #c7 (which were actually slave{1508..1526}-c4, from bug 756270). Is that doable?
,
Feb 15 2018
looks like 15{09,10,11} are already in use, so maybe 20 more instead ...
,
Feb 15 2018
Actually, looks like the bots in this pool are all currently win7 ESX VMs. Mixing those with win10 GCE VMs can't be a good idea. I think instead I'm going to close this out, and move the win-msvc-dbg bot over to the win10_gce pool, which looks like it has plenty of capacity. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dpranke@chromium.org
, Feb 2 2018Labels: -Pri-3 Pri-1