swarming test pending time of win7_chromium_rel_ng builder exceeds 20 mins sometimes |
||||||||||
Issue descriptionpending time of win7_chromium_rel_ng builder's swarming test sometimes exceeds 20 mins in peak time. http://shortn/_UYoRjmdsBs Can we add more capacity for win7_chromium_rel_ng's swarming test pool? There are some other builder having large pending duration, but win7_chromium_rel_ng has higher priority because this builder tends to be slowest in CQ. http://shortn/_p7OZQUVVjf
,
Dec 13
Seeing 20 min pending times in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/151674 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/151675 If we cannot add more capacity for Golo VMs, is it possible to run some time consuming test (e.g. webkit_layout_test, browser_tests) only on GCE based VMs?
,
Dec 13
#1/2 - I'll point out, while it takes more time to deploy Win7 VMs, we do have capacity to do this if that turns out being the right course of action.
,
Dec 14
Dirk, John how do you think? I think adding capacity to win7_chromium_rel_ng is reasonable thing if we can do.
,
Dec 14
Ping? Max pending time of win7_chromium_rel_ng builder always becomes more than 20 mins during MTV business time. http://shortn/_TB9J3ij0nd Adding capacity seems worth doing.
,
Dec 14
Adding more capacity here seems reasonable, though I'm not immediately sure how many we should add. #2: win7 isn't supported on gce-based VMs. #3: how many win7 VMs can we deploy?
,
Dec 14
> how many win7 VMs can we deploy? How many would you estimate are needed?
,
Dec 17
> How many would you estimate are needed? https://plx.corp.google.com/scripts2/script_5b._720d13_0000_240a_a6f4_001a11c04a1c Considering peak usage of swarming tasks, I expect adding 100 vms, 496-> around 600, will improve the situation in most cases.
,
Dec 17
Agree w/ #8, I think something in the 50-100 range would be good here if possible.
,
Dec 18
,
Dec 18
I'll verify we have the capacity for the 100 and if not, whats the max we can deploy this week.
,
Dec 18
We do have the capacity to meet all 100. Going to take more than a week most likely to give it all. So we'll give updates as chunks are delivered.
,
Dec 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/478d8287debc37a83dfcb9f7c960eee39289bf78 commit 478d8287debc37a83dfcb9f7c960eee39289bf78 Author: Garrett Beaty <gbeaty@chromium.org> Date: Wed Dec 19 01:14:37 2018 Increase the timeout for win7_chromium_rel_ng to mitigate timeouts. Bug: 914104 Change-Id: I65d262ad1ee57be59d9e7c79efe95a30c379b0c5 Reviewed-on: https://chromium-review.googlesource.com/c/1383531 Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: Garrett Beaty <gbeaty@chromium.org> Cr-Commit-Position: refs/heads/master@{#617700} [modify] https://crrev.com/478d8287debc37a83dfcb9f7c960eee39289bf78/infra/config/global/cr-buildbucket.cfg
,
Dec 19
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome-golo/chrome-golo/+/f81df0f688e4f9aa29a3ef718dfeaba7270be565 commit f81df0f688e4f9aa29a3ef718dfeaba7270be565 Author: Bryce Albritton <dba@google.com> Date: Wed Dec 19 21:35:20 2018
,
Dec 19
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/5014e112f146383905281d0ee56d11f26a82422b commit 5014e112f146383905281d0ee56d11f26a82422b Author: Bryce Albritton <dba@google.com> Date: Wed Dec 19 22:17:07 2018
,
Dec 19
49 bots added to Pool: Chrome (2 of the bots added were already in the config, #14 delivers the remaining 47). Will continue with the remaining 51 to give a total of 100.
,
Dec 19
,
Dec 20
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/18d41718d0e0b5e3719d16141075c2e5e2f03a55 commit 18d41718d0e0b5e3719d16141075c2e5e2f03a55 Author: Bryce Albritton <dba@google.com> Date: Thu Dec 20 22:28:07 2018
,
Dec 20
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome-golo/chrome-golo/+/bef758817b216c71240af29b679565383c55216d commit bef758817b216c71240af29b679565383c55216d Author: Bryce Albritton <dba@google.com> Date: Thu Dec 20 22:28:18 2018
,
Dec 20
51 more bots are now in Pool: Chrome.
,
Jan 7
Thanks, dba! 4 week stats showed good improvement in max pending time. http://shortn/_j2KIM2a2wl https://screenshot.googleplex.com/KVKht3gtvLw I will mark this bug fixed if this week has similar stats.
,
Jan 7
,
Jan 10
Hmm, 600 bots may not sufficient yet. http://shortn/_HZGdC6uchh
,
Jan 10
That's pretty crazy that 600 bots isn't enough.
,
Jan 14
The NextAction date has arrived: 2019-01-14
,
Jan 16
Hmm, adding capacity does not improve pending time well. http://shortn/_KoHGgL6PWI But time consuming tasks are sent from 'Win7 Tests (dbg)(1)' bots. Seems more than 50% of win7 pool is consumed by non-optimized dbg test. https://plx.corp.google.com/scripts2/script_5c._3cf21a_0000_2bcb_926c_089e0832afdc Can we enable some optimization for dbg tester? I think one of the biggest roll of dbg tester is to confirm the behavior of component build binary rather than confirm the behavior of non-optimized binary.
,
Jan 16
Sorry, query in #26 is wrong, updated. browser_tests and webkit_layout_tests in win7_chromium_rel_ng builder is most time consuming test in the pool.
,
Jan 16
I have some ideas for this * add more capacity if possible * move some test to win10 gce pool as gpu related tests already run on win10, and we can resize gce pool dynamically, so we can somewhat control the cost. * do nothing, put up with in peak time
,
Jan 16
(6 days ago)
How hard would it be to move tests to the win10 gce pool?
,
Jan 16
(6 days ago)
Technically, it can be done like https://chromium-review.googlesource.com/c/chromium/src/+/1404907 We also need to increase win10 pool capacity in this case.
,
Jan 17
(5 days ago)
> move some test to win10 gce pool as gpu related tests already run on win10, and we can resize gce pool dynamically, so we can somewhat control the cost. I think it would make more sense to move the win10 GPU tests on win7_chromium_rel_ng to win10_chromium_x64_rel_ng. Doesn't decrease load at all, just makes the naming scheme more consistent/less surprising.
,
Jan 17
(5 days ago)
And I don't think we have deployments currently up in the air, so removing bryce as owner until we have more concrete plans.
,
Jan 17
(5 days ago)
It'd be fine to move the Win10 GPU tests off of win7_chromium_rel_ng -- the only difference being that they'd be tested in 64-bit rather than 32-bit builds. Is there capacity on the win10_chromium_x64_rel_ng bot to simply switch these tests over? Note that unfortunately this mirroring is currently done in the tools/build workspace: https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/trybots.py?q=trybots.py&sq=package:chromium&dr so the switchover would have to be done without any tryjobs. Also, the GPU Win Builder is currently a 32-bit builder, so we would need to make it build 64-bit first. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by sergeybe...@chromium.org
, Dec 12Status: Assigned (was: Untriaged)