Switch GPU tests on Mac Minis to run on 10.13 |
|||||||||
Issue descriptionCurrently the majority of the GPU tests run on the shared Mac Mini pool are invoked on 10.12. However, recently, that pool has mostly been upgraded to 10.13: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=100&s=id%3Aasc compare to: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12&l=100&s=id%3Aasc The GPU tests should be switched over to run on 10.13. This would have caught bugs like Issue 897507 .
,
Oct 25
Correction from jbudorick@ on https://chromium-review.googlesource.com/1298424/ : when looking at pool:Chrome and comparing 10.12.6 to 10.13.6 specifically, there are ~200 10.12.6 bots available: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=pool%3AChrome&f=os%3AMac-10.12.6&l=100&s=id%3Aasc and only 50 10.13.6 bots: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13.6&f=pool%3AChrome&l=100&s=id%3Aasc In order to proceed with this upgrade, it's necessary to increase the capacity of the 10.13 pool, and migrate the majority of the 10.12 tests to 10.13. There are ~358 Mac Minis running some version of 10.13 in the various Swarming pools: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=100&s=id%3Aasc and 143 Mac Minis in the Chrome pool running some version of 10.13: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&c=pool&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=pool%3AChrome&f=os%3AMac-10.13&l=100&s=id%3Aasc This leaves ~200 10.13 Mac Minis in other pools; we should see if we can coalesce the pools somehow. Additionally, we'll need to do this upgrade with the trigger_multiple_dimensions script once Issue 871453 is resolved.
,
Oct 31
John: at this point Issue 871453 has been resolved, and there are ~143 Mac Minis in the Chrome pool running 10.13.6: http://shortn/_s2T51e4ybi , compared to ~200 running 10.12.6: http://shortn/_HMOyHO9UNG I don't know how to figure out which tests are causing most of the load on which configurations. Certainly looking at 10.12 specifically: https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12&l=50&n=true&s=created_ts%3Adesc&st=1540937640000 the only bot that's triggering on the general "10.12" OS dimension is mac10.12-blink-rel. Most of the test suites seem to trigger on 10.12.6 specifically: https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12.6&l=50&n=true&s=created_ts%3Adesc&st=1540937640000 Certainly a significant fraction of these are the GPU tests. Looking at the tests triggered against the 10.13 dimension on this hardware: https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=50&n=true&s=created_ts%3Adesc&st=1540937640000 turns up webkit_layout_tests, telemetry_unittests, telemetry_perf_unittests, and others. Could you help me understand what percentage of the 10.12 bots we would want to reimage as 10.13 bots in order to move the GPU tests over to the 10.13 bots? Also, is this really blocked on Issue 871453? We've successfully migrated GPU tests from one OS version to another in the past using the trigger_multiple_dimensions.py script, and it's undergone recent improvements in Issue 886985 . Thanks for your feedback. Assigning to you for your input. Please assign back to me with next steps.
,
Dec 13
Issue 913064 (sorry, restricted view) describes some dependencies that need to be resolved, and as a consequence of fixing that bug, the 10.12 and 10.13 pools will likely be merged, and the GPU tests on the Mac Minis migrated to 10.13.
,
Dec 13
Thanks kbr@, and I will keep an eye on this issue to get notification.
,
Jan 9
In Issue 914413, ~80 Mac Minis were migrated from 10.12.6 to 10.13.6, and it looks like a majority of the Swarming tasks were migrated. Here are usage graphs: 10.12.6: http://shortn/_Ov5GzLZUPI 10.13.6: http://shortn/_GJzHHMjOZd There are now ~220 10.13.6 Mac Minis in comparison to ~120 10.12.6 machines. However, the 10.13 machines are more heavily loaded than the 10.12 ones. Is there any way to proceed with migrating the remaining CQ jobs from 10.12 to 10.13, and upgrading a majority of the remaining 10.12 machines to 10.13?
,
Jan 9
,
Jan 9
Raising to P1. During peak times Pacific time we are seeing expirations of the GPU tests that are launched on the Mac Minis by the mac_chromium_rel_ng tryserver: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/221232 because of lack of capacity of 10.12.6 machines. We would very much like to cut these tests over to 10.13, and migrate most if not all of the remaining 10.12 Mac Minis to 10.13.
,
Jan 9
,
Jan 9
overall mac capacity is not sufficient? 10.13 seems busier than 10.12 https://bugs.chromium.org/p/chromium/issues/detail?id=920434
,
Jan 17
(5 days ago)
Issue 923107 has been merged into this issue.
,
Jan 17
(5 days ago)
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by kbr@chromium.org
, Oct 25