New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 898684 link

Starred by 5 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Feature

Blocked on:
issue 871453
issue 886985
issue 897507
issue 898967
issue 913064
issue 914413



Sign in to add a comment

Switch GPU tests on Mac Minis to run on 10.13

Project Member Reported by kbr@chromium.org, Oct 24

Issue description

Currently the majority of the GPU tests run on the shared Mac Mini pool are invoked on 10.12. However, recently, that pool has mostly been upgraded to 10.13:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=100&s=id%3Aasc

compare to:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12&l=100&s=id%3Aasc

The GPU tests should be switched over to run on 10.13. This would have caught bugs like  Issue 897507 .

 
Blockedon: 898967
Blockedon: 871453
Correction from jbudorick@ on https://chromium-review.googlesource.com/1298424/ : when looking at pool:Chrome and comparing 10.12.6 to 10.13.6 specifically, there are ~200 10.12.6 bots available:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=pool%3AChrome&f=os%3AMac-10.12.6&l=100&s=id%3Aasc

and only 50 10.13.6 bots:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13.6&f=pool%3AChrome&l=100&s=id%3Aasc

In order to proceed with this upgrade, it's necessary to increase the capacity of the 10.13 pool, and migrate the majority of the 10.12 tests to 10.13.

There are ~358 Mac Minis running some version of 10.13 in the various Swarming pools:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=100&s=id%3Aasc

and 143 Mac Minis in the Chrome pool running some version of 10.13:
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&c=pool&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=pool%3AChrome&f=os%3AMac-10.13&l=100&s=id%3Aasc

This leaves ~200 10.13 Mac Minis in other pools; we should see if we can coalesce the pools somehow.

Additionally, we'll need to do this upgrade with the trigger_multiple_dimensions script once Issue 871453 is resolved.

Blockedon: 886985
Cc: kbr@chromium.org
Components: Infra>Client>Chrome
Owner: jbudorick@chromium.org
John: at this point Issue 871453 has been resolved, and there are ~143 Mac Minis in the Chrome pool running 10.13.6:

http://shortn/_s2T51e4ybi

, compared to ~200 running 10.12.6:

http://shortn/_HMOyHO9UNG

I don't know how to figure out which tests are causing most of the load on which configurations. Certainly looking at 10.12 specifically:

https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12&l=50&n=true&s=created_ts%3Adesc&st=1540937640000

the only bot that's triggering on the general "10.12" OS dimension is mac10.12-blink-rel. Most of the test suites seem to trigger on 10.12.6 specifically:

https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.12.6&l=50&n=true&s=created_ts%3Adesc&st=1540937640000

Certainly a significant fraction of these are the GPU tests.

Looking at the tests triggered against the 10.13 dimension on this hardware:

https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1541024040000&f=pool%3AChrome&f=gpu%3AIntel%20Haswell%20Iris%20Graphics%205100%20(8086%3A0a2e)&f=os%3AMac-10.13&l=50&n=true&s=created_ts%3Adesc&st=1540937640000

turns up webkit_layout_tests, telemetry_unittests, telemetry_perf_unittests, and others.

Could you help me understand what percentage of the 10.12 bots we would want to reimage as 10.13 bots in order to move the GPU tests over to the 10.13 bots? Also, is this really blocked on Issue 871453? We've successfully migrated GPU tests from one OS version to another in the past using the trigger_multiple_dimensions.py script, and it's undergone recent improvements in  Issue 886985 .

Thanks for your feedback. Assigning to you for your input. Please assign back to me with next steps.

Blockedon: 913064
Cc: yang...@intel.com
Issue 913064 (sorry, restricted view) describes some dependencies that need to be resolved, and as a consequence of fixing that bug, the 10.12 and 10.13 pools will likely be merged, and the GPU tests on the Mac Minis migrated to 10.13.

Thanks kbr@, and I will keep an eye on this issue to get notification. 
Blockedon: 914413
Cc: erikc...@chromium.org jmpittman@chromium.org
In Issue 914413, ~80 Mac Minis were migrated from 10.12.6 to 10.13.6, and it looks like a majority of the Swarming tasks were migrated. Here are usage graphs:

10.12.6:
http://shortn/_Ov5GzLZUPI

10.13.6:
http://shortn/_GJzHHMjOZd

There are now ~220 10.13.6 Mac Minis in comparison to ~120 10.12.6 machines. However, the 10.13 machines are more heavily loaded than the 10.12 ones.

Is there any way to proceed with migrating the remaining CQ jobs from 10.12 to 10.13, and upgrading a majority of the remaining 10.12 machines to 10.13?

Cc: -jmpittman@chromium.org jmpitt...@google.com
Labels: -Pri-2 Pri-1
Raising to P1. During peak times Pacific time we are seeing expirations of the GPU tests that are launched on the Mac Minis by the mac_chromium_rel_ng tryserver:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/221232

because of lack of capacity of 10.12.6 machines.

We would very much like to cut these tests over to 10.13, and migrate most if not all of the remaining 10.12 Mac Minis to 10.13.

Cc: jmad...@chromium.org
overall mac capacity is not sufficient?
10.13 seems busier than 10.12
https://bugs.chromium.org/p/chromium/issues/detail?id=920434

Comment 11 by jbudorick@chromium.org, Jan 17 (5 days ago)

Issue 923107 has been merged into this issue.

Comment 12 by jbudorick@chromium.org, Jan 17 (5 days ago)

Status: Started (was: Assigned)

Sign in to add a comment