Improve ANGLE dEQP bot wait times by using sharding |
||||||||
Issue descriptionWe can try increasing the shard count here for gles2/gles3 tests, which should process the workload faster. Also we could consider increasing the test batch size to reduce the amount of time parsing test names and expectations.
,
Sep 15 2017
Tests seem to be running in 10 or so minutes, and the total bot time is about 20 minutes, a big improvement. The batch size seems to be already 400 so no need to adjust it.
,
Sep 16 2017
,
Sep 16 2017
,
Sep 16 2017
Originally I had this issue just for Android dEQP bots but I'll also look at a couple other bots. In this case the dEQP EGL tests are the tent pole in the dEQP bot times for Linux and Windows, so we can improve that by sharding.
,
Sep 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7dd05b8243f89a1af4927912e5ca9750bdc9bdad commit 7dd05b8243f89a1af4927912e5ca9750bdc9bdad Author: Jamie Madill <jmadill@chromium.org> Date: Sat Sep 16 23:35:25 2017 Increase sharding on ANGLE dEQP tests. Currently the EGL tests take about 20 minutes on one shard. Putting them to four shards should reduce this to 5-6 minutes. Increasing the shard count for the dEQP GLES 3.1 tests should also reduce the test time from 10 minutes to 6 minutes or so. BUG= 765321 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: Ica757be01dd79fb030002bb7b1136b337701218a Reviewed-on: https://chromium-review.googlesource.com/669706 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#502517} [modify] https://crrev.com/7dd05b8243f89a1af4927912e5ca9750bdc9bdad/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/7dd05b8243f89a1af4927912e5ca9750bdc9bdad/testing/buildbot/chromium.gpu.fyi.json
,
Sep 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/angle/angle/+/981f0f8f6a34b6144e573692c6fcab1625a5f67c commit 981f0f8f6a34b6144e573692c6fcab1625a5f67c Author: Jamie Madill <jmadill@chromium.org> Date: Thu Sep 21 15:14:14 2017 Add flag to do a fast pass through perf tests. This flag will only render the first frame of each perf test, regardless of their preferences for how many seconds to run. It will be useful for speeding up the run time of the perf tests on testing infrastructure that only cares about correctness. BUG=chromium:725308 BUG= chromium:765321 Change-Id: I926f488c42f27ef23ef06a0159902613cff04080 Reviewed-on: https://chromium-review.googlesource.com/677306 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> [modify] https://crrev.com/981f0f8f6a34b6144e573692c6fcab1625a5f67c/src/tests/angle_perftests_main.cpp [modify] https://crrev.com/981f0f8f6a34b6144e573692c6fcab1625a5f67c/src/tests/perf_tests/ANGLEPerfTest.cpp [modify] https://crrev.com/981f0f8f6a34b6144e573692c6fcab1625a5f67c/src/tests/perf_tests/ANGLEPerfTest.h
,
Sep 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cb2ec8306dd6854d928eb3bf88840af0a5fdb983 commit cb2ec8306dd6854d928eb3bf88840af0a5fdb983 Author: Frank Henigman <fjhenigman@chromium.org> Date: Thu Sep 21 19:19:25 2017 Roll ANGLE 47bf2dc..1f9d684 https://chromium.googlesource.com/angle/angle.git/+log/47bf2dc..1f9d684 BUG= chromium:767279 ,chromium:725308, chromium:765321 , chromium:655534 TEST=bots CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Change-Id: I794ad67e2a1ef672855c65eef94959800cba9477 Reviewed-on: https://chromium-review.googlesource.com/677428 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Frank Henigman <fjhenigman@chromium.org> Cr-Commit-Position: refs/heads/master@{#503529} [modify] https://crrev.com/cb2ec8306dd6854d928eb3bf88840af0a5fdb983/DEPS
,
Sep 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9cddc994b8e58a467d36e2fe7233afc351f3ee01 commit 9cddc994b8e58a467d36e2fe7233afc351f3ee01 Author: Jamie Madill <jmadill@chromium.org> Date: Tue Sep 26 03:19:21 2017 Add angle_perftests fast run to bots. The command line flag "--one-frame-only" will force the ANGLE perf tests to stop execution after one iteration. This will help test correctness while giving a much faster run time. Currently they take about ten minutes on the bots, this should hopefully give a 10x improvement for correctness-only testing. Requires http://crrev.com/c/677306 to land first. BUG=725308, 765321 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I9bbbe3a2c3100fd4d43759382cd62ec2f45ab33b Reviewed-on: https://chromium-review.googlesource.com/677307 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#504271} [modify] https://crrev.com/9cddc994b8e58a467d36e2fe7233afc351f3ee01/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/9cddc994b8e58a467d36e2fe7233afc351f3ee01/gpu/angle_perftests_main.cc [modify] https://crrev.com/9cddc994b8e58a467d36e2fe7233afc351f3ee01/testing/buildbot/chromium.gpu.fyi.json
,
Oct 3 2017
Ken, do you think it's feasible to increase the shad counts for the WebGL 1 and 2 tests on the ANGLE CQ? Not a very high priority issue (current total shader time is about max ~11 minutes) but thought I might ask.
,
Oct 3 2017
Yes, sure. Feel free to put up a CL modifying generate_buildbot_json.py. It's OK that the change will apply globally and not just to the ANGLE CQ.
,
Oct 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/dd17a1699b1cb628aaa1479f6bab4d61a73d8f3d commit dd17a1699b1cb628aaa1479f6bab4d61a73d8f3d Author: Jamie Madill <jmadill@chromium.org> Date: Wed Oct 11 17:18:26 2017 Increase webgl conformance shard counts. This increases the shard count for the webgl conformance tests from one to two, which should halve the 10 minute run time. It also increases the shard count for the webgl 2 conformance ANGLE tests from 15 to 20, which should reduce the time spent from about 9-10 minutes to 6-7. Hold off on increasing the shard count for the Chrome CQ version of the webgl 2 conformance tests until we can confirm the bots are stable with the increased shards count. Also fixes the webgl2_conformance_gl_tests, which was accidentally testing webgl 1. BUG= 765321 R=kbr@chromium.org Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I9ac1a0b649556b4797cc6e9d2f885921d55692ca Reviewed-on: https://chromium-review.googlesource.com/709460 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Cr-Commit-Position: refs/heads/master@{#508012} [modify] https://crrev.com/dd17a1699b1cb628aaa1479f6bab4d61a73d8f3d/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/dd17a1699b1cb628aaa1479f6bab4d61a73d8f3d/testing/buildbot/chromium.gpu.fyi.json
,
Oct 23 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/778426a57df0f89eaa432f417b6ed8d94ff45448 commit 778426a57df0f89eaa432f417b6ed8d94ff45448 Author: Jamie Madill <jmadill@chromium.org> Date: Mon Oct 23 01:25:48 2017 Increase shard counts for WebGL tests. Currently webgl_conformance_tests run on a single shard on desktop and on 6 shards on Android. The WebGL 2 tests run on 15 shards. This updates the shard count on desktop WebGL to 2 which should halve the time, and 20 for the WebGL 2 conformance tests, which should reduce the time from 10-12 minutes to 6-7. This should reduce cycle time on the GPU optional try servers as well as ANGLE try servers. Bug: 765321 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I13f0cbdce0fe1957cfa1e70a684c95c0be9b6c33 Reviewed-on: https://chromium-review.googlesource.com/732619 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Cr-Commit-Position: refs/heads/master@{#510718} [modify] https://crrev.com/778426a57df0f89eaa432f417b6ed8d94ff45448/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/778426a57df0f89eaa432f417b6ed8d94ff45448/testing/buildbot/chromium.gpu.fyi.json [modify] https://crrev.com/778426a57df0f89eaa432f417b6ed8d94ff45448/testing/buildbot/chromium.gpu.json [modify] https://crrev.com/778426a57df0f89eaa432f417b6ed8d94ff45448/testing/buildbot/client.v8.fyi.json
,
Oct 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3da46d55a65137622d50159bb69314a968da6ebc commit 3da46d55a65137622d50159bb69314a968da6ebc Author: Jamie Madill <jmadill@chromium.org> Date: Thu Oct 26 03:09:51 2017 Update WebGL 2 json test timing results. This should help the tests more accurately divide the tests between the shards to get a balanced runtime, and ensure no single shard dominates the runtime of the entire test suite. Also update the gather script to run on Windows. Bug: 765321 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I368b3b2844e5a001f19e250fef0be05945f09903 Reviewed-on: https://chromium-review.googlesource.com/732728 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#511718} [modify] https://crrev.com/3da46d55a65137622d50159bb69314a968da6ebc/content/test/data/gpu/webgl2_conformance_tests_output.json [modify] https://crrev.com/3da46d55a65137622d50159bb69314a968da6ebc/content/test/gpu/gather_swarming_json_results.py
,
Oct 26 2017
Previous test timing results were updated with gather_swarming_json_results with the default parameters. By default it pulls from chromium.gpu.fyi, Linux Release (NVIDIA), webgl2_conformance_tests.
,
Oct 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ef4202cb0c7f86b36747c89c9d277f37e42b7763 commit ef4202cb0c7f86b36747c89c9d277f37e42b7763 Author: Jamie Madill <jmadill@chromium.org> Date: Fri Oct 27 02:32:45 2017 Update WebGL conformance test timings. The test timings were somewhat out of date, which was causing a time delta between the 1st and 2nd shard to be 3 vs 6-7 minutes. Updating the timings should bring them more in line. Command line: python content\test\gpu\gather_swarming_json_results.py --step webgl_conformance_tests --build 51587 --output content/test/data/gpu/webgl_conformance_tests_output.json TBR=kbr@chromium.org Bug: 765321 Change-Id: Ic005b40109b5551ba95cc3b1bedbc7c7b4ee962a Reviewed-on: https://chromium-review.googlesource.com/738845 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#512064} [modify] https://crrev.com/ef4202cb0c7f86b36747c89c9d277f37e42b7763/content/test/data/gpu/webgl_conformance_tests_output.json
,
Oct 28 2017
Ken, the WebGL 1 tests seem to still have a delta as large as 4 minutes (3 minute vs 7 minute times between shards 0 and 1 in one run I saw). Can you point me to the debugging/logging method for the test partitioning? I wonder if there's some minor error that's tripping up the method, not sure how it's done.
,
Oct 28 2017
Example of 3 minute to 7 minute run: Shard 0: https://chromium-swarm.appspot.com/task?id=397c235388a00810&refresh=10&show_raw=1 Shard 1: https://chromium-swarm.appspot.com/task?id=397c2354f4810910&refresh=10&show_raw=1
,
Oct 30 2017
Here are the flags: ./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=canary --total-shards=2 --shard-index=0 --read-abbreviated-json-results-from=content/test/data/gpu/webgl_conformance_tests_output.json --debug-shard-distributions Mainly, you need to specify the abbreviated json results file, and debug-shard-distributions.
,
Nov 16 2017
Thanks. I did some investigation - it seems the algorithm is okay, but possibly the accounting is incorrect or actual runtime differs significantly than recorded runtime, or is dependent on test ordering. It doesn't seem to be the most pressing issue to fix so will defer further investigation there.
,
Nov 16 2017
,
Nov 16 2017
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by bugdroid1@chromium.org
, Sep 14 2017