angle_perftests timing out on chromium.perf/Win 7 ATI GPU Perf |
|||||
Issue descriptionangle_perftests failing on chromium.perf/Win 7 ATI GPU Perf Builders failed on: - Win 7 ATI GPU Perf: https://build.chromium.org/p/chromium.perf/builders/Win%207%20ATI%20GPU%20Perf Going to disable the test and kick off a bisect.
,
Nov 15 2017
That's unfortunate. How did this happen? The logs cut off after the first couple of tests.
,
Nov 15 2017
I suspect that the test is being stuck. Then when swarming infra kill it, it tries to flush the buffer, which shows some incomplete logs.
,
Nov 15 2017
Note that the prior run took 36 minutes to complete: https://chromium-swarm.appspot.com/task?id=39d5be1cb0748610&refresh=10&show_raw=1 And the new test is timing out of 10 minutes. Is this possibly related to https://chromium-review.googlesource.com/761402 ? (Make the tests isolated and deterministic)
,
Nov 15 2017
Note the above CL is in the regression range for the broken build: https://build.chromium.org/p/chromium.perf/builders/Win%207%20ATI%20GPU%20Perf/builds/1454
,
Nov 15 2017
jmadill@, if the test is timing out after 10 minutes, my guess is that it's hitting the I/O timeout rather than the hard timeout. The I/O timeout is basically when swarming hasn't seen additional output from the test after 10 minutes, so kills it, assuming that the test is stuck. You can do one of two things: 1) Make it so that the test issues some heartbeat output when it's still running 2) Add an I/O timeout override to angle_perftests here: https://cs.chromium.org/chromium/src/tools/perf/core/perf_data_generator.py?type=cs&sq=package:chromium&q=package:%5E(chromium)$+file:(/%7C%5E)core/perf_data_generator(%5C.(swig%7Cpy%7Cspt)$%7C/(__init__%5C.(swig%7Cpy%7Cspt))?$)&l=849
,
Nov 15 2017
Reverting https://chromium-review.googlesource.com/761402 : https://chromium-review.googlesource.com/c/chromium/src/+/772190 It seems very likely that my change is the culprit since my change makes it so that tests don't run in parallel so they will take longer.
,
Nov 15 2017
Before my change: https://chromium-swarm.appspot.com/task?id=39d257a5b44a5d10&refresh=10&show_raw=1 After my change: https://chromium-swarm.appspot.com/task?id=39d7e17e88b43510&refresh=10&show_raw=1 The addition of --single-process-tests is clear.
,
Nov 15 2017
The strange thing is that angle-perf-tests was already using the --single-process-tests flag for some of its subtests... I guess if my change doesn't fix this then we're know it wasn't the culprit.
,
Nov 15 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e9a70b2c096050415e2d10b711d9552360b57f5f commit e9a70b2c096050415e2d10b711d9552360b57f5f Author: Jamie Madill <jmadill@chromium.org> Date: Wed Nov 15 19:38:51 2017 Run angle_perftests on Windows AMD and Intel. This will prevent regressions on the perf bots. BUG= 785291 TBR=kbr@chromium.org Cq-Include-Trybots: master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I58e435466cef8450fe04f1775577dcabb84523e7 Reviewed-on: https://chromium-review.googlesource.com/771547 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#516787} [modify] https://crrev.com/e9a70b2c096050415e2d10b711d9552360b57f5f/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/e9a70b2c096050415e2d10b711d9552360b57f5f/testing/buildbot/chromium.gpu.fyi.json
,
Nov 15 2017
=== BISECT JOB RESULTS === NO Test failure found Bisect Details Configuration: winx64ati_perf_bisect Benchmark : angle_perftests Metric : BitSetIteratorPerf_run/score Revision Exit Code N chromium@516439 0 +- N/A 2 good chromium@516536 0 +- N/A 2 bad To Run This Test .\src\out\Release_x64\angle_perftests.exe --test-launcher-print-test-stdio=always --test-launcher-jobs=1 More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8962844784548974896 For feedback, file a bug with component Speed>Bisection
,
Nov 15 2017
So it seems even on successful builds there were timeouts: https://chromium-swarm.appspot.com/task?id=39d5be1cb0748610&refresh=10&show_raw=1 Here 3 tests timed out: [49/86] InterleavedAttributeDataBenchmark.Run/d3d11_9_3 (TIMED OUT) [52/86] LinkProgramBenchmark.Run/d3d9 (TIMED OUT) [58/86] PointSpritesBenchmark.Run/d3d9_10_3px_3vars (TIMED OUT) However the tests are re-tried later in the run and all pass successfully, hence why there was no error reported: [90/90] InterleavedAttributeDataBenchmark.Run/d3d11_9_3 (5068 ms) My guess is that https://chromium-review.googlesource.com/761402 changed the tests to run in single-process mode, which then meant the timeouts caused the IO to fail entirely. These tests were timing out as far back as I could see: https://chromium-swarm.appspot.com/task?id=38f0562ab579c510&refresh=10&show_raw=1 I even went back in time to build 800, I could see one test timing out and some others were not. It might be related to some kind of driver bug. The drivers for the AMD perf bots are slightly different from the mainline Chromium try bots. I could not repro the timeouts locally, but can try logging into the bot. Also not sure if OS version matters, I was trying on Win 10. Perf bots driver version: 21.19.137.1 (9-16-2016) GPU bots driver version: 21.19.407.0 (12-23-2016) Unfortunately the differences in how we run on our CQ might be affecting why this timeout does not repro. If the timeouts repro on the bot, I am going to try upgrading the drivers. I won't be able to roll back the driver version without help since I don't have the older version, but I think we can try upgrading if it works (the drivers are bit old by now). If I can't repro even on the bot, unsure what action to take. Also I think we should disable automatic retries for failing tests - we want to catch any and all flakiness immediately.
,
Nov 15 2017
+1 to disable automatic retries for failing tests. It's best to deal with flakiness immediately rather than hiding it
,
Nov 15 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ac9bfc69a284300117f8a718fc05462364e0c867 commit ac9bfc69a284300117f8a718fc05462364e0c867 Author: Charlie Andrews <charliea@chromium.org> Date: Wed Nov 15 20:27:26 2017 Disable angle_perftests on Win 7 ATI GPU perf The test is timing out (see bug). TBR=nednguyen@google.com, jmadill@chromium.org Bug: 785291 Change-Id: I4665c454b34cd791998402353766d54b3022d75a Reviewed-on: https://chromium-review.googlesource.com/771971 Commit-Queue: Ned Nguyen <nednguyen@google.com> Reviewed-by: Charlie Andrews <charliea@chromium.org> Cr-Commit-Position: refs/heads/master@{#516808} [modify] https://crrev.com/ac9bfc69a284300117f8a718fc05462364e0c867/testing/buildbot/chromium.perf.json [modify] https://crrev.com/ac9bfc69a284300117f8a718fc05462364e0c867/tools/perf/core/perf_data_generator.py
,
Nov 15 2017
Seems we can just add the flag --test-launcher-retry-limit=0 to disable retries. Need a good way to test this before submitting it thought since I bet a lot of things will break. +Dirk enabled retries by default for issue 402089 .
,
Nov 15 2017
I was thinking of just setting that flag for angle_perftests, rather than everything. Seems lower risk at least.
,
Nov 15 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ee00f335cb479d1bb4f18739369aee3107586320 commit ee00f335cb479d1bb4f18739369aee3107586320 Author: Charlie Andrews <charliea@chromium.org> Date: Wed Nov 15 22:31:03 2017 Disable gpu_perftests on Android One It's been failing since at least September 20, and no one can figure out why. TBR=nednguyen@google.com, reveman@chromium.org Bug: 785291 Change-Id: I57a1ca702ba52e56c5848216864aba0f794c10f2 Reviewed-on: https://chromium-review.googlesource.com/772033 Commit-Queue: Charlie Andrews <charliea@chromium.org> Reviewed-by: Charlie Andrews <charliea@chromium.org> Cr-Commit-Position: refs/heads/master@{#516864} [modify] https://crrev.com/ee00f335cb479d1bb4f18739369aee3107586320/testing/buildbot/chromium.perf.json [modify] https://crrev.com/ee00f335cb479d1bb4f18739369aee3107586320/tools/perf/core/perf_data_generator.py
,
Nov 16 2017
issue 785554 filed for disabling retries.
,
Nov 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5097c6d9c94b45f578f7f1d96036336ab0a16b68 commit 5097c6d9c94b45f578f7f1d96036336ab0a16b68 Author: Jamie Madill <jmadill@chromium.org> Date: Fri Nov 17 04:26:32 2017 Revert "Run angle_perftests on Windows AMD and Intel." This reverts commit e9a70b2c096050415e2d10b711d9552360b57f5f. Reason for revert: Seems to break on non-swarming configs: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win10%20Release%20%28Intel%20HD%20630%29/builds/1038 https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28AMD%20R7%20240%29/builds/1773 Bug: 785291 Original change's description: > Run angle_perftests on Windows AMD and Intel. > > This will prevent regressions on the perf bots. > > BUG= 785291 > TBR=kbr@chromium.org > > Cq-Include-Trybots: master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel > Change-Id: I58e435466cef8450fe04f1775577dcabb84523e7 > Reviewed-on: https://chromium-review.googlesource.com/771547 > Commit-Queue: Jamie Madill <jmadill@chromium.org> > Reviewed-by: Jamie Madill <jmadill@chromium.org> > Cr-Commit-Position: refs/heads/master@{#516787} TBR=jmadill@chromium.org,kbr@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 785291 Change-Id: I629eb2aced90fe2c9d4a72f003d17679fe267fe3 Cq-Include-Trybots: master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Reviewed-on: https://chromium-review.googlesource.com/775793 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Cr-Commit-Position: refs/heads/master@{#517297} [modify] https://crrev.com/5097c6d9c94b45f578f7f1d96036336ab0a16b68/content/test/gpu/generate_buildbot_json.py [modify] https://crrev.com/5097c6d9c94b45f578f7f1d96036336ab0a16b68/testing/buildbot/chromium.gpu.fyi.json
,
Aug 1
,
Aug 6
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Nov 15 2017