gpu_rasterization_tests: GpuRasterization.BlueBox flaky on new Win/AMD hardware |
||||||||||||||||||||
Issue descriptionStarted failing here: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/24070 Several other flaky builds: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28ATI%29/builds/6143 https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/24084 https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/24104 https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/2427 Unsure if this affects only the new hardware/drivers. (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:350 *************** BROWSER STANDARD OUTPUT *************** Can't get standard output with --show-stdout (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:352 (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:355 *********** END OF BROWSER STANDARD OUTPUT ************ (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:357 ********************* BROWSER LOG ********************* (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:359 No log file (INFO) 2016-10-06 03:23:30,663 browser.DumpStateUponFailure:362 ***************** END OF BROWSER LOG ****************** (WARNING) 2016-10-06 03:23:30,663 shared_page_state.DumpStateUponFailure:151 Taking screenshots upon failures disabled. Traceback (most recent call last): File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 86, in _RunStoryAndProcessErrorIfNeeded state.RunStory(results) File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "c:\b\swarm_slave\w\irflsq4f\content\test\gpu\gpu_tests\gpu_test_base.py", line 111, in RunStory RunStoryWithRetries(GpuSharedPageState, self, results) File "c:\b\swarm_slave\w\irflsq4f\content\test\gpu\gpu_tests\gpu_test_base.py", line 72, in RunStoryWithRetries super(cls, shared_page_state).RunStory(results) File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 312, in RunStory self._current_page, self._current_tab, results) File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "c:\b\swarm_slave\w\irflsq4f\content\test\gpu\gpu_tests\gpu_rasterization.py", line 65, in ValidateAndMeasurePage device_pixel_ratio) File "c:\b\swarm_slave\w\irflsq4f\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "c:\b\swarm_slave\w\irflsq4f\content\test\gpu\gpu_tests\cloud_storage_test_base.py", line 264, in _ValidateScreenshotSamples self.options.test_machine_name) File "c:\b\swarm_slave\w\irflsq4f\content\test\gpu\gpu_tests\cloud_storage_test_base.py", line 88, in _CompareScreenshotSamples str(actual_color.b) + "]") Failure: Expected pixel at [5, 5] (actual pixel (5, 5)) to be [0, 128, 0] but got [0, 0, 0] [ FAILED ] GpuRasterization.BlueBox (18605 ms)
,
Oct 6 2016
Issue 653365 has been merged into this issue.
,
Oct 6 2016
Thanks for finding and suppressing the flaky failure Jamie. Unfortunately it looks like the longstanding D3D driver problem which affects ANGLE's glReadPixels implementation still isn't fixed.
,
Oct 7 2016
Jamie pointed out in a separate thread that these tests were reliable on the older AMD GPUs that were just replaced, so it's unlikely to be the same bug. Additionally, I'd forgotten that these tests read back their pixel results in a completely different way than the WebGL conformance tests. It seems likely there's a bug in GPU rasterization on this AMD hardware. The symptom looks like the CSS 3D-transformed div is occasionally not being rendered. Someone should try to reproduce this locally on similar hardware: ./content/test/gpu/run_gpu_test.py gpu_rasterization --story-filter=GpuRasterization.BlueBox --pageset-repeat=100 --max-failures=1
,
Oct 7 2016
My suppression didn't seem to do the trick, test is still failing: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/24135 Can someone take a look at this next week? I'll be out for a while.
,
Oct 7 2016
Thanks for trying. I'll take it.
,
Oct 7 2016
I wonder if Issue 546575 could be related.
,
Oct 7 2016
Gpu Rasterization actually appears really broken with latest AMD drivers in general. I'm guessing this is causing the flakes (the test flakes for me with this change, but a lot of other things are wrong as well). I'm getting black bars all over chrome. I wasn't seeing this with previous drivers (had to update). I'm currently on AMD Radeon version 16.9.2 on an AMD E2-7110 APU Given that I have a machine that repros this, I'm happy to investigate.
,
Oct 8 2016
Wow. That looks pretty bad. That's surely the cause of the test failure. Thanks Eric for agreeing to take this. Let me assign it to you. A revised suppression for the failing test is in the CQ in https://codereview.chromium.org/2400373002 , btw. Please tell me if you'd like me to put you in touch with our contacts at AMD if you triage this to a driver bug (almost surely, given the regression).
,
Oct 8 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2fc6108a69d4566e51b994d56de6574257330684 commit 2fc6108a69d4566e51b994d56de6574257330684 Author: kbr <kbr@chromium.org> Date: Sat Oct 08 02:10:23 2016 Fix GpuRasterization.BlueBox expectation. Strengthen logic and add unit test to prevent accidental string/int confusion. BUG=653538 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=zmo@chromium.org Review-Url: https://codereview.chromium.org/2400373002 Cr-Commit-Position: refs/heads/master@{#424045} [modify] https://crrev.com/2fc6108a69d4566e51b994d56de6574257330684/content/test/gpu/gpu_tests/gpu_rasterization_expectations.py [modify] https://crrev.com/2fc6108a69d4566e51b994d56de6574257330684/content/test/gpu/gpu_tests/gpu_test_expectations.py [modify] https://crrev.com/2fc6108a69d4566e51b994d56de6574257330684/content/test/gpu/gpu_tests/gpu_test_expectations_unittest.py
,
Oct 10 2016
So, from what I can tell this issue persists pretty far back (checked a Jan. build) with Gpu Raster enabled. Additionally, I tried capturing a trace with Visual Studio, but the issue disappears while capturing frames in VS. This makes me suspect that we're dealing with a synchronization issue.
,
Oct 10 2016
Adding additional glFinishes around potentially problematic sync points doesn't help things, but using D3D9 Angle does fix this, so likely a D3D11 issue.
,
Oct 11 2016
That's quite disconcerting. Please talk directly with geofflang@ tomorrow about this.
,
Oct 12 2016
Discussed with geofflang@, he will look into this tomorrow. I was able to capture a trace which shows the error. You can find it attached (capture.vsglog). The trace can be played back using DXCap.exe on windows. If played back with an ATI card on latest drivers, the final frame will include blue in it (see screenshot_1). If played back with WARP, the screenshot shows the correct pixels (screenshot_2). To play this back locally you can run: DXCap.exe -p capture.vsglog or DSCap.exe -p capture.vsglog -warp The fact that this DX playback is different on WARP/Driver almost certainly indicates a driver bug. We should forward this trace along to someone at AMD. The capture can also be opened in Visual Studio, which allows you to browse through all the DX commands that generated the frame. Note that when debugging under VS you won't see the error visually.
,
Oct 12 2016
Excellent work Eric. Bonus points for using a cute cat picture in the test case.
,
Oct 12 2016
Fyi - the issue seems to have appeared in the 16.7.3 driver - 16.7.2 doesn't show the bug, but 16.7.3+ does.
,
Oct 12 2016
Dug into this a bit today but not many results. It appears draw calls or texture data are getting corrupted on this driver but it's hard to narrow it down to where the corruption first happens because all the playback tools use software emulation. AMD PerfStudio and RenderDoc also don't seem to be able to capture anything despite being able to connect to Chrome. I think the next action is to forward the capture file to AMD as it is almost certainly a driver bug.
,
Oct 12 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ef28a21704b6ecfed0b6adb96197b383a91a2fd9 commit ef28a21704b6ecfed0b6adb96197b383a91a2fd9 Author: ericrk <ericrk@chromium.org> Date: Wed Oct 12 21:26:41 2016 Blacklist problematic AMD drivers for GPU raster Adding the problematic driver versions to the GPU raster blacklist. R=jbauman BUG=653538 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2415663002 Cr-Commit-Position: refs/heads/master@{#424862} [modify] https://crrev.com/ef28a21704b6ecfed0b6adb96197b383a91a2fd9/gpu/config/software_rendering_list_json.cc
,
Oct 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/516005ea9c2ffe59b5d468bf4673a549996010df commit 516005ea9c2ffe59b5d468bf4673a549996010df Author: kbr <kbr@chromium.org> Date: Sat Oct 22 04:33:50 2016 Remove GpuRasterization.BlueBox suppression for Win/AMD. GPU rasterization has been disabled on this configuration. BUG=653538 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=zmo@chromium.org Review-Url: https://chromiumcodereview.appspot.com/2442963002 Cr-Commit-Position: refs/heads/master@{#426972} [modify] https://crrev.com/516005ea9c2ffe59b5d468bf4673a549996010df/content/test/gpu/gpu_tests/gpu_rasterization_expectations.py
,
Oct 22 2016
Marking ExternalDependency until a driver's been released fixing this issue.
,
Oct 24 2016
If "GPU rasterization has been disabled on this configuration", why does it fail here: https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/2567?
,
Oct 24 2016
,
Oct 24 2016
Do the tests enable GPU rasterization via flags? I believe that still overrides the blacklist - maybe we still need the suppression.
,
Oct 24 2016
Yes, I see "--force-gpu-rasterization" in the log. Will revert https://chromiumcodereview.appspot.com/2442963002 then.
,
Oct 24 2016
Oh, you already did that. Thanks!
,
Oct 24 2016
Oops. Sorry about that.
,
Oct 24 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/02503a0d49c9335c434730dc77e2d38c620f2310 commit 02503a0d49c9335c434730dc77e2d38c620f2310 Author: ericrk <ericrk@chromium.org> Date: Mon Oct 24 21:05:49 2016 Revert of Remove GpuRasterization.BlueBox suppression for Win/AMD. (patchset #1 id:1 of https://chromiumcodereview.appspot.com/2442963002/ ) Reason for revert: Looks like the tests use --force-gpu-rasterization, which overrides the blacklist. I think we still need this suppression. See: https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/2567 https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28ATI%29/builds/6394 Original issue's description: > Remove GpuRasterization.BlueBox suppression for Win/AMD. > > GPU rasterization has been disabled on this configuration. > > BUG=653538 > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel > TBR=zmo@chromium.org > > Committed: https://crrev.com/516005ea9c2ffe59b5d468bf4673a549996010df > Cr-Commit-Position: refs/heads/master@{#426972} TBR=zmo@chromium.org,kbr@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG=653538 Review-Url: https://codereview.chromium.org/2443283002 Cr-Commit-Position: refs/heads/master@{#427149} [modify] https://crrev.com/02503a0d49c9335c434730dc77e2d38c620f2310/content/test/gpu/gpu_tests/gpu_rasterization_expectations.py
,
Oct 25 2016
,
Oct 26 2016
Your change meets the bar and is auto-approved for M55 (branch: 2883)
,
Oct 26 2016
Your change meets the bar and is auto-approved for M55 (branch: 2883)
,
Oct 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/29fbdaad72fa7dbd57450a1b4a3da212fc332358 commit 29fbdaad72fa7dbd57450a1b4a3da212fc332358 Author: Eric Karl <ericrk@chromium.org> Date: Wed Oct 26 19:58:58 2016 Blacklist problematic AMD drivers for GPU raster Adding the problematic driver versions to the GPU raster blacklist. R=jbauman BUG=653538 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel (cherry picked from commit ef28a21704b6ecfed0b6adb96197b383a91a2fd9) Review-Url: https://codereview.chromium.org/2415663002 Cr-Original-Commit-Position: refs/heads/master@{#424862} Cr-Commit-Position: refs/branch-heads/2883@{#312} Cr-Branched-From: 614d31daee2f61b0180df403a8ad43f20b9f6dd7-refs/heads/master@{#423768} [modify] https://crrev.com/29fbdaad72fa7dbd57450a1b4a3da212fc332358/gpu/config/software_rendering_list_json.cc
,
Oct 26 2016
Here https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28ATI%29/builds/6421 GpuRasterization.BlueBox failed all 3 retries. If this becomes too flaky, we should consider Fail in the expectations instead of Flaky.
,
Oct 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b135cd74b3f4624ab9512dbd25ac07e404eb8c79 commit b135cd74b3f4624ab9512dbd25ac07e404eb8c79 Author: jmadill <jmadill@chromium.org> Date: Thu Oct 06 16:04:20 2016 Mark GpuRasterization.BlueBox flaky on new Win/AMD drivers. This test seems to have started flaking since the change to new AMD. It looks as though it had been flaking before we switched the swarming pool to the new cards. BUG=653538 TBR=kbr@chromium.org,zmo@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2392463008 Cr-Commit-Position: refs/heads/master@{#423550} [modify] https://crrev.com/b135cd74b3f4624ab9512dbd25ac07e404eb8c79/content/test/gpu/gpu_tests/gpu_rasterization_expectations.py
,
Oct 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/29fbdaad72fa7dbd57450a1b4a3da212fc332358 commit 29fbdaad72fa7dbd57450a1b4a3da212fc332358 Author: Eric Karl <ericrk@chromium.org> Date: Wed Oct 26 19:58:58 2016 Blacklist problematic AMD drivers for GPU raster Adding the problematic driver versions to the GPU raster blacklist. R=jbauman BUG=653538 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel (cherry picked from commit ef28a21704b6ecfed0b6adb96197b383a91a2fd9) Review-Url: https://codereview.chromium.org/2415663002 Cr-Original-Commit-Position: refs/heads/master@{#424862} Cr-Commit-Position: refs/branch-heads/2883@{#312} Cr-Branched-From: 614d31daee2f61b0180df403a8ad43f20b9f6dd7-refs/heads/master@{#423768} [modify] https://crrev.com/29fbdaad72fa7dbd57450a1b4a3da212fc332358/gpu/config/software_rendering_list_json.cc
,
Nov 4 2016
[Automated comment] removing mislabelled merge-merged-2840
,
Nov 30 2016
I see it failed 3 retries here: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28AMD%29/builds/173 Though, I don't see any logs related to retrying other than FLAKY TEST FAILURE, retrying: GpuRasterization.BlueBox FLAKY TEST FAILURE, retrying: GpuRasterization.BlueBox So, maybe the harness doesn't really flaky tests.
,
Nov 30 2016
It seems possible the flaky handling in the harness isn't working properly.
,
Dec 1 2016
gpu_rasterization_tests desperately needs to be ported to the gpu_integration_test harness, similarly to how pixel_test was in Issue 352807 . This will fix any issues with broken retries. If you have any time to help with that I'd greatly appreciate it.
,
Dec 12 2016
The new driver version 16.12.1 released 12/8/16 fixes these issues.
,
Dec 12 2016
Thanks accinn1@ for the info. We'll upgrade the drivers on our machines.
,
Dec 12 2016
,
Dec 19 2016
The blacklist has been updated to allow GPU rasterization on drivers which contain the fix. I've opened a new bug to track updating the gpu_rasterization_tests to the gpu_integration_test_harness (comment #39).
,
Dec 25 2016
Brand new Radeon RX 480, Radeon Software Version 16.12.2, still seeing black boxes in chrome. Specifically in google sheets when editing cells, and on soundcloud. Looks like rasterization is enabled, although I don't know how to read the GPU page on chrome. I've completely reinstalled both the drivers and chrome.
,
Dec 30 2016
,
Feb 17 2017
This seems to be fixed on RX 480 with Radeon Software Version 17.2.1. I tested all of my usual tabs, as well as the Pangolin Love Google Doodle, which I assume to be raster dependent to some degree. Not sure if it matters that I'm on Windows 10 with the Anniversary Update. I also verified that the DXCap capture log above ends like the WARP screen shot, not with any blue corruption, first before I attempted to test rasterization in the browser, worried that I may render the flags page unusable due to glitches.
,
Mar 15 2017
,
Mar 16 2017
kbr@, do you know if we have plans to upgrade our drivers to the new 17.3.1 driver? I think the only thing left to do here is to re-enable the test once a driver update has happened. I remember we tried to update to 16.12.1, but hit a number of issues.
,
Mar 16 2017
,
Mar 16 2017
eric: I just filed Issue 702393 about upgrading the sole non-Swarmed Windows AMD bot to the latest driver. If that one's upgraded and it looks good then we can try upgrading the machines in the Swarming pool. It's a fair amount of work to do the driver update completely safely -- see https://www.chromium.org/developers/testing/gpu-testing/gpu-bot-details#TOC-How-to-test-and-deploy-a-driver-update . If you or someone else would like to drive that process, I'll gladly support you. Otherwise we'll take the risk of upgrading the machines in the Swarming pool without first putting that exact hardware and device driver configuration on the waterfall.
,
Mar 30 2017
|
||||||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||||||
Comment 1 by bugdroid1@chromium.org
, Oct 6 2016