canvas_sub_rectangle tests failing on macOS 10.10 with Intel GPUs |
||||||||||
Issue descriptionSome of the new WebGL 2.0 canvas_sub_rectangle conformance tests are failing on the macOS 10.10 bots with Intel GPUs. These tests failed: WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgb_rgb_unsigned_byte WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgb_rgb_unsigned_short_5_6_5 WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_byte WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_short_4_4_4_4 WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_short_5_5_5_1 WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_r8_red_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_r8ui_red_integer_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rg8_rg_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rg8ui_rg_integer_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgb8_rgb_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgb8ui_rgb_integer_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgba8_rgba_unsigned_byte WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgba8ui_rgba_integer_unsigned_byte The same tests pass on AMD GPUs so I suspect a driver bug -- possibly the same seamless cube map bug as has been observed before. These tests will be disabled on this configuration until they can be diagnosed.
,
Nov 17 2016
,
Nov 17 2016
Some other tests also fail(and still fail with Zhenyao's latest patch at https://codereview.chromium.org/2507863002), the failures may be caused by the same reason. They are: all/conformance2/textures/image_data/tex-2d-rg8ui-rg_integer-unsigned_byte.html all/conformance2/textures/image_data/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html all/conformance2/textures/image_data/tex-2d-rgba8ui-rgba_integer-unsigned_byte.html We have another crbug for failures in files under conformance2/textures/image_data at: https://crbug.com/665197. But it was marked as fixed.
,
Nov 17 2016
kainino@ just worked around the ImageData test failures in https://github.com/KhronosGroup/WebGL/pull/2154 , incorporating the same workaround for https://github.com/KhronosGroup/WebGL/issues/1819 as for some of the other tests.
,
Nov 18 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6 commit e4acee67ea6e24c3e594719ca33b4beeb46b4bc6 Author: kainino <kainino@chromium.org> Date: Fri Nov 18 04:17:13 2016 Work around unknown Mac Intel driver bug causing non-deterministic behavior in GPU readback from 2D canvas BUG= 665656 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2503283007 Cr-Commit-Position: refs/heads/master@{#433094} [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/browser/gpu/gpu_data_manager_impl_private.cc [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/common_param_traits_macros.h [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/web_preferences.cc [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/web_preferences.h [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/renderer/render_view_impl.cc [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/gpu/config/gpu_driver_bug_list_json.cc [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/gpu/config/gpu_driver_bug_workaround_type.h [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/core/frame/Settings.in [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/web/WebSettingsImpl.cpp [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/web/WebSettingsImpl.h [modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/public/web/WebSettings.h
,
Nov 18 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9309c93173ffe2982a8e9e8af127300fc4cd063e commit 9309c93173ffe2982a8e9e8af127300fc4cd063e Author: kainino <kainino@chromium.org> Date: Fri Nov 18 04:28:49 2016 clarify description for force_software_readback_from_2d_canvas This is a quick change to avoid any possible confusion for anyone reading about:gpu. BUG= 665656 NOTRY=true TBR=zmo@chromium.org,esprehn@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2505363004 Cr-Commit-Position: refs/heads/master@{#433103} [modify] https://crrev.com/9309c93173ffe2982a8e9e8af127300fc4cd063e/gpu/config/gpu_driver_bug_list_json.cc
,
Nov 21 2016
,
Nov 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/04a4e99b99f399411837ebe3489e9e2b2bd96685 commit 04a4e99b99f399411837ebe3489e9e2b2bd96685 Author: kainino <kainino@chromium.org> Date: Mon Nov 21 23:55:28 2016 Revert workaround due to perf regression Revert "Work around unknown Mac Intel driver bug causing non-deterministic behavior in GPU readback from 2D canvas" ( https://crrev.com/2503283007 ) Revert "clarify description for force_software_readback_from_2d_canvas" ( https://crrev.com/2505363004 ) BUG=667352, 665656 TBR=kbr@chromium.org,esprehn@chromium.org,piman@chromium.org,zmo@chromium.org,tsepez@chromium.org,ben@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2515413002 Cr-Commit-Position: refs/heads/master@{#433704} [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/browser/gpu/gpu_data_manager_impl_private.cc [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/common_param_traits_macros.h [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/web_preferences.cc [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/web_preferences.h [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/renderer/render_view_impl.cc [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/gpu/config/gpu_driver_bug_list_json.cc [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/gpu/config/gpu_driver_bug_workaround_type.h [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/core/frame/Settings.in [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/web/WebSettingsImpl.cpp [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/web/WebSettingsImpl.h [modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/public/web/WebSettings.h
,
Nov 23 2016
For the non-integer failures above, there's some strange nondeterministic flakiness on Mac Intel in these tests. I suspect it could be nondeterministic due to races, but I'm not sure.
For the record, my investigation so far:
* A few pixels of certain full screen quad renders of these textures seem to come out as black.
* The first test that fails USUALLY fails, but they get more flaky down the page, which is why I suspect races + a bug in the driver state tracking.
* We thought that the bad pixels appeared at the edges of the 2 triangles that form the full-screen quad. But this is NOT the case! See attached.
* CONTIGUOUS areas LARGER than 1x1 are bad.
* The attached images are at 1:1 to the texture (192x192).
* Magenta is the background clear color (the magenta pixels are transparent texels).
* Bad areas appear along the diagonals and NEAR (but not on) the edges. (Diagonal is: if flipY=true, TL-to-BR; if flipY=false, BL-to-TR, on the rendered canvas. I _think_ this is the x ~= y diagonal. Not related to the orientation of the triangles being drawn to the screen.)
* They don't seem to appear along the left edge.
* All fragments render to the screen - it's NOT a rasterization problem.
* It's either a sampling or an upload problem: Bad pixels SAMPLE as transparent black (0,0,0,0) - revealed by tweaking blending/mask and playing with the fragment shader.
* texCoord seems to be coming through from the vertex shader fine, so it's either the sampling, or it's the texture itself.
* Performing the quad render multiple times on top of itself doesn't seem to reduce flakiness - so I suspect it's the texture data, not the sampling.
Not sure where to go from here. I have no idea what in the driver could be causing it.
Intel folks, do you have any idea what could be going on here, now that you have more data?
,
Nov 23 2016
Two more things I just remembered which I forgot to mention. 1. It seems as though the errors are all in the (x >= y) half of the texture. I'm not 100% sure about that, though. 2. These tests seem to mostly-or-entirely fail on WebGL texSubImage2D tests. Since the target-sub-rectangles are the full texture, we sometimes replace the gpu-service-side driver call to glTexSubImage2D with a call to glTexImage2D. This issue persists regardless of which of the two driver calls I force it to bottom out into.
,
Nov 23 2016
@kainino, we will soon investigate this issue at our side. Thanks for all the detailed investigation results! It's better if you can work with @kbr to file a radar on this, so that I can forward it to our driver team.
,
Nov 23 2016
Thanks, I'll make sure that gets done on Monday after the US holiday.
,
Nov 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7001c082c7bde47d1326caa700e6b5693f3c2d4e commit 7001c082c7bde47d1326caa700e6b5693f3c2d4e Author: kainino <kainino@chromium.org> Date: Sat Nov 26 21:41:45 2016 These tests seem to fail only on old versions of macOS (such as 10.10 Yosemite on the bots) BUG= 665656 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2512053002 Cr-Commit-Position: refs/heads/master@{#434577} [modify] https://crrev.com/7001c082c7bde47d1326caa700e6b5693f3c2d4e/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py
,
Nov 30 2016
Assigning to kbr: not sure if you have already filed the radar for this, but if not, can you do so?
,
Dec 2 2016
I'm sorry I haven't gotten to this yet. I have to fix Issue 666061 urgently and there have been too many interruptions over the past few days. Could someone else please take a shot at writing up the Radar description? Kai, I shared some Google Drive folders with you which contain the Radar text. The process for filing Radars with Apple under Google's account is documented at go/applebugs . Kai, may I ask you to please take point on making sure this Radar gets filed?
,
Dec 2 2016
,
Dec 2 2016
Kai, after you submit the radar, please provide us the detailed description, and I will communicate this with our driver team. Now this is one of two critical issues for Intel MacOS (the other is rgb9_e5). The investigation at our side of this issue is still on-going, but we don't have too much to update here.
,
Dec 5 2016
I got back into this investigation so I could provide a better report about what actual call was likely to be flaky. I discovered that this function is bottoming out differently than I thought; in the broken case, it's doing a GPU-CPU readback of the canvas for upload and the data is already incorrect on the CPU. So the problem appears to be in the readback. I'm looking into what is happening. It's possible that the bug is on our side, relying on undefined behavior.
,
Dec 5 2016
If toDataURL is called on the source canvas before the texSubImage2D call, then the bug disappears. This leads me to think that toDataURL is doing something we're not. However, the two code paths look really quite different and I don't understand the differences as I know nothing about SkImages, etc. See: WebGL GPU-CPU readback from 2D canvas (broken): https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp?sq=package:chromium&dr=Ss&rcl=1480940406&l=5037 https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/WebGLImageConversion.cpp?sq=package:chromium&dr=Ss&rcl=1480954684&l=2742 toDataURI/getImageData (working): https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLCanvasElement.cpp?sq=package:chromium&dr=CSs&rcl=1480940406&l=629 I may need to send this to someone who knows more about the SkImage/etc code.
,
Dec 6 2016
It's possible that the first glReadPixels from the accelerated 2D canvas gets the garbage results, and that the second one that is done for the WebGL texture upload then gets good results. If someone could do a writeup of this Radar I could help file it. In the instructions, please point to a Chromium continuous build, because otherwise auto-updating may pick up a workaround. Thanks.
,
Dec 6 2016
If I display the results of that first toDataURL to the screen, then it's always valid. I haven't managed to get the bad pixels to appear there. I was putting off the radar until I had some clue what was going on. I still don't know where this is reaching OS code - but I suppose it's probably a readback from an IOSurface. I'll start a radar writeup based on what I know.
,
Dec 6 2016
After further discussion offline, I think it's reasonably likely that this is a bug in Chrome - that we're provoking unspecified behavior, e.g. via a readback loop (copying to and from the same framebuffer). I haven't yet dug into this but my plan of action is: * at the beginning and end of texImageImpl, insert some no-op commands into the command stream that will be easy to search for * run a debug build with --enable-gpu-service-logging, piped into file, run short-ish test-case * search text file for no-op commands * manually search in that portion of the log for potential problems - such as a readback loop I plan to do this in the morning (PST)
,
Dec 7 2016
bsalomon, fmalita: I've been investigating this bug, but I'm having a lot of trouble navigating the Skia code path that this goes through. There's a glReadPixels somewhere at the bottom, which I'd like to experiment with, but I have no idea where it is. Please see attached my --enable-gpu-service-logging output (from a debug build) with annotations explaining the significance of particular calls (665656_gl_trace_annotated.txt). Also attached is a tiny patch which I used to debug (665656_debug_cmds.diff). As you can see above, we were suspecting a rendering loop (to/from the same texture), but we can't find any evidence of this in the trace. This bug is only reproducible on Mac Intel so we suspect either a driver bug or some unspecified behavior on our part. Also potentially useful: > WebGL GPU-CPU readback from 2D canvas (broken): > https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp?sq=package:chromium&dr=Ss&rcl=1480940406&l=5037 > https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/WebGLImageConversion.cpp?sq=package:chromium&dr=Ss&rcl=1480954684&l=2742 > > toDataURI/getImageData (working and causes a subsequent WebGL GPU-CPU readback to pass(!)): > https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLCanvasElement.cpp?sq=package:chromium&dr=CSs&rcl=1480940406&l=629 Thank you!
,
Dec 7 2016
kainino@, GrGLGpu::onReadPixels in third_party/skia/src/gpu/gl/GrGLGpu.cpp is the only place Skia would call glReadPixels. The call site uses a macro to make the call which may obscure it from searches, GL_CALL(ReadPixels(....))
,
Dec 7 2016
bsalomon, thanks, this is helpful. I've determined that both the broken and working cases are bottoming out similarly - with a texture-to-framebuffer blit and ReadPixels. I haven't spotted any suspicious differences in the service side logs. However, as far as I can tell they are both reaching this point through very different code paths in Skia. Do you know of any significant differences between these two paths? *@intel: I'll finally be filing a Radar on this; I'll post a link when it's out.
,
Dec 8 2016
yang.gu, *@intel: Filed as Radar 29563996
,
Dec 8 2016
kainino@, thanks a lot! I already communicated this to our MacOS GPU Driver team.
,
Dec 8 2016
kainino@, I'm not too familiar with this Blink code that calls Skia. A thing that could be different about the two paths is that Skia will sometimes do an intermediate draw to a temporary surface before reading back the pixels in order to perform some kind of conversion (y-flip, unpremultiply, or red/blue swap). It could be that the intermediate draw either avoids or introduces the issue. This occurs in GrContext::readSurfacePixels. Also, the texture used as the intermediate is often larger than the original texture, so in that case it is a partial glReadPixels instead of a full glReadPixels which may or may not be relevant. Another possible difference could be the use of a non-tight "row bytes" parameter to Skia's glReadPixels which controls the number of bytes between consecutive rows in the output buffer.
,
Dec 8 2016
The difference bsalomon mentioned is consistent with the workaround https://codereview.chromium.org/2547013002/
,
Feb 14 2017
Fixed by 10.12.4 beta2(16E154a).
,
Feb 15 2017
Awesome, thanks! I'm updating now and I'll verify.
,
Jun 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1733d120eb79e2d8d49b18892e60ee1297f646f7 commit 1733d120eb79e2d8d49b18892e60ee1297f646f7 Author: zmo <zmo@chromium.org> Date: Thu Jun 08 23:24:48 2017 Update WebGL2 conformance test expectations for Mac bots. BUG= 598930 , 617290 , 618464 ,630800, 641149 , 643866 , 645298 , 646182 , 654187 , 663188 ,665197, 665656 , 676848 , 679682 , 679684 , 679686 , 679687 , 679689 , 679690 , 679691 , 680278 , 684903 TEST=mac bots on GPU FYI waterfall TBR=kbr@chromium.org,kainino@chromium.org NOTRY=true CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2931993002 Cr-Commit-Position: refs/heads/master@{#478121} [modify] https://crrev.com/1733d120eb79e2d8d49b18892e60ee1297f646f7/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by kainino@chromium.org
, Nov 17 2016