New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 665656 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug

Blocked on:
issue 639145
issue 667352



Sign in to add a comment

canvas_sub_rectangle tests failing on macOS 10.10 with Intel GPUs

Project Member Reported by kbr@chromium.org, Nov 16 2016

Issue description

Some of the new WebGL 2.0 canvas_sub_rectangle conformance tests are failing on the macOS 10.10 bots with Intel GPUs. These tests failed:

WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgb_rgb_unsigned_byte
WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgb_rgb_unsigned_short_5_6_5
WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_byte
WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_short_4_4_4_4
WebglConformance_conformance_textures_canvas_sub_rectangle_tex_2d_rgba_rgba_unsigned_short_5_5_5_1
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_r8_red_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_r8ui_red_integer_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rg8_rg_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rg8ui_rg_integer_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgb8_rgb_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgb8ui_rgb_integer_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgba8_rgba_unsigned_byte
WebglConformance_conformance2_textures_canvas_sub_rectangle_tex_2d_rgba8ui_rgba_integer_unsigned_byte

The same tests pass on AMD GPUs so I suspect a driver bug -- possibly the same seamless cube map bug as has been observed before. These tests will be disabled on this configuration until they can be diagnosed.

 
Status: Started (was: Available)
Integer tests seem to be due to
https://github.com/KhronosGroup/WebGL/issues/1819 .
Fixing in https://github.com/KhronosGroup/WebGL/pull/2155 .

Some of the others are some other unknown problem which causes nondeterministic behavior in the path for GPU readback from 2D canvases. They are persistent on macOS 10.12.2 beta. Adding a workaround.
Owner: kainino@chromium.org
Cc: yang...@intel.com
Some other tests also fail(and still fail with Zhenyao's latest patch at https://codereview.chromium.org/2507863002), the failures may be caused by the same reason. They are:

all/conformance2/textures/image_data/tex-2d-rg8ui-rg_integer-unsigned_byte.html
all/conformance2/textures/image_data/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html
all/conformance2/textures/image_data/tex-2d-rgba8ui-rgba_integer-unsigned_byte.html

We have another crbug for failures in files under conformance2/textures/image_data at: https://crbug.com/665197. But it was marked as fixed.

Comment 4 by kbr@chromium.org, Nov 17 2016

kainino@ just worked around the ImageData test failures in https://github.com/KhronosGroup/WebGL/pull/2154 , incorporating the same workaround for https://github.com/KhronosGroup/WebGL/issues/1819 as for some of the other tests.

Project Member

Comment 5 by bugdroid1@chromium.org, Nov 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6

commit e4acee67ea6e24c3e594719ca33b4beeb46b4bc6
Author: kainino <kainino@chromium.org>
Date: Fri Nov 18 04:17:13 2016

Work around unknown Mac Intel driver bug causing non-deterministic behavior in GPU readback from 2D canvas

BUG= 665656 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2503283007
Cr-Commit-Position: refs/heads/master@{#433094}

[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/browser/gpu/gpu_data_manager_impl_private.cc
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/common_param_traits_macros.h
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/web_preferences.cc
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/public/common/web_preferences.h
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/renderer/render_view_impl.cc
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/gpu/config/gpu_driver_bug_list_json.cc
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/gpu/config/gpu_driver_bug_workaround_type.h
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/core/frame/Settings.in
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/web/WebSettingsImpl.cpp
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/Source/web/WebSettingsImpl.h
[modify] https://crrev.com/e4acee67ea6e24c3e594719ca33b4beeb46b4bc6/third_party/WebKit/public/web/WebSettings.h

Project Member

Comment 6 by bugdroid1@chromium.org, Nov 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9309c93173ffe2982a8e9e8af127300fc4cd063e

commit 9309c93173ffe2982a8e9e8af127300fc4cd063e
Author: kainino <kainino@chromium.org>
Date: Fri Nov 18 04:28:49 2016

clarify description for force_software_readback_from_2d_canvas

This is a quick change to avoid any possible confusion for anyone reading about:gpu.

BUG= 665656 
NOTRY=true
TBR=zmo@chromium.org,esprehn@chromium.org
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2505363004
Cr-Commit-Position: refs/heads/master@{#433103}

[modify] https://crrev.com/9309c93173ffe2982a8e9e8af127300fc4cd063e/gpu/config/gpu_driver_bug_list_json.cc

Comment 7 by kbr@chromium.org, Nov 21 2016

Blockedon: 667352
Project Member

Comment 8 by bugdroid1@chromium.org, Nov 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/04a4e99b99f399411837ebe3489e9e2b2bd96685

commit 04a4e99b99f399411837ebe3489e9e2b2bd96685
Author: kainino <kainino@chromium.org>
Date: Mon Nov 21 23:55:28 2016

Revert workaround due to perf regression

Revert "Work around unknown Mac Intel driver bug causing non-deterministic behavior in GPU readback from 2D canvas" ( https://crrev.com/2503283007 )

Revert "clarify description for force_software_readback_from_2d_canvas" ( https://crrev.com/2505363004 )

BUG=667352, 665656 
TBR=kbr@chromium.org,esprehn@chromium.org,piman@chromium.org,zmo@chromium.org,tsepez@chromium.org,ben@chromium.org

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2515413002
Cr-Commit-Position: refs/heads/master@{#433704}

[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/browser/gpu/gpu_data_manager_impl_private.cc
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/common_param_traits_macros.h
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/web_preferences.cc
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/public/common/web_preferences.h
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/renderer/render_view_impl.cc
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/gpu/config/gpu_driver_bug_list_json.cc
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/gpu/config/gpu_driver_bug_workaround_type.h
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/core/frame/Settings.in
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/web/WebSettingsImpl.cpp
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/Source/web/WebSettingsImpl.h
[modify] https://crrev.com/04a4e99b99f399411837ebe3489e9e2b2bd96685/third_party/WebKit/public/web/WebSettings.h

For the non-integer failures above, there's some strange nondeterministic flakiness on Mac Intel in these tests. I suspect it could be nondeterministic due to races, but I'm not sure.

For the record, my investigation so far:
* A few pixels of certain full screen quad renders of these textures seem to come out as black.
* The first test that fails USUALLY fails, but they get more flaky down the page, which is why I suspect races + a bug in the driver state tracking.
* We thought that the bad pixels appeared at the edges of the 2 triangles that form the full-screen quad. But this is NOT the case! See attached.
  * CONTIGUOUS areas LARGER than 1x1 are bad.
  * The attached images are at 1:1 to the texture (192x192).
    * Magenta is the background clear color (the magenta pixels are transparent texels).
  * Bad areas appear along the diagonals and NEAR (but not on) the edges. (Diagonal is: if flipY=true, TL-to-BR; if flipY=false, BL-to-TR, on the rendered canvas. I _think_ this is the x ~= y diagonal. Not related to the orientation of the triangles being drawn to the screen.)
  * They don't seem to appear along the left edge.
  * All fragments render to the screen - it's NOT a rasterization problem.
  * It's either a sampling or an upload problem: Bad pixels SAMPLE as transparent black (0,0,0,0) - revealed by tweaking blending/mask and playing with the fragment shader.
    * texCoord seems to be coming through from the vertex shader fine, so it's either the sampling, or it's the texture itself.
* Performing the quad render multiple times on top of itself doesn't seem to reduce flakiness - so I suspect it's the texture data, not the sampling.

Not sure where to go from here. I have no idea what in the driver could be causing it.

Intel folks, do you have any idea what could be going on here, now that you have more data?
actual_texture.png
1.0 KB View Download
actual_texture2.png
1.2 KB View Download
actual_texture3.png
1.1 KB View Download
actual_texture4.png
1.3 KB View Download
actual_texture5.png
1.1 KB View Download
actual_texture_flipY=true.png
1.3 KB View Download
Two more things I just remembered which I forgot to mention.

1. It seems as though the errors are all in the (x >= y) half of the texture. I'm not 100% sure about that, though.

2. These tests seem to mostly-or-entirely fail on WebGL texSubImage2D tests. Since the target-sub-rectangles are the full texture, we sometimes replace the gpu-service-side driver call to glTexSubImage2D with a call to glTexImage2D. This issue persists regardless of which of the two driver calls I force it to bottom out into.

Comment 11 by yang...@intel.com, Nov 23 2016

@kainino, we will soon investigate this issue at our side. Thanks for all the detailed investigation results!
It's better if you can work with @kbr to file a radar on this, so that I can forward it to our driver team. 
Thanks, I'll make sure that gets done on Monday after the US holiday.
Project Member

Comment 13 by bugdroid1@chromium.org, Nov 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7001c082c7bde47d1326caa700e6b5693f3c2d4e

commit 7001c082c7bde47d1326caa700e6b5693f3c2d4e
Author: kainino <kainino@chromium.org>
Date: Sat Nov 26 21:41:45 2016

These tests seem to fail only on old versions of macOS (such as 10.10 Yosemite on the bots)

BUG= 665656 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2512053002
Cr-Commit-Position: refs/heads/master@{#434577}

[modify] https://crrev.com/7001c082c7bde47d1326caa700e6b5693f3c2d4e/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py

Owner: kbr@chromium.org
Assigning to kbr: not sure if you have already filed the radar for this, but if not, can you do so?

Comment 15 by kbr@chromium.org, Dec 2 2016

Owner: kainino@chromium.org
Status: Assigned (was: Started)
I'm sorry I haven't gotten to this yet. I have to fix  Issue 666061  urgently and there have been too many interruptions over the past few days.

Could someone else please take a shot at writing up the Radar description? Kai, I shared some Google Drive folders with you which contain the Radar text.

The process for filing Radars with Apple under Google's account is documented at go/applebugs .

Kai, may I ask you to please take point on making sure this Radar gets filed?

Comment 16 by kbr@chromium.org, Dec 2 2016

Status: Started (was: Assigned)

Comment 17 by yang...@intel.com, Dec 2 2016

Cc: jie.a.c...@intel.com
Kai, after you submit the radar, please provide us the detailed description, and I will communicate this with our driver team. Now this is one of two critical issues for Intel MacOS (the other is rgb9_e5). 
The investigation at our side of this issue is still on-going, but we don't have too much to update here. 
I got back into this investigation so I could provide a better report about what actual call was likely to be flaky. I discovered that this function is bottoming out differently than I thought; in the broken case, it's doing a GPU-CPU readback of the canvas for upload and the data is already incorrect on the CPU. So the problem appears to be in the readback. I'm looking into what is happening. It's possible that the bug is on our side, relying on undefined behavior.
If toDataURL is called on the source canvas before the texSubImage2D call, then the bug disappears. This leads me to think that toDataURL is doing something we're not. However, the two code paths look really quite different and I don't understand the differences as I know nothing about SkImages, etc. See:

WebGL GPU-CPU readback from 2D canvas (broken):
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp?sq=package:chromium&dr=Ss&rcl=1480940406&l=5037
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/WebGLImageConversion.cpp?sq=package:chromium&dr=Ss&rcl=1480954684&l=2742

toDataURI/getImageData (working):
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLCanvasElement.cpp?sq=package:chromium&dr=CSs&rcl=1480940406&l=629

I may need to send this to someone who knows more about the SkImage/etc code.

Comment 20 by kbr@chromium.org, Dec 6 2016

It's possible that the first glReadPixels from the accelerated 2D canvas gets the garbage results, and that the second one that is done for the WebGL texture upload then gets good results.

If someone could do a writeup of this Radar I could help file it. In the instructions, please point to a Chromium continuous build, because otherwise auto-updating may pick up a workaround. Thanks.

If I display the results of that first toDataURL to the screen, then it's always valid. I haven't managed to get the bad pixels to appear there.

I was putting off the radar until I had some clue what was going on. I still don't know where this is reaching OS code - but I suppose it's probably a readback from an IOSurface. I'll start a radar writeup based on what I know.
After further discussion offline, I think it's reasonably likely that this is a bug in Chrome - that we're provoking unspecified behavior, e.g. via a readback loop (copying to and from the same framebuffer). I haven't yet dug into this but my plan of action is:

* at the beginning and end of texImageImpl, insert some no-op commands into the command stream that will be easy to search for
* run a debug build with --enable-gpu-service-logging, piped into file, run short-ish test-case
* search text file for no-op commands
* manually search in that portion of the log for potential problems - such as a readback loop

I plan to do this in the morning (PST)
Cc: bsalomon@chromium.org fmalita@chromium.org
Components: Internals>Skia
Labels: -Pri-3 Pri-2
bsalomon, fmalita: I've been investigating this bug, but I'm having a lot of trouble navigating the Skia code path that this goes through. There's a glReadPixels somewhere at the bottom, which I'd like to experiment with, but I have no idea where it is.

Please see attached my --enable-gpu-service-logging output (from a debug build) with annotations explaining the significance of particular calls (665656_gl_trace_annotated.txt). Also attached is a tiny patch which I used to debug (665656_debug_cmds.diff).

As you can see above, we were suspecting a rendering loop (to/from the same texture), but we can't find any evidence of this in the trace. This bug is only reproducible on Mac Intel so we suspect either a driver bug or some unspecified behavior on our part.

Also potentially useful:

> WebGL GPU-CPU readback from 2D canvas (broken):
> https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp?sq=package:chromium&dr=Ss&rcl=1480940406&l=5037
> https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/WebGLImageConversion.cpp?sq=package:chromium&dr=Ss&rcl=1480954684&l=2742
> 
> toDataURI/getImageData (working and causes a subsequent WebGL GPU-CPU readback to pass(!)):
> https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLCanvasElement.cpp?sq=package:chromium&dr=CSs&rcl=1480940406&l=629

Thank you!
665656_debug_cmds.diff
1.8 KB Download
665656_gl_trace_annotated.txt
963 KB View Download
kainino@, GrGLGpu::onReadPixels in third_party/skia/src/gpu/gl/GrGLGpu.cpp is the only place Skia would call glReadPixels. The call site uses a macro to make the call which may obscure it from searches, GL_CALL(ReadPixels(....))

bsalomon, thanks, this is helpful. I've determined that both the broken and working cases are bottoming out similarly - with a texture-to-framebuffer blit and ReadPixels. I haven't spotted any suspicious differences in the service side logs. However, as far as I can tell they are both reaching this point through very different code paths in Skia. Do you know of any significant differences between these two paths?

*@intel: I'll finally be filing a Radar on this; I'll post a link when it's out.
yang.gu, *@intel: Filed as Radar 29563996

Comment 27 by yang...@intel.com, Dec 8 2016

kainino@, thanks a lot! I already communicated this to our MacOS GPU Driver team. 
kainino@, I'm not too familiar with this Blink code that calls Skia. A thing that could be different about the two paths is that Skia will sometimes do an intermediate draw to a temporary surface before reading back the pixels in order to perform some kind of conversion (y-flip, unpremultiply, or red/blue swap). It could be that the intermediate draw either avoids or introduces the issue. This occurs in GrContext::readSurfacePixels. Also, the texture used as the intermediate is often larger than the original texture, so in that case it is a partial glReadPixels instead of a full glReadPixels which may or may not be relevant. Another possible difference could be the use of a non-tight "row bytes" parameter to Skia's glReadPixels which controls the number of bytes between consecutive rows in the output buffer.

Comment 29 by zmo@chromium.org, Dec 8 2016

The difference bsalomon mentioned is consistent with the workaround https://codereview.chromium.org/2547013002/
Status: Fixed (was: Started)
Fixed by 10.12.4 beta2(16E154a).
Awesome, thanks! I'm updating now and I'll verify.
Project Member

Comment 32 by bugdroid1@chromium.org, Jun 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1733d120eb79e2d8d49b18892e60ee1297f646f7

commit 1733d120eb79e2d8d49b18892e60ee1297f646f7
Author: zmo <zmo@chromium.org>
Date: Thu Jun 08 23:24:48 2017

Update WebGL2 conformance test expectations for Mac bots.

BUG= 598930 , 617290 , 618464 ,630800, 641149 , 643866 , 645298 , 646182 , 654187 , 663188 ,665197, 665656 , 676848 , 679682 , 679684 , 679686 , 679687 , 679689 , 679690 , 679691 , 680278 , 684903 
TEST=mac bots on GPU FYI waterfall
TBR=kbr@chromium.org,kainino@chromium.org
NOTRY=true
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2931993002
Cr-Commit-Position: refs/heads/master@{#478121}

[modify] https://crrev.com/1733d120eb79e2d8d49b18892e60ee1297f646f7/content/test/gpu/gpu_tests/webgl2_conformance_expectations.py

Sign in to add a comment