Issue metadata
Sign in to add a comment
|
Draw calls using depth testing fail on intermittent frames with Intel Iris Graphics 6100
Reported by
lau...@mapbox.com,
Nov 11 2017
|
||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36 Steps to reproduce the problem: 1. Load a Mapbox GL map on a machine with an Intel Iris Graphics 6100 GPU. 2. Force the map to render new frames (either by interacting with it or by forcing it into an infinite render loop). What is the expected behavior? What went wrong? Occasionally a frame is rendered incorrectly: in these frames, the only draw calls that are rendered have depth testing disabled. Any draw calls with depth testing enabled don't render anything for that single frame. I've also confirmed that all calls to the WebGL context are identical between consecutive frames that render correctly and incorrectly (and in fact this can be easily reproduced on maps that are not animating in any way, just rerendering in a loop). I can reproduce this on a very simple map with only one layer, although it happens more frequently with more complicated maps. Here's a simple map with just one layer on an infinite loop, which sporadically flickers on machines with affected GPUs: https://bl.ocks.org/anonymous/raw/1543d9a7297b8f966fd57fb482f98fbe/ Here's a more complex map that flickers more frequently (zoom/pan to reproduce): https://api.mapbox.com/styles/v1/mapbox/cj44mfrt20f082snokim4ungi.html?title=true&access_token=pk.eyJ1IjoibGJ1ZCIsImEiOiJjajI3dWE3d2kwMWRqMndvNXZ3ZDA4dTA5In0.KGCx34NWkiRrj3K1-GUTyA#3.96/21.45/-83.04 The most severe example I've seen is in this customer application (zoom/pan to reproduce): https://hivemapper.com/36.60468504172913/-119.85026876195184/zoom7.6/0/0 This was originally reported at https://github.com/mapbox/mapbox-gl-js/issues/5490. According to our users it sounds like this behavior originated in Chrome 62, and we've only heard of it affecting Intel Iris Graphics 6100 chips so far. I have tried and thus far failed to reduce this to a pure WebGL test case, but I'm hoping you all might have some hunches as to what might be causing this (particularly if it's something that changed between v61 and 62) and I'd be happy to keep digging. I'm attaching two videos: one of the flicker on a simple style, and one on a more complex style — in this if you walk through the frames you can see that all of the labels remain during the flicker; they are the only layer that does not use depth testing in this map. Did this work before? Yes v61 (I think) Does this work in other browsers? Yes Chrome version: 62.0.3202.89 Channel: stable OS Version: OS X 10.12.3 Flash Version:
,
Nov 11 2017
The flickering is reproducible by zooming in and out of https://hivemapper.com/36.60468504172913/-119.85026876195184/zoom7.6/0/0 on a MacBook Air with Intel HD 6000 GPU. See about:gpu below. If it's a regression in Chrome 62 then it might be related to the fix for Issue 769488 though an immediate hack to insert a glFlush() just before giving the WebGL back buffer to the compositor didn't work around the problem. Wasn't reproducible on an NVIDIA GPU on 10.12.6. Yunchao, can you reproduce this in house? Does it appear to be GPU specific? Submitter, is there any way you can reduce this test case? Even just a simple WebGL page that draws a bunch of lines or something similar to your rendering path could make this easier to isolate. Graphics Feature Status Canvas: Hardware accelerated CheckerImaging: Disabled Flash: Hardware accelerated Flash Stage3D: Hardware accelerated Flash Stage3D Baseline profile: Hardware accelerated Compositing: Hardware accelerated Multiple Raster Threads: Enabled Native GpuMemoryBuffers: Hardware accelerated Rasterization: Hardware accelerated Video Decode: Hardware accelerated Video Encode: Hardware accelerated WebGL: Hardware accelerated WebGL2: Hardware accelerated Driver Bug Workarounds add_and_true_to_loop_condition adjust_src_dst_region_for_blitframebuffer avoid_stencil_buffers decode_encode_srgb_for_generatemipmap disable_framebuffer_cmaa disable_webgl_rgb_multisampling_usage emulate_abs_int_function get_frag_data_info_bug init_two_cube_map_levels_before_copyteximage msaa_is_slow pack_parameters_workaround_with_pack_buffer rebind_transform_feedback_before_resume regenerate_struct_names remove_invariant_and_centroid_for_essl3 reset_teximage2d_base_level rewrite_texelfetchoffset_to_texelfetch scalarize_vec_and_mat_constructor_args set_zero_level_before_generating_mipmap unfold_short_circuit_as_ternary_operation unpack_alignment_workaround_with_unpack_buffer unpack_image_height_workaround_with_unpack_buffer use_intermediary_for_copy_texture_image use_unused_standard_shared_blocks Problems Detected Unfold short circuit on Mac OS X: 307751 Applied Workarounds: unfold_short_circuit_as_ternary_operation Always rewrite vec/mat constructors to be consistent: 398694 Applied Workarounds: scalarize_vec_and_mat_constructor_args Mac drivers handle struct scopes incorrectly: 403957 Applied Workarounds: regenerate_struct_names On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565 Applied Workarounds: msaa_is_slow glGenerateMipmap fails if the zero texture level is not set on some Mac drivers: 560499 Applied Workarounds: set_zero_level_before_generating_mipmap Pack parameters work incorrectly with pack buffer bound: 563714 Applied Workarounds: pack_parameters_workaround_with_pack_buffer Alignment works incorrectly with unpack buffer bound: 563714 Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer copyTexImage2D fails when reading from IOSurface on multiple GPU types.: 581777 Applied Workarounds: use_intermediary_for_copy_texture_image Multisample renderbuffers with format GL_RGB8 have performance issues on Intel GPUs.: 607130 Applied Workarounds: disable_webgl_rgb_multisampling_usage Limited enabling of Chromium GL_INTEL_framebuffer_CMAA: 535198 Applied Workarounds: disable_framebuffer_cmaa glGetFragData{Location|Index} works incorrectly on Max: 638340 Applied Workarounds: get_frag_data_info_bug glResumeTransformFeedback works incorrectly on Intel GPUs: 638514 Applied Workarounds: rebind_transform_feedback_before_resume Result of abs(i) where i is an integer in vertex shader is wrong: 642227 Applied Workarounds: emulate_abs_int_function Rewrite texelFetchOffset to texelFetch for Intel Mac: 642605 Applied Workarounds: rewrite_texelfetchoffset_to_texelfetch Rewrite condition in for and while loops for Intel Mac: 644669 Applied Workarounds: add_and_true_to_loop_condition Decode and encode before generateMipmap for srgb format textures on macosx: 634519 Applied Workarounds: decode_encode_srgb_for_generatemipmap Init first two levels before CopyTexImage2D for cube map texture on Intel Mac 10.12: 648197 Applied Workarounds: init_two_cube_map_levels_before_copyteximage Insert statements to reference all members in unused std140/shared blocks on Mac: 618464 Applied Workarounds: use_unused_standard_shared_blocks Tex(Sub)Image3D performs incorrectly when uploading from unpack buffer with GL_UNPACK_IMAGE_HEIGHT greater than zero on Intel Macs: 654258 Applied Workarounds: unpack_image_height_workaround_with_unpack_buffer adjust src/dst region if blitting pixels outside read framebuffer on Mac: 644740 Applied Workarounds: adjust_src_dst_region_for_blitframebuffer Mac driver GL 4.1 requires invariant and centroid to match between shaders: 639760, 641129 Applied Workarounds: remove_invariant_and_centroid_for_essl3 Disable KHR_blend_equation_advanced until cc shaders are updated: 661715 Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent) Certain Apple devices leak stencil buffers: 713854 Applied Workarounds: avoid_stencil_buffers On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565, 751919 Applied Workarounds: msaa_is_slow Reset TexImage2D base level to 0 on Intel Mac 10.12.4: 705865 Applied Workarounds: reset_teximage2d_base_level Checker-imaging has been disabled via finch trial or the command line. Disabled Features: checker_imaging Version Information Data exported 11/10/2017, 6:00:42 PM Chrome version Chrome/62.0.3202.89 Operating system Mac OS X 10.12.6 Software rendering list version 13.13 Driver bug list version 10.30 ANGLE commit id 842c43ae67ba 2D graphics backend Skia/62 e74b41c6c84638d5a9ee6d254a715bcd9e17c603- Command Line /Applications/Google Chrome.app/Contents/MacOS/Google Chrome --flag-switches-begin --flag-switches-end Driver Information Initialization time 47 In-process GPU false Passthrough Command Decoder false Supports overlays false Sandboxed true GPU0 VENDOR = 0x8086, DEVICE= 0x1626 *ACTIVE* Optimus false Optimus false AMD switchable false Driver vendor Driver version 10.25.17 Driver date Pixel shader version 4.10 Vertex shader version 4.10 Max. MSAA samples 8 Machine model name MacBookAir Machine model version 7.2 GL_VENDOR Intel Inc. GL_RENDERER Intel(R) HD Graphics 6000 GL_VERSION 4.1 INTEL-10.25.17 GL_EXTENSIONS GL_ARB_blend_func_extended GL_ARB_draw_buffers_blend GL_ARB_draw_indirect GL_ARB_ES2_compatibility GL_ARB_explicit_attrib_location GL_ARB_gpu_shader_fp64 GL_ARB_gpu_shader5 GL_ARB_instanced_arrays GL_ARB_internalformat_query GL_ARB_occlusion_query2 GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_separate_shader_objects GL_ARB_shader_bit_encoding GL_ARB_shader_subroutine GL_ARB_shading_language_include GL_ARB_tessellation_shader GL_ARB_texture_buffer_object_rgb32 GL_ARB_texture_cube_map_array GL_ARB_texture_gather GL_ARB_texture_query_lod GL_ARB_texture_rgb10_a2ui GL_ARB_texture_storage GL_ARB_texture_swizzle GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ARB_vertex_attrib_64bit GL_ARB_vertex_type_2_10_10_10_rev GL_ARB_viewport_array GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_texture_compression_s3tc GL_EXT_texture_filter_anisotropic GL_EXT_texture_sRGB_decode GL_APPLE_client_storage GL_APPLE_container_object_shareable GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_texture_range GL_ATI_texture_mirror_once GL_NV_texture_barrier Disabled Extensions GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent Window system binding vendor Window system binding version Window system binding extensions Direct rendering Yes Reset notification strategy 0x0000 GPU process crash count 0 Compositor Information Tile Update Mode Zero-copy Partial Raster Enabled GpuMemoryBuffers Status ATC Software only ATCIA Software only DXT1 Software only DXT5 Software only ETC1 Software only R_8 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT R_16 Software only RG_88 Software only BGR_565 Software only RGBA_4444 Software only RGBX_8888 Software only RGBA_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE BGRX_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE BGRA_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT RGBA_F16 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT YVU_420 Software only YUV_420_BIPLANAR GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT UYVY_422 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
,
Nov 14 2017
kbr, I still haven't managed to replay our rendering path in such a way as to reproduce this without our library, but I'm still trying. In the meantime, if either of these additional notes surface any hunches: * In our 3D render path (not used in any of the examples provided above), we render to a texture attached to an offscreen framebuffer, and read and write depth from a depth renderbuffer attached to that framebuffer. We later copy that texture back to the main framebuffer with depth testing disabled. These layers are successfully (and consistently) rendered when the map flickers, so apparently depth testing is working at some points during a frame (specifically, using an offscreen renderbuffer), and the depth test failures are only affecting the depth buffer attached to the main framebuffer. * I've reproduced this bug many times in testing this past week and 99% of the time the effect is as described above (only fragments from draw calls with depth testing disabled are successfully rendered). *However*, exactly one time I produced a frame where additional fragments render: - draw calls with depth testing disabled still render, as in all other cases of this bug - *additionally*, fragments from draw calls with depth testing enabled that test against a value explicitly written to the depth buffer in a previous draw call render - fragments from those same draw calls that depth test against what should be a cleared depth buffer (we clear the depth buffer to 1 at the beginning of every frame) fail to render Like I said I've only caught this exact effect happening once in all of my testing, but hopefully that might provide some additional clues. I'm attaching the screen capture of this incident — in an early frame you can see the effect I've just described, and in a frame later in the capture it goes back to the effect I've seen 99% of the time — and static screen grabs of those frames (capture_normal is expected, capture_bug is the bug as I usually experience it, capture_bug_anomaly is this instance of additional fragments rendering).
,
Nov 14 2017
To clarify, in the anomaly screen grab: the blue (water) layer is rendered very early in the frame and writes to the depth buffer, and is the only draw call in this frame that does so. Also, clearing the depth buffer to a non-1 value does not resolve the problem.
,
Nov 16 2017
I still haven't been able to create a reduced test case (sorry), but I've been stripping down various parts of our library and it looks like this only happens when we clear the stencil buffer (which we clear usually a few times per frame, but sometimes many more — I think that Hivemapper map is so severe because we clear the stencil buffer many times; in a sample capture it looks like 44 clears). If we eliminate stencil buffer clearing the flickering is eliminated completely. I don't know too much about implementations but I do know that a lot of WebGL implementations use a shared depth/stencil buffer, so I'm assuming this must be related to that?
,
Nov 23 2017
Thanks for the continued investigation and sorry for lack of communication. We have been overwhelmed this past week with incoming issues. I think you're on to something and unfortunately I'm pretty sure that you're running into a bug in the graphics driver and not something in the WebGL implementation itself. Can you please continue to work on developing some sort of stress test that reproduces the flickering more reliably? Now that you know it's somehow related to clearing just the stencil buffer when both depth and stencil are present, is it more feasible for you to stress those operations in your framework?
,
Nov 28 2017
We mitigated this in the Mapbox library by manually "clearing" the stencil buffer by drawing a full screen of zero values (https://github.com/mapbox/mapbox-gl-js/pull/5704), which has eliminated the flashing completely. I'll definitely continue to try to reduce a test case (coarsely throwing a lot of extra stencil clears + draws with depth reading enabled at it has not yielded results yet) but won't be able to prioritize it as highly now. Thanks for your ongoing support!
,
Dec 1 2017
,
Dec 1 2017
Xinghua, please take a look at this issue.
,
Jan 23 2018
Sorry I haven't had time to revisit this for a while. I've created a reduced test case that I'm attaching here. To reproduce on a machine with an affected GPU, view it in a window small enough that the example overflows vertically, and rapidly scroll up and down (I've only tested this with a trackpad). The background is cleared to gray, and there are two triangles rendered: the black triangle uses depth testing and intermittently fails; the red triangle does not depth test and always renders. (It's currently rendering at 50ms intervals; you can change it to be faster to reproduce more frequently or slower to produce the effect more obviously but less frequently.) Some notes: * I tried programmatically scrolling to random Y positions, and have been unable to reproduce this effect in doing so — in this example it only reproduces with user scroll. (In the Mapbox GL library however this could be seen with no user interaction at all, just with a continuous render loop — not sure if that's a factor of complexity maybe?) * When we noticed this in our library code the effect was more pronounced the more times we cleared the stencil buffer. I'm clearing the stencil 5x here, an arbitrary number but one that seems to produce the bug fairly consistently. * I'm not sure whether it's exacerbated by compositing — I've added a border-radius to the canvas; it's reproducible without this but it seems to happen more frequently with it. Hopefully you're able to reproduce using this example — if not, let me know and I can try adding back some more GL calls from a trace that may exacerbate it further.
,
Jan 23 2018
Also, > If it's a regression in Chrome 62 then it might be related to the fix for Issue 769488 though an immediate hack to insert a glFlush() just before giving the WebGL back buffer to the compositor didn't work around the problem. I downloaded the snapshots surrounding that commit (http://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Mac/505146/, http://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Mac/505159/) and it does seem like the flickering was introduced there.
,
Jan 23 2018
Assigning kbr for now based on #11.
,
Sep 10
We're having this issue with Cesium. https://groups.google.com/forum/#!searchin/cesium-dev/murray%7Csort:date/cesium-dev/DJIM7QX8k8E/EpjR8YcMBwAJ
,
Sep 10
Thanks for the report. Xinghua, would you still be able to look at this?
,
Sep 10
Sorry I dropped this on the floor. Too many incoming issues. The test case from https://bugs.chromium.org/p/chromium/issues/detail?id=784030#c10 doesn't flicker when scrolled vertically in 71.0.3548.0 (Official Build) canary (64-bit), but the test case from that Cesium thread: https://bl.ocks.org/anonymous/raw/1543d9a7297b8f966fd57fb482f98fbe/#6.91/33.458/-117.9 does flicker badly on this hardware. It turns out that the per-revision bisect result is the same as for Issue 882317 . You are probably looking for a change made after 580212 (known good), but no later than 580213 (first known bad). CHANGELOG URL: The script might not always return single CL as suspect as some perf builds might get missing due to failure. https://chromium.googlesource.com/chromium/src/+log/cf64e143fc871fae4bc0522241294d34b832c33f..989f7a704c225521f08b2a0d7f64c619ba8aea29 So basically – it looks like if Chrome does mid-frame flushes on this hardware, the GPU renders things incorrectly. We're going to be eliminating those in Issue 882317 , but in the meantime, Yang, could your team or Intel's Mac OpenGL driver team please try to see what's going wrong with mid-frame flushes and context switches on the Intel HD 6000 family of GPUs?
,
Sep 11
Thanks for the prompt reply guys, much appreciated
,
Sep 11
,
Oct 3
Hi, Like many of you I am seeing this issue. It is clearly reproducible with the simple app link in the original thread. I must say, it did not happen over the summer so there has been a regression recently. Model Name: MacBook Pro Model Identifier: MacBookPro12,1 Processor Name: Intel Core i7 Processor Speed: 3.1 GHz Number of Processors: 1 Total Number of Cores: 2 L2 Cache (per Core): 256 KB L3 Cache: 4 MB Memory: 16 GB Intel Iris Graphics 6100: Chipset Model: Intel Iris Graphics 6100 Type: GPU Bus: Built-In VRAM (Dynamic, Max): 1536 MB Vendor: Intel (0x8086) Device ID: 0x162b Revision ID: 0x0009 Metal: Supported Chrome Version 69.0.3497.100 (Official Build) (64-bit) Safari Version 12.0 (12606.2.11) does not have this issue Thank You
,
Oct 10
The flickering is reproducible with https://bl.ocks.org/anonymous/raw/1543d9a7297b8f966fd57fb482f98fbe/, by moving the mouse sprite across the address bar back and forth, on my MacBook Air with Intel HD 6000 GPU. I believe it is most likely a driver bug. Ken, I have drafted a radar for this: https://docs.google.com/document/d/1PQIECHWqBc78PAbz8Ujug4z-HLxJGAgsb_qvrM1vr4k/edit.
,
Oct 13
Thanks Jie for creating the Radar template and Lauren and others for the reduced test cases. I attempted to create an automated test based on triangle.html, but wasn't able to, even by mimicking the browser's scrolling behavior. Radar 45242420 has been filed about this, and I've notified colleagues at Apple about it. Hopefully it'll be routed to Intel quickly for investigation. Marking this ExternalDependency as there's no feasible workaround in the browser for this issue. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by kbr@chromium.org
, Nov 11 2017