Accelerated video decode causes GPU process to freeze for 500+ ms
Reported by
nki...@gmail.com,
Aug 26
|
||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3528.4 Safari/537.36 Example URL: imgur.com Steps to reproduce the problem: 1. Go to imgur.com, and select a post. 2. Move from post to post by using the arrow keys. 3. Note that for video posts (gifv/webm), the entire browser UI freezes for approx 500ms to 1 sec. What is the expected behavior? Videos should play immediately, and the browser UI should not freeze. What went wrong? Videos take a second or more to start playing, and the browser UI freezes momentarily. Scrolling the page to reveal more gifv/webm embeds causes more freezing. Did this work before? N/A Is it a problem with Flash or HTML5? HTML5 Does this work in other browsers? Yes Chrome version: 70.0.3532.5 Channel: canary OS Version: 10.0 Flash Version: Contents of chrome://gpu: Graphics Feature Status Canvas: Hardware accelerated Flash: Hardware accelerated Flash Stage3D: Hardware accelerated Flash Stage3D Baseline profile: Hardware accelerated Compositing: Hardware accelerated Multiple Raster Threads: Enabled Native GpuMemoryBuffers: Software only. Hardware acceleration disabled Out-of-process Rasterization: Disabled Hardware Protected Video Decode: Unavailable Rasterization: Hardware accelerated Skia Deferred Display List: Disabled Skia Renderer: Disabled Surface Synchronization: Enabled Video Decode: Hardware accelerated Viz Service Display Compositor: Enabled WebGL: Hardware accelerated WebGL2: Hardware accelerated Driver Bug Workarounds clear_uniforms_before_first_program_use decode_encode_srgb_for_generatemipmap disable_delayed_copy_nv12 disable_discard_framebuffer disable_framebuffer_cmaa exit_on_context_lost force_cube_complete scalarize_vec_and_mat_constructor_args disabled_extension_GL_KHR_blend_equation_advanced disabled_extension_GL_KHR_blend_equation_advanced_coherent Problems Detected Protected video decoding with swap chain is for Windows and Intel only Disabled Features: protected_video_decode Some drivers are unable to reset the D3D device in the GPU process sandbox Applied Workarounds: exit_on_context_lost Clear uniforms before first program use on all platforms: 124764, 349137 Applied Workarounds: clear_uniforms_before_first_program_use Always rewrite vec/mat constructors to be consistent: 398694 Applied Workarounds: scalarize_vec_and_mat_constructor_args ANGLE crash on glReadPixels from incomplete cube map texture: 518889 Applied Workarounds: force_cube_complete Framebuffer discarding can hurt performance on non-tilers: 570897 Applied Workarounds: disable_discard_framebuffer Use GL_INTEL_framebuffer_CMAA on ChromeOS: 535198 Applied Workarounds: disable_framebuffer_cmaa Disable KHR_blend_equation_advanced until cc shaders are updated: 661715 Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent) Decode and Encode before generateMipmap for srgb format textures on Windows: 634519 Applied Workarounds: decode_encode_srgb_for_generatemipmap Delayed copy NV12 displays incorrect colors on NVIDIA drivers.: 728670 Applied Workarounds: disable_delayed_copy_nv12 Native GpuMemoryBuffers have been disabled, either via about:flags or command line. Disabled Features: native_gpu_memory_buffers Skia renderer is not used by default. Disabled Features: skia_renderer Skia deferred display list is not used by default. Disabled Features: skia_deferred_display_list Version Information Data exported 2018-08-26T02:41:06.311Z Chrome version Chrome/70.0.3532.5 Operating system Windows NT 10.0.17134 Software rendering list URL https://chromium.googlesource.com/chromium/src/+/308c5038e0e10caa748fd5a11c26c40707263494/gpu/config/software_rendering_list.json Driver bug list URL https://chromium.googlesource.com/chromium/src/+/308c5038e0e10caa748fd5a11c26c40707263494/gpu/config/gpu_driver_bug_list.json ANGLE commit id c40974417610 2D graphics backend Skia/70 4c6514490e966198af427ec4df050470e55653a8- Command Line "C:\Users\Nathaniel\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --flag-switches-begin --flag-switches-end Driver Information Initialization time 694 In-process GPU false Passthrough Command Decoder false Sandboxed true GPU0 VENDOR = 0x10de [Google Inc.], DEVICE= 0x1b06 [ANGLE (NVIDIA GeForce GTX 1080 Ti Direct3D11 vs_5_0 ps_5_0)] *ACTIVE* Optimus false AMD switchable false Desktop compositing Aero Glass Direct Composition true Supports overlays false Overlay capabilities Diagonal Monitor Size of \\.\DISPLAY2 27.8" Diagonal Monitor Size of \\.\DISPLAY1 27.0" Driver D3D12 feature level D3D 12.1 Driver Vulkan API version Vulkan API 1.1.0 Driver vendor NVIDIA Driver version 398.82 Driver date 7-30-2018 GPU CUDA compute capability major version 0 Pixel shader version 5.0 Vertex shader version 5.0 Max. MSAA samples 8 Machine model name Machine model version GL_VENDOR Google Inc. GL_RENDERER ANGLE (NVIDIA GeForce GTX 1080 Ti Direct3D11 vs_5_0 ps_5_0) GL_VERSION OpenGL ES 2.0 (ANGLE 2.1.0.c40974417610) GL_EXTENSIONS GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_explicit_context GL_ANGLE_explicit_context_gles1 GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_program_cache_control GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object OES_compressed_EAC_R11_signed_texture OES_compressed_EAC_R11_unsigned_texture OES_compressed_EAC_RG11_signed_texture OES_compressed_EAC_RG11_unsigned_texture OES_compressed_ETC2_RGB8_texture OES_compressed_ETC2_RGBA8_texture OES_compressed_ETC2_punchthroughA_RGBA8_texture OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture OES_compressed_ETC2_sRGB8_alpha8_texture OES_compressed_ETC2_sRGB8_texture Disabled Extensions GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent Disabled WebGL Extensions Window system binding vendor Google Inc. (adapter LUID: 00000000000120fd) Window system binding version 1.4 (ANGLE 2.1.0.c40974417610) Window system binding extensions EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_CHROMIUM_sync_control EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled Direct rendering Yes Reset notification strategy 0x8252 GPU process crash count 0 Compositor Information Tile Update Mode One-copy Partial Raster Enabled GpuMemoryBuffers Status ATC Software only ATCIA Software only DXT1 Software only DXT5 Software only ETC1 Software only R_8 Software only R_16 Software only RG_88 Software only BGR_565 Software only RGBA_4444 Software only RGBX_8888 GPU_READ, SCANOUT RGBA_8888 GPU_READ, SCANOUT BGRX_8888 Software only BGRX_1010102 Software only RGBX_1010102 Software only BGRA_8888 Software only RGBA_F16 Software only YVU_420 Software only YUV_420_BIPLANAR Software only UYVY_422 Software only Display(s) Information Info Display[2528732444] bounds=[0,0 2560x1440], workarea=[0,0 2560x1359], scale=1, external. Color space information {primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL} Bits per color component 8 Bits per pixel 24 Info Display[2779098405] bounds=[2560,0 2561x1440], workarea=[2560,0 2561x1400], scale=1.5, external. Color space information {primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL} Bits per color component 8 Bits per pixel 24 Video Acceleration Information Decode h264 baseline up to 4096x2304 pixels Decode h264 baseline up to 2304x4096 pixels Decode h264 main up to 4096x2304 pixels Decode h264 main up to 2304x4096 pixels Decode h264 high up to 4096x2304 pixels Decode h264 high up to 2304x4096 pixels Decode vp8 up to 7680x4320 pixels Decode vp8 up to 4320x7680 pixels Decode vp9 profile0 up to 7680x4320 pixels Decode vp9 profile0 up to 4320x7680 pixels Decode vp9 profile1 up to 7680x4320 pixels Decode vp9 profile1 up to 4320x7680 pixels Decode vp9 profile2 up to 7680x4320 pixels Decode vp9 profile2 up to 4320x7680 pixels Decode vp9 profile3 up to 7680x4320 pixels Decode vp9 profile3 up to 4320x7680 pixels Encode h264 baseline up to 3840x2176 pixels and/or 30.000 fps Encode h264 main up to 3840x2176 pixels and/or 30.000 fps Encode h264 high up to 3840x2176 pixels and/or 30.000 fps Diagnostics ... loading ... Log Messages GpuProcessHostUIShim: The GPU process exited normally. Everything is okay. I used chrome://tracing to confirm that CrGpuMain freezes for hundreds of milliseconds while executing "DXVAVideoDecodeAccelerator::GetSupportedProfiles" and "GetMaxResolutionsForGUIDs". Disabling HW decode (chrome://flags/#disable-accelerated-video-decode) caused videos to play instantly and freezing is gone. The issue can be replicated in Chrome Dev 70.0.3528.4 and Chrome Canary 70.0.3532.5, but not in Chrome Stable 68.0.3440.106 I'm running a Nvidia 1080Ti with very recent drivers. monitors are a 2560x1440 with no display scaling, and a 3840x2160 with 150% scaling.
,
Aug 26
See attached trace from Chrome Dev. GPU pid is 15264.
,
Aug 26
See attached trace from Chrome Dev. GPU pid is 15264.
,
Aug 26
See attached trace from Chrome Canary. Note trace is compressed with 7zip to stay under 10MB limit. GPU pid is 1936.
,
Aug 26
I'm running a Threadripper 1950x with 1080ti. Further investigation reveals that a friend's computer does not experience the exact same behavior. He is running an FX8350 and a 1060. He experiences less delays, with the tracing showing a wait time in "GetMaxResolutionsForGUIDs" of approx 150ms.
,
Aug 26
,
Aug 27
,
Aug 27
GetSupportedProfiles() is being called each time a VdaVideoDecoder is constructed. This information should be gathered once and cached. Actual run time of GetSupportedProfiles() varies between 3.4ms and 1300ms in the first trace, which is surprising. Followup work may be required to determine why this is sometimes slow on the hardware mentioned.
,
Aug 27
Hmm pretty surprising that call is taking so long, for a modern GPU it should only be calling into the OS ~4 times per method. I think we can probably skip this step in that method: https://cs.chromium.org/chromium/src/media/gpu/windows/dxva_video_decode_accelerator_win.cc?l=304 We already know the decoder GUIDs, we can just try to create one instead of scanning them if that's the expensive part -- but that loop should bail quickly... We could also skip the last two steps here maybe: https://cs.chromium.org/chromium/src/media/gpu/windows/dxva_video_decode_accelerator_win.cc?l=281 I.e., skip config query and decoder creation. Both of these steps were probably added to be avoid crashes though, so it's possible that by skipping those steps we could increase the crash rate on older hardware. Probably adding a couple UMA that test how often we fail if an earlier step succeeds might be useful.
,
Aug 29
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5713bd26e3e36fd3b7329fee8aaf6b665152df3f commit 5713bd26e3e36fd3b7329fee8aaf6b665152df3f Author: Dan Sanders <sandersd@chromium.org> Date: Wed Aug 29 20:52:01 2018 Cache VDA capabilities. VdaVideoDecoder requests VDA capabilities at each construction. While it would be possible to plumb GPUInfo to VdaVideoDecoder, the expectation is that this information will not be computed at startup in the future. This CL caches the results in GpuVideoDecodeAcceleratorFactory using a static variable. Bug: 877803 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: I639febd1230763077824006ee6427ba43c5fa529 Reviewed-on: https://chromium-review.googlesource.com/1192138 Reviewed-by: Hirokazu Honda <hiroh@chromium.org> Commit-Queue: Dan Sanders <sandersd@chromium.org> Cr-Commit-Position: refs/heads/master@{#587293} [modify] https://crrev.com/5713bd26e3e36fd3b7329fee8aaf6b665152df3f/media/gpu/gpu_video_decode_accelerator_factory.cc
,
Aug 30
This may be a stupid question, but are there any cases in which the cache should be invalidated during the life of a single GPU process? The main thing I'm thinking of is if GPU drivers are updated on the fly, or perhaps if Chrome is started inside an RDP session, then that user session is connected to the computer's console instead.
,
Aug 30
The short answer is that this information was already being cached (actually for the entire browser session), but VdaVideoDecoder added a new path without the caching. For the larger question, the GPU process will be restarted for those events if the 'exit_on_context_lost' workaround is enabled. This workaround is always enabled on Windows. Finally, getting the information wrong isn't fatal. Either some time is wasted trying to create hardware decoders when they don't work, or hardware decode isn't used when it should be. If you experience those sorts of issues I do recommend filing a bug; we don't always do a good job of supporting context loss.
,
Aug 30
Thanks for the explanation! I guess this should hit Canary around 3537 or 3538, and I can re-test then.
,
Aug 30
Tried to reproduce the issue on Windows 10 on the reported version 70.0.3532.5 and unable to reproduce the issue by following the below steps. 1. Launched Chrome and the flag #disable-accelerated-video-decode is enabled in chrome://flags. 2. Navigated to http://imgur.com/ and selected a post. 3. Move from post to post using the arrow keys, cannot observe any freeze on browser and the videos are playing immediately. Attached is the screen cast and chrome://gpu details for reference. sandersd@ Request you to check and help us in verifying the fix on the latest M-70 build. Thanks..
,
Aug 30
Attaching chrome://gpu details as mentioned in comment #14. Thanks..
,
Aug 30
#13: This CL was included in Canary 70.0.3537.0, released today.
,
Aug 31
I can confirm that Canary 70.0.3537.2 is working much better, with no perceptible difference in playing videos with or without hw acceleration.
,
Sep 5
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c1f2b0c04480ae536077125eb29cf5c3d299d1dc commit c1f2b0c04480ae536077125eb29cf5c3d299d1dc Author: Dale Curtis <dalecurtis@chromium.org> Date: Tue Sep 04 21:39:00 2018 Add UMA to DXVA decoder to determine if we need expensive checks. We've seen a few user traces where IsResolutionSupportedForDevice() is quite expensive. Lets add some metrics to see if we actually need to be so thorough or if we can just rely on the accuracy of GetVideoDecoderConfigCount() failing or returning zero configs. BUG= 877803 TEST=none Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: Ifbfea9134c8c4d8e555f328deaa3af3ae115844d Reviewed-on: https://chromium-review.googlesource.com/1199626 Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Ilya Sherman <isherman@chromium.org> Reviewed-by: Chrome Cunningham <chcunningham@chromium.org> Cr-Commit-Position: refs/heads/master@{#588643} [modify] https://crrev.com/c1f2b0c04480ae536077125eb29cf5c3d299d1dc/media/gpu/windows/dxva_video_decode_accelerator_win.cc [modify] https://crrev.com/c1f2b0c04480ae536077125eb29cf5c3d299d1dc/tools/metrics/histograms/histograms.xml
,
Sep 20
I think we can probably drop these checks on Windows 10+, the failure rate is ~2% elsewhere, but 0.08% for Windows 10 (caveat beta+ data - not stable). Dan's planning to make them lazily loaded, so we don't have to do this, but if these are really blocking for hundreds of milliseconds on the gpu thread (even the first time through), it might be worth pulling out.
,
Dec 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a2b198406a771372f2823d9f5c527bd6ef743d4c commit a2b198406a771372f2823d9f5c527bd6ef743d4c Author: Dale Curtis <dalecurtis@chromium.org> Date: Wed Dec 19 23:51:08 2018 Skip expensive CreateDecoder test for DXVA resolution tests. These calls can take hundreds of milliseconds to complete while only failing 0.4% of the time. Since playback works fine in even the cases where we get it wrong, go ahead and only use the cheap test. BUG= 877803 TEST=invalid resolution fallback is seamless. R=sandersd Change-Id: I399d453fc72c21148a1e1e24e7e95fbf94f70d32 Reviewed-on: https://chromium-review.googlesource.com/c/1381244 Reviewed-by: Dan Sanders <sandersd@chromium.org> Reviewed-by: Frank Liberato <liberato@chromium.org> Reviewed-by: Jesse Doherty <jwd@chromium.org> Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Cr-Commit-Position: refs/heads/master@{#618012} [modify] https://crrev.com/a2b198406a771372f2823d9f5c527bd6ef743d4c/media/gpu/windows/dxva_video_decode_accelerator_win.cc [modify] https://crrev.com/a2b198406a771372f2823d9f5c527bd6ef743d4c/tools/metrics/histograms/histograms.xml
,
Dec 20
Fixed now Dan?
,
Dec 20
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by nki...@gmail.com
, Aug 268.5 MB
8.5 MB Download