New issue
Advanced search Search tips

Issue 877803 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Accelerated video decode causes GPU process to freeze for 500+ ms

Reported by nki...@gmail.com, Aug 26

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3528.4 Safari/537.36

Example URL:
imgur.com

Steps to reproduce the problem:
1. Go to imgur.com, and select a post.
2. Move from post to post by using the arrow keys.
3. Note that for video posts (gifv/webm), the entire browser UI freezes for approx 500ms to 1 sec.

What is the expected behavior?
Videos should play immediately, and the browser UI should not freeze.

What went wrong?
Videos take a second or more to start playing, and the browser UI freezes momentarily. Scrolling the page to reveal more gifv/webm embeds causes more freezing.

Did this work before? N/A 

Is it a problem with Flash or HTML5? HTML5

Does this work in other browsers? Yes

Chrome version: 70.0.3532.5  Channel: canary
OS Version: 10.0
Flash Version: 

Contents of chrome://gpu: 

Graphics Feature Status
Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Native GpuMemoryBuffers: Software only. Hardware acceleration disabled
Out-of-process Rasterization: Disabled
Hardware Protected Video Decode: Unavailable
Rasterization: Hardware accelerated
Skia Deferred Display List: Disabled
Skia Renderer: Disabled
Surface Synchronization: Enabled
Video Decode: Hardware accelerated
Viz Service Display Compositor: Enabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
Driver Bug Workarounds
clear_uniforms_before_first_program_use
decode_encode_srgb_for_generatemipmap
disable_delayed_copy_nv12
disable_discard_framebuffer
disable_framebuffer_cmaa
exit_on_context_lost
force_cube_complete
scalarize_vec_and_mat_constructor_args
disabled_extension_GL_KHR_blend_equation_advanced
disabled_extension_GL_KHR_blend_equation_advanced_coherent
Problems Detected
Protected video decoding with swap chain is for Windows and Intel only
Disabled Features: protected_video_decode
Some drivers are unable to reset the D3D device in the GPU process sandbox
Applied Workarounds: exit_on_context_lost
Clear uniforms before first program use on all platforms: 124764, 349137
Applied Workarounds: clear_uniforms_before_first_program_use
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
ANGLE crash on glReadPixels from incomplete cube map texture: 518889
Applied Workarounds: force_cube_complete
Framebuffer discarding can hurt performance on non-tilers: 570897
Applied Workarounds: disable_discard_framebuffer
Use GL_INTEL_framebuffer_CMAA on ChromeOS: 535198
Applied Workarounds: disable_framebuffer_cmaa
Disable KHR_blend_equation_advanced until cc shaders are updated: 661715
Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent)
Decode and Encode before generateMipmap for srgb format textures on Windows: 634519
Applied Workarounds: decode_encode_srgb_for_generatemipmap
Delayed copy NV12 displays incorrect colors on NVIDIA drivers.: 728670
Applied Workarounds: disable_delayed_copy_nv12
Native GpuMemoryBuffers have been disabled, either via about:flags or command line.
Disabled Features: native_gpu_memory_buffers
Skia renderer is not used by default.
Disabled Features: skia_renderer
Skia deferred display list is not used by default.
Disabled Features: skia_deferred_display_list
Version Information
Data exported	2018-08-26T02:41:06.311Z
Chrome version	Chrome/70.0.3532.5
Operating system	Windows NT 10.0.17134
Software rendering list URL	https://chromium.googlesource.com/chromium/src/+/308c5038e0e10caa748fd5a11c26c40707263494/gpu/config/software_rendering_list.json
Driver bug list URL	https://chromium.googlesource.com/chromium/src/+/308c5038e0e10caa748fd5a11c26c40707263494/gpu/config/gpu_driver_bug_list.json
ANGLE commit id	c40974417610
2D graphics backend	Skia/70 4c6514490e966198af427ec4df050470e55653a8-
Command Line	"C:\Users\Nathaniel\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --flag-switches-begin --flag-switches-end
Driver Information
Initialization time	694
In-process GPU	false
Passthrough Command Decoder	false
Sandboxed	true
GPU0	VENDOR = 0x10de [Google Inc.], DEVICE= 0x1b06 [ANGLE (NVIDIA GeForce GTX 1080 Ti Direct3D11 vs_5_0 ps_5_0)] *ACTIVE*
Optimus	false
AMD switchable	false
Desktop compositing	Aero Glass
Direct Composition	true
Supports overlays	false
Overlay capabilities
Diagonal Monitor Size of \\.\DISPLAY2	27.8"
Diagonal Monitor Size of \\.\DISPLAY1	27.0"
Driver D3D12 feature level	D3D 12.1
Driver Vulkan API version	Vulkan API 1.1.0
Driver vendor	NVIDIA
Driver version	398.82
Driver date	7-30-2018
GPU CUDA compute capability major version	0
Pixel shader version	5.0
Vertex shader version	5.0
Max. MSAA samples	8
Machine model name	
Machine model version	
GL_VENDOR	Google Inc.
GL_RENDERER	ANGLE (NVIDIA GeForce GTX 1080 Ti Direct3D11 vs_5_0 ps_5_0)
GL_VERSION	OpenGL ES 2.0 (ANGLE 2.1.0.c40974417610)
GL_EXTENSIONS	GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_explicit_context GL_ANGLE_explicit_context_gles1 GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_program_cache_control GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object OES_compressed_EAC_R11_signed_texture OES_compressed_EAC_R11_unsigned_texture OES_compressed_EAC_RG11_signed_texture OES_compressed_EAC_RG11_unsigned_texture OES_compressed_ETC2_RGB8_texture OES_compressed_ETC2_RGBA8_texture OES_compressed_ETC2_punchthroughA_RGBA8_texture OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture OES_compressed_ETC2_sRGB8_alpha8_texture OES_compressed_ETC2_sRGB8_texture
Disabled Extensions	GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent
Disabled WebGL Extensions	
Window system binding vendor	Google Inc. (adapter LUID: 00000000000120fd)
Window system binding version	1.4 (ANGLE 2.1.0.c40974417610)
Window system binding extensions	EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_CHROMIUM_sync_control EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled
Direct rendering	Yes
Reset notification strategy	0x8252
GPU process crash count	0
Compositor Information
Tile Update Mode	One-copy
Partial Raster	Enabled
GpuMemoryBuffers Status
ATC	Software only
ATCIA	Software only
DXT1	Software only
DXT5	Software only
ETC1	Software only
R_8	Software only
R_16	Software only
RG_88	Software only
BGR_565	Software only
RGBA_4444	Software only
RGBX_8888	GPU_READ, SCANOUT
RGBA_8888	GPU_READ, SCANOUT
BGRX_8888	Software only
BGRX_1010102	Software only
RGBX_1010102	Software only
BGRA_8888	Software only
RGBA_F16	Software only
YVU_420	Software only
YUV_420_BIPLANAR	Software only
UYVY_422	Software only
Display(s) Information
Info	Display[2528732444] bounds=[0,0 2560x1440], workarea=[0,0 2560x1359], scale=1, external.
Color space information	{primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL}
Bits per color component	8
Bits per pixel	24
Info	Display[2779098405] bounds=[2560,0 2561x1440], workarea=[2560,0 2561x1400], scale=1.5, external.
Color space information	{primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL}
Bits per color component	8
Bits per pixel	24
Video Acceleration Information
Decode h264 baseline	up to 4096x2304 pixels
Decode h264 baseline	up to 2304x4096 pixels
Decode h264 main	up to 4096x2304 pixels
Decode h264 main	up to 2304x4096 pixels
Decode h264 high	up to 4096x2304 pixels
Decode h264 high	up to 2304x4096 pixels
Decode vp8	up to 7680x4320 pixels
Decode vp8	up to 4320x7680 pixels
Decode vp9 profile0	up to 7680x4320 pixels
Decode vp9 profile0	up to 4320x7680 pixels
Decode vp9 profile1	up to 7680x4320 pixels
Decode vp9 profile1	up to 4320x7680 pixels
Decode vp9 profile2	up to 7680x4320 pixels
Decode vp9 profile2	up to 4320x7680 pixels
Decode vp9 profile3	up to 7680x4320 pixels
Decode vp9 profile3	up to 4320x7680 pixels
Encode h264 baseline	up to 3840x2176 pixels and/or 30.000 fps
Encode h264 main	up to 3840x2176 pixels and/or 30.000 fps
Encode h264 high	up to 3840x2176 pixels and/or 30.000 fps
Diagnostics
... loading ...
Log Messages
GpuProcessHostUIShim: The GPU process exited normally. Everything is okay.

I used chrome://tracing to confirm that CrGpuMain freezes for hundreds of milliseconds while executing "DXVAVideoDecodeAccelerator::GetSupportedProfiles" and "GetMaxResolutionsForGUIDs".

Disabling HW decode (chrome://flags/#disable-accelerated-video-decode) caused videos to play instantly and freezing is gone.

The issue can be replicated in Chrome Dev 70.0.3528.4 and Chrome Canary 70.0.3532.5, but not in Chrome Stable 68.0.3440.106

I'm running a Nvidia 1080Ti with very recent drivers. monitors are a 2560x1440 with no display scaling, and a 3840x2160 with 150% scaling.
 
See attached trace from Chrome Dev. GPU pid is 15264.
trace_trace1.json.gz
8.5 MB Download
See attached trace from Chrome Dev. GPU pid is 15264.
trace_trace2.json.gz
7.3 MB Download
See attached trace from Chrome Dev. GPU pid is 15264.
trace_trace3.json.gz
3.5 MB Download
See attached trace from Chrome Canary. Note trace is compressed with 7zip to stay under 10MB limit. GPU pid is 1936.
trace_trace.canary.7z
9.1 MB Download
I'm running a Threadripper 1950x with 1080ti. Further investigation reveals that a friend's computer does not experience the exact same behavior. He is running an FX8350 and a 1060. He experiences less delays, with the tracing showing a wait time in "GetMaxResolutionsForGUIDs" of approx 150ms.
Labels: Needs-Triage-M70
Cc: sande...@chromium.org tmathmeyer@chromium.org liber...@chromium.org
Owner: sande...@chromium.org
Status: Assigned (was: Unconfirmed)
GetSupportedProfiles() is being called each time a VdaVideoDecoder is constructed. This information should be gathered once and cached.

Actual run time of GetSupportedProfiles() varies between 3.4ms and 1300ms in the first trace, which is surprising. Followup work may be required to determine why this is sometimes slow on the hardware mentioned.
Hmm pretty surprising that call is taking so long, for a modern GPU it should only be calling into the OS ~4 times per method.

I think we can probably skip this step in that method:
https://cs.chromium.org/chromium/src/media/gpu/windows/dxva_video_decode_accelerator_win.cc?l=304

We already know the decoder GUIDs, we can just try to create one instead of scanning them if that's the expensive part -- but that loop should bail quickly... 

We could also skip the last two steps here maybe:
https://cs.chromium.org/chromium/src/media/gpu/windows/dxva_video_decode_accelerator_win.cc?l=281

I.e., skip config query and decoder creation. Both of these steps were probably added to be avoid crashes though, so it's possible that by skipping those steps we could increase the crash rate on older hardware. Probably adding a couple UMA that test how often we fail if an earlier step succeeds might be useful.
Project Member

Comment 10 by bugdroid1@chromium.org, Aug 29

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5713bd26e3e36fd3b7329fee8aaf6b665152df3f

commit 5713bd26e3e36fd3b7329fee8aaf6b665152df3f
Author: Dan Sanders <sandersd@chromium.org>
Date: Wed Aug 29 20:52:01 2018

Cache VDA capabilities.

VdaVideoDecoder requests VDA capabilities at each construction. While it
would be possible to plumb GPUInfo to VdaVideoDecoder, the expectation
is that this information will not be computed at startup in the future.
This CL caches the results in GpuVideoDecodeAcceleratorFactory using a
static variable.

Bug:  877803 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I639febd1230763077824006ee6427ba43c5fa529
Reviewed-on: https://chromium-review.googlesource.com/1192138
Reviewed-by: Hirokazu Honda <hiroh@chromium.org>
Commit-Queue: Dan Sanders <sandersd@chromium.org>
Cr-Commit-Position: refs/heads/master@{#587293}
[modify] https://crrev.com/5713bd26e3e36fd3b7329fee8aaf6b665152df3f/media/gpu/gpu_video_decode_accelerator_factory.cc

This may be a stupid question, but are there any cases in which the cache should be invalidated during the life of a single GPU process? The main thing I'm thinking of is if GPU drivers are updated on the fly, or perhaps if Chrome is started inside an RDP session, then that user session is connected to the computer's console instead.
The short answer is that this information was already being cached (actually for the entire browser session), but VdaVideoDecoder added a new path without the caching.

For the larger question, the GPU process will be restarted for those events if the 'exit_on_context_lost' workaround is enabled. This workaround is always enabled on Windows.

Finally, getting the information wrong isn't fatal. Either some time is wasted trying to create hardware decoders when they don't work, or hardware decode isn't used when it should be. If you experience those sorts of issues I do recommend filing a bug; we don't always do a good job of supporting context loss.
Thanks for the explanation!

I guess this should hit Canary around 3537 or 3538, and I can re-test then.
Labels: Needs-Feedback
Tried to reproduce the issue on Windows 10 on the reported version 70.0.3532.5 and unable to reproduce the issue by following the below steps.

1. Launched Chrome and the flag #disable-accelerated-video-decode is enabled in chrome://flags.
2. Navigated to http://imgur.com/ and selected a post.
3. Move from post to post using the arrow keys, cannot observe any freeze on browser and the videos are playing immediately.
Attached is the screen cast and chrome://gpu details for reference.

sandersd@ Request you to check and help us in verifying the fix on the latest M-70 build.

Thanks..
877803.mp4
1.5 MB View Download
Attaching chrome://gpu details as mentioned in comment #14.

Thanks..
gpu.html
68.5 KB View Download
#13: This CL was included in Canary 70.0.3537.0, released today.
I can confirm that Canary 70.0.3537.2 is working much better, with no perceptible difference in playing videos with or without hw acceleration.
Project Member

Comment 18 by bugdroid1@chromium.org, Sep 5

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c1f2b0c04480ae536077125eb29cf5c3d299d1dc

commit c1f2b0c04480ae536077125eb29cf5c3d299d1dc
Author: Dale Curtis <dalecurtis@chromium.org>
Date: Tue Sep 04 21:39:00 2018

Add UMA to DXVA decoder to determine if we need expensive checks.

We've seen a few user traces where IsResolutionSupportedForDevice()
is quite expensive. Lets add some metrics to see if we actually
need to be so thorough or if we can just rely on the accuracy
of GetVideoDecoderConfigCount() failing or returning zero configs.

BUG= 877803 
TEST=none

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Ifbfea9134c8c4d8e555f328deaa3af3ae115844d
Reviewed-on: https://chromium-review.googlesource.com/1199626
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Ilya Sherman <isherman@chromium.org>
Reviewed-by: Chrome Cunningham <chcunningham@chromium.org>
Cr-Commit-Position: refs/heads/master@{#588643}
[modify] https://crrev.com/c1f2b0c04480ae536077125eb29cf5c3d299d1dc/media/gpu/windows/dxva_video_decode_accelerator_win.cc
[modify] https://crrev.com/c1f2b0c04480ae536077125eb29cf5c3d299d1dc/tools/metrics/histograms/histograms.xml

I think we can probably drop these checks on Windows 10+, the failure rate is ~2% elsewhere, but 0.08% for Windows 10 (caveat beta+ data - not stable). Dan's planning to make them lazily loaded, so we don't have to do this, but if these are really blocking for hundreds of milliseconds on the gpu thread (even the first time through), it might be worth pulling out.
Project Member

Comment 21 by bugdroid1@chromium.org, Dec 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a2b198406a771372f2823d9f5c527bd6ef743d4c

commit a2b198406a771372f2823d9f5c527bd6ef743d4c
Author: Dale Curtis <dalecurtis@chromium.org>
Date: Wed Dec 19 23:51:08 2018

Skip expensive CreateDecoder test for DXVA resolution tests.

These calls can take hundreds of milliseconds to complete while only
failing 0.4% of the time. Since playback works fine in even the cases
where we get it wrong, go ahead and only use the cheap test.

BUG= 877803 
TEST=invalid resolution fallback is seamless.
R=sandersd

Change-Id: I399d453fc72c21148a1e1e24e7e95fbf94f70d32
Reviewed-on: https://chromium-review.googlesource.com/c/1381244
Reviewed-by: Dan Sanders <sandersd@chromium.org>
Reviewed-by: Frank Liberato <liberato@chromium.org>
Reviewed-by: Jesse Doherty <jwd@chromium.org>
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#618012}
[modify] https://crrev.com/a2b198406a771372f2823d9f5c527bd6ef743d4c/media/gpu/windows/dxva_video_decode_accelerator_win.cc
[modify] https://crrev.com/a2b198406a771372f2823d9f5c527bd6ef743d4c/tools/metrics/histograms/histograms.xml

Fixed now Dan?
Status: Fixed (was: Assigned)

Sign in to add a comment