New issue
Advanced search Search tips

Issue 718215 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

380 MB memory spike (88 MB physical RAM) during GPU process initialization

Project Member Reported by brucedaw...@chromium.org, May 3 2017

Issue description

Chrome Version: 60.0.3088.0 (Official Build) canary (64-bit) (cohort: 64-Bit)
OS: Windows 10 RTM
Machine: Z840

What steps will reproduce the problem?
(1) Launch Chrome while recording an ETW heap trace
(2) Load the trace and notice the large spike for the GPU process
(3) Note the stack and duration

What is the expected result?
No huge memory spike

What happens instead?
380 MB of transient heap allocations

The memory is not directly allocated by chrome. It is allocated by mfh264enc.dll. The allocations all come from IsResolutionSupported. The three indented functions at the bottom of the call stack below show where the allocation paths diverge. The amounts allocated by the three paths are 172, 161, and 36 MB.

  |    chrome_child.dll!ui::mojom::GpuMainStubDispatch::Accept
  |    chrome_child.dll!content::GpuChildThread::CreateGpuService
  |    chrome_child.dll!ui::GpuService::UpdateGPUInfoFromPreferences
  |    chrome_child.dll!media::GpuVideoEncodeAccelerator::GetSupportedProfiles
  |    chrome_child.dll!media::MediaFoundationVideoEncodeAccelerator::GetSupportedProfiles
  |    chrome_child.dll!media::MediaFoundationVideoEncodeAccelerator::IsResolutionSupported
  |    mfh264enc.dll!CH264MFT::SetOutputType
  |    mfh264enc.dll!CH264MFTInternal::SetOutputType
  |    mfh264enc.dll!CH264MFTInternal::OnOutputTypeChanged
  |    mfh264enc.dll!CAvcEncoder::InitializeEncoder
  |    mfh264enc.dll!CStrmEncoder::InitializeEncoder
  |    mfh264enc.dll!CStrmEncoder::InitCoreEncoder
  |    |- mfh264enc.dll!CFrameQueue::Init
  |    |- mfh264enc.dll!CPool<CRef>::PoolInit
  |    |- mfh264enc.dll!CPicture::Create

The memory is all freed within about 40 ms, so this is not a memory leak. However this brief spike is big enough that it will eject about 380 MB of data from the disk cache, thus slowing startup by some ill-defined amount. If we can prevent these allocations then we will probably also save a bit of CPU time.

It's possible the memory is never touched, in which case it won't be faulted into the process and the actual consequences will be minor, however this is at least worth investigating.

chrome:gpu text is here:

Graphics Feature Status
Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Native GpuMemoryBuffers: Software only. Hardware acceleration disabled
Rasterization: Hardware accelerated
Video Decode: Hardware accelerated
Video Encode: Hardware accelerated
VPx Video Decode: Software only, hardware acceleration unavailable
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
Driver Bug Workarounds
clear_uniforms_before_first_program_use
decode_encode_srgb_for_generatemipmap
disable_discard_framebuffer
disable_dxgi_zero_copy_video
disable_framebuffer_cmaa
exit_on_context_lost
force_cube_complete
scalarize_vec_and_mat_constructor_args
texsubimage_faster_than_teximage
Problems Detected
VPx decoding isn't supported before Windows 10 anniversary update.: 616318
Disabled Features: accelerated_vpx_decode
Some drivers are unable to reset the D3D device in the GPU process sandbox
Applied Workarounds: exit_on_context_lost
TexSubImage is faster for full uploads on ANGLE
Applied Workarounds: texsubimage_faster_than_teximage
Clear uniforms before first program use on all platforms: 124764, 349137
Applied Workarounds: clear_uniforms_before_first_program_use
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
ANGLE crash on glReadPixels from incomplete cube map texture: 518889
Applied Workarounds: force_cube_complete
Framebuffer discarding can hurt performance on non-tilers: 570897
Applied Workarounds: disable_discard_framebuffer
Limited enabling of Chromium GL_INTEL_framebuffer_CMAA: 535198
Applied Workarounds: disable_framebuffer_cmaa
Zero-copy NV12 video displays incorrect colors on NVIDIA drivers.: 635319
Applied Workarounds: disable_dxgi_zero_copy_video
Disable KHR_blend_equation_advanced until cc shaders are updated: 661715
Decode and Encode before generateMipmap for srgb format textures on Windows: 634519
Applied Workarounds: decode_encode_srgb_for_generatemipmap
Native GpuMemoryBuffers have been disabled, either via about:flags or command line.
Disabled Features: native_gpu_memory_buffers
Version Information
Data exported	5/3/2017, 3:53:19 PM
Chrome version	Chrome/60.0.3088.0
Operating system	Windows NT 10.0.10240
Software rendering list version	13.4
Driver bug list version	10.5
ANGLE commit id	d262799c3784
2D graphics backend	Skia/60 ad15264f9ad2aa62e1da1ab3105887503ddeb4e9-
Command Line	"C:\Users\brucedawson\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --profile-directory="Profile 1" --enable-heap-profiling=native --flag-switches-begin --trace-export-events-to-etw --flag-switches-end
Driver Information
Initialization time	90
In-process GPU	false
Passthrough Command Decoder	false
Supports overlays	false
Sandboxed	false
GPU0	VENDOR = 0x10de, DEVICE= 0x13ba
Optimus	false
Optimus	false
AMD switchable	false
Desktop compositing	Aero Glass
Driver vendor	NVIDIA
Driver version	21.21.13.7563
Driver date	10-21-2016
Pixel shader version	5.0
Vertex shader version	5.0
Max. MSAA samples	8
Machine model name	
Machine model version	
GL_VENDOR	Google Inc.
GL_RENDERER	ANGLE (NVIDIA Quadro K2200 Direct3D11 vs_5_0 ps_5_0)
GL_VERSION	OpenGL ES 3.0 (ANGLE 2.1.0.d262799c3784)
GL_EXTENSIONS	GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_robust_resource_initialization GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_float GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_norm16 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_EGL_image_external_essl3 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object
Disabled Extensions	GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent
Window system binding vendor	Google Inc. (adapter LUID: 0000000000013bb6)
Window system binding version	1.4 (ANGLE 2.1.0.d262799c3784)
Window system binding extensions	EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture_nv12 EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_CHROMIUM_sync_control EGL_EXT_pixel_format_float EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_create_context_robust_resource_initialization
Direct rendering	Yes
Reset notification strategy	0x8252
GPU process crash count	0
Compositor Information
Tile Update Mode	One-copy
Partial Raster	Enabled
GpuMemoryBuffers Status
ATC	Software only
ATCIA	Software only
DXT1	Software only
DXT5	Software only
ETC1	Software only
R_8	Software only
RG_88	Software only
BGR_565	Software only
RGBA_4444	Software only
RGBX_8888	Software only
RGBA_8888	Software only
BGRX_8888	Software only
BGRA_8888	Software only
RGBA_F16	Software only
YVU_420	Software only
YUV_420_BIPLANAR	Software only
UYVY_422	Software only
Diagnostics
0
b3DAccelerationEnabled	true
b3DAccelerationExists	true
bAGPEnabled	true
bAGPExistenceValid	true
bAGPExists	false
bCanRenderWindow	true
bDDAccelerationEnabled	true
bDriverBeta	false
bDriverDebug	false
bDriverSigned	false
bDriverSignedValid	false
bNoHardware	true
dwBpp	32
dwDDIVersion	9
dwHeight	1152
dwRefreshRate	32
dwWHQLLevel	0
dwWidth	2048
iAdapter	0
lDriverSize	0
lMiniVddSize	0
szAGPStatusEnglish	Not Available
szAGPStatusLocalized	Not Available
szChipType	
szD3DStatusEnglish	Enabled
szD3DStatusLocalized	Enabled
szDACType	
szDDIVersionEnglish	9Ex
szDDIVersionLocalized	9Ex
szDDStatusEnglish	Not Available
szDDStatusLocalized	Not Available
szDXVAHDEnglish	Supported
szDXVAModes	
szDescription	RDPUDD Chained DD
szDeviceId	0x13BA
szDeviceIdentifier	{D7B71E3E-50FA-11CF-076E-9A3018C2D835}
szDeviceName	\\.\DISPLAY1
szDisplayMemoryEnglish	n/a
szDisplayMemoryLocalized	n/a
szDisplayModeEnglish	2048 x 1152 (32 bit) (32Hz)
szDisplayModeLocalized	2048 x 1152 (32 bit) (32Hz)
szDriverAssemblyVersion	
szDriverAttributes	
szDriverDateEnglish	
szDriverDateLocalized	
szDriverLanguageEnglish	
szDriverLanguageLocalized	
szDriverModelEnglish	
szDriverModelLocalized	
szDriverName	
szDriverNodeStrongName	
szDriverSignDate	
szDriverVersion	
szKeyDeviceID	Enum\ROOT\BASICRENDER
szKeyDeviceKey	\REGISTRY\Machine\System\CurrentControlSet\Services\RDPUDD\Device0
szManufacturer	
szMiniVdd	
szMiniVddDateEnglish	
szMiniVddDateLocalized	
szMonitorMaxRes	
szMonitorName	
szNotesEnglish	No problems found.
szNotesLocalized	No problems found.
szOverlayEnglish	Supported
szRankOfInstalledDriver	
szRegHelpText	
szRevision	
szRevisionId	0x00A2
szSubSysId	0x1097103C
szTestResultD3D7English	Not run
szTestResultD3D7Localized	Not run
szTestResultD3D8English	Not run
szTestResultD3D8Localized	Not run
szTestResultD3D9English	Not run
szTestResultD3D9Localized	Not run
szTestResultDDEnglish	Not run
szTestResultDDLocalized	Not run
szVdd	
szVendorId	0x10DE
Log Messages
GpuProcessHostUIShim: The GPU process exited normally. Everything is okay.
 
There are actually two spikes. The first one is only 126 MB and lasts slightly less time. The call stacks are shown here:

  |    chrome_child.dll!ui::mojom::GpuMainStubDispatch::Accept
  |    chrome_child.dll!content::GpuChildThread::CreateGpuService
  |    chrome_child.dll!ui::GpuService::UpdateGPUInfoFromPreferences
  |    chrome_child.dll!media::GpuVideoEncodeAccelerator::GetSupportedProfiles
  |    chrome_child.dll!media::MediaFoundationVideoEncodeAccelerator::GetSupportedProfiles
  |    chrome_child.dll!media::MediaFoundationVideoEncodeAccelerator::InitializeInputOutputSamples
  |    |- mfh264enc.dll!CH264MFT::SetOutputType
  |    |    mfh264enc.dll!CH264MFTInternal::SetOutputType
  |    |    mfh264enc.dll!CH264MFTInternal::OnOutputTypeChanged
  |    |    |- mfh264enc.dll!CAvcEncoder::InitializeEncoder
  |    |    |    mfh264enc.dll!CStrmEncoder::InitializeEncoder
  |    |    |    mfh264enc.dll!CStrmEncoder::InitCoreEncoder
  |    |    |    |- mfh264enc.dll!CFrameQueue::Init
  |    |    |    |- mfh264enc.dll!CPool<CRef>::PoolInit
  |    |    |    |- mfh264enc.dll!CPicture::Create

I can share the traces and instructions on collecting them (obscure but not too crazy) if needed. Monitoring process memory while debugging should also work.
Cc: emir...@chromium.org
I've attached a WPA screenshot showing the spike. It shows outstanding heap allocations for the four chrome processes existing at that point. You have to squint to see the other three because the GPU process spike is so tall in comparison.

To record this trace I:

Grabbed UIforETW (see go/etw), my open source ETW trace recording tool.
In Settings checked "Chrome developer" and put "chrome.exe" in the "Heap-profiled processes" box
In the main window changed the tracing type from "circular buffer tracing" to "heap tracing to file"
After closing my main browser I clicked Start Tracing, then launched Chrome canary, and when that was finished I clicked Save Trace Buffers.
I then double-clicked the newly recorded trace to open it in WPA. From the Graph Explorer I expanded Memory and dragged Heap Allocations to the graphing area. The spike should be visible.

To view the stacks you need to make sure the Type and Stack columns are enabled and then have Process, Handle, Type, Stack as the columns to the left of the vertical orange bar, get symbols loaded, then drill down on the hot stack. If you zoom in so that you can see the memory go up but not back down then the allocations will be listed under the AIFO (Allocated Inside Freed Outside) type.

gpu-process-memory-spike.PNG
19.9 KB View Download
Cc: sande...@chromium.org piman@chromium.org
Yes, this is an expected side affect of creating a MFT HW encode session to query the capabilities of the platform. The current MFT API does not provide a way to query the capabilities of the device, so we create a session and see if it succeeds before reporting VideoEncodeAccelerator::SupportedProfiles. Since these supported profiles are requested at the start of gpu service, it causes this peak.
I can think of two solutions for this:
- Find the supported profiles once, save locally and read back.
- Lazy evaluate the supported profiles such that this is only run when queried.

I am cc'ing piman@ to comment on the feasibility of these, or if there is any other solution he can think of. We also discussed the second option with sandersd@ earlier, applying to VDA as well, but I came across to some problems as these supported profiles are kept in structs and copied out. 

[0] https://cs.chromium.org/chromium/src/media/gpu/media_foundation_video_encode_accelerator_win.cc?rcl=8d0c60dcfe3e5f091be962645a7a41bc32aef0e6&l=99
[1] https://cs.chromium.org/chromium/src/services/ui/gpu/gpu_service.cc?rcl=a16438df21e2fb1b99b8b3f6b80180407c78d3cc&l=137
Cc: chcunningham@chromium.org

Comment 5 by piman@chromium.org, May 5 2017

@#3: I would discourage going in the direction of trying to save supported profiles on disk. We're doing this on linux for the GL strings we need for the GPU blacklist (we have to create a GL context to read them, which is too unstable to do in the browser process, so we do in the GPU process), which we then save into the user profile, and it causes a bunch of problems:
1- behavior changes between first run and subsequent runs causing hard-to-reproduce issues (when workarounds aren't applied on a new profile)
2- resolving the GPU blacklist/workarounds (which needs those strings) is on the critical path for the first pixels on screen, and so depending on the user profile being read (disk operations) is very bad.
3- you have to know when to invalidate the cached info (e.g. driver or hardware changes)
Components: Internals>Media>Hardware
Labels: Performance-Memory
Status: Available (was: Untriaged)
Is it understood why the spike is as large as it is? My 2560x1600 screen probably uses 16,384,000 bytes for its frame buffer so 380 MB is about an order of magnitude more than I would have expected. But, then again, I'm not totally clear on what is being temporarily initialized.

Also, I wanted to see what the actual physical memory impact of this behavior was so I set a breakpoint in MediaFoundationVideoEncodeAccelerator::IsResolutionSupported and stepped through it. I measured memory using Process Explorer at the start and end of this function, and again after returning from it and executing ReleaseEncoderResources(). The results are:

                   Private Bytes     WS Private
Start of function:  184,844 KiB       34,032 KiB
End of function:    435,556 KiB       96,900 KiB
After releasing:     54,940 KiB        8,268 KiB

The increase in Private Bytes is what I was measuring but it is not necessarily particularly bad because it doesn't necessarily represent 'real' memory.

The WS Private column, however, represents real memory. So we can see that the actual size of the temporary memory spike is at least 62 MB, and probably more like 88 MB (26 MB of data was allocated prior to getting to MediaFoundationVideoEncodeAccelerator::IsResolutionSupported and was freed by ReleaseEncoderResources).

The Working Set column shows the same deltas as the WS Private column.

I hope this provides useful context. The attached screen show shows three screen shots of process explorer. The GPU process is 24196, right in the middle.

Memory spike bug 718215.png
63.3 KB View Download
Summary: 380 MB memory spike (88 MB physical RAM) during GPU process initialization (was: 380 MB memory spike during GPU process initialization)
Project Member

Comment 9 by sheriffbot@chromium.org, May 7 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: -emir...@chromium.org
Owner: emir...@chromium.org
Status: Assigned (was: Untriaged)
Current status of the Decode version of this: we're still evaluating whether any kind of cache is necessary at all. MojoVideoDecoder does not have one, instead it calls Initialize() for every query; which is faster now that it can stay on the IO thread in the GPU process.

The Mac experiment has provided mixed evidence here. There is evidence that this IPC is more expensive that we had hoped, but the effect size is diminishing as more users are added to the experiment.

Likely we will add a basic blacklist ('VP9 is not HW-accelerated on Mac') to MojoVideoDecoder. If there is still a measurable difference, we will gather the supported profile information on the first Initialize() call and remember it in the renderer (in-memory only).
sandersd@, can you point me to the experiment: results, what you are tracking, etc.? GpuVideoAcceleratorFactoriesImpl::GetVideoDecodeAcceleratorCapabilities() still in use afaict.

On encode side, things are a little more sensitive to timing as we rely on supported codec info before choosing the right codec in WebRTC peer connection, media recorder and cast. After the first couple attempts at that time, I found this is a blocker. But now, we also have Mojo on VEA(running on GPU IO thread) that this might be worth it. If we can cache the information on GPU side, such that GPUInfo can be updated for the new renderers, we wouldn't need a round trip. 
The experiment is VdaVideoDecoderMac, currently in Canary/Dev at 50%. We are tracking all of the performance-related media UMAs, but it's too early to make strong claims about the results.

GetVideoDecodeAcceleratorCapabilities() is still used, but only by GpuVideoDecoder which will be removed after we roll out to all platforms.

Anything that ends up in GPUInfo is no good, it implies that the data is gathered at early startup, which is something we are trying to avoid (especially on Android where MediaCodec is a source of instability due to some locks being held at that time).

Instead consider using an API to request the data after renderer startup, but not using GPUInfo. It doesn't have to be lazy if latency is critical to your use case.
I noticed in a recent trace that the browser process launches a second short-lived GPU process. The command-line options are almost identical - I think the only significant difference is the addition of --disable-gpu-sandbox. The (truncated) command lines are:

"C:\Users\lfg\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --type=gpu-process --field-trial-handle=544,1745353196404690640,7838101868619046437,131072 --user-data-dir="d:\canary_profile" --gpu-preferences=KAAAAAAAAACAAwBAAQAAAAAAAAAAAGAAAAAAAAAAAAAIAAAAAAAAACgAAAAEAAAAIAAAAAAAAAAoAAAAAAAAADAAAAAAAAAAOAAAAAAAAAAQAAAAAAAAAAAAAAAKAAAAEAAAAAAAAAAAAAAACwAAABAAAAAAAAAAAQAAAAoAAAAQAAAAAAAAAAEAAAALAAAA --user-data-dir="d:\canary_profile" --service-request-channel-token=5808997498415270631 --mojo-platform-


"C:\Users\lfg\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --type=gpu-process --field-trial-handle=544,1745353196404690640,7838101868619046437,131072 --disable-gpu-sandbox --user-data-dir="d:\canary_profile" --gpu-preferences=KAAAAAAAAACAAwBAAQAAAAAAAAAAAGAAAAAAAAAAAAAIAAAAAAAAACgAAAAEAAAAIAAAAAAAAAAoAAAAAAAAADAAAAAAAAAAOAAAAAAAAAAQAAAAAAAAAAAAAAAKAAAAEAAAAAAAAAAAAAAACwAAABAAAAAAAAAAAQAAAAoAAAAQAAAAAAAAAAEAAAALAAAA --user-data-dir="d:\canary_profile" --service-request-channel-token=32674358735118


I mention this because I noticed that the transient ~300+ MB spike happens in both GPU processes.

Cc: zmo@chromium.org magchen@chromium.org
We launch that second transient unsandboxed GPU process on Windows after a small timeout (10s?) to collect data for future experiments (Vulkan / DX12 support). That said in this particular mode, we don't need to initialize everything, in particular we could probably skip video-related initialization.

We can also launch this second process the first time you go to about:gpu, to collect additional driver info to display on the page. In that case, we still don't exactly need to initialize everything, however we need to be careful when merging the GPUInfo not to clobber things we didn't collect.
Owner: magchen@chromium.org
Maggie, can you take this bug? (if you are overwhelmed with tasks, feel free to assign back to me).

Here is what I suggest:

in the two cases where we launch unsandboxed GPU process in Windows
1) we try to collect DxDiagnostics
2) we try to query Vulkan/DX12 supports
We could pass in kUseGL=kGLImplementationDisabledName to the GPU process to bypass initializing GL bindings and collect GPU info and compute GpuFeatureInfo.

Now as mentioned by piman@, we need to be careful the GPUInfo will no longer contain full data, so for the above scenario 1), we are OK, but scenario 2) might be problematic. One suggestion could be we separate vulkan/DX12 support fields into its own structure, and just like DxDiagnostics, we send back that data structure instead of the entire GPUInfo.
Labels: -Pri-3 Pri-2
Let's raise this to P2 because the fix is relatively simple and the gain is not trivia.
I will take a look.
Continuing from the earlier discussion in #13, I will experiment with building a new API for all users of these codec fields in GPUInfo, but that won't be a quick solution. I know that some clients, such as cast, expect this in very early stages of pipeline and delay might be significant for the user triggered calls. However, if the spike/revert is too critical, I can turn off H264 HW encode in Win. H264 codec usage in Win is still low, around 5%.
I don't think I would call the spike critical, especially since most of the memory is not touched and therefore doesn't become physical RAM (300 MB commit, 88 MB physical RAM), so disabling features is probably not necessary, but if we can minimize or avoid the spike in some cases that would have some value - it would be that much more data in the disk cache, for instance.
Cc: -zmo@chromium.org
Labels: OS-Windows
Owner: zmo@chromium.org
magchen is working on optimizing video stack on Windows, so let me take this bug
Project Member

Comment 22 by bugdroid1@chromium.org, Sep 26

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620

commit 0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620
Author: Zhenyao Mo <zmo@chromium.org>
Date: Wed Sep 26 00:59:45 2018

Don't initialize GL bindings etc when launching unsandboxed GPU process to collect DX12/Vulkan/DxDiagnostics on Windows

BUG=718215
TEST=manual
R=piman@chromium.org,dcheng@chromium.org

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Ib24a25af61027814f9c92c5bc53784802c2c954e
Reviewed-on: https://chromium-review.googlesource.com/1244298
Commit-Queue: Zhenyao Mo <zmo@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#594167}
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/components/viz/host/host_gpu_memory_buffer_manager_unittest.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/components/viz/service/gl/gpu_service_impl.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/components/viz/service/gl/gpu_service_impl.h
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_data_manager_impl.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_data_manager_impl.h
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_data_manager_impl_private.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_data_manager_impl_private.h
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_process_host.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/content/browser/gpu/gpu_process_host.h
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/gpu/config/dx_diag_node.cc
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/gpu/config/dx_diag_node.h
[modify] https://crrev.com/0176dbbf6d9dae1ea01f8a3c96eba95d1c3d1620/services/viz/privileged/interfaces/gl/gpu_service.mojom

Project Member

Comment 23 by bugdroid1@chromium.org, Sep 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/da88bdc0b93eb69c5066305d09e83cb688c70d16

commit da88bdc0b93eb69c5066305d09e83cb688c70d16
Author: Zhenyao Mo <zmo@chromium.org>
Date: Thu Sep 27 01:47:33 2018

For Dx12 & Vulkan info collection on GPU process
only send back collected bits rather than full GPUInfo

BUG=718215
TEST=bots, manual
R=piman@chromium.org,magchen@chromium.org,dcheng@chromium.org

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Ib7c6107057bd39579d3bad95400599c5a84ce8b0
Reviewed-on: https://chromium-review.googlesource.com/1246689
Commit-Queue: Zhenyao Mo <zmo@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Reviewed-by: Maggie Chen <magchen@chromium.org>
Reviewed-by: Dominick Ng <dominickn@chromium.org>
Cr-Commit-Position: refs/heads/master@{#594561}
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/components/viz/service/gl/gpu_service_impl.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/devtools/protocol/system_info_handler.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/gpu/gpu_data_manager_impl.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/gpu/gpu_data_manager_impl.h
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/gpu/gpu_data_manager_impl_private.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/gpu/gpu_data_manager_impl_private.h
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/content/browser/gpu/gpu_internals_ui.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/config/gpu_info.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/config/gpu_info.h
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/config/gpu_info_collector.h
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/config/gpu_info_collector_win.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/ipc/common/gpu_info.mojom
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/ipc/common/gpu_info.typemap
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/ipc/common/gpu_info_struct_traits.cc
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/gpu/ipc/common/gpu_info_struct_traits.h
[modify] https://crrev.com/da88bdc0b93eb69c5066305d09e83cb688c70d16/services/viz/privileged/interfaces/gl/gpu_service.mojom

Cc: zmo@chromium.org
Owner: ----
Status: Available (was: Assigned)
The issue mentioned by piman@ in #15 has been fixed, let me unassign myself as this bug is more than just this issue.

Sign in to add a comment