New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 799755 link

Starred by 5 users

Issue metadata

Status: WontFix
Owner:
Out until 24 Jan
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-01-31
OS: Linux , Windows , Chrome , Mac
Pri: 1
Type: Bug

Blocking:
issue 800893



Sign in to add a comment

Increased cpu burn with Site Isolation

Project Member Reported by ajwong@chromium.org, Jan 7 2018

Issue description

Chrome Version: 63.0.3239.84 (Official Build) (64-bit)
OS: OSX 10.13.2

What steps will reproduce the problem?
(1) In clean profile, navigate to http://thewoksoflife.com/2017/01/chinese-new-year-recipes-list/.  Page *must* stay visible.
(2) Open Task Manager and wait a few mins for page to stablize.
(3) Go to about:flags and enable Site Isolation. Repeat the above steps.

What is the expected result?
Should see more processes with some increased memory usage, but roughly the same CPU behavior.

What happens instead?
Eyeballling it, the browser process, with Site Isolation enabled, use a sustained 5-10% more CPU with spiky behavior that gives little periods of sustained 30% burn.

The GPU process sometimes seems to ramp up/down too.

It seems like various ad related code is possibly causing significant dom modifications at some regular interval? Unsure.


about:gpu info:
Note: To properly save this page, select the "Webpage, Complete" option in the Save File dialog.
Graphics Feature Status
Canvas: Hardware accelerated
CheckerImaging: Disabled
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Native GpuMemoryBuffers: Hardware accelerated
Rasterization: Software only, hardware acceleration unavailable
Video Decode: Hardware accelerated
Video Encode: Hardware accelerated
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
Driver Bug Workarounds
adjust_src_dst_region_for_blitframebuffer
avoid_stencil_buffers
decode_encode_srgb_for_generatemipmap
depth_stencil_renderbuffer_resize_emulation
disable_framebuffer_cmaa
get_frag_data_info_bug
needs_offscreen_buffer_workaround
pack_parameters_workaround_with_pack_buffer
regenerate_struct_names
remove_invariant_and_centroid_for_essl3
scalarize_vec_and_mat_constructor_args
set_zero_level_before_generating_mipmap
unfold_short_circuit_as_ternary_operation
unpack_alignment_workaround_with_unpack_buffer
use_intermediary_for_copy_texture_image
use_unused_standard_shared_blocks
Problems Detected
Macs with NVidia GPUs experience rendering issues on High Sierra: 773705
Disabled Features: gpu_rasterization
Work around a bug in offscreen buffers on NVIDIA GPUs on Macs: 89557
Applied Workarounds: needs_offscreen_buffer_workaround
Unfold short circuit on Mac OS X: 307751
Applied Workarounds: unfold_short_circuit_as_ternary_operation
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
Mac drivers handle struct scopes incorrectly: 403957
Applied Workarounds: regenerate_struct_names
glGenerateMipmap fails if the zero texture level is not set on some Mac drivers: 560499
Applied Workarounds: set_zero_level_before_generating_mipmap
Pack parameters work incorrectly with pack buffer bound: 563714
Applied Workarounds: pack_parameters_workaround_with_pack_buffer
Alignment works incorrectly with unpack buffer bound: 563714
Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer
copyTexImage2D fails when reading from IOSurface on multiple GPU types.: 581777
Applied Workarounds: use_intermediary_for_copy_texture_image
Use GL_INTEL_framebuffer_CMAA on ChromeOS: 535198
Applied Workarounds: disable_framebuffer_cmaa
glGetFragData{Location|Index} works incorrectly on Max: 638340
Applied Workarounds: get_frag_data_info_bug
Decode and encode before generateMipmap for srgb format textures on macosx: 634519
Applied Workarounds: decode_encode_srgb_for_generatemipmap
Insert statements to reference all members in unused std140/shared blocks on Mac: 618464
Applied Workarounds: use_unused_standard_shared_blocks
adjust src/dst region if blitting pixels outside read framebuffer on Mac: 644740
Applied Workarounds: adjust_src_dst_region_for_blitframebuffer
Mac driver GL 4.1 requires invariant and centroid to match between shaders: 639760, 641129
Applied Workarounds: remove_invariant_and_centroid_for_essl3
Disable KHR_blend_equation_advanced until cc shaders are updated: 661715
Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent)
Certain Apple devices leak stencil buffers: 713854
Applied Workarounds: avoid_stencil_buffers
Depth/stencil renderbuffers can't be resized on NVIDIA on macOS 10.13: 775202
Applied Workarounds: depth_stencil_renderbuffer_resize_emulation
Checker-imaging has been disabled via finch trial or the command line.
Disabled Features: checker_imaging
Version Information
Data exported	1/6/2018, 5:23:41 PM
Chrome version	Chrome/63.0.3239.84
Operating system	Mac OS X 10.13.2
Software rendering list version	13.13
Driver bug list version	10.34
ANGLE commit id	9095f2b44801
2D graphics backend	Skia/63 dbae7001c9805fb0a4b18fd0cbc889941cb39db4-
Command Line	/Applications/Google Chrome.app/Contents/MacOS/Google Chrome --user-data-dir=/Users/awong/Library/Application Support/Google/Chrome Alt --flag-switches-begin --flag-switches-end
Driver Information
Initialization time	42
In-process GPU	false
Passthrough Command Decoder	false
Supports overlays	false
Sandboxed	true
GPU0	VENDOR = 0x10de, DEVICE= 0x0fe9 *ACTIVE*
GPU1	VENDOR = 0x8086, DEVICE= 0x0d26
Optimus	true
Optimus	true
AMD switchable	false
Driver vendor	
Driver version	10.28.10 355.11.10.10.20.111
Driver date	
Pixel shader version	4.10
Vertex shader version	4.10
Max. MSAA samples	8
Machine model name	MacBookPro
Machine model version	11.3
GL_VENDOR	NVIDIA Corporation
GL_RENDERER	NVIDIA GeForce GT 750M OpenGL Engine
GL_VERSION	4.1 NVIDIA-10.28.10 355.11.10.10.20.111
GL_EXTENSIONS	GL_ARB_blend_func_extended GL_ARB_draw_buffers_blend GL_ARB_draw_indirect GL_ARB_ES2_compatibility GL_ARB_explicit_attrib_location GL_ARB_gpu_shader_fp64 GL_ARB_gpu_shader5 GL_ARB_instanced_arrays GL_ARB_internalformat_query GL_ARB_occlusion_query2 GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_separate_shader_objects GL_ARB_shader_bit_encoding GL_ARB_shader_subroutine GL_ARB_shading_language_include GL_ARB_tessellation_shader GL_ARB_texture_buffer_object_rgb32 GL_ARB_texture_cube_map_array GL_ARB_texture_gather GL_ARB_texture_query_lod GL_ARB_texture_rgb10_a2ui GL_ARB_texture_storage GL_ARB_texture_swizzle GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ARB_vertex_attrib_64bit GL_ARB_vertex_type_2_10_10_10_rev GL_ARB_viewport_array GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_depth_bounds_test GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_texture_compression_s3tc GL_EXT_texture_filter_anisotropic GL_EXT_texture_mirror_clamp GL_EXT_texture_sRGB_decode GL_APPLE_client_storage GL_APPLE_container_object_shareable GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_texture_range GL_ATI_texture_mirror_once GL_NV_texture_barrier
Disabled Extensions	GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent
Window system binding vendor	
Window system binding version	
Window system binding extensions	
Direct rendering	Yes
Reset notification strategy	0x0000
GPU process crash count	0
Compositor Information
Tile Update Mode	Zero-copy
Partial Raster	Enabled
GpuMemoryBuffers Status
ATC	Software only
ATCIA	Software only
DXT1	Software only
DXT5	Software only
ETC1	Software only
R_8	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
R_16	Software only
RG_88	Software only
BGR_565	Software only
RGBA_4444	Software only
RGBX_8888	Software only
RGBA_8888	GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE
BGRX_8888	GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE
BGRA_8888	GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
RGBA_F16	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
YVU_420	Software only
YUV_420_BIPLANAR	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
UYVY_422	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
Display(s) Information
Info	Display[69731906] bounds=0,0 1440x900, workarea=0,23 1440x877, scale=2, external
Color space information	{primaries:[[0.4443,0.3794,0.1404,],[0.2248,0.7262,0.0491,],[0.0055,0.0780,0.7415,],], transfer:0.0774*x + 0.0000 if x < 0.0404 else (0.9479*x + 0.0521)**2.4000 + 0.0000, matrix:RGB, range:FULL, icc_profile_id:10}
Bits per color component	8
Bits per pixel	24


 
In my non-clean profile (which I haven't tried doing an A/B test with), I am seeing sustained 30% CPU burn with about 14 tabs open, mostly to gmail, docs, slack, and a couple of random websites.
completely unscientifically, it _feels_ like scroll is a little more janky too.
Looking at it a little more, I think it's being cased by display ad Javascript. If I watch the CPU usage in the subframes, it's usually some add provider subframe spiking when the browser CPU spikes.  The JS console is also full of warnings and messages.

Comment 4 by creis@chromium.org, Jan 8 2018

Cc: dcheng@chromium.org nasko@chromium.org alex...@chromium.org
Labels: -Pri-3 OS-Chrome OS-Linux OS-Mac OS-Windows Pri-1
[+alexmos, nasko, dcheng, lukasza]

Sounds like we should get a trace to see what extra work is being done.  Without looking at it, the main differences I would expect would be IPCs to replicate state.  My first concern would be very replication of large frame names on ads, since they put JS code blocks into their window.name.  (We dealt with that for unique name in session history, but the full frame name still gets replicated.)
I'll bring a repro in tomorrow so we can look in person if that's helpful.
Attaching 2 traces for ipc/navigation from my clean profile on mac.

It looks like we're getting a ton of post messages and also we're updating hittest data a lot more frequently?

I couldn't detangled the minified JS, but it seems to be triggered by osd.js:39 in the minified code which runs on a timer.

Gonna stop digging unless all y'all tell me otherwise.
trace_woksoflife-mac-no-si.json.gz
11.7 KB Download
trace_woksoflife-mac-with-si.json.gz
55.6 KB Download
I was able also to repro on linux sporadically. It does seem to be dependent on which banner ads display and for some reason, my linux machine (which is on a different network from my laptop) seems to render a very different set of ads.
Labels: M-64 Target-65 M-63
Blocking: 800893

Comment 10 by nasko@chromium.org, Jan 13 2018

I also tried looking at this. The banner ad definitely makes a difference in overall CPU usage of the tab. However, I did not see any difference in CPU usage between using default mode and --site-per-process - both variations hover about 30-40% based on the Chrome task manager.

I've also instrumented the window.name propagation to the browser process and it is definitely not frequent and doesn't transfer big name - most of the time the name is zero length.

Given this, how sure are we that this is --site-per-process specific bug vs just general badly behaving web site? ajwong@, can you try to load only that page in a browser instance and see if you can replicate the CPU usage even without --site-per-process? My testing is in a clean profile.

Comment 11 by nasko@chromium.org, Jan 18 2018

After some discussion offline, ajwong@ pointed out that the increased CPU usage is in the browser process and I was tracking only renderer processes. It is indeed the case that we see increased CPU usage in browser process when the flag is enabled.

I instrumented the handler for postMessage in the browser process, which is used for routing the messages between out-of-process iframes. The outcome is that I see about 82 postMessage messages routed per second on the site. I'd expect this is the main culprit for increased CPU usage in the browser process, since without --site-per-process postMessage between subframes on a page is handled entirely in the renderer process.

Comment 12 by nasko@chromium.org, Jan 18 2018

I've instrumented also the local, in-process version of postMessage to see what happens in the case of no --site-per-process. The situation is the same, there are about 94 postMessage events per second being processed. It looks to me that this site is just spamming postMessage and I fully expect this to be taking CPU cycles. It looks like we shift some of the CPU usage to the browser process with --site-per-process, as it needs to be routed between processes, but this is by design.

Comment 13 by creis@chromium.org, Jan 22 2018

Owner: nasko@chromium.org
Status: Assigned (was: Untriaged)
Sounds like Nasko is tracking some upstream work on this, so I'll assign to him to comment on findings so far.
nasko@ and I looked at a couple ETW traces. There does seem to be a small difference based on eyeballing things, but it didn't seem very large Most of the samples recorded were in a renderer process, not the browser process.

We did not open the Chrome Task Manager though, as it seems to increase load quite a bit since it dumps memory statistics fairly often.

Comment 15 by nasko@chromium.org, Jan 24 2018

NextAction: 2018-01-31
Initial report had: "Eyeballling it, the browser process, with Site Isolation enabled, use a sustained 5-10% more CPU with spiky behavior that gives little periods of sustained 30% burn."

Based on all the investigation, it is clear that the sustained CPU usage in the browser process is mostly due to Task Manager running. The delta on top of it is small, depends on the OS, and is mostly caused by the rate of postMessage calls dispatched between cross-site iframes. With Site Isolation, they are routed through the browser process, so it is expected. 
As dcheng@ outlined in comment 14, at least on Windows the CPU cycles to do the routing are a small percentage compared to the rest of the activity in the system. I don't know how to collect data on CPU cycles used on other OS versions, so maybe someone else can help with that.

The periods of sustained 30% burn cited in the report are encountered when subframes are navigating from one document to another, which causes network activity and processing. This is also expected and consistent whether we have --site-per-process or not.

Based on this, it looks to me that everything is working as expected. If anyone can help collect CPU cycles data on Linux and Mac, please let me know. I will keep the bug open for a bit longer to see if we can collect more data and if not will resolve as WontFix in the future.
Nice analysis!

So the polling and data collection methodology of task manager is just doing something weird here that is reporting misleading/incorrectly high cpu usage stats?
The NextAction date has arrived: 2018-01-31
Status: WontFix (was: Assigned)
ajwong@ - on Windows, where we could look at ETW tracing, the reported CPU usage wasn't as high as it was reported on Linux. I'm not an expert in the area, so I'd assume on Linux our polling in the task manager behaves differently.

Overall the main contributor we could find to using more CPU cycles were the postMessage calls made across processes. There wasn't anything else we can find in traces. Since postMessage must cross processes by design with Site Isolation, I'm going to close this as WontFix as there isn't anything else actionable we can do here. If there are other cases that are found to lead to higher CPU usage, please file a bug.

Sign in to add a comment