Increased cpu burn with Site Isolation |
|||||||
Issue descriptionChrome Version: 63.0.3239.84 (Official Build) (64-bit) OS: OSX 10.13.2 What steps will reproduce the problem? (1) In clean profile, navigate to http://thewoksoflife.com/2017/01/chinese-new-year-recipes-list/. Page *must* stay visible. (2) Open Task Manager and wait a few mins for page to stablize. (3) Go to about:flags and enable Site Isolation. Repeat the above steps. What is the expected result? Should see more processes with some increased memory usage, but roughly the same CPU behavior. What happens instead? Eyeballling it, the browser process, with Site Isolation enabled, use a sustained 5-10% more CPU with spiky behavior that gives little periods of sustained 30% burn. The GPU process sometimes seems to ramp up/down too. It seems like various ad related code is possibly causing significant dom modifications at some regular interval? Unsure. about:gpu info: Note: To properly save this page, select the "Webpage, Complete" option in the Save File dialog. Graphics Feature Status Canvas: Hardware accelerated CheckerImaging: Disabled Flash: Hardware accelerated Flash Stage3D: Hardware accelerated Flash Stage3D Baseline profile: Hardware accelerated Compositing: Hardware accelerated Multiple Raster Threads: Enabled Native GpuMemoryBuffers: Hardware accelerated Rasterization: Software only, hardware acceleration unavailable Video Decode: Hardware accelerated Video Encode: Hardware accelerated WebGL: Hardware accelerated WebGL2: Hardware accelerated Driver Bug Workarounds adjust_src_dst_region_for_blitframebuffer avoid_stencil_buffers decode_encode_srgb_for_generatemipmap depth_stencil_renderbuffer_resize_emulation disable_framebuffer_cmaa get_frag_data_info_bug needs_offscreen_buffer_workaround pack_parameters_workaround_with_pack_buffer regenerate_struct_names remove_invariant_and_centroid_for_essl3 scalarize_vec_and_mat_constructor_args set_zero_level_before_generating_mipmap unfold_short_circuit_as_ternary_operation unpack_alignment_workaround_with_unpack_buffer use_intermediary_for_copy_texture_image use_unused_standard_shared_blocks Problems Detected Macs with NVidia GPUs experience rendering issues on High Sierra: 773705 Disabled Features: gpu_rasterization Work around a bug in offscreen buffers on NVIDIA GPUs on Macs: 89557 Applied Workarounds: needs_offscreen_buffer_workaround Unfold short circuit on Mac OS X: 307751 Applied Workarounds: unfold_short_circuit_as_ternary_operation Always rewrite vec/mat constructors to be consistent: 398694 Applied Workarounds: scalarize_vec_and_mat_constructor_args Mac drivers handle struct scopes incorrectly: 403957 Applied Workarounds: regenerate_struct_names glGenerateMipmap fails if the zero texture level is not set on some Mac drivers: 560499 Applied Workarounds: set_zero_level_before_generating_mipmap Pack parameters work incorrectly with pack buffer bound: 563714 Applied Workarounds: pack_parameters_workaround_with_pack_buffer Alignment works incorrectly with unpack buffer bound: 563714 Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer copyTexImage2D fails when reading from IOSurface on multiple GPU types.: 581777 Applied Workarounds: use_intermediary_for_copy_texture_image Use GL_INTEL_framebuffer_CMAA on ChromeOS: 535198 Applied Workarounds: disable_framebuffer_cmaa glGetFragData{Location|Index} works incorrectly on Max: 638340 Applied Workarounds: get_frag_data_info_bug Decode and encode before generateMipmap for srgb format textures on macosx: 634519 Applied Workarounds: decode_encode_srgb_for_generatemipmap Insert statements to reference all members in unused std140/shared blocks on Mac: 618464 Applied Workarounds: use_unused_standard_shared_blocks adjust src/dst region if blitting pixels outside read framebuffer on Mac: 644740 Applied Workarounds: adjust_src_dst_region_for_blitframebuffer Mac driver GL 4.1 requires invariant and centroid to match between shaders: 639760, 641129 Applied Workarounds: remove_invariant_and_centroid_for_essl3 Disable KHR_blend_equation_advanced until cc shaders are updated: 661715 Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent) Certain Apple devices leak stencil buffers: 713854 Applied Workarounds: avoid_stencil_buffers Depth/stencil renderbuffers can't be resized on NVIDIA on macOS 10.13: 775202 Applied Workarounds: depth_stencil_renderbuffer_resize_emulation Checker-imaging has been disabled via finch trial or the command line. Disabled Features: checker_imaging Version Information Data exported 1/6/2018, 5:23:41 PM Chrome version Chrome/63.0.3239.84 Operating system Mac OS X 10.13.2 Software rendering list version 13.13 Driver bug list version 10.34 ANGLE commit id 9095f2b44801 2D graphics backend Skia/63 dbae7001c9805fb0a4b18fd0cbc889941cb39db4- Command Line /Applications/Google Chrome.app/Contents/MacOS/Google Chrome --user-data-dir=/Users/awong/Library/Application Support/Google/Chrome Alt --flag-switches-begin --flag-switches-end Driver Information Initialization time 42 In-process GPU false Passthrough Command Decoder false Supports overlays false Sandboxed true GPU0 VENDOR = 0x10de, DEVICE= 0x0fe9 *ACTIVE* GPU1 VENDOR = 0x8086, DEVICE= 0x0d26 Optimus true Optimus true AMD switchable false Driver vendor Driver version 10.28.10 355.11.10.10.20.111 Driver date Pixel shader version 4.10 Vertex shader version 4.10 Max. MSAA samples 8 Machine model name MacBookPro Machine model version 11.3 GL_VENDOR NVIDIA Corporation GL_RENDERER NVIDIA GeForce GT 750M OpenGL Engine GL_VERSION 4.1 NVIDIA-10.28.10 355.11.10.10.20.111 GL_EXTENSIONS GL_ARB_blend_func_extended GL_ARB_draw_buffers_blend GL_ARB_draw_indirect GL_ARB_ES2_compatibility GL_ARB_explicit_attrib_location GL_ARB_gpu_shader_fp64 GL_ARB_gpu_shader5 GL_ARB_instanced_arrays GL_ARB_internalformat_query GL_ARB_occlusion_query2 GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_separate_shader_objects GL_ARB_shader_bit_encoding GL_ARB_shader_subroutine GL_ARB_shading_language_include GL_ARB_tessellation_shader GL_ARB_texture_buffer_object_rgb32 GL_ARB_texture_cube_map_array GL_ARB_texture_gather GL_ARB_texture_query_lod GL_ARB_texture_rgb10_a2ui GL_ARB_texture_storage GL_ARB_texture_swizzle GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ARB_vertex_attrib_64bit GL_ARB_vertex_type_2_10_10_10_rev GL_ARB_viewport_array GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_depth_bounds_test GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_texture_compression_s3tc GL_EXT_texture_filter_anisotropic GL_EXT_texture_mirror_clamp GL_EXT_texture_sRGB_decode GL_APPLE_client_storage GL_APPLE_container_object_shareable GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_texture_range GL_ATI_texture_mirror_once GL_NV_texture_barrier Disabled Extensions GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent Window system binding vendor Window system binding version Window system binding extensions Direct rendering Yes Reset notification strategy 0x0000 GPU process crash count 0 Compositor Information Tile Update Mode Zero-copy Partial Raster Enabled GpuMemoryBuffers Status ATC Software only ATCIA Software only DXT1 Software only DXT5 Software only ETC1 Software only R_8 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT R_16 Software only RG_88 Software only BGR_565 Software only RGBA_4444 Software only RGBX_8888 Software only RGBA_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE BGRX_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE BGRA_8888 GPU_READ, SCANOUT, SCANOUT_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT RGBA_F16 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT YVU_420 Software only YUV_420_BIPLANAR GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT UYVY_422 GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT Display(s) Information Info Display[69731906] bounds=0,0 1440x900, workarea=0,23 1440x877, scale=2, external Color space information {primaries:[[0.4443,0.3794,0.1404,],[0.2248,0.7262,0.0491,],[0.0055,0.0780,0.7415,],], transfer:0.0774*x + 0.0000 if x < 0.0404 else (0.9479*x + 0.0521)**2.4000 + 0.0000, matrix:RGB, range:FULL, icc_profile_id:10} Bits per color component 8 Bits per pixel 24
,
Jan 7 2018
completely unscientifically, it _feels_ like scroll is a little more janky too.
,
Jan 7 2018
Looking at it a little more, I think it's being cased by display ad Javascript. If I watch the CPU usage in the subframes, it's usually some add provider subframe spiking when the browser CPU spikes. The JS console is also full of warnings and messages.
,
Jan 8 2018
[+alexmos, nasko, dcheng, lukasza] Sounds like we should get a trace to see what extra work is being done. Without looking at it, the main differences I would expect would be IPCs to replicate state. My first concern would be very replication of large frame names on ads, since they put JS code blocks into their window.name. (We dealt with that for unique name in session history, but the full frame name still gets replicated.)
,
Jan 8 2018
I'll bring a repro in tomorrow so we can look in person if that's helpful.
,
Jan 8 2018
Attaching 2 traces for ipc/navigation from my clean profile on mac. It looks like we're getting a ton of post messages and also we're updating hittest data a lot more frequently? I couldn't detangled the minified JS, but it seems to be triggered by osd.js:39 in the minified code which runs on a timer. Gonna stop digging unless all y'all tell me otherwise.
,
Jan 8 2018
I was able also to repro on linux sporadically. It does seem to be dependent on which banner ads display and for some reason, my linux machine (which is on a different network from my laptop) seems to render a very different set of ads.
,
Jan 9 2018
,
Jan 10 2018
,
Jan 13 2018
I also tried looking at this. The banner ad definitely makes a difference in overall CPU usage of the tab. However, I did not see any difference in CPU usage between using default mode and --site-per-process - both variations hover about 30-40% based on the Chrome task manager. I've also instrumented the window.name propagation to the browser process and it is definitely not frequent and doesn't transfer big name - most of the time the name is zero length. Given this, how sure are we that this is --site-per-process specific bug vs just general badly behaving web site? ajwong@, can you try to load only that page in a browser instance and see if you can replicate the CPU usage even without --site-per-process? My testing is in a clean profile.
,
Jan 18 2018
After some discussion offline, ajwong@ pointed out that the increased CPU usage is in the browser process and I was tracking only renderer processes. It is indeed the case that we see increased CPU usage in browser process when the flag is enabled. I instrumented the handler for postMessage in the browser process, which is used for routing the messages between out-of-process iframes. The outcome is that I see about 82 postMessage messages routed per second on the site. I'd expect this is the main culprit for increased CPU usage in the browser process, since without --site-per-process postMessage between subframes on a page is handled entirely in the renderer process.
,
Jan 18 2018
I've instrumented also the local, in-process version of postMessage to see what happens in the case of no --site-per-process. The situation is the same, there are about 94 postMessage events per second being processed. It looks to me that this site is just spamming postMessage and I fully expect this to be taking CPU cycles. It looks like we shift some of the CPU usage to the browser process with --site-per-process, as it needs to be routed between processes, but this is by design.
,
Jan 22 2018
Sounds like Nasko is tracking some upstream work on this, so I'll assign to him to comment on findings so far.
,
Jan 23 2018
nasko@ and I looked at a couple ETW traces. There does seem to be a small difference based on eyeballing things, but it didn't seem very large Most of the samples recorded were in a renderer process, not the browser process. We did not open the Chrome Task Manager though, as it seems to increase load quite a bit since it dumps memory statistics fairly often.
,
Jan 24 2018
Initial report had: "Eyeballling it, the browser process, with Site Isolation enabled, use a sustained 5-10% more CPU with spiky behavior that gives little periods of sustained 30% burn." Based on all the investigation, it is clear that the sustained CPU usage in the browser process is mostly due to Task Manager running. The delta on top of it is small, depends on the OS, and is mostly caused by the rate of postMessage calls dispatched between cross-site iframes. With Site Isolation, they are routed through the browser process, so it is expected. As dcheng@ outlined in comment 14, at least on Windows the CPU cycles to do the routing are a small percentage compared to the rest of the activity in the system. I don't know how to collect data on CPU cycles used on other OS versions, so maybe someone else can help with that. The periods of sustained 30% burn cited in the report are encountered when subframes are navigating from one document to another, which causes network activity and processing. This is also expected and consistent whether we have --site-per-process or not. Based on this, it looks to me that everything is working as expected. If anyone can help collect CPU cycles data on Linux and Mac, please let me know. I will keep the bug open for a bit longer to see if we can collect more data and if not will resolve as WontFix in the future.
,
Jan 24 2018
Nice analysis! So the polling and data collection methodology of task manager is just doing something weird here that is reporting misleading/incorrectly high cpu usage stats?
,
Jan 31 2018
The NextAction date has arrived: 2018-01-31
,
Feb 1 2018
ajwong@ - on Windows, where we could look at ETW tracing, the reported CPU usage wasn't as high as it was reported on Linux. I'm not an expert in the area, so I'd assume on Linux our polling in the task manager behaves differently. Overall the main contributor we could find to using more CPU cycles were the postMessage calls made across processes. There wasn't anything else we can find in traces. Since postMessage must cross processes by design with Site Isolation, I'm going to close this as WontFix as there isn't anything else actionable we can do here. If there are other cases that are found to lead to higher CPU usage, please file a bug. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by ajwong@chromium.org
, Jan 7 2018