Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Issue 631485 VTDecoderXPCService leaks IOSurfaces (causing OOM black boxes on screen)
Starred by 27 users Reported by jani.kra...@gmail.com, Jul 26 2016 Back to list
Status: ExternalDependency
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment
Chrome Version       :  52.0.2743.82 (64-bit)
URLs (if applicable) :
OS version               : 10.11.6
Behavior in Safari 3.x/4.x (if applicable): normal
Behavior in Firefox 3.x (if applicable): normal 
Behavior in Chrome for Windows: N/a

What steps will reproduce the problem?
(1)Open a website in a tab and leave it
(2) Open a new tab and continue to browser
(3) periodically check back on the inital tab

What is the expected result?
Website has no change in appearance

What happens instead?
Parts of the website will be covered by black boxes, which will flicker as you hover over.

Chroem seems to trigger VTDecoderXPCService which consumes memory. Same goes for Google Chrome helper tasks. When does a simple plain html site need 1gb of memory???


 
Screen Shot 2016-07-26 at 15.52.22.png
729 KB View Download
Comment 1 by redins...@gmail.com, Jul 27 2016
I've been wrangling with this issue for a few hours. Have found after a restart and watching memory usage in OS X and Chrome Task Manager, usage was high-ish but no black boxes. As soon as I opened Atom I got black boxes in Chrome. Wonder if a conflict has arisen with the new version of chrome against some Chromium apps? 

OP, can you test if you have any other apps relying on Chromium/chrome rendering that effect this one way or the other? 
Comment 2 by redins...@gmail.com, Jul 27 2016
CURSES! The moment I post this I get black boxes returning even with Atom closed. 
Screen Shot 2016-07-27 at 4.46.21 PM.png
265 KB View Download
Screen Shot 2016-07-27 at 4.45.27 PM.png
641 KB View Download
it started happening to me after the latest update. 
update: terminated all extensions and started working on tab by tam basis:

Inbox is ok, tweedeck starts showing black boxes
Components: Internals>GPU>Internals
Labels: -Pri-3 Pri-1
Status: Untriaged
Summary: Graphics distortion (black boxes) (was: Graphica distortion (black boxes))
Could these issues be due to the change in Mac compositing hitting issues on specific machines? I'm trying to figure out why Chromium developers, who use Macs very heavily, don't see it.

Could you both please tell us exactly what machine you're seeing this on, and paste in the contents of chrome://gpu? redinsect@ I see you've already done that over on crbug.com/630394. And what is "Atom" in this context?

And are you sure you have no malicious extensions or other malware?

Assigning to the GPU team. If this were paint or invalidation I would expect to see it everywhere.

I don't think this is the same as 630394.
this has not been an issue until now.
seeing this on two diffrent machines same version of chrome, 13" mbp and 15" mbp both 2015, i7, 16gb

contents of //gpu

Graphics Feature Status
Canvas: Software only, hardware acceleration unavailable
Flash: Software only, hardware acceleration unavailable
Flash Stage3D: Software only, hardware acceleration unavailable
Flash Stage3D Baseline profile: Software only, hardware acceleration unavailable
Compositing: Software only, hardware acceleration unavailable
Multiple Raster Threads: Unavailable
Native GpuMemoryBuffers: Software only, hardware acceleration unavailable
Rasterization: Software only, hardware acceleration unavailable
Video Decode: Software only, hardware acceleration unavailable
Video Encode: Software only, hardware acceleration unavailable
WebGL: Unavailable
Driver Bug Workarounds
disable_multimonitor_multisampling
disable_texture_cube_map_seamless
disable_webgl_rgb_multisampling_usage
init_varyings_without_static_use
msaa_is_slow
pack_parameters_workaround_with_pack_buffer
regenerate_struct_names
scalarize_vec_and_mat_constructor_args
set_zero_level_before_generating_mipmap
unfold_short_circuit_as_ternary_operation
unpack_alignment_workaround_with_unpack_buffer
use_intermediary_for_copy_texture_image
validate_multisample_buffer_allocation
Problems Detected
GPU process was unable to boot: GPU access is disabled in chrome://settings.
Disabled Features: all
Multisampling is buggy on OSX when multiple monitors are connected: 237931
Applied Workarounds: disable_multimonitor_multisampling
Multisampled renderbuffer allocation must be validated on some Macs: 290391
Applied Workarounds: validate_multisample_buffer_allocation
Unfold short circuit on Mac OS X: 307751
Applied Workarounds: unfold_short_circuit_as_ternary_operation
Mac drivers handle varyings without static use incorrectly: 322760
Applied Workarounds: init_varyings_without_static_use
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
Mac drivers handle struct scopes incorrectly: 403957
Applied Workarounds: regenerate_struct_names
On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565
Applied Workarounds: msaa_is_slow
glGenerateMipmap fails if the zero texture level is not set on some Mac drivers: 560499
Applied Workarounds: set_zero_level_before_generating_mipmap
Pack parameters work incorrectly with pack buffer bound: 563714
Applied Workarounds: pack_parameters_workaround_with_pack_buffer
Alignment works incorrectly with unpack buffer bound: 563714
Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer
copyTexImage2D fails when reading from IOSurface on multiple GPU types.: 581777
Applied Workarounds: use_intermediary_for_copy_texture_image
Seamless cubemap does not work for Mac Intel: 597794
Applied Workarounds: disable_texture_cube_map_seamless
Multisample renderbuffers with format GL_RGB8 have performance issues on Intel GPUs.: 607130
Applied Workarounds: disable_webgl_rgb_multisampling_usage
Version Information
Data exported	7/27/2016, 3:16:21 PM
Chrome version	Chrome/52.0.2743.82
Operating system	Mac OS X 10.11.6
Software rendering list version	11.7
Driver bug list version	8.68
ANGLE commit id	f07246f6a06d
2D graphics backend	Skia
Command Line Args	Chrome.app/Contents/MacOS/Google Chrome --flag-switches-begin --enable-password-generation --enable-features=enable-automatic-password-saving --flag-switches-end
Driver Information
Initialization time	26
In-process GPU	false
Sandboxed	true
GPU0	VENDOR = 0x8086, DEVICE= 0x0d26 *ACTIVE*
Optimus	false
AMD switchable	false
Driver vendor	
Driver version	
Driver date	
Pixel shader version	
Vertex shader version	
Max. MSAA samples	
Machine model name	MacBookPro
Machine model version	11.2
GL_VENDOR	
GL_RENDERER	
GL_VERSION	
GL_EXTENSIONS	
Disabled Extensions	
Window system binding vendor	
Window system binding version	
Window system binding extensions	
Direct rendering	Yes
Reset notification strategy	0x0000
GPU process crash count	3
Compositor Information
Tile Update Mode	Zero-copy
Partial Raster	Disabled
GpuMemoryBuffers Status
ATC	Software only
ATCIA	Software only
DXT1	Software only
DXT5	Software only
ETC1	Software only
R_8	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
BGR_565	Software only
RGBA_4444	Software only
RGBX_8888	Software only
RGBA_8888	GPU_READ, SCANOUT
BGRX_8888	GPU_READ, SCANOUT
BGRA_8888	GPU_READ, SCANOUT, GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
YUV_420	Software only
YUV_420_BIPLANAR	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
UYVY_422	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
Log Messages
GpuProcessHostUIShim: You killed the GPU process! Why?
GpuProcessHostUIShim: You killed the GPU process! Why?
GpuProcessHostUIShim: You killed the GPU process! Why?


Components: Internals>Compositing
Thanks for the info. One other request: Do you see the same behavior in Incognito mode?  I know we're asking a lot but we really want to figure this out.
yes, see attached
Screen Shot 2016-07-27 at 17.14.08.png
1.7 MB View Download
In your about:gpu, I see that hardware acceleration is disabled. Can you verify that it is?

In particular, could you go to about:settings, go to "advanced", and un-check "use hardware acceleration when available".

Does the issue still happen after you do that and restart Chrome?

WRT the black tiles, that looks like the GPU driver is failing allocations (not much can be done there).

WRT the flickering in #7, that looks vaguely like issue 543324, but we had only seen that on some ATI cards, and it would never happen in software mode.
Labels: Needs-Feedback
hardware acceleration is enabled a.k.a "use hardware acceleration if available" is checked. will petrol the toggle and restart later. is there an older version of chrome available to test in parallel (v51 for example)?

Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Native GpuMemoryBuffers: Hardware accelerated
Rasterization: Software only. Hardware acceleration disabled
Video Decode: Hardware accelerated
Video Encode: Hardware accelerated
WebGL: Hardware accelerated


see attached the series of screenshots with rising memory usage. same tabs open, no other activity


gpu.jpg
770 KB View Download
Screen Shot 2016-07-27 at 20.00.20.jpg
1.2 MB View Download
Screen Shot 2016-07-27 at 20.01.14.jpg
1.3 MB View Download
Screen Shot 2016-07-27 at 20.05.18.jpg
1.3 MB View Download
Screen Shot 2016-07-27 at 20.10.56.jpg
1.3 MB View Download
Cc: ccameron@chromium.org
Components: -Internals>Compositing Internals>Compositing>Rasterization
Status: Available
Comment 16 by enne@chromium.org, Jul 27 2016
Cc: ericrk@chromium.org vmi...@chromium.org
seems to happen when chrome pushes VTDecoderXPCService memory usage above 1.5GB
only had shown tabs visible open and runnign for a while. no other apps running

no software updates other than chrome update to latest. 
can we get earlier version of chrome to test
Screen Shot 2016-07-27 at 23.16.34.jpg
928 KB View Download
Screen Shot 2016-07-27 at 23.16.46.jpg
776 KB View Download
Screen Shot 2016-07-27 at 23.17.07.jpg
1.2 MB View Download
Screen Shot 2016-07-27 at 23.18.07.jpg
926 KB View Download
@ schenney & ccameron - confirming as above that the issue persists in incognito. 

Have disabled hardware acceleration and restarted Chrome - will monitor over the next hour. If black-boxing reappears I'll disable all extensions and continue testing. Note: black boxes only started occuring just under 24 hours ago, seemingly exact time Chrome updated to version 52.x

Machine specs: 

Brand new (1 week old) 15" Macbook Pro (lower tier, integrated graphics, 16GB RAM). More detail attached.  

As per question above, Atom in this instance is: https://atom.io/ which I believe uses chromium for rendering? Have noticed rare instances where the text area display in atom also started black-boxing, only started happening since the same symptoms occurred in Chrome proper. No issues with Chrome or Atom over the last week until yesterday. 




Screen Shot 2016-07-28 at 9.08.23 AM.png
77.6 KB View Download
Screen Shot 2016-07-28 at 9.08.38 AM.png
62.1 KB View Download
Screen Shot 2016-07-28 at 9.08.46 AM.png
110 KB View Download
Screen Shot 2016-07-28 at 9.08.53 AM.png
81.1 KB View Download
Following up, with hardware acceleration off - flickering has reappeared (tweetdeck is a prime offender), but no outright black boxes yet. Will continue to test for a bit, before disabling all extensions and going from there. 

Tweetdeck behaviour attached (.mov screencap).  
tweetdeck-flickering.mov
8.2 MB Download
Disabled all extensions, restarted Chrome = same flickering in Tweetdeck. 
Restarted MacBook = same flickering in Tweetdeck.

Still no black boxes.

Have re-enabled hardware acceleration... still have all extensions disabled. Will continue to monitor. 
OK black boxes are back (all extensions disabled, hardware acceleration on) - like jani said in comment #17 this happened immediately on VTDecoderXPCService breaking 1.5GB memory. (See attached). 

In summary of my testing: 

- Black boxes seem to occur when hardware acceleration is ON and VTDecoderXPCService exceeds 1.5GB memory

- Black boxes occurs in normal or incognito tabs/windows

- Black boxes occur regardless of extensions (tried with all my extensions on and off in all configurations)

- When hardware acceleration is OFF I don't seem to get the black boxes (VTDecoderXPCService never starts up?), but DO get flickering and odd rendering (as per above post with attached .mov tweetdeck is a prime offender and is essentially unusable; image viewing in G+ has odd behaviour like overlays rendering behind the image then hiccuping into view, see attached.)


gplus-image-hiccup.mp4
7.1 MB View Download
Can confirm the latest Canary suffers the same problem. 54.0.2808.0

Open a YouTube in Chrome, keep playing video (choose a long one). VTDecoderXPCService will slowly eat more and more memory. Once it crosses 1.5GB (approx) the black box artefacts start appearing.

Restarting Chrome fixes the problem until the next time VTDecoderXPCService gets to 1.5 GB.
awesome investigation work, thanks. 
Thanks. Just add keep adding detail - after several hours of black-boxing at 1.5-1.6GB usage of VTDecoderXPCService, I've pushed it out all the way to 1.9GB before getting the black-boxing. 

No idea what changed, but very possible VTDecoderXPCService is a red herring, or at least a symptom and not a cause. 

Anyway, Chromium devs, happy to keep giving you any/all info you need to help trouble shoot this one. 

Just to reconfirm (and same with jani in previous posts) this wasn't happening at all on these same machines until the 52 update. 
Cc: sande...@chromium.org
Owner: ccameron@chromium.org
It looks that there are two distinct problems here.

1. With hardware acceleration, it sounds that VTDecoderXPCService is leaking decoded frames, causing allocation failures, which is causing black boxes. In M52 we started using AVFoundation to draw frames produced by VideoToolbox. Our usage of this is very straightforward, and this has been in Canary since March, so this is something of a surprise.

2. In software mode, the issue reported in #19 looks to be a double-buffering issue where the WindowServer is reading from a buffer while we are writing into it. This is theoretically possible -- I can look into making that go away.
I had similar issues while watching HTML5 videos. Didn't find this issue before, so opened one myself: https://bugs.chromium.org/p/chromium/issues/detail?id=632178 for anyone interested.
redins... was so kind to link me to this thread. Thx!
Issue 632178 has been merged into this issue.
Cc: ligim...@chromium.org
Thanks for the feedback -- I have some theories about what may be causing this, but since I can't reproduce it, I'll need your help to test them.

I have three builds in the file at
https://drive.google.com/file/d/0B6kh5pYRi1dKYVh0WnVkSnRzNUE/view?usp=sharing

They're Chromium (open source version), not Chrome, and you can run them side-by-side with Chrome still open.

The 3 versions are "default_build", which is a control, "no_software_av", which will affect how YouTube and NetFlix is presented on-screen, and "no_av_ever", which affects all video.

Can you try to reproduce the issue with the 3 builds, and let me know what your results are?

If we're lucky, then your results will be
 default_build: problem exists
 no_software_av: problem does not exist
 no_av_ever: problem does not exist

If that's the case, then there is basically no down-side to the fix. If we don't see the problem with default_build, then the results are invalid. If the problem continues to exist with no_software_av but goes away with no_av_ever, then the fix will negatively affect our power consumption (at least until we can figure out the exact cause).
cool ,thanks.
running 1st default_build on its own
Labels: M-52
I am unable to reproduce in 52.0.2743.82 in Mac 10.11.5.

GPU: Intel HD 3000 OpenGL Engine with Hardware acceleration.

Tried playing netflix, youtube, vimeo videos, works fine.
Ok I just checked all 3 versions and I think we got lucky:

as predicted:
 default_build: problem exists
 no_software_av: problem does not exist
 no_av_ever: problem does not exist

I added a screenshot where you can see the black boxes as well as the increasing memory usage (pychart in the top-panel - red indicates compressed! memory)

My specs are:
OS: 64bit Mac OS X 10.9.5 13F1911
Kernel: x86_64 Darwin 13.4.0
CPU: Intel Core2 Duo CPU P8600 @ 2.40GHz
GPU: NVIDIA GeForce 9400M / NVIDIA GeForce 9600M GT
RAM: 8192MiB

I hope this helps, thanks for your work! 
blackboxes.png
801 KB View Download
Thank you so much -- that's excellent news!

If it's not too much, could I request another experiment?

I'm trying to find the exact place where this is causing problems. The "no_software_av" build disallowed use of AVSampleBufferDisplayLayer with software-decoded video. I'd like to see if this is unique to the software-decode path, or if the problem exists in the hardware-decode path as well.

To test this, could you install the "h264ify" Chrome extension at
  https://chrome.google.com/webstore/detail/h264ify/aleakchihdccplidncghkekgioiakgal?hl=en-US
on the "no_software_av" build and then try to reproduce the problem using YouTube?

The "h264ify" extension will switch the build to requesting hardware-decoded h264 from YouTube, instead of software-decoded vp9. To test if h264ify has taken effect, you should be able to notice that your CPU usage when watching YouTube should be lower.

If the problem does not exist when using h264ify, then that's good news.
Did as you described and when using

h264ify with no_software_av

the problem does! occur. Though, the black boxes seem to be limited to the video window itself. I guess thats h264ify. Added a screen where one can see the increasing memory until video playback fails and the memory is freed up again.

Hope this helps anyway. 
Bildschirmfoto 2016-07-29 um 00.08.39.png
679 KB View Download
I just tested "no_software_av" with h264ify running. Took longer, but finally encountered black boxes (though much less and cleared quickly) and Tweetdeck content jittering and disappearing/reappearing. 

When this happened YouTube playback failed as per screenshots attached. (Note, looks like VTDecoderXPCService reset moment prior to these screenshots). 
Screen Shot 2016-07-29 at 11.11.42 AM.png
847 KB View Download
Screen Shot 2016-07-29 at 11.11.29 AM.png
578 KB View Download
Screen Shot 2016-07-29 at 11.11.22 AM.png
852 KB View Download
Cc: erikc...@chromium.org
Thanks. It appears that the "h264" versus "vp9" issue is a red herring -- the problem is our use of AVSampleBufferDisplayLayer -- somehow that seems to be causing a leak. What puzzles me is that it only seems to be happening on a few systems (and, by Murphy's Law, none that we have in-house).

When you're experiencing the runaway memory allocation followed by black boxes, what happens if you run "ioclasscount IOSurface" at the command line? Does the count increment as the video plays?

For me it reads
  ccameron-macbookpro:src ccameron$ ioclasscount IOSurface
  IOSurface = 129
and stays more-or-less constant (if I open more windows, it goes up, but it goes back down when I close them).
@erik - I quit Chrome (stable), reopened with 8 pinned tabs and one full YouTube video that started playing, then ran "ioclasscount IOSurface" every 30 seconds (approx)

Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 66
Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 69
Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 69
Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 69
Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 68
Karls-MacBook-Pro:~ karlsmith$ ioclasscount IOSurface
IOSurface = 68


Erik, sorry missed the part about running that when we get the black boxes. I'll wait for it to reoccur then try again. 
Here you go, count doesn't seem to increment as videos play as far as I can tell unless it's happening very gradually which may be the case. To me it seems like over time it matches the increase in the VTDecoderXPCService...

The first run of "ioclasscount IOSurface" in the screenshot was when VTDecoderXPCService was at about 1.1GB and climbing. The last two are after the black boxes appeared.


Screen Shot 2016-07-29 at 3.04.58 PM.png
435 KB View Download
Cc: kbr@chromium.org
Woah, 4000-5000 IOSurfaces is a huge leak.

I wonder why this is only affecting a couple of machines (every local machine is having no problems with this).

It would be great if we could track where these IOSurfaces are coming from -- we were able to do that with ioreg prior to 10.10, but that tracking disappeared. I'm reaching out to see if we can get any information there.
Happy to keep helping as much as I can - very difficult to use Chrome fro me right now so would love to help fix it, let me know if there's anything further I can do. 
It may be that we're enqueueing samples when -[AVSampleDisplayLayer isReadyForMoreMediaData] would return false, which is "safe", but "highly discouraged."

I'll spin up a build tomorrow that blows up if that's ever the case.

In the mean time, could you try running with the command line flag "--show-fps-counter", and report what you see as the memory is blowing up? It could be that we're doing updates too rapidly, and that is getting things into a bad state.

Also, if you are having trouble using chrome, and just want to use it without these issues, the command line flag "--disable-mac-overlays" should fix the problem, albeit at a big cost to your battery.

Regardless of the outcome, the next M52 spin will have this issue fixed.

Also, if anyone affected by this issue is in the bay area and wants me to check out their machine in person, drop me a line in email (@chromium.org).
ran no_software_av build with h264ify extension running yt, tweedeck website, hangouts and google play music,... (see screenshots and time stamps on them)

black boxes appeared when VTDecoderXPCService exceeded 1.6GB memory usage

ioclasscount IOSurface trace: 

about:gpu output

Graphics Feature Status
Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Native GpuMemoryBuffers: Hardware accelerated
Rasterization: Hardware accelerated
Video Decode: Hardware accelerated
Video Encode: Hardware accelerated
VPx Video Decode: Hardware accelerated
WebGL: Hardware accelerated
Driver Bug Workarounds
disable_framebuffer_cmaa
disable_multimonitor_multisampling
disable_webgl_rgb_multisampling_usage
msaa_is_slow
pack_parameters_workaround_with_pack_buffer
regenerate_struct_names
scalarize_vec_and_mat_constructor_args
set_zero_level_before_generating_mipmap
unfold_short_circuit_as_ternary_operation
unpack_alignment_workaround_with_unpack_buffer
use_intermediary_for_copy_texture_image
use_shadowed_tex_level_params
validate_multisample_buffer_allocation
Problems Detected
Multisampling is buggy on OSX when multiple monitors are connected: 237931
Applied Workarounds: disable_multimonitor_multisampling
Multisampled renderbuffer allocation must be validated on some Macs: 290391
Applied Workarounds: validate_multisample_buffer_allocation
Unfold short circuit on Mac OS X: 307751
Applied Workarounds: unfold_short_circuit_as_ternary_operation
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
Mac drivers handle struct scopes incorrectly: 403957
Applied Workarounds: regenerate_struct_names
On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565
Applied Workarounds: msaa_is_slow
glGenerateMipmap fails if the zero texture level is not set on some Mac drivers: 560499
Applied Workarounds: set_zero_level_before_generating_mipmap
Pack parameters work incorrectly with pack buffer bound: 563714
Applied Workarounds: pack_parameters_workaround_with_pack_buffer
Alignment works incorrectly with unpack buffer bound: 563714
Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer
copyTexImage2D fails when reading from IOSurface on multiple GPU types.: 581777
Applied Workarounds: use_intermediary_for_copy_texture_image
Multisample renderbuffers with format GL_RGB8 have performance issues on Intel GPUs.: 607130
Applied Workarounds: disable_webgl_rgb_multisampling_usage
Mac Drivers store texture level parameters on int16_t that overflow: 610153
Applied Workarounds: use_shadowed_tex_level_params
Limited enabling of Chromium GL_INTEL_framebuffer_CMAA: 535198
Applied Workarounds: disable_framebuffer_cmaa
Version Information
Data exported	7/29/2016, 10:07:56 AM
Chrome version	Chrome/54.0.2810.0
Operating system	Mac OS X 10.11.6
Software rendering list version	11.9
Driver bug list version	8.80
ANGLE commit id	c9bde92635e8
2D graphics backend	Skia
Command Line Args	-psn_0_286790 --flag-switches-begin --flag-switches-end
Driver Information
Initialization time	27
In-process GPU	false
Sandboxed	true
GPU0	VENDOR = 0x8086, DEVICE= 0x0d26 *ACTIVE*
Optimus	false
AMD switchable	false
Driver vendor	
Driver version	10.14.73
Driver date	
Pixel shader version	1.20
Vertex shader version	1.20
Max. MSAA samples	8
Machine model name	MacBookPro
Machine model version	11.2
GL_VENDOR	Intel Inc.
GL_RENDERER	Intel Iris Pro OpenGL Engine
GL_VERSION	2.1 INTEL-10.14.73
GL_EXTENSIONS	GL_ARB_color_buffer_float GL_ARB_depth_buffer_float GL_ARB_depth_clamp GL_ARB_depth_texture GL_ARB_draw_buffers GL_ARB_draw_elements_base_vertex GL_ARB_draw_instanced GL_ARB_fragment_program GL_ARB_fragment_program_shadow GL_ARB_fragment_shader GL_ARB_framebuffer_object GL_ARB_framebuffer_sRGB GL_ARB_half_float_pixel GL_ARB_half_float_vertex GL_ARB_instanced_arrays GL_ARB_multisample GL_ARB_multitexture GL_ARB_occlusion_query GL_ARB_pixel_buffer_object GL_ARB_point_parameters GL_ARB_point_sprite GL_ARB_provoking_vertex GL_ARB_seamless_cube_map GL_ARB_shader_objects GL_ARB_shader_texture_lod GL_ARB_shading_language_100 GL_ARB_shadow GL_ARB_sync GL_ARB_texture_border_clamp GL_ARB_texture_compression GL_ARB_texture_compression_rgtc GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_texture_env_combine GL_ARB_texture_env_crossbar GL_ARB_texture_env_dot3 GL_ARB_texture_float GL_ARB_texture_mirrored_repeat GL_ARB_texture_non_power_of_two GL_ARB_texture_rectangle GL_ARB_texture_rg GL_ARB_transpose_matrix GL_ARB_vertex_array_bgra GL_ARB_vertex_blend GL_ARB_vertex_buffer_object GL_ARB_vertex_program GL_ARB_vertex_shader GL_ARB_window_pos GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_equation_separate GL_EXT_blend_func_separate GL_EXT_blend_minmax GL_EXT_blend_subtract GL_EXT_clip_volume_hint GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_draw_buffers2 GL_EXT_draw_range_elements GL_EXT_fog_coord GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_framebuffer_object GL_EXT_framebuffer_sRGB GL_EXT_geometry_shader4 GL_EXT_gpu_program_parameters GL_EXT_gpu_shader4 GL_EXT_multi_draw_arrays GL_EXT_packed_depth_stencil GL_EXT_packed_float GL_EXT_provoking_vertex GL_EXT_rescale_normal GL_EXT_secondary_color GL_EXT_separate_specular_color GL_EXT_shadow_funcs GL_EXT_stencil_two_side GL_EXT_stencil_wrap GL_EXT_texture_array GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc GL_EXT_texture_env_add GL_EXT_texture_filter_anisotropic GL_EXT_texture_integer GL_EXT_texture_lod_bias GL_EXT_texture_rectangle GL_EXT_texture_shared_exponent GL_EXT_texture_sRGB GL_EXT_texture_sRGB_decode GL_EXT_timer_query GL_EXT_transform_feedback GL_EXT_vertex_array_bgra GL_APPLE_aux_depth_stencil GL_APPLE_client_storage GL_APPLE_element_array GL_APPLE_fence GL_APPLE_float_pixels GL_APPLE_flush_buffer_range GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_packed_pixels GL_APPLE_pixel_buffer GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_specular_vector GL_APPLE_texture_range GL_APPLE_transform_hint GL_APPLE_vertex_array_object GL_APPLE_vertex_array_range GL_APPLE_vertex_point_size GL_APPLE_vertex_program_evaluators GL_APPLE_ycbcr_422 GL_ATI_separate_stencil GL_ATI_texture_env_combine3 GL_ATI_texture_float GL_ATI_texture_mirror_once GL_IBM_rasterpos_clip GL_NV_blend_square GL_NV_conditional_render GL_NV_depth_clamp GL_NV_fog_distance GL_NV_light_max_exponent GL_NV_texgen_reflection GL_NV_texture_barrier GL_SGIS_generate_mipmap GL_SGIS_texture_edge_clamp GL_SGIS_texture_lod
Disabled Extensions	
Window system binding vendor	
Window system binding version	
Window system binding extensions	
Direct rendering	Yes
Reset notification strategy	0x0000
GPU process crash count	0
Compositor Information
Tile Update Mode	Zero-copy
Partial Raster	Enabled
GpuMemoryBuffers Status
ATC	Software only
ATCIA	Software only
DXT1	Software only
DXT5	Software only
ETC1	Software only
R_8	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
BGR_565	Software only
RGBA_4444	Software only
RGBX_8888	Software only
RGBA_8888	GPU_READ, SCANOUT
BGRX_8888	GPU_READ, SCANOUT
BGRA_8888	GPU_READ, SCANOUT, GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
YVU_420	Software only
YUV_420_BIPLANAR	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT
UYVY_422	GPU_READ_CPU_READ_WRITE, GPU_READ_CPU_READ_WRITE_PERSISTENT


Log Messages
[666:15527:0729/090808:ERROR:vt_video_decode_accelerator_mac.cc(649)] : Illegal attempt to decode without IDR. Discarding decode requests until next IDR.
[666:1295:0729/093322:ERROR:gl_image_io_surface.mm(318)] : Error in CGLTexImageIOSurface2D for the Y plane. 10008
[666:1295:0729/093720:ERROR:interface_registry.cc(80)] : Failed to locate a binder for interface: mojom::ResourceUsageReporter
Screen Shot 2016-07-29 at 09.20.31.png
4.6 MB View Download
Screen Shot 2016-07-29 at 09.47.17.png
2.3 MB View Download
Screen Shot 2016-07-29 at 09.58.38.png
2.0 MB View Download
stable-chrome-blackbox.png
2.2 MB View Download
Screen Shot 2016-07-29 at 10.19.20.png
3.9 MB View Download
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 365
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 357
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 357
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 472
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 511
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 546
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 776
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 971
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 926
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 1185
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 1641
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 1675
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2085
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2186
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2398
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2560
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2643
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 2812
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 3148
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 3386
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 3579
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 3990
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 4415
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 4642
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 4918
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5213
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5331
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5474
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5708
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5694
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5700
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5575
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5581
JanezKraners-MacBook-Pro:~ janez.kraner$ ioclasscount IOSurface
IOSurface = 5581
JanezKraners-MacBook-Pro:~ janez.kraner$ 
Labels: ReleaseBlock-Stable
Adding "ReleaseBlock-Stable" as we're planning to take this fix in for next stable refresh per comment #43.
Thanks for the reports!

I have a new build to test, this time going on the theory that AVSampleBufferDisplayLayer's internal buffers are getting over-stuffed. The build is called "check_ready_and_flush" and is at

  https://drive.google.com/file/d/0B6kh5pYRi1dKQXZQUXY4LXdrNEE/view?usp=sharing

This build will work with h264ify or without (no difference). See if this causes the black boxes. Also see if it causes a gray/pink/yellow grid to appear (the grid will tell me if we hit the condition that I'm concerned about).
Just tested "check_ready_and_flush" (with h264ify). No luck and no gray/pink/yellow grid appears. Sorry :(


Bildschirmfoto 2016-07-29 um 19.04.07.png
547 KB View Download
Oh no.

Quick question to verify: The ioclasscount was small (like ~100) before Chromium started, and then it blew up while watching a video, right? If it's already >1000 before Chromium starts, then there will be black boxes (because there isn't enough memory around anymore). If it's already >1000 before Chromium starts, then you may need to reboot first (but, before that, run the attached program described below!)

Oh -- you're on 10.9! That's before they broke the IOSurface reporting. Instead of ioclasscount, try running this program (attached the source -- the instructions to build and run are at the top -- "g++ iosurface_dump.cc -framework IoKit -framework CoreFoundation && ./a.out").

That will tell me what process is responsible for holding these surfaces. Could you attach its output before starting the Chromium build, and again once Chromium has caused the black boxing problem.

I have one more build, which very aggressively flushes these buffers -- it's "flush_every_frame", and is at
  https://drive.google.com/file/d/0B6kh5pYRi1dKUWlvcXNmUXNEcEk/view?usp=sharing
It should put a green border around all videos.


iosurface_dump.cc
5.3 KB View Download
Oh I just started the ioclasscount very late. That's why it's already so high. usually aroround ~150 or so. I will try out the things you mentioned and get back to you in a bit...
Hey I just tried to produce the output with iosurface_dump.cc still using check_ready_and_flush. This time however I also enabled --show-fps-counter - and nothing happende. No black boxes, not memory overflow. All seemed good. I attached an output of iosurface_dump.cc called "all_ok" during video playback. Then I switched the --show-fps-counter off again to produce an output with black boxes, but when they appeared, I started iosurface_dump but it got stuck. could kill it or anything. Tried to reboot - nothing. Had to do a hard reset. Maybe to much data? The file is empty so nothing got written logged unfortunately. But maybe the iosurface_dump.cc helps?
btw this time I also disabled h264ify.
iosurface_dump_all_ok.txt
55.1 KB View Download
iosurface_dump_before.txt
46.8 KB View Download
meant to say: "But maybe the  --show-fps-counter thing helps?"
The flush_every_frame version works! Can see the green border and no problems so far.
It's odd that --show-fps-counter helped. I'm worried now that just drawing the green border fixed things in flush_every_frame.

I think we're close -- I've removed the green border and chopped flush_every_frame into a handful of separate parts. There are 3 builds here, can you let me know if they start causing the leak?

  https://drive.google.com/file/d/0B6kh5pYRi1dKeXIxdUQyNER1U3M/view?usp=sharing

For my reference, the builds are:

test_a:
  same as flush_every_frame, but no green border
  - flushAndRemoveImage before every enqueueSampleBuffer
  - flushAndRemoveImage in ContentLayer dtor
  - no calls to IOSurfaceIncrementUseCount
  - no fslp frames

test_b:
  - flush before every enqueueSampleBuffer
  - flushAndRemoveImage in ContentLayer dtor

test_c:
  - no calls to IOSurfaceIncrementUseCount
  - flushAndRemoveImage in ContentLayer dtor

Thank you again for all of your help!!

Hey sorry for the delay: Here the results:

test_a OK
test_b OK
test_c NOT OK --> MERMOY/BLACK BOXES

hope this helps! 
M52 Stable Release is blocked due to this issue. The original plan was to cut the RC at 5.00 pm today.

Chris/Ken, so you think this is critical for M52 Stable? We can take the fixes for further Stable refreshes. Please confirm.
test_a : black box and memory issues
see screenshot
Screen Shot 2016-07-29 at 23.23.17.jpg
789 KB View Download
Project Member Comment 58 by bugdroid1@chromium.org, Jul 29 2016
Labels: merge-merged-2743
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b

commit 52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b
Author: Christopher Cameron <ccameron@chromium.org>
Date: Fri Jul 29 22:41:18 2016

Mac: Disable AVSampleBufferDisplayLayer

There are reports of this leaking IOSurfaces.

BUG=631485

Review URL: https://codereview.chromium.org/2189423003 .

Cr-Commit-Position: refs/branch-heads/2743@{#711}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b/ui/accelerated_widget_mac/ca_renderer_layer_tree.mm

Approved M52 merge by chat. As per ccameron@, the fix (CL listed at comment#58) disables the feature that is new in M52 as it is safest to do.
Summary: AVSampleBufferDisplayLayer leaks IOSurfaces (causing OOM black boxes on screen) (was: Graphics distortion (black boxes))
[Changing the title of the bug]

Well, crap. So #55 gave me hope that we had a fix here. But #57 leaves me basically where we started.

AVSampleBufferDisplayLayer is the way to play accelerated video with minimal power usage (see issue 594449), so it's really upsetting to have to disable this.


test_b : black box and memory issues
see screenshots
Screen Shot 2016-07-30 at 01.03.32.jpg
1.1 MB View Download
Screen Shot 2016-07-30 at 01.04.26.jpg
986 KB View Download
It looks like there are different issues between 10.9 and 10.11 -- the bug in 10.9 was solved by adding just a "flush" -- the bug in 10.11 seems to have more problems.

Thanks sebastian.@ -- I think that 10.9 has a fix now

jani.@ and redinsect@, a couple of things I'd like you to try:

1. Did you have any luck with the "check_ready_and_flush" build from https://bugs.chromium.org/p/chromium/issues/detail?id=631485#c47 ? Did a grid appear before the black tiles?

2. I reduce Chrome's video decode path into a stand-alone application, which I've put at
  https://drive.google.com/file/d/0B6kh5pYRi1dKLTEwdlBmRVdybzg/view?usp=sharing
Could you try to run it and see if your IOSurface count grows while it is running?

It is already built, so, from a command line, do "./test_player 720p30fps.mp4", or alternatively type "make". If it has issues running, and you have XCode, you can also do "make clean && make".

Also, if anyone is having this issue on 10.11 in the SF bay area or in the LA area, drop me a line via email so that I can take a look in person.
Btw, my one remaining theory is that our YUV->RGB [1] OpenGL shader has a bad interaction with AVSampleBufferDisplay layer (like the infamous issue 158469).

I'll spin up a build that just skips that part, and we'll see if that plugs the leak. That may have to wait until Monday.

[1] https://cs.chromium.org/chromium/src/ui/gl/gl_image_io_surface.mm?rcl=0&l=335
Hey Chris - #2 (test_player) result attached. Looks totally fine (assuming I'm doing it right). 
Screen Shot 2016-07-30 at 4.10.27 PM.png
2.0 MB View Download
#1 Check ready and flush - I'm about 20 mins into testing (running x4 YouTube tabs playing video) and looks promising so far - memory usage is staying low! Will keep testing though and give you a definitive result in the next hour. 
Screen Shot 2016-07-30 at 4.25.13 PM.png
867 KB View Download
Cc: abodenha@chromium.org danakj@chromium.org rookrishna@chromium.org abod...@chromium.org ananthak@chromium.org dhadd...@chromium.org vollick@chromium.org
Issue 611310 has been merged into this issue.
Very interesting. Does it also stay low if you leave the YouTube videos visible (like, on-screen). When they're in background tabs, they don't send their frames to the AVSampleBufferDisplayLayer (which is the thing that appears to be causing the leak).

Hey Chris, yeah just an update, still in the same session and I'm definitely getting inflated mem usage. Haven't had the black boxes yet, but trying to trigger them (and see if the grid appears). 
Screen Shot 2016-07-30 at 4.52.50 PM.png
1.9 MB View Download
@ccameron no, i didn't try "check_ready_and_flush" build, will do it next and then test the latest build.

btw: i don't have to keep the tab visible. get issues regardless 
OK another update, again this is running "Check ready and flush"

I managed to spike VTDecoderXPCService above 2GB briefly, before it settling back at around 1.7GB, this is after sitting for a while at about 500MB. I did this by loading facebook, buzzfeed video, and watching video after video in theatre mode, switching every 10-20 seconds or so. On each new video load sometimes there would be no increase in mem usage, and sometimes it would jump 400-500MB at a time. 

During this time ioclasscount IOSurface stayed no higher than 3200, though I got it to briefly spike (see attached) then it dropped down again. 

During this whole time, no block boxes, no pink/grey/yellow grid. 
Screen Shot 2016-07-30 at 5.08.35 PM.png
1.9 MB View Download
"Check ready and flush"
blackboxes, no grid appeared
Screen Shot 2016-07-30 at 11.50.29.jpg
1.0 MB View Download
Screen Shot 2016-07-30 at 11.50.42.jpg
959 KB View Download
Screen Shot 2016-07-30 at 11.53.02.jpg
798 KB View Download
Screen Shot 2016-07-30 at 11.54.19.jpg
1.0 MB View Download
Re #69 -- the not needing to be visible -- that's *very* concerning!

I just realized, I only had users on 10.9 verifying that the fix that I'm planning to ship to M52 actually works!

The fix that we're pushing to M52 next week is the "no_av_ever" build in #30. Did that fix the issue for you?

If not, then I'll need to scramble to find a new fix (my last remaining guess is #63).
I'll run "no_av_ever" later.
I'm trying to see if there's a way to detect-and-recover from this.

You mentioned that when you quit Chrome then (1) VTDecoderXPCService memory usage returns to normal, and (2) 'ioclasscount IOSurface' returns to normal.

Q1. Does this return to normal if you all of the Chrome windows that are playing video? Or does it remain high?

Q2. What about closing all Chrome windows, but leaving the app running.

Q3. What about when you kill the GPU process (go to "Window" in the menu bar, "Task Manager", select "GPU Process", and press "End Task".

If this is the case, then we can make it so that we use AVSampleBufferDisplayLayer, but we kill the GPU process if we see VTDecoderXPCService's memory blow up, and disallow further use of AVSampleBufferDisplayLayer.

(this is all assuming that no_av_ever works ... if it doesn't, we're pretty hosed).
Also, when you start seeing the memory usage increase, do you always see the error

 ERROR:vt_video_decode_accelerator_mac.cc(649) : Illegal attempt to decode without IDR. Discarding decode requests until next IDR.

in your logs in about:gpu?
Comment 76 by j...@6bit.com, Jul 30 2016
I run into issues similar to this constantly, but only when I'm my laptop switches from integrated to high perf graphics, or vice versa.

I use an external thunderbolt display which automatically kicks in high perf graphics when I get to work, and then it goes back to integrated when I leave at the end of the day. There are also a number of programs that annoyingly force a high-perf graphics switch because of bugs (skype, hipchat, VLC), necessitating a restart of Chrome.

I don't know if this is the same issue, but the only way I've been able to fix the problem is to just restart Chrome any time the switch happens. Occasionally I can get a tab to show content instead of weird black box designs by tearing it off the tab bar and dropping it back.

Running dev branch 54.0.2810.2
Screenshot 2016-07-30 15.51.29.png
83.6 KB View Download
test with "no_av_ever: black boxes appearing

tested with 2 video playing tabs: youtube and bbc iplayer, the other that effects memory is tweedeck 

A1: closing down tabs that are media related (iplayer, youtube, tweedeck, resets VTDecoderXPCService memory usage and bring down 'ioclasscount IOSurface' value - black boxes go away (see screenshots)

with regards to Q in #75: error isnt present at the beginning then it occurs , see dump https://gist.github.com/anonymous/89d16c666a0687f9d6753c96d6f75cfe
start.jpg
1.1 MB View Download
Screen Shot 2016-07-31 at 00.32.41.jpg
805 KB View Download
Screen Shot 2016-07-31 at 00.33.06.jpg
904 KB View Download
Screen Shot 2016-07-31 at 00.34.32.jpg
1.9 MB View Download
Screen Shot 2016-07-31 at 00.37.26.jpg
1.1 MB View Download
"The fix that we're pushing to M52 next week is the "no_av_ever" build in #30. Did that fix the issue for you?"

Chris, about 45-60 mins of testing just now, and that build seems to be OK to me. Look like it's releasing memory before it has a change to go nuts? Never got higher than 800 ioclasscount IOSurface, and usually settled anywhere between 300-600 depending on switching tabs/vids etc. 
To summarize so far:

flush_every_frame (aka test_a):
 - makes "reasonable" changes to our use of AVFoundation
   - fixes the issue for sebastian (using OS X 10.9)
   - does not fix the issue jani (OS X 10.11)
   - not tested for redinsect

no_av_ever:
 - restores M51 behavior, doesn't allow improved battery life
   - fixes the issue for sebastian (using OS X 10.9)
   - does not fix the issue jani (OS X 10.11)
   - not sure for redinsect

This appears only related to accelerated video. I believe this because VTDecoderXPCService should not be invoked for software decoded video.

I'm looking at all of the other diffs that went into M51->M52. The other thing that comes to mind is how we handle CVPixelBufferRefs. In particular, the changes in
  https://codereview.chromium.org/1910633004
    - in M52 but not M51
    - pass CVPixelBuffer to AVSampleBufferDisplayLayer
  https://codereview.chromium.org/1881783002
    - in M52 but not M51
    - remove an assertion about CVPixelBufferRef lifetime
  https://codereview.chromium.org/1861923002
    - in M51 and M52
    - initialize GLImageIOSurface with CVPixelBufferRef instead of IOSurfaceRef

So, with this, I have one more idea of how to handle this -- treat CVPixelBufferRef the way we did in M51. I'll spin up a build to do this.

I'm also going to merge that into the M52 branch, in case we're doing another build.
I have the CVPixelBuffer build ready, it's at:

  https://drive.google.com/file/d/0B6kh5pYRi1dKX2xGb3VsRXNUMWs/view?usp=sharing

jani (and redinsect), could you give this a try?
Project Member Comment 81 by bugdroid1@chromium.org, Jul 31 2016
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/074273d8b3ebf34c38df90970bdc9174658b3ff0

commit 074273d8b3ebf34c38df90970bdc9174658b3ff0
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Jul 31 19:21:46 2016

Mac h264: Do not retain CVPixelBufferRefs

This (in particular, sending the retained CVPixelBufferRefs over to an
AVSampleBufferDisplayLayer) may be causing IOSurface leaks in M52.

BUG=631485

Review URL: https://codereview.chromium.org/2197893002 .

Cr-Commit-Position: refs/branch-heads/2743@{#714}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/074273d8b3ebf34c38df90970bdc9174658b3ff0/media/gpu/vt_video_decode_accelerator_mac.cc

The above patch merges the changes from the "no_cvpixel_or_avlayer" build in #80 to M52 (because there is no risk).

If "no_cvpixel_or_avlayer" works, then I'll have one more patch to see if we can keep the battery improvement (and roll it out in M53). In theory it should be possible.

Fingers crossed that the CVPixelBufferRef issue is the problem.
started testing no_cvpixel_or_avlayer"

immediate observation is that .mp4 video playback is brokebn. it starts and immediately stops. example in tweetdeck this https://pbs.twimg.com/tweet_video/CoqL7nEVIAA39Cq.mp4 or channel 4 f1 race rewind from today. all mp4 videoes in tweedeck played fine with all the other versions on chromium i've tested. youtube plays fine though

see the issue https://drive.google.com/open?id=0B77k0uPbvCnvTWZvQU9KTngtTkk
tested "no_cvpixel_or_avlayer" : black boxes
its also been one of the quickest to increase VTDecoderXPCService's memory consumption

closing down youtube in this scenario didnt make a dent in VTDecoderXPCService's memory. shutting done tweedeck site did - rest it back to normal 24 mb

chrome://gpu errors
https://gist.github.com/anonymous/c79b2f7971281585facf287473b5cb42


start.jpg
1.1 MB View Download
Screen Shot 2016-07-31 at 23.00.48.jpg
964 KB View Download
Screen Shot 2016-07-31 at 23.01.48.jpg
1.0 MB View Download
Screen Shot 2016-07-31 at 23.02.24.jpg
1.0 MB View Download
Sorry about the no_cvpixel_or_avlayer build (oops, now I need to fix that in the M52 tree too ... sorry TPMs!!!!). Frantically un-doing that. I've deleted no_cvpixel_or_avlayer from Google Drive.

I'm beginning to suspect that TweetDeck is the source of all of these problems, and it's only coincidence that we're seeing issues with video playback.

Can you ever reproduce the problem without using TweetDeck?

Is TweetDeck an extension, or just a webpage?

[I'm creating a twitter account to test this now, to see if I can reproduce the problem].
One more thing to check -- can you to to Task Manager, and add the column "Gpu Memory" (right-click on the column titles),  and attach a screenshot sorted by that? That might give some hints as to who is eating tons of memory.
tweedeck is just a webpage...
as i cant play any other video other than youtube with this build, the VTDecoderXPCService memory usage is remaining constantly low (2.4 mb) 

see attached screenshot.
I will not add tweetdeck tab and take a screenshot with task manager mem usage
Screen Shot 2016-07-31 at 23.40.29.jpg
926 KB View Download
Oh -- sorry, for the "Gpu Memory", use any build that causes the issue (Chrome Canary, Chrome 52 are fine). And keep TweetDeck there.

What I'm looking for is if our Chrome task manager sees the memory usage too (if it does, then the problem should be easier to track).
to sdd to this. my daily chrome usage  has the same (work and home) for the last year or so: 5 pinned tabs on start-up: 2x Inbox, feedly, hangouts, tweetdeck and they run all day. Not sure what change in v52 upgrade....hence i noticed this issue immediately after the update
^^^ Same here, Pinned: G+, Gmail, Inbox, Photos, Hangouts, Tweetdeck, Trello - all pinned all day - never had problems till build 52. 

Sorry, Chris - do you want us to use Chrome stable, including Tweetdeck - get the black boxes to happen, then give you the chrome://gpu output AND taskmanager screenshot? 

Sorry, I wasn't clear.

I'm looking for a screenshot of Task Manager with the "Gpu Memory" column when the black boxes problem is occurring. You can use whichever build is most convenient for you to reproduce the black boxes problem (so, Chrome 52 would be fine).
that's fine, in test now. will take screenshot when it reaches that point
Thank you ccameron@ for trying very hard to fix this issue. M52 is already in stable and bar is VERY high. We can take change only if it is baked/verified in canary and safe to merge. 

Per our chat on Friday, I only approved change listed at #58 to directly land on M52 branch 2743 as it just disables the feature that is new in M52 and it is safest to do (see #59). We're  planning to cut M52 Stable RC tomorrow, so please revert any changes (#58 and #81) if anyone or both of them are not safe.

Any other merges to M52, please re-request a merge by applying "Merge-Request-52" label. Thank you very much.

 

Re #94, sorry! I've reverted #81.

#58 is known to be safe.
Project Member Comment 96 by bugdroid1@chromium.org, Jul 31 2016
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1066f888e958cd68cb6fb9bc60aa12e0db379937

commit 1066f888e958cd68cb6fb9bc60aa12e0db379937
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Jul 31 23:15:11 2016

Revert "Mac h264: Do not retain CVPixelBufferRefs"

This reverts commit 074273d8b3ebf34c38df90970bdc9174658b3ff0.

BUG=631485

Review URL: https://codereview.chromium.org/2201673002 .

Cr-Commit-Position: refs/branch-heads/2743@{#715}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/1066f888e958cd68cb6fb9bc60aa12e0db379937/media/gpu/vt_video_decode_accelerator_mac.cc

Got the black boxes on stable. Attached is Task Manager, Activity Monitor, ioclasscount IOSurface, and Chrome://GPU

Chris, yeah looks like Tweetdeck may be the cause? (How effing frustrating). Still weird that prob surfaced in 52... but Tweetdeck development seems neglected at best :/


Screen Shot 2016-08-01 at 9.42.08 AM.png
985 KB View Download
gpu-chrome-stable-tweetdeck-probably.txt
110 KB View Download
Oh -- for the task manager, can you add the "Gpu Memory" column (right-click on the column titles, and it will be an option to click on).
(adding example screenshot of selecting the column)
gpumem.png
50.0 KB View Download
Ah crap - yep. Stand by, just need to blackbox again. 
tested with no_cvpixel_or_avlayer build
attached screenshot with GPU mem in task manager

about://gpu log message shortly before black boxes:
Log Messages
[5672:1295:0731/233405:ERROR:interface_registry.cc(80)] : Failed to locate a binder for interface: mojom::ResourceUsageReporter
[5672:1295:0801/002444:ERROR:gl_image_io_surface.mm(318)] : Error in CGLTexImageIOSurface2D for the Y plane. 10008
Screen Shot 2016-08-01 at 01.03.18.jpg
1.0 MB View Download
Screen Shot 2016-08-01 at 01.03.43.jpg
1.2 MB View Download
Here you go, mate. 
Screen Shot 2016-08-01 at 10.31.48 AM.png
1.4 MB View Download
Thanks! That very effectively shoots down my theory about blaming tweet deck.

FYI, my local tweet deck is still hanging out at <200 IOSurfaces.

I'm continuing to scrape through the changes that went into M52, and there was a change to our VTDecompressionSession instance in https://codereview.chromium.org/1882533002.

I've pulled that out, and also torn out all of the other things I'd already torn out (plus canvas), and put it in the build "no_cvpixel-no_av-no_seekcl-no_canvas", which I tested (sorry about the last build), and I've uploaded to:

  https://drive.google.com/file/d/0B6kh5pYRi1dKNy01SXhTNEVfRTg/view?usp=sharing

Could you see if the black box issue reproduces with that.

(now fingers crossed that it's the VT patch -- that would make a lot of sense, given the other observations).
Re #95, thank you so much for quickly reverting #81 and good to know #58 is safe.
Hey Chris - very minimal black boxes, but there they are. 
Sorry, attached. 
Screen Shot 2016-08-01 at 12.28.53 PM.png
1.9 MB View Download
gpu-20160801-1229.txt
90.5 KB View Download
Sorry again, ^^^above was using "no_cvpixel-no_av-no_seekcl-no_canvas"
I would say, Tweetdeck still "seems" to be involved somehow? When I refreshed the Tweetdeck tab, the VTDecoderXPCService amount reset, and so did ioclasscount IOSurface. 
Quick experiment running "no_cvpixel-no_av-no_seekcl-no_canvas": 

- I removed Tweetdeck tab (didn't restart Chromium build)
- Have been running for over an hour wth normal usage (plenty of tabs, video playing, etc)
- VTDecoderXPCService max about 600MB when flipping around Facebook video, quickly resets when leaving/closing the tab
- ioclasscount IOSurface max about 600 but stays closer to ~200-300
- Zero performance problems experienced so far. 

Will run this config (I actually have Tweetdeck running in standalone native Mac app) for the rest of the day and report back. 
Thanks!

I'm wondering if TweetDeck opens lots of decoder instances.

I added some instrumentation which will appear when run at the command line

    ccameron-macbookpro:out ccameron$ ./instrumented_no_cvpixel_seek_av_canvas/Chromium.app/Contents/MacOS/Chromium 
    2016-07-31 21:35:54.436 Chromium[77225:745124] NSWindow warning: adding an unknown subview: <FullSizeContentView: 0x7fa5b4c64000>. Break on NSLog to debug.
    2016-07-31 21:35:54.436 Chromium[77225:745124] Call stack:
    (
        "+callStackSymbols disabled for performance reasons"
    )
    ***
    ***
    *** VTVideoDecodeAccelerator count: 1
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator::Frame count: 11 (decoder count: 1)
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator count: 2
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator::Frame count: 22 (decoder count: 2)

...

And added it to "instrumented_no_cvpixel_seek_av_canvas", at

    https://drive.google.com/file/d/0B6kh5pYRi1dKQ1Q4N0t0dkphVGc/view?usp=sharing

How high do you see these decoder counts going?
Summary: VTDecoderXPCService leaks IOSurfaces (causing OOM black boxes on screen) (was: AVSampleBufferDisplayLayer leaks IOSurfaces (causing OOM black boxes on screen))
VTVideoDecodeAccelerator::Frame count: 88 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 99 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 110 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 121 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 132 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 143 (decoder count: 10)
***

Here's a minute of scrolling up/down in the tweetdeck columns. 
VTVideoDecodeAccelerator_v1.txt
7.0 KB View Download
Seems to keep increasing.

*** VTVideoDecodeAccelerator::Frame count: 495 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 484 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 495 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 506 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 517 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 528 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 539 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 550 (decoder count: 40)

Wow! 40 instances of accelerated h264 decoders, and 550 decoded frames -- that's crazy-high! And I would bet it tracks with "ioclasscount IOSurface" (like, there are probably ~100-200 more IOSurfaces than VTVideoDecodeAccelerator::Frames, but as they grow, they grow in tandem).

So, it looks to me that tweetdeck is DoSing the system. We don't put any guards in place to prevent this from happening (any webpage is perfectly welcome to DoS the system this way).

My tweetdeck only gets to ~10 decoders, but that's probably because of the content that I'm looking at. But I do see "ioclasscount IOSurface" growing, and I do see VTDecoderXPCService growing.

I'd guess that this appeared in M52 because we held on to IOSurfaces "just a bit longer" than in M51, and that pushed things over the edge.

Also, the "Gpu Memory" column in task manager doesn't take into account these decoded frames (hmm, maybe I should add an IOSurface memory column).

So, the plan for this:
* We should look into ways to prevent this sort of DoSing (that's a longer-term project)
* Maybe devrel can reach out to Twitter about these problems
* I'll leave AVSampleBufferDisplayLayer disabled for M52
* I'll add the "flush" fix to AVSampleBufferDisplayLayer for M53, and ship that then

Thanks Chris! FYI - *** VTVideoDecodeAccelerator::Frame count: 2453 (decoder count: 165)

So what do you think is the rough ETA for this to be resolved in stable?

For other users who have found this thread: close/reopen tweetdeck.twitter.com *regularly* to resolve OOM errors in the meantime :)
Comment 117 Deleted
Wow! That definitely solves the mystery!

WRT when we'll get a fix on stable for this, I'm not sure -- we'll need to figure something out between devrel, media team, etc. Chrome 52 is almost-completely-frozen (so if this is mostly tweetdeck's fault, we're not likely to push further changes for it).

I'll update tomorrow when I get more information about this from the rest of the team.
Also, thank you everyone for the debugging help!
Hi folks – I'm Tom, lead on the frontend of TweetDeck. Thanks for your extensive efforts looking into this — it's something I've been seeing lots, probably because I often run 3 or 4 instances of TweetDeck at a time!

There is one likely culprit for this: gifs. We render them as videos (mp4, <video>), and don't pause them when they're offscreen, although we sometimes remove them from DOM completely to keep a lid on the DOM node count.

--> How can we prove it's the gifs? Can we see the source of these decoders?

The team can put some work into pausing offscreen videos, but we don't have the resources to do it right now. However, we're planning to replace our infinite list implementation this quarter. DOM-thrift is at the top of that project's priorities.

(Alternatively, we recently changed the way we embed videos in TweetDeck to use an iframed player (video/mp2t, <video>) but this is probably not the issue as it takes a few clicks to watch one and you're seeing high instances of the h264 decoders.)
Labels: -Pri-1 -ReleaseBlock-Stable Pri-2
Hi tashworth@, you'll likely want to resolve this sooner than later as the M52 release will be going out soon and we don't feel that this is a stable blocker. Infinitely loading videos like this will either cause OOM or thread cap issues. We can idle collect software-decoded paused instances, but if the videos are playing or hardware decoded they won't be collected.

There's some improvements we could make in later versions to collect hardware decoded videos and possibly auto-pause videos w/o sound once off-screen, but that's not something we can land with M52 or even M53.

The most reasonable way to fix this is by clearing the src attribute (set '') _and_ removing the element from the dom -- this is the advice we gave Vine a long time ago. Unfortunately the HTML5 media spec allows elements to continue to function even when removed from the DOM, so it's necessary to set src='' to completely delete them.

The IntersectionObserver API should make it fairly easy to implement this; clear src='' some distance off page and bring it back once the page comes back into visibility.
Thanks for the tips — we'll look at those options. We already know what's onscreen pretty well — the hard bit is managing starting & stopping correctly.

The rendering engine supporting natively this would be *very* helpful. This is tough in JavaScript without dropping frames, especially alongside the other things TweetDeck is doing concurrently.
Thanks, happy to help. FWIW, anything we implemented in the rendering engine would not be any better at preserving frames than JavaScript. Suspend/resume works by destroying the internal playback engine and seeking to the last known time upon resume.
hi, thanks for getting in touch Tom.
perhaps auto playing is replaced with click to play. all autoplaying is annoying to say the least
If it helps, here's some more info: I'm having the same issue (even as we speak) running the latest Chrome (stable channel) on a mid-2011 iMac with 20GB or RAM, running OSX 10.9.5. I don't use tweetdeck, but Tom mentioned something about GIFS. If it helps, I keep a http://messenger.com (the standalone facebook chat) tab most of the time open, and there are constantly animated GIFs on the conversations (can be seen in the screenshot - in messenger.com all the images in the current conversation reappear in the form of a mosaic at the bottom right - when they are GIFs, the mosaic is "animated"). The black-boxes problem for me starts some time after opening some youtube video. In the screenshot you can see the ioclasscount IOSurface count (north of 5000 and goes up merely by scrolling). I haven't tested any of the builds, but gladly would test something if you need me to!

-Fotis
Screen Shot 2016-08-03 at 4.10.38 PM.png
237 KB View Download
chrome___gpu__ redinsect@.pdf
538 KB Download
To #125, that's a similar-but-separate issue 632178, which is related to any video playback on some macOS 10.9 systems. That issue has been fixed, and the fix should be pushed to users in the next day or so (if it hasn't already).
Issue 632558 has been merged into this issue.
Issue 639070 has been merged into this issue.
Project Member Comment 129 by bugdroid1@chromium.org, Aug 21 2016
Labels: merge-merged-2785
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e7f13a23d26aa74bde7e6758e4c46071b819453e

commit e7f13a23d26aa74bde7e6758e4c46071b819453e
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Aug 21 23:43:10 2016

Mac: Disable AVSampleBufferDisplayLayer on Mac <10.11

There are reports of this leaking IOSurfaces.

BUG=631485

Review URL: https://codereview.chromium.org/2269473002 .

Cr-Commit-Position: refs/branch-heads/2785@{#698}
Cr-Branched-From: 68623971be0cfc492a2cb0427d7f478e7b214c24-refs/heads/master@{#403382}

[modify] https://crrev.com/e7f13a23d26aa74bde7e6758e4c46071b819453e/ui/accelerated_widget_mac/ca_renderer_layer_tree.mm

Cc: rnimmagadda@chromium.org
Labels: TE-Verified-M53 TE-Verified-53.0.2785.80
Unable to repro this issue on MAC (10.11.6) on Chrome Beta Version - 53.0.2785.80. Looks like the issue is fixed.

Screen-recording is attached.

Hence adding the TE-Verified labels.


631485.mov
26.8 MB Download
this has NOT been resolved in version 53.0.2785.89 (64-bit) on OSX 10.11.6
see attachment
Screen Shot 2016-09-04 at 18.13.40.jpg
864 KB View Download
Screen Shot 2016-09-04 at 18.14.19.jpg
716 KB View Download
Screen Shot 2016-09-04 at 18.15.27.jpg
117 KB View Download
Screen Shot 2016-09-04 at 18.17.09.jpg
525 KB View Download
Please see #120, #121, and #122 -- this is, in effect, a DoS by the website.
Hi, 
I'm familiar with the comments and also non-existence of this prior to v52. 
This shouldn't have a TE-Verified-53.0.2785.80 label on it.
Labels: -Needs-Feedback -TE-Verified-53.0.2785.80
Status: ExternalDependency
Removed verify labels changing status to ExternalDependency.
Hi,

David here, also frontend dev on TweetDeck. We have recently put out an update to explicitly clear the "src" attribute before removing videos from the DOM. Whilst this doesn't eliminate the problem, it hopefully helps ease it a bit. There's certainly cases where the issue will still crop up (for example: a user who filters their content to display *ONLY* animated GIF videos)

While we are discussing this, I was wondering what your thoughts are on browser behaviour in this particular scenario...

The media spec makes a valid case for media elements with audio to continue playing in the background - even when detached from the DOM. But in our case, the videos don't have an audio track. Is it a reasonable expectation that removing a video with no sound from the DOM, makes it a valid candidate to be collected? The spec does allow for GC in this scenario, and I totally understand that implementation details are never easy. But I'm interested in the conversation, and curious on what you think.

Having said that, I'm going to re-iterate what my colleague Tom has said above, and we are working on a larger body of performance work, which includes improving how videos are handled on our side.

how recent?

I have a normal timeline and a dont do any scrolling and playback of videos and that still happens....
The change to clear the src attribute went out mid-August (not that recent after all!). 

Curious why you'd still be seeing it though, especially if you've got no animated-gif videos happening.
as soon as i load the site up, the VTDecoderXPCService which has been running for a while at 8.9MB shoots up to 35.8MB
It then continues to rise, eventhough there is barely a new tweet, no scrolling the timeline
see attachements and note the time stamp

Screen Shot 2016-09-09 at 21.38.38.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.41.00.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.41.49.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.43.03.jpg
1.3 MB View Download
Cc: sureshkumari@chromium.org
Issue 694311 has been merged into this issue.
Issue 698844 has been merged into this issue.
My bug 698844 was merged into this issue as a duplicate. I'm seeing the issue with Chrome 56, but I see no activity on this bug to indicate that it is still open. Is this bug really a duplicate and not fixed?
Issue 647967 has been merged into this issue.
Sign in to add a comment