New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 631485 link

Starred by 33 users

VTDecoderXPCService leaks IOSurfaces (causing OOM black boxes on screen)

Reported by jani.kra...@gmail.com, Jul 26 2016

Issue description

Chrome Version       :  52.0.2743.82 (64-bit)
URLs (if applicable) :
OS version               : 10.11.6
Behavior in Safari 3.x/4.x (if applicable): normal
Behavior in Firefox 3.x (if applicable): normal 
Behavior in Chrome for Windows: N/a

What steps will reproduce the problem?
(1)Open a website in a tab and leave it
(2) Open a new tab and continue to browser
(3) periodically check back on the inital tab

What is the expected result?
Website has no change in appearance

What happens instead?
Parts of the website will be covered by black boxes, which will flicker as you hover over.

Chroem seems to trigger VTDecoderXPCService which consumes memory. Same goes for Google Chrome helper tasks. When does a simple plain html site need 1gb of memory???


 
Screen Shot 2016-07-26 at 15.52.22.png
729 KB View Download
Showing comments 46 - 145 of 145 Older
Labels: ReleaseBlock-Stable
Adding "ReleaseBlock-Stable" as we're planning to take this fix in for next stable refresh per comment #43.
Thanks for the reports!

I have a new build to test, this time going on the theory that AVSampleBufferDisplayLayer's internal buffers are getting over-stuffed. The build is called "check_ready_and_flush" and is at

  https://drive.google.com/file/d/0B6kh5pYRi1dKQXZQUXY4LXdrNEE/view?usp=sharing

This build will work with h264ify or without (no difference). See if this causes the black boxes. Also see if it causes a gray/pink/yellow grid to appear (the grid will tell me if we hit the condition that I'm concerned about).
Just tested "check_ready_and_flush" (with h264ify). No luck and no gray/pink/yellow grid appears. Sorry :(


Bildschirmfoto 2016-07-29 um 19.04.07.png
547 KB View Download
Oh no.

Quick question to verify: The ioclasscount was small (like ~100) before Chromium started, and then it blew up while watching a video, right? If it's already >1000 before Chromium starts, then there will be black boxes (because there isn't enough memory around anymore). If it's already >1000 before Chromium starts, then you may need to reboot first (but, before that, run the attached program described below!)

Oh -- you're on 10.9! That's before they broke the IOSurface reporting. Instead of ioclasscount, try running this program (attached the source -- the instructions to build and run are at the top -- "g++ iosurface_dump.cc -framework IoKit -framework CoreFoundation && ./a.out").

That will tell me what process is responsible for holding these surfaces. Could you attach its output before starting the Chromium build, and again once Chromium has caused the black boxing problem.

I have one more build, which very aggressively flushes these buffers -- it's "flush_every_frame", and is at
  https://drive.google.com/file/d/0B6kh5pYRi1dKUWlvcXNmUXNEcEk/view?usp=sharing
It should put a green border around all videos.


iosurface_dump.cc
5.3 KB View Download
Oh I just started the ioclasscount very late. That's why it's already so high. usually aroround ~150 or so. I will try out the things you mentioned and get back to you in a bit...
Hey I just tried to produce the output with iosurface_dump.cc still using check_ready_and_flush. This time however I also enabled --show-fps-counter - and nothing happende. No black boxes, not memory overflow. All seemed good. I attached an output of iosurface_dump.cc called "all_ok" during video playback. Then I switched the --show-fps-counter off again to produce an output with black boxes, but when they appeared, I started iosurface_dump but it got stuck. could kill it or anything. Tried to reboot - nothing. Had to do a hard reset. Maybe to much data? The file is empty so nothing got written logged unfortunately. But maybe the iosurface_dump.cc helps?
btw this time I also disabled h264ify.
iosurface_dump_all_ok.txt
55.1 KB View Download
iosurface_dump_before.txt
46.8 KB View Download
meant to say: "But maybe the  --show-fps-counter thing helps?"
The flush_every_frame version works! Can see the green border and no problems so far.
It's odd that --show-fps-counter helped. I'm worried now that just drawing the green border fixed things in flush_every_frame.

I think we're close -- I've removed the green border and chopped flush_every_frame into a handful of separate parts. There are 3 builds here, can you let me know if they start causing the leak?

  https://drive.google.com/file/d/0B6kh5pYRi1dKeXIxdUQyNER1U3M/view?usp=sharing

For my reference, the builds are:

test_a:
  same as flush_every_frame, but no green border
  - flushAndRemoveImage before every enqueueSampleBuffer
  - flushAndRemoveImage in ContentLayer dtor
  - no calls to IOSurfaceIncrementUseCount
  - no fslp frames

test_b:
  - flush before every enqueueSampleBuffer
  - flushAndRemoveImage in ContentLayer dtor

test_c:
  - no calls to IOSurfaceIncrementUseCount
  - flushAndRemoveImage in ContentLayer dtor

Thank you again for all of your help!!

Hey sorry for the delay: Here the results:

test_a OK
test_b OK
test_c NOT OK --> MERMOY/BLACK BOXES

hope this helps! 
M52 Stable Release is blocked due to this issue. The original plan was to cut the RC at 5.00 pm today.

Chris/Ken, so you think this is critical for M52 Stable? We can take the fixes for further Stable refreshes. Please confirm.
test_a : black box and memory issues
see screenshot
Screen Shot 2016-07-29 at 23.23.17.jpg
789 KB View Download
Project Member

Comment 58 by bugdroid1@chromium.org, Jul 29 2016

Labels: merge-merged-2743
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b

commit 52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b
Author: Christopher Cameron <ccameron@chromium.org>
Date: Fri Jul 29 22:41:18 2016

Mac: Disable AVSampleBufferDisplayLayer

There are reports of this leaking IOSurfaces.

BUG=631485

Review URL: https://codereview.chromium.org/2189423003 .

Cr-Commit-Position: refs/branch-heads/2743@{#711}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/52c1eb8aeaf97d9158b24bf237b88ba7d8ee396b/ui/accelerated_widget_mac/ca_renderer_layer_tree.mm

Approved M52 merge by chat. As per ccameron@, the fix (CL listed at comment#58) disables the feature that is new in M52 as it is safest to do.
Summary: AVSampleBufferDisplayLayer leaks IOSurfaces (causing OOM black boxes on screen) (was: Graphics distortion (black boxes))
[Changing the title of the bug]

Well, crap. So #55 gave me hope that we had a fix here. But #57 leaves me basically where we started.

AVSampleBufferDisplayLayer is the way to play accelerated video with minimal power usage (see  issue 594449 ), so it's really upsetting to have to disable this.


test_b : black box and memory issues
see screenshots
Screen Shot 2016-07-30 at 01.03.32.jpg
1.1 MB View Download
Screen Shot 2016-07-30 at 01.04.26.jpg
986 KB View Download
It looks like there are different issues between 10.9 and 10.11 -- the bug in 10.9 was solved by adding just a "flush" -- the bug in 10.11 seems to have more problems.

Thanks sebastian.@ -- I think that 10.9 has a fix now

jani.@ and redinsect@, a couple of things I'd like you to try:

1. Did you have any luck with the "check_ready_and_flush" build from https://bugs.chromium.org/p/chromium/issues/detail?id=631485#c47 ? Did a grid appear before the black tiles?

2. I reduce Chrome's video decode path into a stand-alone application, which I've put at
  https://drive.google.com/file/d/0B6kh5pYRi1dKLTEwdlBmRVdybzg/view?usp=sharing
Could you try to run it and see if your IOSurface count grows while it is running?

It is already built, so, from a command line, do "./test_player 720p30fps.mp4", or alternatively type "make". If it has issues running, and you have XCode, you can also do "make clean && make".

Also, if anyone is having this issue on 10.11 in the SF bay area or in the LA area, drop me a line via email so that I can take a look in person.
Btw, my one remaining theory is that our YUV->RGB [1] OpenGL shader has a bad interaction with AVSampleBufferDisplay layer (like the infamous  issue 158469 ).

I'll spin up a build that just skips that part, and we'll see if that plugs the leak. That may have to wait until Monday.

[1] https://cs.chromium.org/chromium/src/ui/gl/gl_image_io_surface.mm?rcl=0&l=335
Hey Chris - #2 (test_player) result attached. Looks totally fine (assuming I'm doing it right). 
Screen Shot 2016-07-30 at 4.10.27 PM.png
2.0 MB View Download
#1 Check ready and flush - I'm about 20 mins into testing (running x4 YouTube tabs playing video) and looks promising so far - memory usage is staying low! Will keep testing though and give you a definitive result in the next hour. 
Screen Shot 2016-07-30 at 4.25.13 PM.png
867 KB View Download
Cc: abodenha@chromium.org danakj@chromium.org rookrishna@chromium.org abod...@chromium.org ananthak@chromium.org dhadd...@chromium.org vollick@chromium.org
 Issue 611310  has been merged into this issue.
Very interesting. Does it also stay low if you leave the YouTube videos visible (like, on-screen). When they're in background tabs, they don't send their frames to the AVSampleBufferDisplayLayer (which is the thing that appears to be causing the leak).

Hey Chris, yeah just an update, still in the same session and I'm definitely getting inflated mem usage. Haven't had the black boxes yet, but trying to trigger them (and see if the grid appears). 
Screen Shot 2016-07-30 at 4.52.50 PM.png
1.9 MB View Download
@ccameron no, i didn't try "check_ready_and_flush" build, will do it next and then test the latest build.

btw: i don't have to keep the tab visible. get issues regardless 
OK another update, again this is running "Check ready and flush"

I managed to spike VTDecoderXPCService above 2GB briefly, before it settling back at around 1.7GB, this is after sitting for a while at about 500MB. I did this by loading facebook, buzzfeed video, and watching video after video in theatre mode, switching every 10-20 seconds or so. On each new video load sometimes there would be no increase in mem usage, and sometimes it would jump 400-500MB at a time. 

During this time ioclasscount IOSurface stayed no higher than 3200, though I got it to briefly spike (see attached) then it dropped down again. 

During this whole time, no block boxes, no pink/grey/yellow grid. 
Screen Shot 2016-07-30 at 5.08.35 PM.png
1.9 MB View Download
"Check ready and flush"
blackboxes, no grid appeared
Screen Shot 2016-07-30 at 11.50.29.jpg
1.0 MB View Download
Screen Shot 2016-07-30 at 11.50.42.jpg
959 KB View Download
Screen Shot 2016-07-30 at 11.53.02.jpg
798 KB View Download
Screen Shot 2016-07-30 at 11.54.19.jpg
1.0 MB View Download
Re #69 -- the not needing to be visible -- that's *very* concerning!

I just realized, I only had users on 10.9 verifying that the fix that I'm planning to ship to M52 actually works!

The fix that we're pushing to M52 next week is the "no_av_ever" build in #30. Did that fix the issue for you?

If not, then I'll need to scramble to find a new fix (my last remaining guess is #63).
I'll run "no_av_ever" later.
I'm trying to see if there's a way to detect-and-recover from this.

You mentioned that when you quit Chrome then (1) VTDecoderXPCService memory usage returns to normal, and (2) 'ioclasscount IOSurface' returns to normal.

Q1. Does this return to normal if you all of the Chrome windows that are playing video? Or does it remain high?

Q2. What about closing all Chrome windows, but leaving the app running.

Q3. What about when you kill the GPU process (go to "Window" in the menu bar, "Task Manager", select "GPU Process", and press "End Task".

If this is the case, then we can make it so that we use AVSampleBufferDisplayLayer, but we kill the GPU process if we see VTDecoderXPCService's memory blow up, and disallow further use of AVSampleBufferDisplayLayer.

(this is all assuming that no_av_ever works ... if it doesn't, we're pretty hosed).
Also, when you start seeing the memory usage increase, do you always see the error

 ERROR:vt_video_decode_accelerator_mac.cc(649) : Illegal attempt to decode without IDR. Discarding decode requests until next IDR.

in your logs in about:gpu?

Comment 76 by j...@6bit.com, Jul 30 2016

I run into issues similar to this constantly, but only when I'm my laptop switches from integrated to high perf graphics, or vice versa.

I use an external thunderbolt display which automatically kicks in high perf graphics when I get to work, and then it goes back to integrated when I leave at the end of the day. There are also a number of programs that annoyingly force a high-perf graphics switch because of bugs (skype, hipchat, VLC), necessitating a restart of Chrome.

I don't know if this is the same issue, but the only way I've been able to fix the problem is to just restart Chrome any time the switch happens. Occasionally I can get a tab to show content instead of weird black box designs by tearing it off the tab bar and dropping it back.

Running dev branch 54.0.2810.2
Screenshot 2016-07-30 15.51.29.png
83.6 KB View Download
test with "no_av_ever: black boxes appearing

tested with 2 video playing tabs: youtube and bbc iplayer, the other that effects memory is tweedeck 

A1: closing down tabs that are media related (iplayer, youtube, tweedeck, resets VTDecoderXPCService memory usage and bring down 'ioclasscount IOSurface' value - black boxes go away (see screenshots)

with regards to Q in #75: error isnt present at the beginning then it occurs , see dump https://gist.github.com/anonymous/89d16c666a0687f9d6753c96d6f75cfe
start.jpg
1.1 MB View Download
Screen Shot 2016-07-31 at 00.32.41.jpg
805 KB View Download
Screen Shot 2016-07-31 at 00.33.06.jpg
904 KB View Download
Screen Shot 2016-07-31 at 00.34.32.jpg
1.9 MB View Download
Screen Shot 2016-07-31 at 00.37.26.jpg
1.1 MB View Download
"The fix that we're pushing to M52 next week is the "no_av_ever" build in #30. Did that fix the issue for you?"

Chris, about 45-60 mins of testing just now, and that build seems to be OK to me. Look like it's releasing memory before it has a change to go nuts? Never got higher than 800 ioclasscount IOSurface, and usually settled anywhere between 300-600 depending on switching tabs/vids etc. 
To summarize so far:

flush_every_frame (aka test_a):
 - makes "reasonable" changes to our use of AVFoundation
   - fixes the issue for sebastian (using OS X 10.9)
   - does not fix the issue jani (OS X 10.11)
   - not tested for redinsect

no_av_ever:
 - restores M51 behavior, doesn't allow improved battery life
   - fixes the issue for sebastian (using OS X 10.9)
   - does not fix the issue jani (OS X 10.11)
   - not sure for redinsect

This appears only related to accelerated video. I believe this because VTDecoderXPCService should not be invoked for software decoded video.

I'm looking at all of the other diffs that went into M51->M52. The other thing that comes to mind is how we handle CVPixelBufferRefs. In particular, the changes in
  https://codereview.chromium.org/1910633004
    - in M52 but not M51
    - pass CVPixelBuffer to AVSampleBufferDisplayLayer
  https://codereview.chromium.org/1881783002
    - in M52 but not M51
    - remove an assertion about CVPixelBufferRef lifetime
  https://codereview.chromium.org/1861923002
    - in M51 and M52
    - initialize GLImageIOSurface with CVPixelBufferRef instead of IOSurfaceRef

So, with this, I have one more idea of how to handle this -- treat CVPixelBufferRef the way we did in M51. I'll spin up a build to do this.

I'm also going to merge that into the M52 branch, in case we're doing another build.
I have the CVPixelBuffer build ready, it's at:

  https://drive.google.com/file/d/0B6kh5pYRi1dKX2xGb3VsRXNUMWs/view?usp=sharing

jani (and redinsect), could you give this a try?
Project Member

Comment 81 by bugdroid1@chromium.org, Jul 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/074273d8b3ebf34c38df90970bdc9174658b3ff0

commit 074273d8b3ebf34c38df90970bdc9174658b3ff0
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Jul 31 19:21:46 2016

Mac h264: Do not retain CVPixelBufferRefs

This (in particular, sending the retained CVPixelBufferRefs over to an
AVSampleBufferDisplayLayer) may be causing IOSurface leaks in M52.

BUG=631485

Review URL: https://codereview.chromium.org/2197893002 .

Cr-Commit-Position: refs/branch-heads/2743@{#714}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/074273d8b3ebf34c38df90970bdc9174658b3ff0/media/gpu/vt_video_decode_accelerator_mac.cc

The above patch merges the changes from the "no_cvpixel_or_avlayer" build in #80 to M52 (because there is no risk).

If "no_cvpixel_or_avlayer" works, then I'll have one more patch to see if we can keep the battery improvement (and roll it out in M53). In theory it should be possible.

Fingers crossed that the CVPixelBufferRef issue is the problem.
started testing no_cvpixel_or_avlayer"

immediate observation is that .mp4 video playback is brokebn. it starts and immediately stops. example in tweetdeck this https://pbs.twimg.com/tweet_video/CoqL7nEVIAA39Cq.mp4 or channel 4 f1 race rewind from today. all mp4 videoes in tweedeck played fine with all the other versions on chromium i've tested. youtube plays fine though

see the issue https://drive.google.com/open?id=0B77k0uPbvCnvTWZvQU9KTngtTkk
tested "no_cvpixel_or_avlayer" : black boxes
its also been one of the quickest to increase VTDecoderXPCService's memory consumption

closing down youtube in this scenario didnt make a dent in VTDecoderXPCService's memory. shutting done tweedeck site did - rest it back to normal 24 mb

chrome://gpu errors
https://gist.github.com/anonymous/c79b2f7971281585facf287473b5cb42


start.jpg
1.1 MB View Download
Screen Shot 2016-07-31 at 23.00.48.jpg
964 KB View Download
Screen Shot 2016-07-31 at 23.01.48.jpg
1.0 MB View Download
Screen Shot 2016-07-31 at 23.02.24.jpg
1.0 MB View Download
Sorry about the no_cvpixel_or_avlayer build (oops, now I need to fix that in the M52 tree too ... sorry TPMs!!!!). Frantically un-doing that. I've deleted no_cvpixel_or_avlayer from Google Drive.

I'm beginning to suspect that TweetDeck is the source of all of these problems, and it's only coincidence that we're seeing issues with video playback.

Can you ever reproduce the problem without using TweetDeck?

Is TweetDeck an extension, or just a webpage?

[I'm creating a twitter account to test this now, to see if I can reproduce the problem].
One more thing to check -- can you to to Task Manager, and add the column "Gpu Memory" (right-click on the column titles),  and attach a screenshot sorted by that? That might give some hints as to who is eating tons of memory.
tweedeck is just a webpage...
as i cant play any other video other than youtube with this build, the VTDecoderXPCService memory usage is remaining constantly low (2.4 mb) 

see attached screenshot.
I will not add tweetdeck tab and take a screenshot with task manager mem usage
Screen Shot 2016-07-31 at 23.40.29.jpg
926 KB View Download
Oh -- sorry, for the "Gpu Memory", use any build that causes the issue (Chrome Canary, Chrome 52 are fine). And keep TweetDeck there.

What I'm looking for is if our Chrome task manager sees the memory usage too (if it does, then the problem should be easier to track).
to sdd to this. my daily chrome usage  has the same (work and home) for the last year or so: 5 pinned tabs on start-up: 2x Inbox, feedly, hangouts, tweetdeck and they run all day. Not sure what change in v52 upgrade....hence i noticed this issue immediately after the update
^^^ Same here, Pinned: G+, Gmail, Inbox, Photos, Hangouts, Tweetdeck, Trello - all pinned all day - never had problems till build 52. 

Sorry, Chris - do you want us to use Chrome stable, including Tweetdeck - get the black boxes to happen, then give you the chrome://gpu output AND taskmanager screenshot? 

Sorry, I wasn't clear.

I'm looking for a screenshot of Task Manager with the "Gpu Memory" column when the black boxes problem is occurring. You can use whichever build is most convenient for you to reproduce the black boxes problem (so, Chrome 52 would be fine).
that's fine, in test now. will take screenshot when it reaches that point
Thank you ccameron@ for trying very hard to fix this issue. M52 is already in stable and bar is VERY high. We can take change only if it is baked/verified in canary and safe to merge. 

Per our chat on Friday, I only approved change listed at #58 to directly land on M52 branch 2743 as it just disables the feature that is new in M52 and it is safest to do (see #59). We're  planning to cut M52 Stable RC tomorrow, so please revert any changes (#58 and #81) if anyone or both of them are not safe.

Any other merges to M52, please re-request a merge by applying "Merge-Request-52" label. Thank you very much.

 

Re #94, sorry! I've reverted #81.

#58 is known to be safe.
Project Member

Comment 96 by bugdroid1@chromium.org, Jul 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1066f888e958cd68cb6fb9bc60aa12e0db379937

commit 1066f888e958cd68cb6fb9bc60aa12e0db379937
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Jul 31 23:15:11 2016

Revert "Mac h264: Do not retain CVPixelBufferRefs"

This reverts commit 074273d8b3ebf34c38df90970bdc9174658b3ff0.

BUG=631485

Review URL: https://codereview.chromium.org/2201673002 .

Cr-Commit-Position: refs/branch-heads/2743@{#715}
Cr-Branched-From: 2b3ae3b8090361f8af5a611712fc1a5ab2de53cb-refs/heads/master@{#394939}

[modify] https://crrev.com/1066f888e958cd68cb6fb9bc60aa12e0db379937/media/gpu/vt_video_decode_accelerator_mac.cc

Got the black boxes on stable. Attached is Task Manager, Activity Monitor, ioclasscount IOSurface, and Chrome://GPU

Chris, yeah looks like Tweetdeck may be the cause? (How effing frustrating). Still weird that prob surfaced in 52... but Tweetdeck development seems neglected at best :/


Screen Shot 2016-08-01 at 9.42.08 AM.png
985 KB View Download
gpu-chrome-stable-tweetdeck-probably.txt
110 KB View Download
Oh -- for the task manager, can you add the "Gpu Memory" column (right-click on the column titles, and it will be an option to click on).
(adding example screenshot of selecting the column)
gpumem.png
50.0 KB View Download
Ah crap - yep. Stand by, just need to blackbox again. 
tested with no_cvpixel_or_avlayer build
attached screenshot with GPU mem in task manager

about://gpu log message shortly before black boxes:
Log Messages
[5672:1295:0731/233405:ERROR:interface_registry.cc(80)] : Failed to locate a binder for interface: mojom::ResourceUsageReporter
[5672:1295:0801/002444:ERROR:gl_image_io_surface.mm(318)] : Error in CGLTexImageIOSurface2D for the Y plane. 10008
Screen Shot 2016-08-01 at 01.03.18.jpg
1.0 MB View Download
Screen Shot 2016-08-01 at 01.03.43.jpg
1.2 MB View Download
Here you go, mate. 
Screen Shot 2016-08-01 at 10.31.48 AM.png
1.4 MB View Download
Thanks! That very effectively shoots down my theory about blaming tweet deck.

FYI, my local tweet deck is still hanging out at <200 IOSurfaces.

I'm continuing to scrape through the changes that went into M52, and there was a change to our VTDecompressionSession instance in https://codereview.chromium.org/1882533002.

I've pulled that out, and also torn out all of the other things I'd already torn out (plus canvas), and put it in the build "no_cvpixel-no_av-no_seekcl-no_canvas", which I tested (sorry about the last build), and I've uploaded to:

  https://drive.google.com/file/d/0B6kh5pYRi1dKNy01SXhTNEVfRTg/view?usp=sharing

Could you see if the black box issue reproduces with that.

(now fingers crossed that it's the VT patch -- that would make a lot of sense, given the other observations).
Re #95, thank you so much for quickly reverting #81 and good to know #58 is safe.
Hey Chris - very minimal black boxes, but there they are. 
Sorry, attached. 
Screen Shot 2016-08-01 at 12.28.53 PM.png
1.9 MB View Download
gpu-20160801-1229.txt
90.5 KB View Download
Sorry again, ^^^above was using "no_cvpixel-no_av-no_seekcl-no_canvas"
I would say, Tweetdeck still "seems" to be involved somehow? When I refreshed the Tweetdeck tab, the VTDecoderXPCService amount reset, and so did ioclasscount IOSurface. 
Quick experiment running "no_cvpixel-no_av-no_seekcl-no_canvas": 

- I removed Tweetdeck tab (didn't restart Chromium build)
- Have been running for over an hour wth normal usage (plenty of tabs, video playing, etc)
- VTDecoderXPCService max about 600MB when flipping around Facebook video, quickly resets when leaving/closing the tab
- ioclasscount IOSurface max about 600 but stays closer to ~200-300
- Zero performance problems experienced so far. 

Will run this config (I actually have Tweetdeck running in standalone native Mac app) for the rest of the day and report back. 
Thanks!

I'm wondering if TweetDeck opens lots of decoder instances.

I added some instrumentation which will appear when run at the command line

    ccameron-macbookpro:out ccameron$ ./instrumented_no_cvpixel_seek_av_canvas/Chromium.app/Contents/MacOS/Chromium 
    2016-07-31 21:35:54.436 Chromium[77225:745124] NSWindow warning: adding an unknown subview: <FullSizeContentView: 0x7fa5b4c64000>. Break on NSLog to debug.
    2016-07-31 21:35:54.436 Chromium[77225:745124] Call stack:
    (
        "+callStackSymbols disabled for performance reasons"
    )
    ***
    ***
    *** VTVideoDecodeAccelerator count: 1
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator::Frame count: 11 (decoder count: 1)
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator count: 2
    ***
    ***
    ***
    ***
    *** VTVideoDecodeAccelerator::Frame count: 22 (decoder count: 2)

...

And added it to "instrumented_no_cvpixel_seek_av_canvas", at

    https://drive.google.com/file/d/0B6kh5pYRi1dKQ1Q4N0t0dkphVGc/view?usp=sharing

How high do you see these decoder counts going?
Summary: VTDecoderXPCService leaks IOSurfaces (causing OOM black boxes on screen) (was: AVSampleBufferDisplayLayer leaks IOSurfaces (causing OOM black boxes on screen))
VTVideoDecodeAccelerator::Frame count: 88 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 99 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 110 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 121 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 132 (decoder count: 10)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 143 (decoder count: 10)
***

Here's a minute of scrolling up/down in the tweetdeck columns. 
VTVideoDecodeAccelerator_v1.txt
7.0 KB View Download
Seems to keep increasing.

*** VTVideoDecodeAccelerator::Frame count: 495 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 484 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 495 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 506 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 517 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 528 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 539 (decoder count: 40)
***
***
***
***
*** VTVideoDecodeAccelerator::Frame count: 550 (decoder count: 40)

Wow! 40 instances of accelerated h264 decoders, and 550 decoded frames -- that's crazy-high! And I would bet it tracks with "ioclasscount IOSurface" (like, there are probably ~100-200 more IOSurfaces than VTVideoDecodeAccelerator::Frames, but as they grow, they grow in tandem).

So, it looks to me that tweetdeck is DoSing the system. We don't put any guards in place to prevent this from happening (any webpage is perfectly welcome to DoS the system this way).

My tweetdeck only gets to ~10 decoders, but that's probably because of the content that I'm looking at. But I do see "ioclasscount IOSurface" growing, and I do see VTDecoderXPCService growing.

I'd guess that this appeared in M52 because we held on to IOSurfaces "just a bit longer" than in M51, and that pushed things over the edge.

Also, the "Gpu Memory" column in task manager doesn't take into account these decoded frames (hmm, maybe I should add an IOSurface memory column).

So, the plan for this:
* We should look into ways to prevent this sort of DoSing (that's a longer-term project)
* Maybe devrel can reach out to Twitter about these problems
* I'll leave AVSampleBufferDisplayLayer disabled for M52
* I'll add the "flush" fix to AVSampleBufferDisplayLayer for M53, and ship that then

Thanks Chris! FYI - *** VTVideoDecodeAccelerator::Frame count: 2453 (decoder count: 165)

So what do you think is the rough ETA for this to be resolved in stable?

For other users who have found this thread: close/reopen tweetdeck.twitter.com *regularly* to resolve OOM errors in the meantime :)

Comment 117 Deleted

Wow! That definitely solves the mystery!

WRT when we'll get a fix on stable for this, I'm not sure -- we'll need to figure something out between devrel, media team, etc. Chrome 52 is almost-completely-frozen (so if this is mostly tweetdeck's fault, we're not likely to push further changes for it).

I'll update tomorrow when I get more information about this from the rest of the team.
Also, thank you everyone for the debugging help!
Hi folks – I'm Tom, lead on the frontend of TweetDeck. Thanks for your extensive efforts looking into this — it's something I've been seeing lots, probably because I often run 3 or 4 instances of TweetDeck at a time!

There is one likely culprit for this: gifs. We render them as videos (mp4, <video>), and don't pause them when they're offscreen, although we sometimes remove them from DOM completely to keep a lid on the DOM node count.

--> How can we prove it's the gifs? Can we see the source of these decoders?

The team can put some work into pausing offscreen videos, but we don't have the resources to do it right now. However, we're planning to replace our infinite list implementation this quarter. DOM-thrift is at the top of that project's priorities.

(Alternatively, we recently changed the way we embed videos in TweetDeck to use an iframed player (video/mp2t, <video>) but this is probably not the issue as it takes a few clicks to watch one and you're seeing high instances of the h264 decoders.)
Labels: -Pri-1 -ReleaseBlock-Stable Pri-2
Hi tashworth@, you'll likely want to resolve this sooner than later as the M52 release will be going out soon and we don't feel that this is a stable blocker. Infinitely loading videos like this will either cause OOM or thread cap issues. We can idle collect software-decoded paused instances, but if the videos are playing or hardware decoded they won't be collected.

There's some improvements we could make in later versions to collect hardware decoded videos and possibly auto-pause videos w/o sound once off-screen, but that's not something we can land with M52 or even M53.

The most reasonable way to fix this is by clearing the src attribute (set '') _and_ removing the element from the dom -- this is the advice we gave Vine a long time ago. Unfortunately the HTML5 media spec allows elements to continue to function even when removed from the DOM, so it's necessary to set src='' to completely delete them.

The IntersectionObserver API should make it fairly easy to implement this; clear src='' some distance off page and bring it back once the page comes back into visibility.
Thanks for the tips — we'll look at those options. We already know what's onscreen pretty well — the hard bit is managing starting & stopping correctly.

The rendering engine supporting natively this would be *very* helpful. This is tough in JavaScript without dropping frames, especially alongside the other things TweetDeck is doing concurrently.
Thanks, happy to help. FWIW, anything we implemented in the rendering engine would not be any better at preserving frames than JavaScript. Suspend/resume works by destroying the internal playback engine and seeking to the last known time upon resume.
hi, thanks for getting in touch Tom.
perhaps auto playing is replaced with click to play. all autoplaying is annoying to say the least
If it helps, here's some more info: I'm having the same issue (even as we speak) running the latest Chrome (stable channel) on a mid-2011 iMac with 20GB or RAM, running OSX 10.9.5. I don't use tweetdeck, but Tom mentioned something about GIFS. If it helps, I keep a http://messenger.com (the standalone facebook chat) tab most of the time open, and there are constantly animated GIFs on the conversations (can be seen in the screenshot - in messenger.com all the images in the current conversation reappear in the form of a mosaic at the bottom right - when they are GIFs, the mosaic is "animated"). The black-boxes problem for me starts some time after opening some youtube video. In the screenshot you can see the ioclasscount IOSurface count (north of 5000 and goes up merely by scrolling). I haven't tested any of the builds, but gladly would test something if you need me to!

-Fotis
Screen Shot 2016-08-03 at 4.10.38 PM.png
237 KB View Download
chrome___gpu__ redinsect@.pdf
538 KB Download
To #125, that's a similar-but-separate  issue 632178 , which is related to any video playback on some macOS 10.9 systems. That issue has been fixed, and the fix should be pushed to users in the next day or so (if it hasn't already).
 Issue 632558  has been merged into this issue.
 Issue 639070  has been merged into this issue.
Project Member

Comment 129 by bugdroid1@chromium.org, Aug 21 2016

Labels: merge-merged-2785
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e7f13a23d26aa74bde7e6758e4c46071b819453e

commit e7f13a23d26aa74bde7e6758e4c46071b819453e
Author: Christopher Cameron <ccameron@chromium.org>
Date: Sun Aug 21 23:43:10 2016

Mac: Disable AVSampleBufferDisplayLayer on Mac <10.11

There are reports of this leaking IOSurfaces.

BUG=631485

Review URL: https://codereview.chromium.org/2269473002 .

Cr-Commit-Position: refs/branch-heads/2785@{#698}
Cr-Branched-From: 68623971be0cfc492a2cb0427d7f478e7b214c24-refs/heads/master@{#403382}

[modify] https://crrev.com/e7f13a23d26aa74bde7e6758e4c46071b819453e/ui/accelerated_widget_mac/ca_renderer_layer_tree.mm

Cc: rnimmagadda@chromium.org
Labels: TE-Verified-M53 TE-Verified-53.0.2785.80
Unable to repro this issue on MAC (10.11.6) on Chrome Beta Version - 53.0.2785.80. Looks like the issue is fixed.

Screen-recording is attached.

Hence adding the TE-Verified labels.


631485.mov
26.8 MB Download
this has NOT been resolved in version 53.0.2785.89 (64-bit) on OSX 10.11.6
see attachment
Screen Shot 2016-09-04 at 18.13.40.jpg
864 KB View Download
Screen Shot 2016-09-04 at 18.14.19.jpg
716 KB View Download
Screen Shot 2016-09-04 at 18.15.27.jpg
117 KB View Download
Screen Shot 2016-09-04 at 18.17.09.jpg
525 KB View Download
Please see #120, #121, and #122 -- this is, in effect, a DoS by the website.
Hi, 
I'm familiar with the comments and also non-existence of this prior to v52. 
This shouldn't have a TE-Verified-53.0.2785.80 label on it.
Labels: -Needs-Feedback -TE-Verified-53.0.2785.80
Status: ExternalDependency (was: Available)
Removed verify labels changing status to ExternalDependency.
Hi,

David here, also frontend dev on TweetDeck. We have recently put out an update to explicitly clear the "src" attribute before removing videos from the DOM. Whilst this doesn't eliminate the problem, it hopefully helps ease it a bit. There's certainly cases where the issue will still crop up (for example: a user who filters their content to display *ONLY* animated GIF videos)

While we are discussing this, I was wondering what your thoughts are on browser behaviour in this particular scenario...

The media spec makes a valid case for media elements with audio to continue playing in the background - even when detached from the DOM. But in our case, the videos don't have an audio track. Is it a reasonable expectation that removing a video with no sound from the DOM, makes it a valid candidate to be collected? The spec does allow for GC in this scenario, and I totally understand that implementation details are never easy. But I'm interested in the conversation, and curious on what you think.

Having said that, I'm going to re-iterate what my colleague Tom has said above, and we are working on a larger body of performance work, which includes improving how videos are handled on our side.

how recent?

I have a normal timeline and a dont do any scrolling and playback of videos and that still happens....
The change to clear the src attribute went out mid-August (not that recent after all!). 

Curious why you'd still be seeing it though, especially if you've got no animated-gif videos happening.
as soon as i load the site up, the VTDecoderXPCService which has been running for a while at 8.9MB shoots up to 35.8MB
It then continues to rise, eventhough there is barely a new tweet, no scrolling the timeline
see attachements and note the time stamp

Screen Shot 2016-09-09 at 21.38.38.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.41.00.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.41.49.jpg
1.3 MB View Download
Screen Shot 2016-09-09 at 21.43.03.jpg
1.3 MB View Download
Cc: sureshkumari@chromium.org
 Issue 694311  has been merged into this issue.
 Issue 698844  has been merged into this issue.
My  bug 698844  was merged into this issue as a duplicate. I'm seeing the issue with Chrome 56, but I see no activity on this bug to indicate that it is still open. Is this bug really a duplicate and not fixed?
 Issue 647967  has been merged into this issue.
 Issue 719938  has been merged into this issue.
We are seeing this with VS Code (using Electron) ever since we updated Electron to Chrome 56. We did NOT see this while we were using Electron with a Chrome version of 53.

It reproduces no matter if GPU is enabled or disabled (via --disable-gpu flag). 

More steps to reproduce using Chrome 58:
1. Connect your MacBook Pro laptop (mine is a Retina, 15-inch, Mid 2014) to an external 4k or 5k monitor
2. Mirror the display and make sure the resolution is set to at least 3360x1890
3. open https://sourcegraph.com/github.com/Microsoft/vscode/-/blob/src/bootstrap.js#L42:6 and 3 editors side by side (to open more than one editor, click on the Split icon next to the "View on GitHub" label top right of the editor)
4. make sure the window is maximized
5. scroll around and generally use the web site UI
=> flickering 
Showing comments 46 - 145 of 145 Older

Sign in to add a comment