Potential memory leak in cc::Display::DrawAndSwap [LatencyInfo]. |
|||||||||||
Issue descriptionI've been looking at sources of browser memory bloat by using native heap profiling on my own browser. Full details: https://docs.google.com/document/d/1fN5balfyrd7sRpd6DRaUI1TwoOwYjLyRSd7mwZT5US8/edit# Over the course of 1 week, the browser process created ~2k [LatencyInfo] objects that it did not destroy. This is suggestive of a large leak. Probably a bug in how stored_latency_info_ get's cleared. Each screenshot shows: 1) # of objects created [that have not been destroyed] 2) The stack trace of the code that created the object.
,
Jun 13 2017
It looks like we do a) copy all latencyinfo in stored_latency_info_ into the frame b) clear stored_latency_info_ c) draw d) if (!should_swap), copy all latencyinfo in frame into stored_latency_info_ So if you kept drawing but not swapping, you'd keep a growing list of latency infos. e) if (should_swap), move the latencyinfo in frame to SwapBuffers() f) in GLRenderer, pass them to OutputSurfaceFrame g) in GpuBrowserCompositorOutputSurface they're copied to GetCommandBufferProxy()->SetLatencyInfo(), at which point it'd be a different malloc h) in SoftwareRenderer pass them to OutputSurfaceFrame i) in SoftwareBrowserCompositorOutputSurface, copy them to a callback for &RenderWidgetHostImpl::CompositorFrameDrawn So I guess it would be d) ?
,
Jun 13 2017
,
Jun 13 2017
Hm I guess I'm not sure what the malloc is there though. It's in the constructor of LatencyInfo. Maybe that's std::string, or SmallMap. In that case i) isn't it cuz the Callback will be destroyed once it runs taking the LatencyInfos with it. g) could be, it does CommandBufferProxyImpl::SetLatencyInfo() which adds them to a vector of latency_info_. That is cleared on Flush() or OrderingBarrier() if put_offset_changed. But if the context doesn't do one of those then maybe that list is growing forever?
,
Jun 13 2017
I guess notably this is on mac, where the display compositor context is not shared with any meaningful ui compositor that uses it to do anything. The display compositor still does draw and such, but it doesn't flush or orderingbarrier maybe?
,
Jun 14 2017
> Hm I guess I'm not sure what the malloc is there though Ignore it. This is the implementation underneath "new operator". Your code is allocating an object with "new".
,
Jun 14 2017
Are you sure? There is a vector of LatencyInfos, not of unique_ptr<LatencyInfo>s. https://cs.chromium.org/chromium/src/cc/surfaces/display.h?rcl=014fa484cfd3972bfae6296776502369750a6ed2&l=121
,
Jun 16 2017
The stacktrace would suggest it's created in d), though the stack trace also suggests that there is malloc in latency info's default copy constructor. Latency info is bools, enums, string, and base::flat_map. I don't think these being copied would ends up calling malloc? What is the best way to experiment to see whether it's d) not swapping ot g) not flushing?
,
Jun 16 2017
> Latency info is bools, enums, string, and base::flat_map. I don't think these being copied would ends up calling malloc? I imagine that copying a base::flat_map [which contains a std::vector] would call new, which in turn calls malloc.
,
Jun 20 2017
Ping. Can someone from the GPU team pick this up?
,
Jun 20 2017
,
Jun 20 2017
The LatencyInfo vector is only used if it contains less than kMaxLatencyInfoNumber LatencyInfo objects - there's a bunch of calls to LatencyInfo::Verify which checks against that, and if it returns false, the latency info is not processed. So one option is to limit the size of of stored_latency_info_ to kMaxLatencyInfoNumber. We can probably just clear it if it grows beyond kMaxLatencyInfoNumber. This should also let us replace all those LatencyInfo::Verify calls with CHECKs, which would be good.
,
Jun 20 2017
Does that account for this? > Over the course of 1 week, the browser process created ~2k [LatencyInfo] objects that it did not destroy.
,
Jun 20 2017
Are you asking if these Verify checks are causing the leaks? If so, then I don't think it does.. I am just saying that if the issue is in (d) as you suspect, then we can fix it by putting a cap on stored_latency_info_ size.
,
Jun 20 2017
tdresser@ could you please help find an owner for this LatencyInfo issue?
,
Jun 21 2017
Over to dtapuska@ for triage. The potential solution in #12 sounds like a reasonable starting point to me.
,
Jun 30 2017
Marking as assigned for dtapuska to look at.
,
Jul 6 2017
,
Jul 6 2017
,
Jul 6 2017
,
Jul 6 2017
,
Jul 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/20548e47b8881bb32efa2d9b8db489d7b20eaf39 commit 20548e47b8881bb32efa2d9b8db489d7b20eaf39 Author: nzolghadr <nzolghadr@chromium.org> Date: Thu Jul 06 15:41:27 2017 Limit the size of latency info storage When there is no swaping we copy back the latency infos from the frame back into the stored latency info list. If this list grows beyond a particular size we never use it anyways. So we can just not store if the size ever grows more than the threshold. BUG= 732979 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2966803002 Cr-Commit-Position: refs/heads/master@{#484609} [modify] https://crrev.com/20548e47b8881bb32efa2d9b8db489d7b20eaf39/cc/surfaces/display.cc [modify] https://crrev.com/20548e47b8881bb32efa2d9b8db489d7b20eaf39/cc/surfaces/display.h [modify] https://crrev.com/20548e47b8881bb32efa2d9b8db489d7b20eaf39/cc/surfaces/display_unittest.cc
,
Jul 10 2017
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by ericrk@chromium.org
, Jun 13 2017