New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 798012 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

GpuDataManagerImpl::log_messages_ consumes 3GB+ memory.

Project Member Reported by erikc...@chromium.org, Dec 28 2017

Issue description

We have examples from the wild of this vector containing > 13 million elements, using 1.1GB of memory. For unsymbolized, native heap dump, see go/crash/55d9c6818b004951.

Stack trace of allocation frame attached as an image.

It looks like nothing ever clears log_messages_. Can we make this an LRU cache with 
  * A reasonable, maximum size [e.g. 1000 elements]
  * With a TTL for elements [so that log messages don't affect memory usage days later].


 
Screen Shot 2017-12-28 at 2.21.32 PM.png
170 KB View Download
Oops, wrong link. I've attached a symbolized trace.

The total memory usage of the browser process is 7.5GB. At least 3GB is directly attributable to this vector.
trace-5c2d6eba89a9aa4e.gz
175 KB Download
Summary: GpuDataManagerImpl::log_messages_ consumes 3GB+ memory. (was: GpuDataManagerImpl::log_messages_ consumes 1GB+ memory.)

Comment 3 by piman@chromium.org, Jan 2 2018

Cc: kbr@chromium.org
Components: -Internals>GPU Internals>GPU>Internals
linky for that other trace: https://crash.corp.google.com/browse?q=reportid=%275c2d6eba89a9aa4e%27#0

Yeah, this sounds bad. We should definitely not keep an unbounded log like this.
An upper limit on the number of log lines (or total size even) sounds good. It doesn't seem worth it to wire through a time and deal with time expiration though.


Separately, 3GB of memory is on the order of several (tens of?) millions of items, and that seems wrong too (probably 1+ log per frame - we don't have uptime, but given that it's on the current dev, it's been running for less than 2 weeks), but that part is probably not actionable as is.


We should also look at what is worth logging there. Generally we have everything that we log from the GPU process, which includes legitimate errors (e.g. failed to load GL libraries, lost GPU process), but also GL errors from all contexts. They're all kinda useful sometimes for debugging, but the majority is noise. In particular, errors from the WebGL contexts, which are also sent to the client renderer (displayed in devtools) probably don't need to be logged in about:gpu.
Cc: mariakho...@chromium.org
I will point out that without an expiration, we should expect the memory overhead for a long-running instance of chrome to always at the maximum size. Perhaps instead we should allow the entire buffer to be purged on low-memory conditions?

+maria, since this likely also affects android.

Comment 5 by piman@chromium.org, Jan 2 2018

1000 log lines (which is generous) is probably around 100k, we can certainly have a tighter limit on Android too. I'm not sure it's worth overthinking this.
oh sure, that seems fine. Didn't know what type of limit you were considering.
Owner: erikc...@chromium.org
Status: Assigned (was: Untriaged)
Adding a 1k limit is easy, I'll take that.
Project Member

Comment 8 by bugdroid1@chromium.org, Jan 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/98676f6ea0ebcf48a40d7bda0d8457196caee458

commit 98676f6ea0ebcf48a40d7bda0d8457196caee458
Author: erikchen <erikchen@chromium.org>
Date: Thu Jan 04 01:08:10 2018

Add a gpu log limit of 1000 entries.

Some clients emit too many entries, which causes memory bloat. This has been
observed to consume 3GB+ in the wild.

Bug:  798012 
Change-Id: Ia230d9bd338c3173fa0f3fb86e641c731605b016
Reviewed-on: https://chromium-review.googlesource.com/849485
Reviewed-by: Antoine Labour <piman@chromium.org>
Commit-Queue: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#526882}
[modify] https://crrev.com/98676f6ea0ebcf48a40d7bda0d8457196caee458/content/browser/gpu/gpu_data_manager_impl_private.cc
[modify] https://crrev.com/98676f6ea0ebcf48a40d7bda0d8457196caee458/content/browser/gpu/gpu_data_manager_impl_private.h

Status: Fixed (was: Assigned)
Issue 807154 has been merged into this issue.
Issue 807154: Apparently this was causing OOMs in the wild.
Labels: Hotlist-HeapProfilingInField

Sign in to add a comment