New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 664350 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 3
Type: Bug
Hotlist-MemoryInfra



Sign in to add a comment

ConvertTraceEventsToTraceFormat recycles a lot of memory

Project Member Reported by dskiba@chromium.org, Nov 11 2016

Issue description

Sometimes during last stages of tracing (i.e. after you clicked 'Stop' on the tracing dialog) renderer process dies because it exhausts AllocationRegister' allocations capacity (which is 1.5 M).

I have long suspected that it has to do with json, because adding more stuff to HeadDumpWriter's Entry crashes renderer almost always.

So I used ThreadHeapUsageTracker to get a sense what happens in ConvertTraceEventsToTraceFormat(), and voila:

[Format is bytes(count), i.e. the first line allocated 7624640 in 120005 allocations.]

102662 | ConvertTraceEventsToTraceFormat: allocated: 7624640(120005) freed: 7559072(120003), diff: 65568(2)
102648 | ConvertTraceEventsToTraceFormat: allocated: 45995216(693196) freed: 43514928(693176), diff: 2480288(20)
102638 | ConvertTraceEventsToTraceFormat: allocated: 61042160(1007373) freed: 58582544(1007361), diff: 2459616(12)
102967 | ConvertTraceEventsToTraceFormat: allocated: 76251120(1265357) freed: 73110368(1265343), diff: 3140752(14)
102695 | ConvertTraceEventsToTraceFormat: allocated: 131873424(2116273) freed: 126565920(2116256), diff: 5307504(17)
102587 | ConvertTraceEventsToTraceFormat: allocated: 67354096(1093805) freed: 63700672(1093772), diff: 3653424(33)

102695 = renderer
102967 = renderer
102662 = utility
102648 = gpu-process
102587 = browser

(Patch attached.)
 
trace_log.diff
1.9 KB Download
Owner: alph@chromium.org
Status: Assigned (was: Untriaged)

Comment 2 by alph@chromium.org, Dec 6 2016

Components: -Platform>DevTools>Memory Internals>Tracing
Owner: fmea...@chromium.org
Doesn't seem to be related to DevTools. Fadi could you please take a look.
Cc: -primiano@chromium.org l...@chromium.org fmea...@chromium.org
Owner: primiano@chromium.org
Assigning to primiano for now, if you don't have enough bandwidth, feel free to bounce it back.

Comment 4 by dskiba@chromium.org, Jan 14 2017

Cc: mariakho...@chromium.org ssid@chromium.org
OK, some more data on this. As we saw, ConvertTraceEventsToTraceFormat did over 2 millions allocations in renderer 102695 (2,116,256 to be precise).

I tried native heap profiling on Android, and found that it was crashing because apparently 1.5 M allocations configured in AllocationRegister was not enough. I checked last allocation (which caused the crash) and found that it was from ConvertTraceEventsToTraceFormat() converting trace events into JSON.

What's interesting is that ConvertTraceEventsToTraceFormat() uses HEAP_PROFILER_SCOPED_IGNORE macro to  attribute allocations with "tracing_overhead" type. But that the macro doesn't actually ignore allocations, they are still inserted to the AllocationRegister, and still cause it to overflow.

I decided to ignore allocations in HEAP_PROFILER_SCOPED_IGNORE scopes for real, i.e. simply not register them. To get some numbers to justify that I logged max allocation count (i.e. max count of touched entries in allocation hash table, next_unused_cell_) in ctor/dtor of HeapProfilerScopedIgnore class.

The results of running startup tracing on Android in simplest possible setup (run Chrome to the NTP, take dump after 13 seconds):

GPU       ~HeapProfilerScopedIgnore() in ConvertTraceEventsToTraceFormat: 148735 allocations (grew by 144423)
Renderer  ~HeapProfilerScopedIgnore() in ConvertTraceEventsToTraceFormat: 429624 allocations (grew by 407158)
Browser   ~HeapProfilerScopedIgnore() in ConvertTraceEventsToTraceFormat: 2435242 allocations (grew by 2299684)

(Oh right, I also increased AllocationRegister capacity to 2.5M to avoid crashes.)

So yeah, the process of converting stuff to JSON at some point results in 2,299,684 new allocations being alive at the same time. No wonder capacity of 1.5M allocations was not enough.
max-allocation-count.diff
6.0 KB Download
First of all thanks for the analysis as usual!

I added some comments in the CL. Considering that the heap profiler memory is not resident until we actually use it, I'd just bump it to 5M entries.
By ignoring HEAP_PROFILER_SCOPED_IGNORE (instead of marking as overhead) we'd cause even more discrepancies in the total reported by malloc vs heap profiler and that might be extremely puzzling.
Can we afford 60 MB per process until we get to TracingV2 and get rid of most ConvertableToTraceFormats?
If the answer is no, a possible alternative action is to use a separate:
1. counter for "ignored bytes"
2. a more compact hashtable that just maps addr to size, without any furhter context (to adjust the counter down when freeing).
BUt this is extra work and complexity, and not sure we should go there.

Comment 6 by dskiba@chromium.org, Jan 17 2017

As we know, right now mallinfo() on Android reports lots of other stuff, and it's simply not possible to correlate it with what heap profiler measures. Changing heap profiler to report lower number isn't going to change that - there is simply no such thing as "more discrepancies", because those numbers just can't be correlated.

Next, the point of "tracing_overhead" value is moot. Especially with it being in the malloc tree. I've complained about it before - during investigations you're not interested in tracing_overhead, but yet you have to do this little exercise of subtracting tracing_overhead from heap profiler's total to get "real" total.

So from my point of view, we're bumping capacities and wasting memory just to account for (terribly) memory inefficient JSON conversion to get a useless number.

Other way might be to actually fix JSON conversion to not produce 2.3M simultaneously allocated objects in the process.

Regarding V2 - how it's going to convert from its protobuf to JSON? If there is straight protobuf -> JSON conversion function, then I think it can be pretty easily adapted to handle TracedValues too.
allright then

> Regarding V2 - how it's going to convert from its protobuf to JSON? If there is straight protobuf -> JSON conversion function, then I think it can be pretty easily adapted to handle TracedValues too.
Ok you make a good point, this will apply as long as we have to support JSON.
We could write a smarter JSON writer, which doesn't need the base::values. but right now there are too many other more important things to do, so let's keep focus.

will lg the cl, thanks for the good points.

Comment 8 by dskiba@chromium.org, Jan 17 2017

OK, thanks! I'll take a look at JSON conversion at my spare time, hopefully by the time V2 lands we'll have something better.

Comment 9 by dskiba@chromium.org, Jan 17 2017

As https://codereview.chromium.org/2637783002 is about to kill tracing_overhead, it's time to see what it is.

I modified AllocationContextTracker to prepend stack frames with "tracing_overhead" frame, instead of completely overwriting frames with it. I.e. you can now drill into "tracing_overhead" item.

Then I took a simple trace (NTP, 13s): https://drive.google.com/file/d/0B_Hmi138MnbJWVZyejVYdDZCNmM/view?usp=sharing

Findings:

Browser: tracing_overhead is 290 KiB, most of which (215 KiB) comes from ProcessMetricsMemoryDumpProvider::DumpProcessMemoryMaps() adding entries to ProcessMemoryDump::process_mmaps_::vm_regions_.

Renderer: tracing_overhead is 4821 KiB. There are three main contributors:

2268 KiB: BlinkGCMemoryDumpProvider::OnMemoryDump(), with majority (2047 KiB) coming from TracedValue::SetString() (Pickle::WriteString -> Pickle::Resize -> realloc).
2298 KiB: PartitionAllocMemoryDumpProvider::OnMemoryDump(), again 2047 KiB coming from TracedValue::SetString().
204 KiB: ProcessMetricsMemoryDumpProvider::DumpProcessMemoryMaps()

Everything else is due to deduplicators (which we measure and report later), and minor things that are related to tracing machinery.

Given that I think it's pretty reasonable to drop tracing_overhead, because:

1. It lists allocations that are ProcessMemoryDump-related, so once dump is destroyed, memory is freed.
2. It depends on when MallocDumpProvider is dumped. For example for the renderer we see allocations from blink_gc / partition_alloc only because MallocDumpProvider is triggered after. Thus tracing_overhead misses allocations by MallocDumpProvider itself (of similar magnitude), and all dump providers that come after.


Components: -Internals>Tracing Internals>Instrumentation>Memory
Thanks for looking into this. Yeah I always knew about VMMaps being silly, we'll get there at some point :)
THere is one thing missing, that you don't see because your trace is short. If you do a longer trace, or if you just record in circular buffer mode, you should see something like 30 MB for the trace buffer.

Anyways this is not that much related with the heap traffic that JSON serialization causes really.
Project Member

Comment 11 by bugdroid1@chromium.org, Jan 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9ab14b2d78c1d15909fa2d6cc12760a757edb405

commit 9ab14b2d78c1d15909fa2d6cc12760a757edb405
Author: dskiba <dskiba@chromium.org>
Date: Wed Jan 18 21:53:42 2017

[tracing] Don't register tracing-related allocations.

Tracing-related scopes are instrumented with HEAP_PROFILER_SCOPED_IGNORE
macro, which caused all scoped allocations to have "tracing_overhead" type,
but didn't prevent those allocations to be registered.

However, it was discovered (see the bug) that the last stages of tracing
create substantial pressure on allocation register, adding millions of
short-lived allocations. Instead of bumping allocation register capacity
once again, this CL makes HEAP_PROFILER_SCOPED_IGNORE to completely ignore
allocations in its scope, i.e. not register them.

BUG= 664350 

Review-Url: https://codereview.chromium.org/2637783002
Cr-Commit-Position: refs/heads/master@{#444500}

[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/base/trace_event/heap_profiler_allocation_context_tracker.cc
[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/base/trace_event/heap_profiler_allocation_context_tracker.h
[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/base/trace_event/heap_profiler_allocation_context_tracker_unittest.cc
[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/base/trace_event/malloc_dump_provider.cc
[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/third_party/WebKit/Source/platform/PartitionAllocMemoryDumpProvider.cpp
[modify] https://crrev.com/9ab14b2d78c1d15909fa2d6cc12760a757edb405/third_party/WebKit/Source/platform/heap/BlinkGCMemoryDumpProvider.cpp

Project Member

Comment 12 by bugdroid1@chromium.org, Jul 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/131e21f45f5d7b04874b636c1ffa642c13944f1e

commit 131e21f45f5d7b04874b636c1ffa642c13944f1e
Author: dskiba <dskiba@chromium.org>
Date: Sun Jul 16 19:41:39 2017

[tracing] Optimize TracedValue::AppendAsTraceFormat().

This CL optimizes TracedValue::AppendAsTraceFormat() by removing
intermediate base::Value conversion.

HeapProfilerPerfTest.AppendStackFramesAsTraceFormat shows ~1.9x improvement
on macOS (1730ms -> 920ms).

TBR=jbauman@chromium.org

BUG= 739378 ,  664350 

Review-Url: https://codereview.chromium.org/2975033002
Cr-Commit-Position: refs/heads/master@{#487019}

[modify] https://crrev.com/131e21f45f5d7b04874b636c1ffa642c13944f1e/base/trace_event/trace_event_argument.cc
[modify] https://crrev.com/131e21f45f5d7b04874b636c1ffa642c13944f1e/base/trace_event/trace_event_argument_unittest.cc
[modify] https://crrev.com/131e21f45f5d7b04874b636c1ffa642c13944f1e/cc/base/filter_operations_unittest.cc
[modify] https://crrev.com/131e21f45f5d7b04874b636c1ffa642c13944f1e/components/tracing/common/graphics_memory_dump_provider_android_unittest.cc

Status: WontFix (was: Assigned)
Chrome is moving to perfetto, where this won't be an issue.

Sign in to add a comment