Currently the profiler stores all of the collected stacks individually in memory. At 300 stacks, ~22 frames/stack, 16 bytes/frame, this works out to 106k for each collection for each thread, plus module information and overhead.
In the typical case most of the stacks are idling in the message loop and so are the same. We can substantially reduce the memory usage by only storing one stack for all of these rather than multiple.
Proposed approach: store distinct stacks rather than individual stacks in StackSamplingProfiler::CallStackProfile. Add a vector of struct { size_t index, uint32_t process_milestones } to record each sample's stack index, plus process milestones.
Comment 1 by chengx@chromium.org
, Jan 23 2018Owner: chengx@chromium.org
Status: Assigned (was: Available)