We currently use libMalloc on macOS. We might see performance improvements if we move to TCMalloc.
I'm going to roughly divide performance into three buckets:
1) speed of allocation/free/etc.
2) total memory overhead, during quiescence
3) fragmentation
I have not investigated (1) or (3).
For (2):
libMalloc uses three zones: large, small and tiny. The large zone uses a free list of max size = [total machine memory] / 1024. Running the command:
"""
ps aux | grep Canary | awk '{print $2}' | xargs -I '{}' sudo vmmap '{}' | grep "MALLOC.*(empty).*see MALLOC"
"""
shows that for my ~16 tabs across 2 profiles [with some extensions], there are ~40 processes with ~300MB of dirty memory in the freelists, primarily for the large zones. This is on a machine with 64GB memory. I expect this number to roughly scale with # of processes.
If we used TCMalloc or any custom implementation, we could add optimizations for background processes [e.g. purge free lists]. This would be in line with purge-and-suspend, and other such optimizations.
Not sure if this is worthwhile though, especially since we don't know how a custom allocator would change (1) or (3).
Comment 1 by haraken@chromium.org
, Jul 21