GrTextBlobCache's GrMemoryPool wastes 32 KiB on Android |
||||
Issue descriptionGrMemoryPool::CreateBlock(size) allocates 'size + kHeaderSize' bytes. GrTextBlobCache creates its pool with 'preallocSize' and 'minAllocSize' both set to 1 << 17 (kPreAllocSize / kMinGrowthSize). What ends up being allocated by GrMemoryPool::CreateBlock() is '1 << 17 + kHeaderSize + kPerAllocPad' bytes, i.e. 131112. jemalloc (default Android's allocator since Android L) has pretty coarse allocation classes, see https://docs.google.com/spreadsheets/d/1zRdgXSgbxQDUoHoO1iiv0KTG8fBESeWHj-BUOZYqZ40/edit?usp=sharing According to that sheet (and as seen by a tool that I wrote) allocating 131112 actually allocates 163840, i.e. wastes 32728 bytes. While browsing TheVerge GrTextBlobCache caused GrMemoryPool to allocate 9 blocks, wasting 287 KiB total. I propose changing semantics of 'preallocSize' and 'minAllocSize' GrMemoryPool constructor arguments to include any internal size overhead (i.e. kHeaderSize). I.e. preallocSize / minAllocSize will specify memory sizes that GrMemoryPool will end up allocating (in most cases). Note that both preallocSize / minAllocSize already include some internal overhead, since they account for kPerAllocPad only once. I.e. the following: GrMemoryPool pool(sizeof(MyClass), 2 * sizeof(MyClass)); pool.allocate(sizeof(MyClass)); pool.allocate(sizeof(MyClass)); pool.allocate(sizeof(MyClass)); will end up allocating 3 blocks (and wasting sizeof(MyClass)): 1. First block is of size 'sizeof(MyClass) + kHeaderSize + kPerAllocPad', allocated by the constructor. Used by the first invocation of pool.allocate(). 2. Second block is of size '2 * sizeof(MyClass) + kHeaderSize + kPerAllocPad', allocated by the second invocation of pool.allocate(). 3. Third block is allocated by the third invocation of pool.allocate(), because internally it will ask for 'sizeof(MyClass) + kPerAllocPad', but there is only sizeof(MyClass) bytes available, since fMinAllocSize only includes kPerAllocPad once. So sizeof(MyClass) from the second block is wasted. Same is true for GrMemoryPool(2 * sizeof(MyClass), sizeof(MyClass)) - it will also end up with 3 blocks. GrMemoryPool used in 4 places in Skia, and only place one uses it to store predefined objects: InstancedRendering (stores Batch::Draw objects). Other places use random constants and won't possibly notice change in semantics. I think that InstancedRendering case can be solved by introducing GrObjectPool class that operates on counts, not sizes: template <class T> class GrObjectPool { public: GrObjectPool(size_t preallocCount, size_t minAllocCount); T* allocate(); void release(T*); ... }; Since GrObjectPool knows both counts it can add correct number of kPerAllocPad to make sure objects fit exactly. And with that '(Draw*)fInstancedRendering->fDrawPool.allocate(sizeof(Draw))' becomes just 'fInstancedRendering->fDrawPool.allocate()', which looks cleaner.
,
Sep 30 2016
,
Sep 30 2016
Is there any way we could know that the allocation classes are so we can round up?
,
Sep 30 2016
There is malloc_usable_size() that returns real allocated size for a pointer. But it has several issues: 1. It's different depending on platform (dlmalloc_usable_size on some early Android versions, malloc_size() on macOS, and HeapSize on Windows). 2. It can have arbitrary performance overhead, since it's not expected to be called often.
,
Sep 30 2016
Is there no malloc_good_size() equivalent that we can call before allocating?
,
Sep 30 2016
No, no such think on most platforms. On Android with jemalloc we can guess size classes (since they are more or less fixed), but we'll need some runtime checks, because early versions of Android use dlmalloc.
,
Oct 24 2016
Guys, what is our plan here? If you're OK with proposed fix, I can submit a patch.
,
Oct 27 2016
Guys, I see that there is a test for GrMemoryPool (tests/GrMemoryPoolTest.cpp), but it doesn't seem to be compiled. What is the reason for that?
,
Oct 27 2016
Do you mean, it's not being compiled, or that when you compile and link it the unit test doesn't show up? If it's not being compiled, that's odd... better paste out the log of `ninja -C out/Whatever -t commands dm` somewhere for me to look at. If it doesn't seem to be doing anything, better check the value of SK_SUPPORT_GPU. It appears to guard the bulk of the file's code.
,
Oct 27 2016
AFAIK it is compiled and runs. I thought I had commented the other day, but I guess I closed the tab before saving the changes... I'm ok with your proposed plan.
,
Oct 27 2016
Thanks guys! I'll work on the patch next week. The issue was that Chrome's third_party/skia/gn/ lacks tests.gni. I'll use standalone repo.
,
Oct 27 2016
tests.gni landed today, and I think just rolled into Chrome. If you sync up it should be there. That said, tests/GrMemoryPoolTest.cpp should always have been compiled. It just wasn't listed in our build explicitly. It was globbed up by a call to exec_script.
,
Nov 29 2016
The following revision refers to this bug: https://skia.googlesource.com/skia.git/+/e4cd00699167cefde9abedbd49ede64f82d552c7 commit e4cd00699167cefde9abedbd49ede64f82d552c7 Author: dskiba <dskiba@chromium.org> Date: Tue Nov 29 14:50:35 2016 Make GrMemoryPool play nice with bucketing allocators. Some memory allocators have very coarse size buckets, so for example on Android (jemalloc) an attempt to allocate 32 KiB + 1 byte will end up allocating 40 KiB, wasting 8 KiB. GrMemoryPool ctor takes two arguments that specify prealloc / block sizes, and then inflates them to accommodate some bookkeeping structures. Since most places create GrMemoryPools with pow2 numbers (which have buckets in most allocators) the inflation causes allocator to select next size bucket, wasting memory. This CL makes GrMemoryPool to stop inflating sizes it was created with, and allocate specified amounts exactly. Part of allocated memory is then used for bookkeeping structures. Additionally, GrObjectMemoryPool template is provided, which takes prealloc / block object counts (instead of sizes) and guarantees that specified number of objects will fit in prealloc / block spaces. BUG= 651872 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2525773002 Review-Url: https://codereview.chromium.org/2525773002 [modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/GrMemoryPool.cpp [modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/GrMemoryPool.h [modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/instanced/InstancedRendering.cpp [modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/instanced/InstancedRendering.h [modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/tests/GrMemoryPoolTest.cpp
,
Nov 29 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b commit 5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b Author: skia-deps-roller <skia-deps-roller@chromium.org> Date: Tue Nov 29 15:54:35 2016 Roll src/third_party/skia/ c51c18fd7..e4cd00699 (1 commit). https://skia.googlesource.com/skia.git/+log/c51c18fd7837..e4cd00699167 $ git log c51c18fd7..e4cd00699 --date=short --no-merges --format='%ad %ae %s' 2016-11-29 dskiba Make GrMemoryPool play nice with bucketing allocators. BUG= 651872 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel TBR=brianosman@google.com Review-Url: https://codereview.chromium.org/2536203002 Cr-Commit-Position: refs/heads/master@{#434999} [modify] https://crrev.com/5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b/DEPS
,
Dec 15 2016
|
||||
►
Sign in to add a comment |
||||
Comment 1 by fmalita@chromium.org
, Sep 30 2016