New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 651872 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit 15 days ago
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

GrTextBlobCache's GrMemoryPool wastes 32 KiB on Android

Project Member Reported by dskiba@chromium.org, Sep 30 2016

Issue description

GrMemoryPool::CreateBlock(size) allocates 'size + kHeaderSize' bytes. GrTextBlobCache creates its pool with 'preallocSize' and 'minAllocSize' both set to 1 << 17 (kPreAllocSize / kMinGrowthSize). What ends up being allocated by GrMemoryPool::CreateBlock() is '1 << 17 + kHeaderSize + kPerAllocPad' bytes, i.e. 131112.

jemalloc (default Android's allocator since Android L) has pretty coarse allocation classes, see https://docs.google.com/spreadsheets/d/1zRdgXSgbxQDUoHoO1iiv0KTG8fBESeWHj-BUOZYqZ40/edit?usp=sharing

According to that sheet (and as seen by a tool that I wrote) allocating 131112 actually allocates 163840, i.e. wastes 32728 bytes. While browsing TheVerge GrTextBlobCache caused GrMemoryPool to allocate 9 blocks, wasting 287 KiB total.

I propose changing semantics of 'preallocSize' and 'minAllocSize' GrMemoryPool constructor arguments to include any internal size overhead (i.e. kHeaderSize). I.e. preallocSize / minAllocSize will specify memory sizes that GrMemoryPool will end up allocating (in most cases).

Note that both preallocSize / minAllocSize already include some internal overhead, since they account for kPerAllocPad only once. I.e. the following:

GrMemoryPool pool(sizeof(MyClass), 2 * sizeof(MyClass));
pool.allocate(sizeof(MyClass));
pool.allocate(sizeof(MyClass));
pool.allocate(sizeof(MyClass));

will end up allocating 3 blocks (and wasting sizeof(MyClass)):

1. First block is of size 'sizeof(MyClass) + kHeaderSize + kPerAllocPad', allocated by the constructor. Used by the first invocation of pool.allocate().
2. Second block is of size '2 * sizeof(MyClass) + kHeaderSize + kPerAllocPad', allocated by the second invocation of pool.allocate().
3. Third block is allocated by the third invocation of pool.allocate(), because internally it will ask for 'sizeof(MyClass) + kPerAllocPad', but there is only sizeof(MyClass) bytes available, since fMinAllocSize only includes kPerAllocPad once. So sizeof(MyClass) from the second block is wasted.

Same is true for GrMemoryPool(2 * sizeof(MyClass), sizeof(MyClass)) - it will also end up with 3 blocks.

GrMemoryPool used in 4 places in Skia, and only place one uses it to store predefined objects: InstancedRendering (stores Batch::Draw objects). Other places use random constants and won't possibly notice change in semantics.

I think that InstancedRendering case can be solved by introducing GrObjectPool class that operates on counts, not sizes:

template <class T>
class GrObjectPool {
public:
	GrObjectPool(size_t preallocCount, size_t minAllocCount);

	T* allocate();
	void release(T*);

	...
};

Since GrObjectPool knows both counts it can add correct number of kPerAllocPad to make sure objects fit exactly.

And with that '(Draw*)fInstancedRendering->fDrawPool.allocate(sizeof(Draw))' becomes just 'fInstancedRendering->fDrawPool.allocate()', which looks cleaner.
 
Cc: mtklein@chromium.org bsalomon@chromium.org reed@chromium.org

Comment 2 by bsalo...@google.com, Sep 30 2016

Cc: csmartdalton@chromium.org

Comment 3 by bsalo...@google.com, Sep 30 2016

Is there any way we could know that the allocation classes are so we can round up?

Comment 4 by dskiba@chromium.org, Sep 30 2016

There is malloc_usable_size() that returns real allocated size for a pointer. But it has several issues:

1. It's different depending on platform (dlmalloc_usable_size on some early Android versions, malloc_size() on macOS, and HeapSize on Windows).

2. It can have arbitrary performance overhead, since it's not expected to be called often. 
Is there no malloc_good_size() equivalent that we can call before allocating?

Comment 6 by dskiba@chromium.org, Sep 30 2016

No, no such think on most platforms. On Android with jemalloc we can guess size classes (since they are more or less fixed), but we'll need some runtime checks, because early versions of Android use dlmalloc.

Comment 7 by dskiba@chromium.org, Oct 24 2016

Labels: Performance-Memory Performance-Memory-Q4
Guys, what is our plan here? If you're OK with proposed fix, I can submit a patch.

Comment 8 by dskiba@chromium.org, Oct 27 2016

Guys, I see that there is a test for GrMemoryPool (tests/GrMemoryPoolTest.cpp), but it doesn't seem to be compiled.

What is the reason for that?
Do you mean, it's not being compiled, or that when you compile and link it the unit test doesn't show up?  If it's not being compiled, that's odd... better paste out the log of `ninja -C out/Whatever -t commands dm` somewhere for me to look at.

If it doesn't seem to be doing anything, better check the value of SK_SUPPORT_GPU.  It appears to guard the bulk of the file's code.
AFAIK it is compiled and runs.

I thought I had commented the other day, but I guess I closed the tab before saving the changes... I'm ok with your proposed plan.
Thanks guys! I'll work on the patch next week.

The issue was that Chrome's third_party/skia/gn/ lacks tests.gni. I'll use standalone repo.


tests.gni landed today, and I think just rolled into Chrome.  If you sync up it should be there.

That said, tests/GrMemoryPoolTest.cpp should always have been compiled.  It just wasn't listed in our build explicitly.  It was globbed up by a call to exec_script.
Project Member

Comment 13 by bugdroid1@chromium.org, Nov 29 2016

The following revision refers to this bug:
  https://skia.googlesource.com/skia.git/+/e4cd00699167cefde9abedbd49ede64f82d552c7

commit e4cd00699167cefde9abedbd49ede64f82d552c7
Author: dskiba <dskiba@chromium.org>
Date: Tue Nov 29 14:50:35 2016

Make GrMemoryPool play nice with bucketing allocators.

Some memory allocators have very coarse size buckets, so for example on
Android (jemalloc) an attempt to allocate 32 KiB + 1 byte will end up
allocating 40 KiB, wasting 8 KiB.

GrMemoryPool ctor takes two arguments that specify prealloc / block sizes,
and then inflates them to accommodate some bookkeeping structures. Since
most places create GrMemoryPools with pow2 numbers (which have buckets in
most allocators) the inflation causes allocator to select next size bucket,
wasting memory.

This CL makes GrMemoryPool to stop inflating sizes it was created with, and
allocate specified amounts exactly. Part of allocated memory is then used for
bookkeeping structures. Additionally, GrObjectMemoryPool template is provided,
which takes prealloc / block object counts (instead of sizes) and guarantees
that specified number of objects will fit in prealloc / block spaces.

BUG= 651872 
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2525773002

Review-Url: https://codereview.chromium.org/2525773002

[modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/GrMemoryPool.cpp
[modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/GrMemoryPool.h
[modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/instanced/InstancedRendering.cpp
[modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/src/gpu/instanced/InstancedRendering.h
[modify] https://crrev.com/e4cd00699167cefde9abedbd49ede64f82d552c7/tests/GrMemoryPoolTest.cpp

Project Member

Comment 14 by bugdroid1@chromium.org, Nov 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b

commit 5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b
Author: skia-deps-roller <skia-deps-roller@chromium.org>
Date: Tue Nov 29 15:54:35 2016

Roll src/third_party/skia/ c51c18fd7..e4cd00699 (1 commit).

https://skia.googlesource.com/skia.git/+log/c51c18fd7837..e4cd00699167

$ git log c51c18fd7..e4cd00699 --date=short --no-merges --format='%ad %ae %s'
2016-11-29 dskiba Make GrMemoryPool play nice with bucketing allocators.

BUG= 651872 

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls

CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel
TBR=brianosman@google.com

Review-Url: https://codereview.chromium.org/2536203002
Cr-Commit-Position: refs/heads/master@{#434999}

[modify] https://crrev.com/5dbfcd08e6a1f561fc9fc4ac97b569c1a4580e9b/DEPS

Owner: dskiba@chromium.org
Status: Fixed (was: Untriaged)

Sign in to add a comment