New issue
Advanced search Search tips

Issue 864462 link

Starred by 46 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 3
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Chrome , Mac , Fuchsia
Pri: 2
Type: Feature

Blocked on:
issue 890752



Sign in to add a comment

Initialization of memory provided by PartitionAlloc is slow

Reported by ydor...@gmail.com, Jul 17

Issue description

UserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36

Steps to reproduce the problem:
Run the attached index.html (or navigate to https://algotec.github.io/alloc-clamped-array/), click on 'Run test'. 
The tester allocates 1000 1MB array buffers, stores their references in an array, and measures the amount of time it took.

What is the expected behavior?
Chrome memory allocation time is similar or better than other browsers

What went wrong?
Chrome allocation time is significantly worse in all cases except one (Edge, Win10). See the attached excel for a detailed comparison with other browsers.
It varies between 1.5-2 (IE11) to 50-70 faster (Firefox)

Did this work before? N/A 

Does this work in other browsers? Yes

Chrome version: 67.0.3396.99  Channel: stable
OS Version: 6.1 (Windows 7, Windows Server 2008 R2)
Flash Version:
 
index.html
817 bytes View Download
browser-memory-allocation.xlsx
9.8 KB Download
Components: Blink>JavaScript
Labels: Needs-Triage-M67
Cc: susan.boorgula@chromium.org
Labels: Triaged-ET M-69 FoundIn-69 Target-69 OS-Linux OS-Mac
Status: Untriaged (was: Unconfirmed)
ydorman@ Thanks for the issue.

Able to reproduce this issue on Windows 10, Mac OS 10.13.3 and Ubuntu 17.10 on the reported version 67.0.3396.99 and the latest Canary 69.0.3493 as per the original comment.
Attached is the screen shot for reference.

This is a Non-Regression issue as this behavior is observed from M-60 chrome builds.
Hence marking this as Untriaged for further updates from Dev.

Thanks..
864462-M60.PNG
59.7 KB View Download
Cc: jgruber@chromium.org
Owner: petermarshall@chromium.org
Status: Assigned (was: Untriaged)
Cc: haraken@chromium.org jochen@chromium.org petermarshall@chromium.org
Owner: haraken@chromium.org
Summary: Initialization of memory provided by PartitionAlloc is slow (was: Uint8ClampedArray allocations are slower on Chrome than in other browsers)
A huge amount of the overhead is coming from initializing the memory to 0. I verified this by commenting out the memset here: https://cs.chromium.org/chromium/src/third_party/blink/renderer/platform/wtf/typed_arrays/array_buffer_contents.cc?l=115&rcl=90be568fb49ceac0dfad0783d293bf31baccebe1

Without the memset, the time goes from ~500ms to ~100ms. Firefox is ~250ms for this benchmark.

Clearly we need the memory to be set to 0 for correctness. When I run this same benchmark in d8, which uses malloc/calloc/free for the ArrayBufferAllocator implementation, the result is ~16ms. That indicates to me that calloc does something smart - maybe pre-zeroed chunks or something.

Maybe it would be possible for PartitionAlloc to provide an interface to allocate initialized memory? Then it could do something under the hood to speed up these cases. 

Also open to other ideas. Adding folks who might be able to help.
Cc: palmer@chromium.org
This is interesting. Palmer's experiment (https://chromium-review.googlesource.com/c/chromium/src/+/1134523) is saying that adding one more zapping to the memory region did not regress performance. That contradicts with the observation of this bug...

Palmer: Do you have any thoughts?


Actually, for an allocation this big, d8 uses mmap with MAP_ANONYMOUS, meaning the memory will be initialized to 0.

The mmap call is here: https://cs.chromium.org/chromium/src/v8/src/base/platform/platform-posix.cc?l=143&rcl=c1e3354d56b2da398aa06a2429a09e7b878fe8eb

Which is called in a large chain from here: https://cs.chromium.org/chromium/src/v8/src/d8.cc?l=94&rcl=c1e3354d56b2da398aa06a2429a09e7b878fe8eb

Components: Blink>MemoryAllocator
Labels: OS-Android OS-Chrome OS-Fuchsia
My CL zaps memory on *freeing*, not on allocation, which wouldn't affect this benchmark. (And, apparently, not the others. :) )

I agree with #5 that a good way to handle this would be for PartitionAlloc to have an interface allowing the caller to get the memory zero'd, and then for the implementation to optimize by doing nothing if the allocation was satisfied with mmap. The way to do this would be to define a new flag for `PartitionAllocGenericFlags`. https://cs.chromium.org/chromium/src/base/allocator/partition_allocator/partition_alloc.h?l=316

haraken, do you want to do this or shall I?
Regarding this benchmark, yes. Your zapping on freeing won't affect the benchmark.

However, the benchmark result is saying that "a memset overhead on allocation is huge".

My question is: why can we argue that "a memset overhead on freeing is negligible" while "a memset overhead on allocation is huge"?


#10: Well, a memset on allocation is huge *in this particular allocation pattern*. Whether real applications really allocate 1,000 1 MiB regions, *and do so in their critical path*, is questionable.

This bug is about optimizing for a benchmark, which I am normally against, but since the optimization is nice to have and since we can probably implement with not too much difficulty, I'm not against it. And who knows, it might benefit some real applications.

I do also suspect allocations are more likely to be in the critical path for real applications than frees are. For example, when you close a tab and a million frees happen, it's pretty well isolated from your other tabs (especially now with Site Isolation, which creates more and shorter-lived processes).
Thanks for the clarification! Your analysis sounds reasonable to me.

+1 to implementing a new flag for PartitionAllocGenericFlags as an optimization for this case.

palmer@ about #11: We opened this bug due to a real world scenario. We develop a medical imaging application, and in some scenarios the user may scroll very quickly through many different images spread out on 2 to 4 large monitors (each up to 5MegaPixel). In some cases, we need to process 200 images per second coming from the server, using a web socket. When trying to optimize this scenario in different ways we noticed this bug. Obviously, if we could save 1ms per image then we'd free up 200ms per second that can be used to decompress and display images.
Labels: -Type-Bug Type-Feature
Owner: palmer@chromium.org
Status: Started (was: Assigned)
Labels: -Hotlist-Interop
Project Member

Comment 16 by bugdroid1@chromium.org, Sep 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b60e5cc0bd4b351c5898dc3637f537c355546c11

commit b60e5cc0bd4b351c5898dc3637f537c355546c11
Author: Chris Palmer <palmer@chromium.org>
Date: Fri Sep 07 00:51:36 2018

[PartitionAlloc] Add a flag to allow callers to zero-fill allocations.

This enables callers to avoid doing it themselves (centralizing the `memset` in
1 place, rather than duplicating it at every call site that wants it), and to
allow the PartitionAlloc implementation to avoid doing it itself if the
operating system has already zero'd the new page(s).

Bug:  864462 , 680657 
Change-Id: Ibeae0bf7b9f20238334bf9bc022d0f243a77ef11
Reviewed-on: https://chromium-review.googlesource.com/1162825
Commit-Queue: Chris Palmer <palmer@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Reviewed-by: Albert J. Wong <ajwong@chromium.org>
Cr-Commit-Position: refs/heads/master@{#589389}
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc.cc
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc.h
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc_constants.h
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc_unittest.cc
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_bucket.cc
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_cookie.h
[modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_page.h

Labels: -M-69 -Target-69 M-71
Status update:

https://chromium-review.googlesource.com/1162825 creates the new interface

https://chromium-review.googlesource.com/c/chromium/src/+/1212443 uses it in 2 call sites (although there may be other places we can use it, too)

But we don't yet have the actual optimization in place, inside Partition Alloc. I have a draft CL (not uploaded yet) that doesn't work yet, and I'll try to hammer that out. The innards of the PA implementation are always enjoyably tricky...
Project Member

Comment 18 by bugdroid1@chromium.org, Sep 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f94b670562073c228efaf1f32a20708dc0425db3

commit f94b670562073c228efaf1f32a20708dc0425db3
Author: Findit <findit-for-me@appspot.gserviceaccount.com>
Date: Fri Sep 07 01:37:54 2018

Revert "[PartitionAlloc] Add a flag to allow callers to zero-fill allocations."

This reverts commit b60e5cc0bd4b351c5898dc3637f537c355546c11.

Reason for revert:

Findit (https://goo.gl/kROfz5) identified CL at revision 589389 as the
culprit for failures in the build cycles as shown on:
https://findit-for-me.appspot.com/waterfall/culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyRAsSDVdmU3VzcGVjdGVkQ0wiMWNocm9taXVtL2I2MGU1Y2MwYmQ0YjM1MWM1ODk4ZGMzNjM3ZjUzN2MzNTU1NDZjMTEM

Sample Failed Build: https://ci.chromium.org/buildbot/chromium.linux/Cast%20Audio%20Linux/20069

Sample Failed Step: base_unittests

Original change's description:
> [PartitionAlloc] Add a flag to allow callers to zero-fill allocations.
> 
> This enables callers to avoid doing it themselves (centralizing the `memset` in
> 1 place, rather than duplicating it at every call site that wants it), and to
> allow the PartitionAlloc implementation to avoid doing it itself if the
> operating system has already zero'd the new page(s).
> 
> Bug:  864462 , 680657 
> Change-Id: Ibeae0bf7b9f20238334bf9bc022d0f243a77ef11
> Reviewed-on: https://chromium-review.googlesource.com/1162825
> Commit-Queue: Chris Palmer <palmer@chromium.org>
> Reviewed-by: Kentaro Hara <haraken@chromium.org>
> Reviewed-by: Albert J. Wong <ajwong@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#589389}

Change-Id: Ifd54700a2acb141c458df69304d7ccdce2719cc5
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  864462 , 680657 
Reviewed-on: https://chromium-review.googlesource.com/1212346
Cr-Commit-Position: refs/heads/master@{#589404}
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc.cc
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc.h
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc_constants.h
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc_unittest.cc
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_bucket.cc
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_cookie.h
[modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_page.h

Project Member

Comment 19 by bugdroid1@chromium.org, Sep 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7bd9d693667d585c44d2d29666860c678c19bc83

commit 7bd9d693667d585c44d2d29666860c678c19bc83
Author: Chris Palmer <palmer@chromium.org>
Date: Fri Sep 07 23:27:10 2018

[PartitionAlloc] Add a flag to allow callers to zero-fill allocations.

This enables callers to avoid doing it themselves (centralizing the `memset` in
1 place, rather than duplicating it at every call site that wants it), and to
allow the PartitionAlloc implementation to avoid doing it itself if the
operating system has already zero'd the new page(s).

This is a re-land, with fixes, of
https://chromium-review.googlesource.com/c/chromium/src/+/1162825.

Bug:  864462 , 680657 
TBR: ajwong,haraken
Change-Id: Icfe05bca82c79c8a8029db40b2902271d3d10347
Reviewed-on: https://chromium-review.googlesource.com/1214173
Reviewed-by: Chris Palmer <palmer@chromium.org>
Commit-Queue: Chris Palmer <palmer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#589717}
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc.cc
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc.h
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc_constants.h
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc_unittest.cc
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_bucket.cc
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_cookie.h
[modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_page.h

Project Member

Comment 20 by bugdroid1@chromium.org, Sep 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/03cfb8fd0e4558e6b2d1bd9570441dfe412f3202

commit 03cfb8fd0e4558e6b2d1bd9570441dfe412f3202
Author: Chris Palmer <palmer@chromium.org>
Date: Sat Sep 08 06:35:23 2018

[PartitionAlloc] Fix a crash when allocation fails and PartitionAllocZeroFill is used.

Bug:  864462 
TBR: ajwong,haraken
Change-Id: Ia32bd6ec4aad86c0581a0512cabeb658546d6327
Reviewed-on: https://chromium-review.googlesource.com/1214692
Commit-Queue: Chris Palmer <palmer@chromium.org>
Reviewed-by: Chris Palmer <palmer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#589774}
[modify] https://crrev.com/03cfb8fd0e4558e6b2d1bd9570441dfe412f3202/base/allocator/partition_allocator/partition_alloc.h

Project Member

Comment 21 by bugdroid1@chromium.org, Sep 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d96ca641e402036a04ff8c80a075aead40872437

commit d96ca641e402036a04ff8c80a075aead40872437
Author: Chris Palmer <palmer@chromium.org>
Date: Mon Sep 10 22:05:12 2018

Use the new PartitionAlloc API to zero-fill allocations.

Also embellish the PA API a bit to make accessing the new zero-fill
functionality more usable.

There are likely additional call sites that can make use of this functionality.

Bug:  864462 
Change-Id: I8881c5774b6083b210f5160d4b4f2ac345f2467e
Reviewed-on: https://chromium-review.googlesource.com/1212443
Commit-Queue: Chris Palmer <palmer@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Reviewed-by: Hongchan Choi <hongchan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#590081}
[modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/base/allocator/partition_allocator/partition_alloc.h
[modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/audio/audio_array.h
[modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/wtf/allocator/partitions.h
[modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/wtf/typed_arrays/array_buffer_contents.cc

Status: Fixed (was: Started)
Looking at the test app (https://algotec.github.io/alloc-clamped-array/) in Chrome Stable (69), I see it takes about 650+ ms on my machine. In a Release build of tip-of-tree with the fix in #22 applied, it's down to about 60 ms, which is comparable with Firefox on my machine. 10x speedup, as hoped for. So I think we're good here.
Blockedon: 890752
Status: Started (was: Fixed)
Status: Fixed (was: Started)
See also https://chromium-review.googlesource.com/1256211. I think we should be back in business.

Sign in to add a comment