Initialization of memory provided by PartitionAlloc is slow
Reported by
ydor...@gmail.com,
Jul 17
|
||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36 Steps to reproduce the problem: Run the attached index.html (or navigate to https://algotec.github.io/alloc-clamped-array/), click on 'Run test'. The tester allocates 1000 1MB array buffers, stores their references in an array, and measures the amount of time it took. What is the expected behavior? Chrome memory allocation time is similar or better than other browsers What went wrong? Chrome allocation time is significantly worse in all cases except one (Edge, Win10). See the attached excel for a detailed comparison with other browsers. It varies between 1.5-2 (IE11) to 50-70 faster (Firefox) Did this work before? N/A Does this work in other browsers? Yes Chrome version: 67.0.3396.99 Channel: stable OS Version: 6.1 (Windows 7, Windows Server 2008 R2) Flash Version:
,
Jul 17
,
Jul 18
ydorman@ Thanks for the issue. Able to reproduce this issue on Windows 10, Mac OS 10.13.3 and Ubuntu 17.10 on the reported version 67.0.3396.99 and the latest Canary 69.0.3493 as per the original comment. Attached is the screen shot for reference. This is a Non-Regression issue as this behavior is observed from M-60 chrome builds. Hence marking this as Untriaged for further updates from Dev. Thanks..
,
Jul 31
,
Aug 1
A huge amount of the overhead is coming from initializing the memory to 0. I verified this by commenting out the memset here: https://cs.chromium.org/chromium/src/third_party/blink/renderer/platform/wtf/typed_arrays/array_buffer_contents.cc?l=115&rcl=90be568fb49ceac0dfad0783d293bf31baccebe1 Without the memset, the time goes from ~500ms to ~100ms. Firefox is ~250ms for this benchmark. Clearly we need the memory to be set to 0 for correctness. When I run this same benchmark in d8, which uses malloc/calloc/free for the ArrayBufferAllocator implementation, the result is ~16ms. That indicates to me that calloc does something smart - maybe pre-zeroed chunks or something. Maybe it would be possible for PartitionAlloc to provide an interface to allocate initialized memory? Then it could do something under the hood to speed up these cases. Also open to other ideas. Adding folks who might be able to help.
,
Aug 1
,
Aug 1
This is interesting. Palmer's experiment (https://chromium-review.googlesource.com/c/chromium/src/+/1134523) is saying that adding one more zapping to the memory region did not regress performance. That contradicts with the observation of this bug... Palmer: Do you have any thoughts?
,
Aug 1
Actually, for an allocation this big, d8 uses mmap with MAP_ANONYMOUS, meaning the memory will be initialized to 0. The mmap call is here: https://cs.chromium.org/chromium/src/v8/src/base/platform/platform-posix.cc?l=143&rcl=c1e3354d56b2da398aa06a2429a09e7b878fe8eb Which is called in a large chain from here: https://cs.chromium.org/chromium/src/v8/src/d8.cc?l=94&rcl=c1e3354d56b2da398aa06a2429a09e7b878fe8eb
,
Aug 1
My CL zaps memory on *freeing*, not on allocation, which wouldn't affect this benchmark. (And, apparently, not the others. :) ) I agree with #5 that a good way to handle this would be for PartitionAlloc to have an interface allowing the caller to get the memory zero'd, and then for the implementation to optimize by doing nothing if the allocation was satisfied with mmap. The way to do this would be to define a new flag for `PartitionAllocGenericFlags`. https://cs.chromium.org/chromium/src/base/allocator/partition_allocator/partition_alloc.h?l=316 haraken, do you want to do this or shall I?
,
Aug 2
Regarding this benchmark, yes. Your zapping on freeing won't affect the benchmark. However, the benchmark result is saying that "a memset overhead on allocation is huge". My question is: why can we argue that "a memset overhead on freeing is negligible" while "a memset overhead on allocation is huge"?
,
Aug 2
#10: Well, a memset on allocation is huge *in this particular allocation pattern*. Whether real applications really allocate 1,000 1 MiB regions, *and do so in their critical path*, is questionable. This bug is about optimizing for a benchmark, which I am normally against, but since the optimization is nice to have and since we can probably implement with not too much difficulty, I'm not against it. And who knows, it might benefit some real applications. I do also suspect allocations are more likely to be in the critical path for real applications than frees are. For example, when you close a tab and a million frees happen, it's pretty well isolated from your other tabs (especially now with Site Isolation, which creates more and shorter-lived processes).
,
Aug 3
Thanks for the clarification! Your analysis sounds reasonable to me. +1 to implementing a new flag for PartitionAllocGenericFlags as an optimization for this case.
,
Aug 3
palmer@ about #11: We opened this bug due to a real world scenario. We develop a medical imaging application, and in some scenarios the user may scroll very quickly through many different images spread out on 2 to 4 large monitors (each up to 5MegaPixel). In some cases, we need to process 200 images per second coming from the server, using a web socket. When trying to optimize this scenario in different ways we noticed this bug. Obviously, if we could save 1ms per image then we'd free up 200ms per second that can be used to decompress and display images.
,
Aug 3
,
Aug 27
,
Sep 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b60e5cc0bd4b351c5898dc3637f537c355546c11 commit b60e5cc0bd4b351c5898dc3637f537c355546c11 Author: Chris Palmer <palmer@chromium.org> Date: Fri Sep 07 00:51:36 2018 [PartitionAlloc] Add a flag to allow callers to zero-fill allocations. This enables callers to avoid doing it themselves (centralizing the `memset` in 1 place, rather than duplicating it at every call site that wants it), and to allow the PartitionAlloc implementation to avoid doing it itself if the operating system has already zero'd the new page(s). Bug: 864462 , 680657 Change-Id: Ibeae0bf7b9f20238334bf9bc022d0f243a77ef11 Reviewed-on: https://chromium-review.googlesource.com/1162825 Commit-Queue: Chris Palmer <palmer@chromium.org> Reviewed-by: Kentaro Hara <haraken@chromium.org> Reviewed-by: Albert J. Wong <ajwong@chromium.org> Cr-Commit-Position: refs/heads/master@{#589389} [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc.cc [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc.h [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc_constants.h [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_alloc_unittest.cc [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_bucket.cc [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_cookie.h [modify] https://crrev.com/b60e5cc0bd4b351c5898dc3637f537c355546c11/base/allocator/partition_allocator/partition_page.h
,
Sep 7
Status update: https://chromium-review.googlesource.com/1162825 creates the new interface https://chromium-review.googlesource.com/c/chromium/src/+/1212443 uses it in 2 call sites (although there may be other places we can use it, too) But we don't yet have the actual optimization in place, inside Partition Alloc. I have a draft CL (not uploaded yet) that doesn't work yet, and I'll try to hammer that out. The innards of the PA implementation are always enjoyably tricky...
,
Sep 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f94b670562073c228efaf1f32a20708dc0425db3 commit f94b670562073c228efaf1f32a20708dc0425db3 Author: Findit <findit-for-me@appspot.gserviceaccount.com> Date: Fri Sep 07 01:37:54 2018 Revert "[PartitionAlloc] Add a flag to allow callers to zero-fill allocations." This reverts commit b60e5cc0bd4b351c5898dc3637f537c355546c11. Reason for revert: Findit (https://goo.gl/kROfz5) identified CL at revision 589389 as the culprit for failures in the build cycles as shown on: https://findit-for-me.appspot.com/waterfall/culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyRAsSDVdmU3VzcGVjdGVkQ0wiMWNocm9taXVtL2I2MGU1Y2MwYmQ0YjM1MWM1ODk4ZGMzNjM3ZjUzN2MzNTU1NDZjMTEM Sample Failed Build: https://ci.chromium.org/buildbot/chromium.linux/Cast%20Audio%20Linux/20069 Sample Failed Step: base_unittests Original change's description: > [PartitionAlloc] Add a flag to allow callers to zero-fill allocations. > > This enables callers to avoid doing it themselves (centralizing the `memset` in > 1 place, rather than duplicating it at every call site that wants it), and to > allow the PartitionAlloc implementation to avoid doing it itself if the > operating system has already zero'd the new page(s). > > Bug: 864462 , 680657 > Change-Id: Ibeae0bf7b9f20238334bf9bc022d0f243a77ef11 > Reviewed-on: https://chromium-review.googlesource.com/1162825 > Commit-Queue: Chris Palmer <palmer@chromium.org> > Reviewed-by: Kentaro Hara <haraken@chromium.org> > Reviewed-by: Albert J. Wong <ajwong@chromium.org> > Cr-Commit-Position: refs/heads/master@{#589389} Change-Id: Ifd54700a2acb141c458df69304d7ccdce2719cc5 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 864462 , 680657 Reviewed-on: https://chromium-review.googlesource.com/1212346 Cr-Commit-Position: refs/heads/master@{#589404} [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc.cc [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc.h [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc_constants.h [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_alloc_unittest.cc [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_bucket.cc [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_cookie.h [modify] https://crrev.com/f94b670562073c228efaf1f32a20708dc0425db3/base/allocator/partition_allocator/partition_page.h
,
Sep 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7bd9d693667d585c44d2d29666860c678c19bc83 commit 7bd9d693667d585c44d2d29666860c678c19bc83 Author: Chris Palmer <palmer@chromium.org> Date: Fri Sep 07 23:27:10 2018 [PartitionAlloc] Add a flag to allow callers to zero-fill allocations. This enables callers to avoid doing it themselves (centralizing the `memset` in 1 place, rather than duplicating it at every call site that wants it), and to allow the PartitionAlloc implementation to avoid doing it itself if the operating system has already zero'd the new page(s). This is a re-land, with fixes, of https://chromium-review.googlesource.com/c/chromium/src/+/1162825. Bug: 864462 , 680657 TBR: ajwong,haraken Change-Id: Icfe05bca82c79c8a8029db40b2902271d3d10347 Reviewed-on: https://chromium-review.googlesource.com/1214173 Reviewed-by: Chris Palmer <palmer@chromium.org> Commit-Queue: Chris Palmer <palmer@chromium.org> Cr-Commit-Position: refs/heads/master@{#589717} [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc.cc [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc.h [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc_constants.h [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_alloc_unittest.cc [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_bucket.cc [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_cookie.h [modify] https://crrev.com/7bd9d693667d585c44d2d29666860c678c19bc83/base/allocator/partition_allocator/partition_page.h
,
Sep 8
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/03cfb8fd0e4558e6b2d1bd9570441dfe412f3202 commit 03cfb8fd0e4558e6b2d1bd9570441dfe412f3202 Author: Chris Palmer <palmer@chromium.org> Date: Sat Sep 08 06:35:23 2018 [PartitionAlloc] Fix a crash when allocation fails and PartitionAllocZeroFill is used. Bug: 864462 TBR: ajwong,haraken Change-Id: Ia32bd6ec4aad86c0581a0512cabeb658546d6327 Reviewed-on: https://chromium-review.googlesource.com/1214692 Commit-Queue: Chris Palmer <palmer@chromium.org> Reviewed-by: Chris Palmer <palmer@chromium.org> Cr-Commit-Position: refs/heads/master@{#589774} [modify] https://crrev.com/03cfb8fd0e4558e6b2d1bd9570441dfe412f3202/base/allocator/partition_allocator/partition_alloc.h
,
Sep 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d96ca641e402036a04ff8c80a075aead40872437 commit d96ca641e402036a04ff8c80a075aead40872437 Author: Chris Palmer <palmer@chromium.org> Date: Mon Sep 10 22:05:12 2018 Use the new PartitionAlloc API to zero-fill allocations. Also embellish the PA API a bit to make accessing the new zero-fill functionality more usable. There are likely additional call sites that can make use of this functionality. Bug: 864462 Change-Id: I8881c5774b6083b210f5160d4b4f2ac345f2467e Reviewed-on: https://chromium-review.googlesource.com/1212443 Commit-Queue: Chris Palmer <palmer@chromium.org> Reviewed-by: Kentaro Hara <haraken@chromium.org> Reviewed-by: Hongchan Choi <hongchan@chromium.org> Cr-Commit-Position: refs/heads/master@{#590081} [modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/base/allocator/partition_allocator/partition_alloc.h [modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/audio/audio_array.h [modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/wtf/allocator/partitions.h [modify] https://crrev.com/d96ca641e402036a04ff8c80a075aead40872437/third_party/blink/renderer/platform/wtf/typed_arrays/array_buffer_contents.cc
,
Sep 28
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f commit f369a29a01b3d8e8269afae32157f3e4e7ec7e3f Author: Chris Palmer <palmer@chromium.org> Date: Fri Sep 28 21:19:40 2018 [PartitionAlloc] Optimize the `PartitionAllocZeroFill` option. When we have satisfied the allocation with a new OS page, we can skip the `memset`. Bug: 864462 Change-Id: Ib13539a83a931dbad3e588001aa248ff994d3900 Reviewed-on: https://chromium-review.googlesource.com/1213957 Commit-Queue: Chris Palmer <palmer@chromium.org> Reviewed-by: Kentaro Hara <haraken@chromium.org> Cr-Commit-Position: refs/heads/master@{#595218} [modify] https://crrev.com/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f/base/allocator/partition_allocator/partition_alloc.h [modify] https://crrev.com/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f/base/allocator/partition_allocator/partition_alloc_unittest.cc [modify] https://crrev.com/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f/base/allocator/partition_allocator/partition_bucket.cc [modify] https://crrev.com/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f/base/allocator/partition_allocator/partition_bucket.h [modify] https://crrev.com/f369a29a01b3d8e8269afae32157f3e4e7ec7e3f/base/allocator/partition_allocator/partition_root_base.h
,
Sep 28
Looking at the test app (https://algotec.github.io/alloc-clamped-array/) in Chrome Stable (69), I see it takes about 650+ ms on my machine. In a Release build of tip-of-tree with the fix in #22 applied, it's down to about 60 ms, which is comparable with Firefox on my machine. 10x speedup, as hoped for. So I think we're good here.
,
Oct 1
,
Oct 1
,
Oct 3
See also https://chromium-review.googlesource.com/1256211. I think we should be back in business. |
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by dtapu...@chromium.org
, Jul 17