Transferring ArrayBuffer back-n-forth between window and worker seems to leak memory |
||||||||||
Issue descriptionChrome Version: 58.0.3029.18 (Official Build) dev (64-bit) OS: Chrome OS [9334.10.0 (Official Build) dev-channel samus] What steps will reproduce the problem? (1) Go to https://darinf.github.io/worker_render/main.html (2) Hit the "start" button (3) Watch task manager Memory usage should remain stable, but instead it seems to grow unbounded.
,
Mar 18 2017
Sigbjorn: Maybe is the array buffer accounting not working as expected (and thus V8 is not aware of the memory allocated by array buffers)? Or is it really leaking?
,
Mar 18 2017
,
Mar 18 2017
content_shell memory profile is flat with ToT (and https://codereview.chromium.org/2741793003/ in scope, which isn't on M58), but I do see steady buildup with Canary 59.0.3044.0 (Windows)
,
Mar 18 2017
(I don't have an access to my desktop until Tuesday but) maybe it would be helpful to tweak Darin's example and force a GC at some point. If the memory is gone, it's a bug of the array buffer accounting. If the memory is not gone, it will be a real memory leak.
,
Mar 18 2017
Yes, I can take a look at how this one enters. Issues around a balanced Blink GC activation policy for these external (PA) allocation buildups is an ongoing topic - issue 697192 , issue 114548 , this and issue 698981
,
Mar 18 2017
The SerializedScriptValue allocation accounting is behaving correctly. The problem is that we don't have support for idle GCs on non-worker thread heaps.
,
Mar 18 2017
Too many negatives, no support on _worker_ threads was what i meant to write.
,
Mar 19 2017
> The problem is that we don't have support for idle GCs on worker thread heaps. Would you elaborate on the problem a bit more? Technically I think we can easily add an idle GC for worker threads because Blink Scheduler supports idle tasks on non-main threads now.
,
Mar 19 2017
https://darinf.github.io/worker_leak/worker.js deserializes the message payload and creates an ArrayBuffer on the worker's heap. The worker then creates an Uint32Array view of that buffer, updates it and posts it back. And repeats. We do try to schedule idle GCs for the worker heap as part of the (moderate) allocations that it performs, but they go nowhere per https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/ThreadState.cpp?type=cs&q=scheduleIdleGC+package:%5Echromium$&l=649 If we wait long enough, a conservative or memory pressure GC might kick in for the worker. (I confirmed that the buildup disappeared if a precise GC was scheduled instead for every N of these failed idle GC attempts.)
,
Mar 19 2017
Thanks! > We do try to schedule idle GCs for the worker heap as part of the (moderate) allocations that it performs, but they go nowhere per https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/ThreadState.cpp?type=cs&q=scheduleIdleGC+package:%5Echromium$&l=649 I guess we can now remove the !isMainThread check since idle tasks are supported on non-main threads.
,
Mar 19 2017
Awesome, didn't know that this had been added. The scheduled idle (worker) GC task doesn't get to run for this testcase, for some reason.
,
Mar 19 2017
It looks like IdleHelper::EnableLongIdlePeriod() never reaches a quiescence state for the worker.. but the delayed task it posts to re-check quiescence state later upsets that very state when it gets to run, so it just re-posts that rechecking task ad infinitum. iow, if there are no other calls to EnableLongIdlePeriod() for the worker thread scheduler (I can't see any bar the initial one), then no idle delayed tasks get to run. afaict. Should we create a new issue for enabling worker idle GCs?
,
Mar 19 2017
+Sami +Alex
,
Mar 20 2017
We'll take a look.
,
Mar 20 2017
The root cause is we currently have a pretty arbitrary definition of idleness for workers. We deem a worker idle if no tasks* have run for 300ms. This example is sending messages much more frequently than that. Maybe we should consider having short idle periods (somehow) on worker threads? * On a monitored queue. The scheduler's control task queue isn't monitored.
,
Mar 20 2017
Everything should still work even if we don't have idle time -- idle tasks are just an optional bonus for reducing the amount of inline GC. Since we're almost busy-looping here (?), running no idle tasks seems correct. Is the lack of a V8 or an Oilpan GC that's hurting us here? Do things eventually recover thanks to an inline GC?
,
Mar 20 2017
,
Mar 20 2017
This is Blink GC side - itinerant garbage is cleared out if a precise Blink GC runs; see #10. For this testcase, any idle period matches up with whenever the main thread enjoys one. We could track if a scheduled idle worker GC doesn't get to run, and become increasingly keen on scheduling a precise (or conservative) GC.
,
Mar 20 2017
Random idea: if we measured the average CPU utilization of the worker thread over time and designated the remaining time as idle, we could then start handing out idle periods, say, every 100ms.
,
Mar 20 2017
,
Mar 20 2017
Given that this is not a memory leak and that Oilpan's conservative GC is actually triggered at some point (although it should be triggered a bit earlier), this wouldn't be a P1 issue, right?
,
Mar 21 2017
External memory is reported to V8 on a really low rate ~100K/second and will hover at around 20M peak. ./out/Release/chrome --user-data-dir=/tmp/bla --js-flags="--trace-gc --trace-gc-verbose" https://darinf.github.io/worker_leak/main.html ... ThreadState::scheduleV8FollowupGCIfNeeded (gcType=MinorGC) [1:0x38ec84c3000] 51610 ms: Scavenge 3.8 (9.5) -> 1.8 (9.5) MB, 0.1 / 0.0 ms allocation failure [1:0x38ec84c3000] Memory allocator, used: 9728 KB, available: 2120192 KB [1:0x38ec84c3000] New space, used: 0 KB, available: 2013 KB, committed: 4096 KB [1:0x38ec84c3000] Old space, used: 680 KB, available: 0 KB, committed: 904 KB [1:0x38ec84c3000] Code space, used: 1140 KB, available: 0 KB, committed: 1704KB [1:0x38ec84c3000] Map space, used: 66 KB, available: 0 KB, committed: 532 KB [1:0x38ec84c3000] Large object space, used: 0 KB, available: 2119671 KB, committed: 0 KB [1:0x38ec84c3000] All spaces, used: 1887 KB, available: 2121685 KB, committed: 7236KB [1:0x38ec84c3000] External memory reported: 20992 KB [1:0x38ec84c3000] Total time spent in GC : 322.3 ms ThreadState::scheduleGCIfNeeded heapGrowingRate=100.0, partitionAllocGrowingRate=100.0 heapGrowingRate=100.0, partitionAllocGrowingRate=100.0 Scheduled ConservativeGC ThreadHeap::collectGarbage (gcReason=ConservativeGC, lazySweeping=1, time=0.4ms) [1:0x38ec84c3000] Fast promotion mode: false survival rate: 0% ThreadState::scheduleV8FollowupGCIfNeeded (gcType=MinorGC) [1:0x38ec84c3000] 51644 ms: Scavenge 3.8 (9.5) -> 1.8 (9.5) MB, 19.5 / 19.5 ms allocation failure [1:0x38ec84c3000] Memory allocator, used: 9728 KB, available: 2120192 KB [1:0x38ec84c3000] New space, used: 0 KB, available: 2013 KB, committed: 4096 KB [1:0x38ec84c3000] Old space, used: 680 KB, available: 0 KB, committed: 904 KB [1:0x38ec84c3000] Code space, used: 1140 KB, available: 0 KB, committed: 1704KB [1:0x38ec84c3000] Map space, used: 66 KB, available: 0 KB, committed: 532 KB [1:0x38ec84c3000] Large object space, used: 0 KB, available: 2119671 KB, committed: 0 KB [1:0x38ec84c3000] All spaces, used: 1887 KB, available: 2121685 KB, committed: 7236KB [1:0x38ec84c3000] External memory reported: 19253 KB [1:0x38ec84c3000] Total time spent in GC : 341.8 ms [1:0x38ec84c3000] Fast promotion mode: false survival rate: 0% In V8 only Scavenges are triggered which may imply a conservative Blink GCs. The external memory limit of 64M that would trigger a full GC and also imply a Blink precise Blink GC is never reached. At around 20M external memory (c.f. trace above) a conservative Blink GC kicks in and only frees up 1M. Why is it not able to free up more? What would it take to free up more. A precise Blink GC would release everything here.
,
Mar 21 2017
I let the testcase run for a couple of hours and observed a handful of precise and conservative GCs going ahead on the worker thread. The renderer process size had > doubled when those kicked in - task manager's "private memory" size showed a sizable reduction after those (-20M), so they do appear effective. I think we should go ahead and enable idle (Blink) GCs on worker thread heaps, and then we can decide if we ought to support "idle-starved" workers better wrt GC policy.
,
Mar 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/05567589ab894a0524ac8263250905484fc3845e commit 05567589ab894a0524ac8263250905484fc3845e Author: sigbjornf <sigbjornf@opera.com> Date: Tue Mar 21 14:42:00 2017 Enable idle GC for worker thread heaps. Idle task processing is now supported on threads other than the main thread, hence go ahead and enable idle GCs for worker thread heaps. R=haraken BUG= 702902 Review-Url: https://codereview.chromium.org/2765843002 Cr-Commit-Position: refs/heads/master@{#458405} [modify] https://crrev.com/05567589ab894a0524ac8263250905484fc3845e/third_party/WebKit/Source/platform/heap/ThreadState.cpp
,
Mar 21 2017
When trying out the testcase with v8 >= 5.9.61, the worker heap growth is considerably steeper, triggering Blink conservative GCs fairly often. hpayer@ ulan@: anything from https://chromium.googlesource.com/v8/v8/+log/a98d0e95..f8043964 that could explain that? (i.e., that's where a Blink bisect ends up at, r458174.)
,
Mar 22 2017
Bisected to "Migrate %TypedArray%.prototype.fill to C++": https://chromium.googlesource.com/v8/v8/+/cb903e317338798fec3a43fea2ec285d3626c333 I don't know why that would caused increased heap growth. cbruni@, loorongjie@, any ideas?
,
Mar 22 2017
Could it be the "DisallowHeapAllocation no_gc" before std::fill? If %Typedarray%.prototype.fill is called very frequently, I am guessing this line will prevent GC from doing more aggressive sweeping, though I am not familiar with V8's GC internal. https://cs.chromium.org/chromium/src/v8/src/elements.cc?l=2883
,
Mar 22 2017
loorongjie@, that place should be ok, since there is no allocation going on. I think I found a place that can lead to redundant heap number allocations. Left a comment in the CL: https://codereview.chromium.org/2735563002/#msg74
,
Mar 22 2017
As stated in the CL: as per spec we are currently forced to do the input conversion on every [[Set]]. I don't think this was fully intended. However, this only occurs on the slow-path (aka, when don't use numbers as input values to fill).
,
Mar 22 2017
The testcase's use of fill(), see https://darinf.github.io/worker_render/worker.js
,
Mar 22 2017
The "offending" line is: uint32.fill(colors[Math.floor(Math.random() * 10)]); colors.length is 8, causing a fill with undefined from time to time, which in turn makes us use fully spec compliant slow path. Filed a spec issue: https://github.com/tc39/ecma262/issues/855 In the mean time, uint32.fill(colors[Math.floor(Math.random() * colors.length)]); should do the trick.
,
Mar 22 2017
Very nice, the previous polyfill for fill() implementing what the spec hopefully will allow, with less allocation churn.
,
Mar 22 2017
> The "offending" line is: > uint32.fill(colors[Math.floor(Math.random() * 10)]); Thanks for catching my bug! D'oh :-/
,
Mar 22 2017
All hail auto-coercion :) Fast-path for this case is on the way: https://codereview.chromium.org/2769673002
,
Mar 22 2017
In case it is interesting, here's a version of the same test w/o the fill(undefined) calls: https://darinf.github.io/worker_leak_2/main.html It still uses more memory than I would expect, but I see the GC doing its job.
,
Mar 23 2017
Filed issue 704453 for any adjustments to idle GC handling for workers. It looks like we're about 'done' here.
,
Mar 24 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299 commit a1f2239e0b7daff6bbf728d4f744ef6ee52f5299 Author: loorongjie <loorongjie@gmail.com> Date: Fri Mar 24 22:43:35 2017 Move Oddball/String to %Typearray%.prototype.fill fast path ToNumber for Oddball/String has no side-effect, no need to go through %Typearray%.prototype.fill slow path. BUG= v8:5929 , chromium:702902 Review-Url: https://codereview.chromium.org/2769673002 Cr-Commit-Position: refs/heads/master@{#44129} [modify] https://crrev.com/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299/src/elements.cc [modify] https://crrev.com/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299/test/mjsunit/es6/typedarray-fill.js
,
Mar 27 2017
,
Apr 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b commit 2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b Author: Loo Rong Jie <loorongjie@gmail.com> Date: Tue Apr 04 12:51:56 2017 [typedarray] ToNumber coercion is done only once for TA.p.fill Update according to new spec change at https://github.com/tc39/ecma262/pull/856 - Call ToNumber only once in BUILTIN - Remove unused FillNumberSlowPath - FillImpl assumes obj_value->IsNumber() is true - Update test Bug: v8:5929 , chromium:702902 Change-Id: Ic83e6754d043582955b81c76e68f95e1c6b7e901 Reviewed-on: https://chromium-review.googlesource.com/465646 Reviewed-by: Daniel Ehrenberg <littledan@chromium.org> Reviewed-by: Camillo Bruni <cbruni@chromium.org> Commit-Queue: Camillo Bruni <cbruni@chromium.org> Cr-Commit-Position: refs/heads/master@{#44373} [modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/src/builtins/builtins-typedarray.cc [modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/src/elements.cc [modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/test/mjsunit/es6/typedarray-fill.js
,
May 26 2017
(blink-worker triage) How is this?
,
May 29 2017
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by darin@chromium.org
, Mar 18 2017