New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 702902 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Transferring ArrayBuffer back-n-forth between window and worker seems to leak memory

Project Member Reported by darin@chromium.org, Mar 18 2017

Issue description

Chrome Version: 58.0.3029.18 (Official Build) dev (64-bit)
OS: Chrome OS [9334.10.0 (Official Build) dev-channel samus]

What steps will reproduce the problem?
(1) Go to https://darinf.github.io/worker_render/main.html
(2) Hit the "start" button
(3) Watch task manager

Memory usage should remain stable, but instead it seems to grow unbounded.
 

Comment 1 by darin@chromium.org, Mar 18 2017

Here's a better, more reduced test case:
https://darinf.github.io/worker_leak/main.html
Cc: sigbjo...@opera.com jbroman@chromium.org nhiroki@chromium.org
Sigbjorn: Maybe is the array buffer accounting not working as expected (and thus V8 is not aware of the memory allocated by array buffers)? Or is it really leaking?


Labels: -Pri-3 Pri-1

Comment 4 by sigbjo...@opera.com, Mar 18 2017

content_shell memory profile is flat with ToT (and https://codereview.chromium.org/2741793003/ in scope, which isn't on M58), but I do see steady buildup with Canary 59.0.3044.0 (Windows)
(I don't have an access to my desktop until Tuesday but) maybe it would be helpful to tweak Darin's example and force a GC at some point. If the memory is gone, it's a bug of the array buffer accounting. If the memory is not gone, it will be a real memory leak.

Comment 6 by sigbjo...@opera.com, Mar 18 2017

Yes, I can take a look at how this one enters.

Issues around a balanced Blink GC activation policy for these external (PA) allocation buildups is an ongoing topic -   issue 697192 ,  issue 114548 , this and  issue 698981 

Comment 7 by sigbjo...@opera.com, Mar 18 2017

The SerializedScriptValue allocation accounting is behaving correctly. The problem is that we don't have support for idle GCs on non-worker thread heaps.

Comment 8 by sigbjo...@opera.com, Mar 18 2017

Too many negatives, no support on _worker_ threads was what i meant to write.
> The problem is that we don't have support for idle GCs on worker thread heaps.

Would you elaborate on the problem a bit more?

Technically I think we can easily add an idle GC for worker threads because Blink Scheduler supports idle tasks on non-main threads now.


https://darinf.github.io/worker_leak/worker.js deserializes the message payload and creates an ArrayBuffer on the worker's heap. The worker then creates an Uint32Array view of that buffer, updates it and posts it back. And repeats.

We do try to schedule idle GCs for the worker heap as part of the (moderate) allocations that it performs, but they go nowhere per https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/ThreadState.cpp?type=cs&q=scheduleIdleGC+package:%5Echromium$&l=649

If we wait long enough, a conservative or memory pressure GC might kick in for the worker. (I confirmed that the buildup disappeared if a precise GC was scheduled instead for every N of these failed idle GC attempts.)
Thanks!

> We do try to schedule idle GCs for the worker heap as part of the (moderate) allocations that it performs, but they go nowhere per https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/ThreadState.cpp?type=cs&q=scheduleIdleGC+package:%5Echromium$&l=649

I guess we can now remove the !isMainThread check since idle tasks are supported on non-main threads.

Awesome, didn't know that this had been added. The scheduled idle (worker) GC task doesn't get to run for this testcase, for some reason.
It looks like IdleHelper::EnableLongIdlePeriod() never reaches a quiescence state for the worker.. but the delayed task it posts to re-check quiescence state later upsets that very state when it gets to run, so it just re-posts that rechecking task ad infinitum. iow, if there are no other calls to EnableLongIdlePeriod() for the worker thread scheduler (I can't see any bar the initial one), then no idle delayed tasks get to run. afaict.

Should we create a new issue for enabling worker idle GCs?
Cc: skyos...@chromium.org alexclarke@chromium.org
+Sami +Alex

We'll take a look.
The root cause is we currently have a pretty arbitrary definition of idleness for workers.  We deem a worker idle if no tasks* have run for 300ms.  This example is sending messages much more frequently than that.  Maybe we should consider having short idle periods (somehow) on worker threads?

* On a monitored queue. The scheduler's control task queue isn't monitored.
Cc: hpayer@chromium.org u...@chromium.org
Everything should still work even if we don't have idle time -- idle tasks are just an optional bonus for reducing the amount of inline GC. Since we're almost busy-looping here (?), running no idle tasks seems correct.

Is the lack of a V8 or an Oilpan GC that's hurting us here? Do things eventually recover thanks to an inline GC?
Components: Blink>JavaScript>GC Blink>Scheduling
This is Blink GC side - itinerant garbage is cleared out if a precise Blink GC runs; see #10. 

For this testcase, any idle period matches up with whenever the main thread enjoys one. We could track if a scheduled idle worker GC doesn't get to run, and become increasingly keen on scheduling a precise (or conservative) GC.
Random idea: if we measured the average CPU utilization of the worker thread over time and designated the remaining time as idle, we could then start handing out idle periods, say, every 100ms.
Status: Available (was: Untriaged)
Given that this is not a memory leak and that Oilpan's conservative GC is actually triggered at some point (although it should be triggered a bit earlier), this wouldn't be a P1 issue, right?

External memory is reported to V8 on a really low rate ~100K/second and will hover at around 20M peak.

./out/Release/chrome --user-data-dir=/tmp/bla --js-flags="--trace-gc --trace-gc-verbose" https://darinf.github.io/worker_leak/main.html

...
ThreadState::scheduleV8FollowupGCIfNeeded (gcType=MinorGC)
[1:0x38ec84c3000]    51610 ms: Scavenge 3.8 (9.5) -> 1.8 (9.5) MB, 0.1 / 0.0 ms  allocation failure 
[1:0x38ec84c3000] Memory allocator,   used:   9728 KB, available: 2120192 KB
[1:0x38ec84c3000] New space,          used:      0 KB, available:   2013 KB, committed:   4096 KB
[1:0x38ec84c3000] Old space,          used:    680 KB, available:      0 KB, committed:    904 KB
[1:0x38ec84c3000] Code space,         used:   1140 KB, available:      0 KB, committed:   1704KB
[1:0x38ec84c3000] Map space,          used:     66 KB, available:      0 KB, committed:    532 KB
[1:0x38ec84c3000] Large object space, used:      0 KB, available: 2119671 KB, committed:      0 KB
[1:0x38ec84c3000] All spaces,         used:   1887 KB, available: 2121685 KB, committed:   7236KB
[1:0x38ec84c3000] External memory reported:  20992 KB
[1:0x38ec84c3000] Total time spent in GC  : 322.3 ms
ThreadState::scheduleGCIfNeeded
heapGrowingRate=100.0, partitionAllocGrowingRate=100.0
heapGrowingRate=100.0, partitionAllocGrowingRate=100.0
Scheduled ConservativeGC
ThreadHeap::collectGarbage (gcReason=ConservativeGC, lazySweeping=1, time=0.4ms)
[1:0x38ec84c3000] Fast promotion mode: false survival rate: 0%
ThreadState::scheduleV8FollowupGCIfNeeded (gcType=MinorGC)
[1:0x38ec84c3000]    51644 ms: Scavenge 3.8 (9.5) -> 1.8 (9.5) MB, 19.5 / 19.5 ms  allocation failure 
[1:0x38ec84c3000] Memory allocator,   used:   9728 KB, available: 2120192 KB
[1:0x38ec84c3000] New space,          used:      0 KB, available:   2013 KB, committed:   4096 KB
[1:0x38ec84c3000] Old space,          used:    680 KB, available:      0 KB, committed:    904 KB
[1:0x38ec84c3000] Code space,         used:   1140 KB, available:      0 KB, committed:   1704KB
[1:0x38ec84c3000] Map space,          used:     66 KB, available:      0 KB, committed:    532 KB
[1:0x38ec84c3000] Large object space, used:      0 KB, available: 2119671 KB, committed:      0 KB
[1:0x38ec84c3000] All spaces,         used:   1887 KB, available: 2121685 KB, committed:   7236KB
[1:0x38ec84c3000] External memory reported:  19253 KB
[1:0x38ec84c3000] Total time spent in GC  : 341.8 ms
[1:0x38ec84c3000] Fast promotion mode: false survival rate: 0%

In V8 only Scavenges are triggered which may imply a conservative Blink GCs. The external memory limit of 64M that would trigger a full GC and also imply a Blink precise Blink GC is never reached.

At around 20M external memory (c.f. trace above) a conservative Blink GC kicks in and only frees up 1M. Why is it not able to free up more? What would it take to free up more. A precise Blink GC would release everything here.

I let the testcase run for a couple of hours and observed a handful of precise and conservative GCs going ahead on the worker thread. The renderer process size had > doubled when those kicked in - task manager's "private memory" size showed a sizable reduction after those (-20M), so they do appear effective.

I think we should go ahead and enable idle (Blink) GCs on worker thread heaps, and then we can decide if we ought to support "idle-starved" workers better wrt GC policy.


Project Member

Comment 25 by bugdroid1@chromium.org, Mar 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/05567589ab894a0524ac8263250905484fc3845e

commit 05567589ab894a0524ac8263250905484fc3845e
Author: sigbjornf <sigbjornf@opera.com>
Date: Tue Mar 21 14:42:00 2017

Enable idle GC for worker thread heaps.

Idle task processing is now supported on threads other than
the main thread, hence go ahead and enable idle GCs for
worker thread heaps.

R=haraken
BUG= 702902 

Review-Url: https://codereview.chromium.org/2765843002
Cr-Commit-Position: refs/heads/master@{#458405}

[modify] https://crrev.com/05567589ab894a0524ac8263250905484fc3845e/third_party/WebKit/Source/platform/heap/ThreadState.cpp

When trying out the testcase with v8 >= 5.9.61, the worker heap growth is considerably steeper, triggering Blink conservative GCs fairly often.

hpayer@ ulan@: anything from https://chromium.googlesource.com/v8/v8/+log/a98d0e95..f8043964 that could explain that? (i.e., that's where a Blink bisect ends up at, r458174.)

Comment 27 by u...@chromium.org, Mar 22 2017

Cc: loorong...@gmail.com cbruni@chromium.org
Bisected to "Migrate %TypedArray%.prototype.fill to C++": https://chromium.googlesource.com/v8/v8/+/cb903e317338798fec3a43fea2ec285d3626c333

I don't know why that would caused increased heap growth. cbruni@, loorongjie@, any ideas?



Could it be the "DisallowHeapAllocation no_gc" before std::fill? If %Typedarray%.prototype.fill is called very frequently, I am guessing this line will prevent GC from doing more aggressive sweeping, though I am not familiar with V8's GC internal.

https://cs.chromium.org/chromium/src/v8/src/elements.cc?l=2883

Comment 29 by u...@chromium.org, Mar 22 2017

loorongjie@, that place should be ok, since there is no allocation going on.

I think I found a place that can lead to redundant heap number allocations. Left a comment in the CL:
https://codereview.chromium.org/2735563002/#msg74
As stated in the CL: as per spec we are currently forced to do the input conversion on every [[Set]]. I don't think this was fully intended.
However, this only occurs on the slow-path (aka, when don't use numbers as input values to fill).
The testcase's use of fill(), see https://darinf.github.io/worker_render/worker.js
The "offending" line is:
uint32.fill(colors[Math.floor(Math.random() * 10)]);

colors.length is 8, causing a fill with undefined from time to time, which in turn makes us use fully spec compliant slow path.

Filed a spec issue: https://github.com/tc39/ecma262/issues/855
In the mean time, uint32.fill(colors[Math.floor(Math.random() * colors.length)]); should do the trick.
Very nice, the previous polyfill for fill()  implementing what the spec hopefully will allow, with less allocation churn.

Comment 34 by darin@chromium.org, Mar 22 2017

> The "offending" line is:
> uint32.fill(colors[Math.floor(Math.random() * 10)]);

Thanks for catching my bug! D'oh :-/
All hail auto-coercion :)
Fast-path for this case is on the way: https://codereview.chromium.org/2769673002

Comment 36 by darin@chromium.org, Mar 22 2017

In case it is interesting, here's a version of the same test w/o the fill(undefined) calls: https://darinf.github.io/worker_leak_2/main.html

It still uses more memory than I would expect, but I see the GC doing its job.
Filed issue 704453 for any adjustments to idle GC handling for workers.

It looks like we're about 'done' here.
Project Member

Comment 38 by bugdroid1@chromium.org, Mar 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299

commit a1f2239e0b7daff6bbf728d4f744ef6ee52f5299
Author: loorongjie <loorongjie@gmail.com>
Date: Fri Mar 24 22:43:35 2017

Move Oddball/String to %Typearray%.prototype.fill fast path

ToNumber for Oddball/String has no side-effect, no need to go
through %Typearray%.prototype.fill slow path.

BUG= v8:5929 , chromium:702902 

Review-Url: https://codereview.chromium.org/2769673002
Cr-Commit-Position: refs/heads/master@{#44129}

[modify] https://crrev.com/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299/src/elements.cc
[modify] https://crrev.com/a1f2239e0b7daff6bbf728d4f744ef6ee52f5299/test/mjsunit/es6/typedarray-fill.js

Components: Blink>Workers
Project Member

Comment 40 by bugdroid1@chromium.org, Apr 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b

commit 2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b
Author: Loo Rong Jie <loorongjie@gmail.com>
Date: Tue Apr 04 12:51:56 2017

[typedarray] ToNumber coercion is done only once for TA.p.fill

Update according to new spec change at
https://github.com/tc39/ecma262/pull/856

- Call ToNumber only once in BUILTIN
- Remove unused FillNumberSlowPath
- FillImpl assumes obj_value->IsNumber() is true
- Update test

Bug: v8:5929 , chromium:702902 

Change-Id: Ic83e6754d043582955b81c76e68f95e1c6b7e901
Reviewed-on: https://chromium-review.googlesource.com/465646
Reviewed-by: Daniel Ehrenberg <littledan@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Commit-Queue: Camillo Bruni <cbruni@chromium.org>
Cr-Commit-Position: refs/heads/master@{#44373}
[modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/src/builtins/builtins-typedarray.cc
[modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/src/elements.cc
[modify] https://crrev.com/2b1b32253bd4f0d6d97ae6c804d8e66eb1bfd07b/test/mjsunit/es6/typedarray-fill.js

(blink-worker triage)
How is this?
Status: Fixed (was: Available)

Sign in to add a comment