new ImageData() in a for loop causes memory growing crazy |
|||||||||||||||||||||||
Issue descriptionRun the attached html file, observe memory in task manager of chrome. The memory goes up to around 2GB and stable at 2GB. Is this expected behavior? Should garbage collection be triggered because the image data is not referenced by anything? jochen@ and haraken@: could you please take a look? Thanks.
,
Jul 6 2016
Repros on ToT. V8 side of repro (1) AdjustAmountOfExternalMemory get's called for each 100M image data added (2) At ~200M we trigger incremental marking. (3) Each further ImageData allocation adds more external memory, triggering incremental marking steps (3a) We don't make progress fast enough, so at some point we sit idle at ~2GB. (4) At some point V8 is idle long enough, triggering the finalization of incremental marking in idle time. Finalization results in a major GC (Mark-Compact) (5) V8 follows up with 3 memory reducing GCs as the page gets idle. Taking aside that we could do better in (2)->(4), e.g. do a major GC earlier, this is expected behavior. I would assume that global handles are processed asynchronously after the tasks are created in (4). However, I don't see reporting of negative amounts, so I assume something is wrong with handle processing. (WTF::ArrayBufferContents::~DataHolder calls AdjustAmountOfExternalMemory)
,
Jul 6 2016
,
Jul 11 2016
I confirmed that WTF::ArrayBufferContents::~DataHolder and corresponding AdjustAmountOfExternalAllocatedMemory with negative values are called expectedly, although it takes loooong time to get called. I guess that the problem is that we don't often run GC. Once GC runs, the objects get destroyed, then WTF::ArrayBufferContents::~DataHolder and corresponding AdjustAmountOfExternalAllocatedMemory get called. Unless GC runs, Blink doesn't call AdjustAmountOfExternalAllocatedMemory with negative values.
,
Jul 13 2016
yukishiino@: I am running the example on ToT, and I see that the memory usage stays at 2GB for more than 2 mins and I just kill the window. It appears to me that GC never runs, which is kind of strange.
,
Jul 13 2016
,
Jul 14 2016
mlippautz@, do you have any ideas why GC doesn't run for a long time?
,
Jul 14 2016
#5: Can you clarify what "It appears to me that GC never runs..." means? Have you verified that it does not run with --js-flags="--trace-gc --trace-gc-verbose"? Trying the testcase from #0 I see 3 V8 GCs running. A regular MC and 3 memory-reducing MC. What we seem to miss is the round-trip through the oilpan heap.
,
Jul 14 2016
,
Jul 14 2016
mlippautz@: I just run it with --js-flags="--trace-gc --trace-gc-verbose", but again, the memory usage sits a 2GB for about 2 mins and I killed the window. I do see that there are 3 GC runs but it doesn't seem to reduce any memory usage. Here is the output on my console: [1:0x17c69a153000] Memory reducer: call rate 0.000, low alloc, foreground [1:0x17c69a153000] Memory reducer: started GC #1 [1:0x17c69a153000] Heap growing factor 1.1 based on mu=0.970, speed_ratio=387474 (gc=387474, mutator=1) [1:0x17c69a153000] Grow: old size: 1916 KB, new limit: 11116 KB (1.1) [1:0x17c69a153000] Memory reducer: finished GC #1 (will do more) [1:0x17c69a153000] 9299 ms: Mark-sweep 1.9 (9.0) -> 1.9 (8.0) MB, 2.7 / 0.0 ms (+ 4.4 ms in 4 steps since start of marking, biggest step 2.3 ms) [idle notification: finalize incremental marking] [GC in old space requested]. [1:0x17c69a153000] Memory allocator, used: 8232 KB, available: 1458136 KB [1:0x17c69a153000] New space, used: 0 KB, available: 1007 KB, committed: 1024 KB [1:0x17c69a153000] Old space, used: 1422 KB, available: 506 KB, committed: 2000 KB [1:0x17c69a153000] Code space, used: 427 KB, available: 0 KB, committed: 1024 KB [1:0x17c69a153000] Map space, used: 66 KB, available: 0 KB, committed: 1112 KB [1:0x17c69a153000] Large object space, used: 0 KB, available: 1457095 KB, committed: 0 KB [1:0x17c69a153000] All spaces, used: 1916 KB, available: 1458609 KB, committed: 5160 KB [1:0x17c69a153000] External memory reported: 1953145 KB [1:0x17c69a153000] Total time spent in GC : 5.3 ms [1:0x17c69a153000] Memory reducer: call rate 0.000, low alloc, foreground [1:0x17c69a153000] Memory reducer: started GC #2 [1:0x17c69a153000] Increasing marking speed to 3 due to high promotion rate [1:0x17c69a153000] Heap growing factor 1.1 based on mu=0.970, speed_ratio=12210 (gc=312437, mutator=26) [1:0x17c69a153000] Grow: old size: 1916 KB, new limit: 11116 KB (1.1) [1:0x17c69a153000] Memory reducer: finished GC #3 (done) [1:0x17c69a153000] 9907 ms: Mark-sweep 1.9 (8.0) -> 1.9 (8.0) MB, 2.2 / 0.0 ms (+ 3.8 ms in 4 steps since start of marking, biggest step 1.9 ms) [idle notification: finalize incremental marking] [GC in old space requested]. [1:0x17c69a153000] Memory allocator, used: 8232 KB, available: 1458136 KB [1:0x17c69a153000] New space, used: 0 KB, available: 1007 KB, committed: 1024 KB [1:0x17c69a153000] Old space, used: 1422 KB, available: 0 KB, committed: 2000 KB [1:0x17c69a153000] Code space, used: 427 KB, available: 0 KB, committed: 1024 KB [1:0x17c69a153000] Map space, used: 66 KB, available: 0 KB, committed: 1112 KB [1:0x17c69a153000] Large object space, used: 0 KB, available: 1457095 KB, committed: 0 KB [1:0x17c69a153000] All spaces, used: 1916 KB, available: 1458103 KB, committed: 5160 KB [1:0x17c69a153000] External memory reported: 1953145 KB [1:0x17c69a153000] Total time spent in GC : 7.6 ms
,
Jul 14 2016
Chrome features multiple GCs, one in blink, and one in V8. Without explaining the whole object model and how GCs are triggered: The GCs you see there are V8 GCs that *should* trigger followup work in blink, which should result in blink reporting that memory was freed. The external memory reported is incremented by blink whenever V8 holds onto memory it does not manage. It's incremented for each ImageData object but never reset. The problem is that the first round trip between V8 and blink is not yielding in adjust the amount of external memory that is reported by V8. Most of the GCs are triggered based on allocations. So a page sitting idle will not generate any more GCs (2 minutes or 30 minutes, does not matter). The memory reducing GCs you see are actually time triggered and indicate that V8 thinks it's idle. Yuki, Kentaro: Can you have a look at whether oilpan is properly triggered?
,
Jul 14 2016
,
Jul 14 2016
,
Jul 15 2016
,
Jul 15 2016
keishi@, could you take a look or triage this issue?
,
Jul 26 2016
any updates on this one? Thanks.
,
Jul 26 2016
I think I found an issue where V8 follow up oilpan gc was not firing as intended. https://codereview.chromium.org/2178393002
,
Jul 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9b5399503fe60d4939f60bfc5c6acd4131fe6208 commit 9b5399503fe60d4939f60bfc5c6acd4131fe6208 Author: keishi <keishi@chromium.org> Date: Wed Jul 27 12:39:42 2016 Fix BlinkGC triggering so it is more sensitive to PartitionAlloc only allocations We were not triggering a BlinkGC when partition alloc grew a lot but Oilpan heap allocated object size was small. BUG= 626082 Review-Url: https://codereview.chromium.org/2178393002 Cr-Commit-Position: refs/heads/master@{#408102} [modify] https://crrev.com/9b5399503fe60d4939f60bfc5c6acd4131fe6208/third_party/WebKit/Source/platform/heap/ThreadState.cpp [modify] https://crrev.com/9b5399503fe60d4939f60bfc5c6acd4131fe6208/third_party/WebKit/Source/platform/heap/ThreadState.h
,
Jul 28 2016
Hi keishi@, I saw you had a CL for this. I tried it, and it does seem to trigger GC faster at the end of the for loop.
However, if I change the script to this:
for (var i = 0; i < 200; i++) {
var image = new ImageData(5000, 5000);
}
which makes it 200 instead of 20 loops, then I would get OOM. Is it expected behavior?
,
Jul 28 2016
,
Jul 29 2016
BlinkGC not happening immediately after a v8 gc is probably a regresssion I introduced in https://chromium.googlesource.com/chromium/src/+/d2466040fdea5f25406f9c6ac07efab10c46548d And that should be fixed with r408102 The peak memory usage is a problem that seems to have started long ago, I'm guessing when we introduced v8 incremental gc. Here is what seems to be happening - Create ImageData #1 (adjustAmountOfExternalAllocatedMemory is called) - Create ImageData #2 - Create ImageData #3 - Inside Isolate::AdjustAmountOfExternalAllocatedMemory, external_memory becomes greater than external_memory_limit which will start an v8 incremental GC - Create ImageData #4 - Create ImageData #5 - V8 incremental marking runs bit by bit while the script runs - Create ImageData #6 - ... - Create ImageData #200 (memory usage reaches 20GB) - V8 incremental gc completes and the garbage is disposed, but only #1, #2, #3 are collected because they were the only garbage when gc started. - ImageData #4 - #200 stays until next v8 gc Can someone on the v8 team(ulan@?) comment if this is expected? Thanks
,
Aug 4 2016
IIURC, it is expected that the major GC collects only #1, #2 and #3 because V8's major GC is now a snapshot GC. The problem is that V8 doesn't trigger the major GC until the externally allocated memory reaches 20 GB. It looks too late. Maybe should we have a mechanism to kick off the major GC when the externally allocated memory exceeds some threshold? ulan@, hpayer@, jochen@: Any thoughts on this?
,
Aug 18 2016
,
Aug 18 2016
> Maybe should we have a mechanism to kick off the major GC when the externally allocated memory exceeds some threshold? I thought that was the whole point of blink declaring it's object's externally allocated memory to V8, to trigger GCs. It certainly used to wrk that way. For many use cases it is important that we still sometimes interrupt JS to perform a synchronous GC. This is causing regressions (OOM crashes) in important sites. Among other things, PDF.js is having a hard time with this. See issue 623375 Bumping priority.
,
Aug 18 2016
V8 triggers an incremental GC once we hit 192M. It then performs another incremental maring step whenever external memory is adjust again. The final GC in this case is triggered because the page is idle. To mitigate this scenario we could start an incremental GC way earlier and have a hard limit for which we would do a full GC. We have to be careful with adjusting the behavior as this code path is also triggered in latency-critical applications.
,
Aug 18 2016
Even in latency-critical application, I would imagine that jank is better than a crash? Would it be possible to have some kind of last chance GC. For example when a memory allocation is being attempted in PartitionAlloc, and it can not be fulfilled, before generating an OOM crash, do a synchronous full GC, and try the allocation again.
,
Aug 19 2016
keishi@ is actively working on this. keishi@: Would you post the latest status to this thread?
,
Sep 1 2016
Bump - any updates on this? Seems to block https://crbug.com/630394 as well.
,
Sep 5 2016
On the V8 side we recently lowered the limit for when we start incremental GCs to 64M. We also added a hard limit for which we would do full GCs, independent of any other state of V8, to half of V8's heap size (which amounts to 700M on 64bit desktop). For 32 bit or low-end devices the limit is way smaller. For this specific case on 64-bit desktop: With these changes we now trigger incremental marking a 64M and hit full GCs at around 700M. The delta of 700M is cleaned up by memory reducing GCs that kick in when the page stays idle. We are also switching how incremental marking works to be based on tasks, which will improve the case further as we make marking progress without having to call the V8 API explicitly. Triggering GCs from other allocators, such as PartitionAlloc, is currently not on our agenda as you need a proper safepoint to call a GC.
,
Oct 6 2016
Issue 652394 has been merged into this issue.
,
Oct 7 2016
,
Oct 7 2016
Issue 653260 has been merged into this issue.
,
Oct 14 2016
Out of memory also for me on Windows Chrome 64 bit 53.0.2785.143 m into a ChromaKey script based on video texture into a worker : https://bugs.chromium.org/p/chromium/issues/detail?id=242215#c30
,
Nov 2 2016
Bump, is there any update here? It's been nearly 2 months since the last update from an '@chromium.org'. This effectively breaks any long running (even a few minutes on low memory machines) canvas image data manipulation. Are there any work arounds or flags to set to trigger earlier GC which will contain this memory leak? Is there any timeline for when this will be fixed? It's been nearly 4 months.
,
May 29 2017
For a priority 1 bug this has been remarkably long-lived. Any updates?
,
May 30 2017
With the adjustments made to V8 GC I can no longer reproduce the issue with the test cases provided in Comments 0 and 19. Marking this bug as fixed. If you still have issues with ImageData memory usage, please file a new bug with new reproduction steps or test cases. Also please include the specs of your device including amount of system RAM.
,
May 30 2017
In issue 623375, which was the original issue I opened about this problem, and in Version 61.0.3114.0 (Official Build) canary (64-bit) on HP Spectre X360 with 8GB RAM and Win 10 x64 I'm still getting a crash with Not enough memory. FYI
,
May 30 2017
Still present for me too on version 60.0.3107.4 (Official Build) dev (64-bit) on Linux 4.10 with 32 GB RAM. Looking at the task manager, memory usage sometimes spikes to > 1 GB in the test case from Comment 0. Furthermore, if I increase the number of iterations to 50+, I get OOM crashes. Seems like GC is happening but may be delayed?
,
May 30 2017
mlippautz@ could you take a look? I took a trace that logs external_memory with the test case from Comment 19. Are external_memory_limit and external_memory_at_last_mark_compact_ suppose to keep increasing like this? (up to 4GB) Maybe we should be using the external_memory after BlinkGC sweep in order to update external_memory_limit and external_memory_at_last_mark_compact_? Or should we be relying on LowMemoryNotification to avoid using up all RAM?
,
May 30 2017
Re-assigning as hpayer@ is currently looking into redesigning external memory handling.
,
Jun 1 2017
,
Jun 2 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc commit 502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc Author: hpayer <hpayer@chromium.org> Date: Fri Jun 02 09:40:16 2017 [heap] Activate memory reducer on external memory activity. BUG=chromium:728228, chromium:626082 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_chromium_rel_ng Review-Url: https://codereview.chromium.org/2917853004 Cr-Commit-Position: refs/heads/master@{#45671} [modify] https://crrev.com/502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc/include/v8.h [modify] https://crrev.com/502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc/src/api.cc [modify] https://crrev.com/502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc/src/heap/heap.cc [modify] https://crrev.com/502c6ae6a03979efbd3e006e6a0b8c3369ca2bbc/src/isolate.cc
,
Jun 3 2017
So I tried Version 61.0.3119.0 (Official Build) canary (64-bit) which has JavaScript V8 6.1.60 that supposedly has the fix (https://chromium.googlesource.com/v8/v8.git/+log/6.1.60) on Win 10 x64 w/ 8GB RAM and issue 626082 is still UNRESOLVED (OOM). FYI
,
Jun 3 2017
Interestingly, the script in comment #19 doesn't crash the tab anymore. Memory grows to max 3.8GB then settles for a couple of seconds at 3GB and then is cleared. So it seems that there's an improvement, which is insufficient to fix issue 626082 .
,
Jun 3 2017
If I increase the loop count to 400 I manage to crash the tab at ~3.6GB. FYI
,
Jun 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/8d75644fc0ce1cee5d6eca42006f4c4aa89e9b86 commit 8d75644fc0ce1cee5d6eca42006f4c4aa89e9b86 Author: hpayer <hpayer@chromium.org> Date: Thu Jun 08 08:58:30 2017 [heap] Use larger marking steps during external allocation pressure BUG= chromium:626082 , chromium:728228 Review-Url: https://codereview.chromium.org/2927553003 Cr-Commit-Position: refs/heads/master@{#45784} [modify] https://crrev.com/8d75644fc0ce1cee5d6eca42006f4c4aa89e9b86/src/heap/heap.cc
,
Jun 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/195eab4619ace704d34ebd00b197ff8d7c739df7 commit 195eab4619ace704d34ebd00b197ff8d7c739df7 Author: machenbach <machenbach@chromium.org> Date: Thu Jun 08 21:19:44 2017 Revert of [heap] Use larger marking steps during external allocation pressure (patchset #4 id:60001 of https://codereview.chromium.org/2927553003/ ) Reason for revert: Blocks the roll. Fails some layout tests: https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/21757 STDERR: # Fatal error in ../../v8/src/heap/heap.cc, line 957 STDERR: # Check failed: 1.0 <= pressure (1 vs. -0.00503761). Original issue's description: > [heap] Use larger marking steps during external allocation pressure > > BUG= chromium:626082 , chromium:728228 > > Review-Url: https://codereview.chromium.org/2927553003 > Cr-Commit-Position: refs/heads/master@{#45784} > Committed: https://chromium.googlesource.com/v8/v8/+/8d75644fc0ce1cee5d6eca42006f4c4aa89e9b86 TBR=ulan@chromium.org,hpayer@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= chromium:626082 , chromium:728228 Review-Url: https://codereview.chromium.org/2925333002 Cr-Commit-Position: refs/heads/master@{#45797} [modify] https://crrev.com/195eab4619ace704d34ebd00b197ff8d7c739df7/src/heap/heap.cc
,
Jun 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/b011c781cf639194aa596eb9cbffb42cf63635e5 commit b011c781cf639194aa596eb9cbffb42cf63635e5 Author: hpayer <hpayer@chromium.org> Date: Mon Jun 12 10:37:49 2017 [heap] Reland use larger marking steps during external allocation pressure This reverts commit 195eab4619ace704d34ebd00b197ff8d7c739df7. BUG= chromium:626082 , chromium:728228 Review-Url: https://codereview.chromium.org/2931393002 Cr-Commit-Position: refs/heads/master@{#45843} [modify] https://crrev.com/b011c781cf639194aa596eb9cbffb42cf63635e5/src/heap/heap.cc
,
Jun 15 2017
With 61.0.3130.0 (Official Build) canary (64-bit) which has JavaScript V8 6.1.156 on 8GB 64-bit Win 10 machine script in c19 runs fine even with 400 iterations. The pdf file in bug 626082 still crashes the extension (the tab) with OOM message. Note that while rendering the pdf file, the process view window also becomes unresponsive, so it's impossible to see how much memory is consumed.
,
Jun 19 2017
Michale, can you take over and look into the pdf issue.
,
Jun 19 2017
#49. Can you provide a proper link to the PDF file? The bug mentioned is a self reference :)
,
Jun 19 2017
Sorry, see bug 623375, comment 49 for the file and the link to pdf.js extension. Note that you need to have zoom level at 150% and you need to scroll to page 2.
,
Jun 19 2017
Version 61.0.3136.0 (Developer Build) (64-bit) I could not reproduce the PDF issue. I see a lot of external memory being allocated but the GC reliable triggering.
,
Jun 19 2017
I can easily reproduce this up to and including Version 61.0.3135.0 (Official Build) canary (64-bit). (Haven't tried 61.0.3136.0 yet). 1.How much memory do you have? (I have 8GB) 2.Did you set zooming factor to 150%? 3.Did you scroll past p.2?
,
Jun 19 2017
1. a lot more (100GB+). I see that the external memory held alive is below 256M though, so this should not be an issue. Can you provide a crash id from chrome://crashes ? 2. Yes 3. Yes
,
Jun 19 2017
1.There's an OOM crash, but no crash report. How to force one? 2.At lower zoom levels (~80%), there's no crash and I saw with Process Explorer than memory spiked to about 3GB.
,
Jun 20 2017
Same in Version 61.0.3136.0 (Official Build) canary (64-bit) 8GB No crash report
,
Jun 21 2017
Can I help with some trace or whatever? Just let me know.
,
Jun 27 2017
Interestingly, the problem does NOT occur in Version 61.0.3142.1 (Official Build) canary SyzyASan (32-bit) Not sure why and why it's 32-bit now instead of the usual 64-bit, eventhough the system is Win 10 64b
,
Jun 28 2017
With Version 61.0.3142.3 (Official Build) canary (64-bit) we're back to "normal" buggy behavior.
,
Jun 30 2017
With Version 61.0.3144.0 (Official Build) canary (64-bit) managed to get this crash ID: Uploaded Crash Report ID 8bb2295e38000000 (Local Crash ID: c8f388ff-51e4-4d45-80bf-9198543fdb21)
,
Jun 30 2017
Another ones Uploaded Crash Report ID 29d2307c68000000 (Local Crash ID: 8cdb6c62-baca-4a14-b0c1-b3335705e69c) from Version 61.0.3145.0 (Official Build) canary (64-bit)
,
Jul 9 2017
The phenomenon of CORRECT operation in 32b SyzyASan and INCORRECT operation in 64b regular versions of Chrome persists. In Version 61.0.3152.1 (Official Build) canary SyzyASan (32-bit) the operation with pdf file is correct - no crash, no black pages.
,
Jul 10 2017
So I wanted to see whether it's the 32b or SyzyASan that makes Chrome work correctly on pdf links. Installed Version 61.0.3141.8 (Official Build) dev (32-bit) and it works CORRECTLY on a 8GB machine. So the current status is that 32b version works CORRECTLY and the 64b version works INCORRECTLY on pdf file on a 8GB machine. Hope this helps.
,
Aug 6 2017
I profile this recently and it turns out that we were not making progress fast enough with (incremental) GCs in V8 and the hand over to Blink. This is a known issue and we are working on multiple fronts address these sorts of problems. E.g. are faster marking, better integration with Blink, incremental marking in Blink, all of which can help in those cases. We are working on these things at a larger scale but I don't see an immediate fix to this particular issue as we would probably require multiple synchronous rounds of GC between V8 and Blink.
,
Aug 6 2017
I'm just wondering why the difference between 32b and 64b: the former works, the latter doesn't.
,
Aug 10 2017
32bit and 64bit environments diverge wrt. heuristics. E.g., the incremental marker starts off with a limit of 1ms slices of marking to go easy on the 16ms rendering budget for a frame. Because of this initially fixed window size and variable object sizes on different architectures (e.g. word length, but also other differences) we get different marking speeds and progress. (There are a lot more nuanced differences and this one just came up at the top of my head.)
,
Aug 17 2017
@mlippautz It FEELS like the 64b version has issues that the 32b does not have. In 32b, there's a freeze presumably when GC kicks in, but it manages to resume. In 64b version there're black screens and eventually a crash. I wonder whether there're some additional bugs hiding under the hood.
,
Aug 17 2017
Statement in #65 still holds. If you can provide a trace/analysis/profile dump of your feelings we can reconsider. Otherwise this is blocked in improvement marking speed in issue 694255 .
,
Aug 18 2017
Don't know what has changed, but I could NOT crash the pdf file in Version 62.0.3189.0 (Official Build) canary (64-bit) anymore.
,
Oct 11 2017
We shipped concurrent marking in M63 ( issue 694255 ). I could not reproduce the crash in pdf file in the most recent Chrome Dev 64-bit. Closing based on this and comment #70. Please re-open if crash reproduces in M63. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by jochen@chromium.org
, Jul 6 2016