New issue
Advanced search Search tips
Starred by 31 users
Status: Assigned
Owner:
Cc:
Components:
HW: ----
OS: All
Priority: 2
Type: Bug

Blocked on:
issue 7025


Show other hotlists

Hotlists containing this issue:
Hotlist-1


Sign in to add a comment
Substring of huge string retains huge string in memory
Reported by adam.hoo...@gmail.com, Sep 3 2013 Back to list
Steps to reproduce:

  var s = "a huge, huge, huge string...";

  s = s.substring(0, 5);

Expected results: s takes five bytes of memory, plus some overhead.

Actual results: s takes a huge, huge, huge amount of memory.

Unfortunately, most String functions use substring() or no-ops internally: concatenating with empty string, trim(), slice(), match(), search(), replace() with no match, split(), substr(), substring(), toString(), trim(), valueOf().

My workaround is:

function unleakString(s) { return (' ' + s).substr(1); }

But it's not satisfying, because it breaks an abstraction and forces me to think about memory allocation.

Perhaps there should be some special logic for handling substring() when the output string length is less than, say, 0.125 times the input string length? that's not perfect, but in my opinion, the pros of this solution outweigh the cons.

This crops up constantly when scraping HTML.

 
Another approach not needing any kind of ad-hoc copy heuristic is to teach the garbage collector about projection functions. This is a well-known technique described e.g. in http://homepages.inf.ed.ac.uk/wadler/papers/leak/leak.ps.
I briefly glanced the paper. It's not as easy:
- replacing a sliced string by a sequential string with same content usually uses up more memory
- in return, releasing the backing store of the sliced string frees up memory

It's really hard to tell whether flattening the sliced string is worthwhile or not, especially since we don't have a good way to track what strings refer to the same backing store. We had an attempt to flatten sliced strings when GC happens, two years ago, but was rejected due to the fact that it could cause an explosion on GC.
The point here is: On one hand, our sliced strings are an optimization for one kind of use cases (probably chosen from one of the "great" benchmarks we care about), on the other hand, sliced strings in their current naive form are a source of unlimited(!) space leaks for other use cases. So the question is: Can we mitigate the latter problem at least a bit? While a general solution might involve quite some work, solving easier cases might be relatively straightforward, e.g. when a single sliced string is the only reason for keeping another string alive.

We are not the first ones encountering this kind of problem, so there should be a variety of solutions already out there. Let's keep this issue open.
Labels: Type-FeatureRequest Priority-Low
Owner: svenpanne@chromium.org
Status: Accepted
Sven, assigning to you since you suggested to keep the issue open :)
Cc: svenpanne@chromium.org
Owner: yangguo@chromium.org
Comment 6 by habl...@google.com, Apr 29 2015
Status: Assigned
Comment 7 by bit@google.com, Jul 30 2015
Guys, people are starting to propose hacks as workaround for this bug :(

https://github.com/mrdoob/three.js/issues/9679
https://github.com/mrdoob/three.js/pull/9680/files
Comment 9 by kbr@chromium.org, Sep 13 2016
Cc: -svenpanne@chromium.org kbr@chromium.org hpayer@chromium.org
Labels: -Priority-Low -Type-FeatureRequest OS-All Priority-Medium Type-Bug
This is affecting the most popular JavaScript library for drawing 3D graphics on the web. Per the above links, could this please be given some more thought? Thanks.

Comment 10 by kbr@chromium.org, Sep 13 2016
Components: Runtime Language
Comment 11 by adamk@chromium.org, Sep 13 2016
Components: -Language
Labels: Priority-2
Blockedon: 7025
Sign in to add a comment