New issue
Advanced search Search tips

Issue 701344 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 1
Type: Bug-Regression



Sign in to add a comment

RangeError when postMessaging large object from worker

Reported by asher...@gmail.com, Mar 14 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36

Steps to reproduce the problem:
1. Visit https://asherkin.github.io/vtable/
2. Hover over "Team Fortress 2" and click "Server"
3. Wait for the binary to download and process

What is the expected behavior?
The progress bar will be replaced by a text input that requests you enter a classname.

What went wrong?
The progress bar will get stuck at 100%, opening the JS console will reveal the following:

Uncaught RangeError: Maximum call stack size exceeded
    at self.onmessage (worker.js:299)

The referenced line is:

self.postMessage(out);

`out` is a rather large, cyclical object.
Object {classes: Array(3568), functions: Array(26398)}

Did this work before? Yes 56.0.2924.87

Chrome version: 57.0.2987.98  Channel: stable
OS Version: 10.0
Flash Version: 

I am happy to build a minimal testcase, but reporting ASAP as this has broken a live site.
 

Comment 1 by asher...@gmail.com, Mar 14 2017

Sorry, forgot to mention in the wizard - the changes made in  issue #148757  look very likely related.

I'm guessing I can work around this by posting multiple, smaller objects?

Comment 2 by woxxom@gmail.com, Mar 14 2017

You can bisect the breaking change using https://www.chromium.org/developers/bisect-builds-py
Or by manually trying out snapshots:
  x86: https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Win/
  x64: https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Win_64/
Snapshot number for 56.0.2924.87 is 433059
Snapshot number for 57.0.2987.98 is 444943
These numbers can be retrieved on https://omahaproxy.appspot.com/

Owner: jbroman@chromium.org
Status: Assigned (was: Unconfirmed)
This is the regression which started in Chrome 57, Please find the bisect result below :

You are probably looking for a change made after 434684 (known good), but no lat
er than 434685 (first known bad).
CHANGELOG URL:
  https://chromium.googlesource.com/chromium/src/+log/ce1c7996be64c3af7243256c5032f11aef6fe7cd..19b847a9301f7139b7ff85cb32eb13e514fa1ab1

Comment 4 Deleted

Able to reproduce the issue with Chrome Stable(57.0.2987.98), Dev(58.0.3029.19) and Canary(59.0.3041.0) on Windows(7 and 10), Mac and Linux.
Labels: -Pri-2 prestable-57.0.2987.98 Pri-1

Comment 7 by gov...@chromium.org, Mar 14 2017

URGENT - PTAL.
Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and get it merged into the release branch ASAP. Thank you!

Know that this issue shouldn't block the release?  Remove the ReleaseBlock-Stable label or move to M58.
Cc: esprehn@chromium.org
+esprehn since we discussed this case way back when this was being implemented.

This is the "very deep object" case. Essentially, the new implementation uses the stack to recurse, and this object is very deep (if you traverse depth-first, as per spec we must). So we attempt roughly this traversal:

out.classes[0].vtables[0].functions[0].classes[0].vtables[0].functions[0]...

And this overflows the stack. The old implementation had more complicated code to keep this stack on the heap instead (though it still had a hard-coded maximum depth -- it just happens that the depth here is between the depth that's possible on the available stack and the old depth limit). At the time that this was moving into V8, I noticed that doing that more complicated work imposed a measurable (on the order of a few percent -- I don't recall exactly) performance overhead.

At the time we reasoned that JSON.stringify also rejects very deep structures for the same reason, so this seemed an acceptable thing to do. AFAIK this is the first user report of an issue arising from this decision (no reports in canary, dev, or beta), which suggests that doing this is indeed uncommon.

If we wanted to back that out, it'd be hard now that 57 is rolling out to stable. While I think we could reasonably safely switch to the older, slower implementation, that'd only help for one milestone. And changing our minds to add the complexity to the v8::ValueSerializer implementation would be a big change that I could not justify merging to stable.

WDYT?


To the reporter: Yes,  bug 148757  is exactly related (its primary goal is performance of large messages -- aside from hitting the max depth, this should have significantly improved postMessage performance for objects like yours). As far as workarounds: the object size is not exactly the issue, it's the depth. One fix, for instance, might be having the "functions" array in the vtable objects contain indexes into out.functions rather than direct pointers. That would make the message object wide but not deep, which should work without issue.
I think this is working as intended, had this app been using json instead they'd probably have hit this limit too, and I don't think we're going to rewrite the json serializer (and other browsers use the stack same as us). 

Comment 10 by asher...@gmail.com, Mar 15 2017

Thank you for the detailed reply jbroman,

Interestingly, if I process a smaller binary (i.e. "Team Fortress 2" -> "Engine" on the site) it does complete - so I'm guessing there is cycle avoidance, but the depth-first nature is blowing the stack first?

For me, this webapp being broken until I can rewrite things is not a significant issue, so I will not take it personally if this is WONTFIX given the lack of early reports - sadly it is not a project I have done any development on in over a year, so I was not aware until I received reports from users of it being broken. I was very happy that postMessage would allow cyclical objects unlike JSON.stringify, should have expected that to come back and bite me.

It is a shame there is no way to just transfer the object across completely without serialization - the worker ends right after posting it.
Labels: -ReleaseBlock-Stable -prestable-57.0.2987.98
Removing Release-Block-Stable to unblock stable release of M57.

Cycles are fine -- the problem is when the depth-first traversal finds a non-cyclic path that exceeds the stack. The algorithm looks roughly like:

- if the object is in the visited set, refer to its ID and skip the following steps
- add the object to the visited set
- recurse on the properties of the object

The depth of that recursion, though, is finite.

Unfortunately, there's considerable complexity around just moving the object reference (objects are closely tied to the context they are created in, and they are allocated on separate heaps).
Status: WontFix (was: Assigned)
I haven't heard any more reports of this, so I'm going to WontFix this per #9, sorry.

Sign in to add a comment