New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 12 users

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-03-01
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment
link

Issue 169705: WebWorker postMessage very slow when passing TypedArray subarray

Reported by flo...@gmail.com, Jan 12 2013 Project Member

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17

Steps to reproduce the problem:
1. navigate to http://n3demos.appspot.com/dsocharviewer_debug.html, which is doing asynchronous decompressing of loaded asset files via a single web worker in an emscripten application
2. notice how Chrome becomes unresponse/sluggish while the characters are loading (switch to next character / appearance with Cursor Up and Cursor Right, try to rotate the camera with LeftMouseButton+MouseMove) 
3. repeat the same in Firefox, and notice how the page remains responsive while loading data

The problem exists since implementing asynchronous decompression of loaded data through a web worker (see below)

What is the expected behavior?
Chrome should remain response / smooth in web worker postMessage with a typed array.

We suspect that postMessage in the function _emscripten_call_worker() is very slow (see below for details).

Also see here this thread on the emscripten discussion list:
https://groups.google.com/forum/?fromgroups=#!topic/emscripten-discuss/R3W-Encm_DI

What went wrong?
The application is downloading asset files through an XHR, and in the XHR onload callback sends the downloaded, compressed data (about 5kByte to a few dozen kByte) to a single web worker, which decompresses the data through zlib and sends the decompressed data back to the browser thread. 

Sending the data to the web worker is unusually slow on Chrome, so that the browser becomes very unresponsive (it's actually slower then doing the decompressing on the browser thread without web worker), the same code on Firefox runs as expected.

This is currently throttled to at most 5 messages at once (without throttling the Chrome tab may even time-out and show an oops).

You can test this by trying to rotate the camera (LeftMouse+Move) while data is loading. On Chrome this is very sluggish.

This is the function which is identified by the JS profiler as the culprit, the only potentially expensive operation there could be the postMessage:

 function _emscripten_call_worker(id, funcName, data, size, callback, arg) {
      funcName = Pointer_stringify(funcName);
      var info = Browser.workers[id];
      var callbackId = -1;
      if (callback) {
        callbackId = info.callbacks.length;
        info.callbacks.push({
          func: Runtime.getFuncWrapper(callback, 'viii'),
          arg: arg
        });
        info.awaited++;
      }
      info.worker.postMessage({
        'funcName': funcName,
        'callbackId': callbackId,
        'data': data ? HEAPU8.subarray((data),(data + size)) : 0
      });
    }

Did this work before? N/A 

Chrome version: 24.0.1312.52  Channel: beta
OS Version: OS X 10.7.5

This is an emscripten-compiled application. The version under http://n3demos.appspot.com/dsocharviewer_debug.html is not optimized and not minified, under http://n3demos.appspot.com/dsocharviewer.html lives a minified and optimized version. Other demos are:

http://n3demos.appspot.com/dragons.html
http://n3demos.appspot.com/instancing.html

and the debug-versions:

http://n3demos.appspot.com/dragons_debug.html
http://n3demos.appspot.com/instancing_debug.html
 

Comment 1 by flo...@gmail.com, Jan 13 2013

Here's more info, which I think makes it clearer what's going on: this app has emscripten's heap typed-array set to 32 MB, and postMessage takes hundreds of milliseconds in this case, and this time grows with the size of the heap array, and not with the actual size of the subarray (only 5kByte to 20kByte) which is used in postMessage. Here's the time how long postMessage takes with several heap sizes on the same subarray data:

For a 32MB heap array:
Chrome: 150ms .. 200ms
Firefox: 0ms

For a 256MB heap array:
Chrome: 1000ms .. 1800ms(!) (Chrome basically becomes unusable)
Firefox: 0ms

The workaround in Chrome is to make an explicit copy of the subarray data into a new array:

... new Uint8Array(HEAPU8.subarray((data),(data + size))) ...

With this Chrome also only takes 0ms, but I think this workaround isn't desirable in the long run since it makes an extra copy of the data.

Comment 2 by alonza...@gmail.com, Jan 13 2013

I'm pushing the workaround to emscripten as a temporary measure, but it would be much better to not need this, it adds unnecessary overhead (an allocation, a copy and an object to GC).

Comment 3 by tkent@chromium.org, Jan 15 2013

Cc: dim...@chromium.org noel@chromium.org
Labels: -Webkit-JavaScript WebKit-Core Area-WebKit

Comment 4 by bugdroid1@chromium.org, Mar 10 2013

Project Member
Labels: -WebKit-Core -Area-WebKit Cr-Content Cr-Content-Core

Comment 5 by bugdroid1@chromium.org, Apr 5 2013

Project Member
Labels: -Cr-Content Cr-Blink

Comment 6 by kettin...@gmail.com, Feb 22 2014

I've the same problem as well. My chrome version is 33.0.1750.117 beta-m. I think the issue is not solved yet.

Comment 7 Deleted

Comment 8 by thunder...@illyriad.co.uk, Aug 7 2014

.subarray is just a view on the underlying arraybuffer which is 32MB or 256MB in your examples e.g. HEAPU8.subarray((data),(data + size)).buffer.bytelength will show this so it is copying the full buffer

Comment 9 by michalch...@gmail.com, Apr 6 2015

The browsers that I've checked seem to show similar performance profiles. Sending a .subarray seems to be slower if the buffer it's a view on is bigger, while copying the .subarray first seems to be independent of the size of the original buffer:

http://jsperf.com/web-worker-sub-array-copy

Looking at what a TypedArray is, this seems to make sense. A TypedArray has a byteOffset and buffer properties. If you're sending a TypedArray and expect it to be re-created at the other side exactly as it was, it will have to send the buffer it's a view on to achieve this.

Due to this, I don't think this is a bug.

Comment 10 by tkent@chromium.org, Jul 15 2015

Labels: -Cr-Content-Core

Comment 11 by nhiroki@chromium.org, Jul 16 2015

Labels: -Cr-Blink Cr-Blink-Workers

Comment 12 by kinuko@chromium.org, Aug 11 2015

Owner: nhiroki@chromium.org
Status: WontFix
I'd like to close this because latest Chrome's behavior doesn't seem strangely slow any longer, or the slowness seems happening in an expected way (as was mentioned in #9).

For the repro step noted in OP (http://n3demos.appspot.com/dsocharviewer_debug.html), FireFox seems to load the image faster, but Chrome's behavior also looks somewhat comparable, and doesn't become sluggish.

So the following statement no longer seems to apply (this must have been the era where we created a new process for each worker)

"Sending the data to the web worker is unusually slow on Chrome, so that the browser becomes very unresponsive (it's actually slower then doing the decompressing on the browser thread without web worker), the same code on Firefox runs as expected."

Comment 13 by kinuko@chromium.org, Aug 11 2015

Please feel free to open a bug with a more specific or appropriate subject if any of you observe unexpected performance issue around Workers. Thanks!

Comment 14 by rogerscg...@gmail.com, Feb 2 2017

2017 right now. When sending subarrays of a Float32Array via WebWorker postMessage, I also got the insane overhead and processing time required, regardless of subarray size. The solution above requiring a new Float32Array construction worked perfectly for me, despite the increased memory usage. I'm not sure if this is expected behavior for typed arrays, but if it's not I would suggest a chromium developer to look into it.

Comment 15 by nhiroki@chromium.org, Feb 3 2017

NextAction: 2017-02-08
Status: Unconfirmed (was: WontFix)
rogerscgreg@, thank you for reporting it. What is your environment(OS, Chrome Version)? Are you able to make minimum reproduction? That's very helpful to confirm/triage the issue.

Probably I cannot have time to investigate this until next week, so set NextAction date.

Comment 16 by horo@chromium.org, Feb 10 2017

NextAction data came. Ping :)

Comment 17 by falken@chromium.org, Feb 21 2017

Components: Blink>Messaging
NextAction: 2017-03-01
Status: Assigned (was: Unconfirmed)

Comment 18 by jbroman@chromium.org, Feb 21 2017

Is Chrome (ideally, 57 or greater, since there was a rewrite of the relevant code in that milestone) still substantially slower than other browsers here?

The HTML spec (https://html.spec.whatwg.org/multipage/infrastructure.html#structuredclone) requires that we clone the entire underlying array buffer, which is probably why this is slow for your use. But that issue should exist in other vendors as well.

We _might_ have some additional slowness due to the fact that we presently force the TypedArray to have a full ArrayBuffer internally (which may involve one additional copy of the buffer, one time per backing buffer). It's not apparent whether that's the case here.

A workaround, if the data you need to send is much smaller than the backing buffer, is to explicitly make a separate copy and then send that.

Comment 19 by noel@chromium.org, Feb 22 2017

Like to hear some data per #18 if the rewrite in 57 speed things up.

Meanwhile, are TypedArray objects transferrable per the spec?  I think so, but you'd need to check.  If the caller does not need to retain a copy of the object posted to the worker, then transfer it.  The speed increase is ~massive [1]

[1] https://developers.google.com/web/updates/2011/12/Transferable-Objects-Lightning-Fast

Comment 20 by jbroman@chromium.org, Feb 22 2017

#19: Rewrite was  bug 148757 . tl;dr: to 3x win on JSON-y objects where a lot of the work is traversing JS object graphs. But if the data is mostly a big array buffer, then we were already spending most of our time in a memcpy.

You can transfer the ArrayBuffer underlying the TypedArray, but for the reporter's use case, that would detach and send the entire emscripten memory, which is almost certainly not desirable.

Comment 21 by falken@chromium.org, Apr 19 2017

Status: WontFix (was: Assigned)
We set next action two months ago but do we have an action to take here? 

Let's open a new bug if there's data or a repro on Chrome 57+. The repro in the original bug no longer seems strangely slow (thanks for keeping it up this long).

Sign in to add a comment