New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 759236 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Feature

Blocked on:
issue 786686



Sign in to add a comment

Uint8ClampedArray.prototype.set is slow

Reported by w...@playcanvas.com, Aug 25 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.101 Safari/537.36

Steps to reproduce the problem:
1. Go here: https://jsperf.com/typed-array-set-performance

Ooor, setup:

  var src = new Uint8ClampedArray(512 * 512);
  for (var i = 0; i < src.length; i++) {
      src[i] = i;
  }
  var dst = new Uint8Array(512 * 512);

Slow:

dst.set(src);

Faster:

for (var i = 0, len = src.length; i < len; i++) {
    dst[i] = src[i];
}

Fastest:

for (var i = 0, len = src.length; i < len;) {
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
    dst[i] = src[i]; i++;
}

What is the expected behavior?
Uint8ClampedArray.prototype.set should be by far the fastest (as in Edge).

What went wrong?
Uint8ClampedArray.prototype.set is super-slow in Chrome.

Did this work before? N/A 

Does this work in other browsers? No
 In FF and Safari, Uint8ClampedArray.prototype.set is a similar speed as copying via for loop. Chrome is significantly slower. Edge is crazy fast.

Chrome version: 60.0.3112.101  Channel: n/a
OS Version: 10.0
Flash Version: 

Keep up the great work! :D
 

Comment 1 by w...@playcanvas.com, Aug 25 2017

I should point out that copying from Uint8Array is super-fast. It's because I'm using Uint8ClampedArray.

Comment 2 by kbr@chromium.org, Aug 26 2017

Cc: bmeu...@chromium.org yangguo@chromium.org
Status: Available (was: Unconfirmed)
V8 folks: could you please devote some time to optimizing this intrinsic? It's affecting a partner's WebGL engine. Thanks.

Cc: fran...@chromium.org
Labels: -Type-Bug Type-Feature
q
Labels: -OS-Windows Performance OS-All
Owner: fran...@chromium.org
Status: Assigned (was: Available)
Assigning to franzih@ for investigation. We're already planning to optimize TypedArray.prototype.set. The fact that your particular repro is faster when copying from Uint8Array is probably because then src and dst have the same type, which takes a different path.

Comment 5 by d...@acmer.me, Sep 10 2017

Few months ago I prepared some tests and found that the next code gives the best performance:

var b = new Uint8Array(clampedArray.buffer);
simpleArray.set(b)

For more information see:
https://jsperf.com/uint8-array-vs-uint8-clamped-array
https://jsperf.com/clamped-array-vs-array
Right, that's not surprising, as you keep things monomorphic that way.

Comment 7 by kbr@chromium.org, Oct 6 2017

Any updates on this? Speeding up Uint8ClampedArray <-> Uint8Array copies will make this project go faster:
https://github.com/playcanvas/playcanvas-ar

Try it out, it's pretty neat. The Hiro marker seems to be the one to which an object is attached in the demo.

Labels: -Performance Performance-Browser

Comment 9 by kbr@chromium.org, Nov 21 2017

Blockedon: 786686
Cc: peter.wm...@gmail.com
Owner: jgruber@chromium.org
Peter is currently looking into this.

Current numbers for 64.0.3280 (Linux 64bit), https://jsperf.com/typed-array-set-performance:

TypedArray set: 2844 +-5%
for loop: 3491 +- 3%
Partially unrolled for loop: 2941 +- 5%

Firefox 52.5.0:

TypedArray set: 27221 +- 3%
for loop: 5009 +- 3%
Partially unrolled for loop: 8317 +- 3%

Clearly we can still improve here.
NextAction: 2017-11-30
https://crrev.com/c/790756 is expected to bring a 15x speedup of the TypedArray.p.set test case posted in #0. 

Setting NextAction to possibly merge to 64/63 once we have canary and clusterfuzz coverage.
Project Member

Comment 12 by bugdroid1@chromium.org, Nov 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/e1028d711e7841b715924a20531b13e25efffeb8

commit e1028d711e7841b715924a20531b13e25efffeb8
Author: peterwmwong <peter.wm.wong@gmail.com>
Date: Tue Nov 28 15:48:45 2017

[typedarray] Add Uint8 fast path for TA.p.set

Use memmove if source and target are either Uint8Array or Uint8ClampedArray.

Bug:  v8:7123 , chromium:759236 
Change-Id: If82bf10165cfc67274f36bb772ce9676a768dcc8
Reviewed-on: https://chromium-review.googlesource.com/790756
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#49678}
[modify] https://crrev.com/e1028d711e7841b715924a20531b13e25efffeb8/src/builtins/builtins-typedarray-gen.cc

As this is not a regression, I don't think we should merge this back. 
NextAction: ----
Status: Fixed (was: Assigned)
Not backmerging to 63 as per #13. Just to clarify, should we also not merge to 64?
Offline discussion: also not merging to 64.
But unless we're unlucky (depending on the exact branch point), this will make it to 64.

Sign in to add a comment