Issue metadata
Sign in to add a comment
|
Chrome 58 V8 performance is much worser than before (for example, Chrome 56.0.2924.87)
Reported by
shader.y...@gmail.com,
May 17 2017
|
||||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 Steps to reproduce the problem: 1. Open link jsir.mujigame.com/yy/kingdoms/index.html via Chrome 58 and Chrome 56 2. The performance of game login screen on Chrome 58 is much worser than on Chrome 56 3. Open DevTools to collect performance information, it shows there are many Animation Frame Fired Recurring handler took warnings on Chrome 58, but there is no warning on Chrome 56 4. If I switched Experimental JavaScript Compilation Pipeline to enabled on Chrome 58, its V8 performance is back and good, JS works well 5. following is the requestAnimationFrame implementation: window.requestAnimationFrame(function() { qFrame.push(Date.now() / 1000); wake(); }); What is the expected behavior? Chrome 58 V8 performance is good as Chrome 56 What went wrong? Chrome 58 V8 performance is bad Did this work before? Yes 56.0.2924.87 Chrome version: 58.0.3029.110 Channel: stable OS Version: OS X 10.12.4 Flash Version: Is it an issue for V8 Crankshaft to Turbofan?
,
May 18 2017
,
May 23 2017
Updated new test link to http://test.mujigame.com/yy/kingdoms/index.html And for issue 717636, I got following information:
,
May 25 2017
59 is going out any time now, and since perf appears to be good again, I guess that's WontFix
,
May 27 2017
It looks great! Looking forward to the new version.
,
Jun 14 2017
Unfortunately, I updated my Chrome to Version 59.0.3071.86 (Official Build) (64-bit), the performance is still bad, is it a Turbofan issue?
,
Jun 14 2017
There is still "Recurring Handler" issue as Chrome 58.
,
Jun 14 2017
It looks typed array operation performance is low, because I captured obvious bottleneck code is:
var $MEMCPY = function(dst, src, size) {
for (var i = 0; i < size; ++i)
$U8[dst+i] = $U8[src+i];
return dst;
}
if I changed to
var $MEMCPY = function(dst, src, size) {
$U8.set(new Uint8Array($MEM, src, size), dst);
return dst;
}
the performance was improved, maybe there are some other points still affecting the performance, but is this new V8 engine issue?
,
Jun 14 2017
Thanks for the investigation! I saw the same problem in your example (it looks like function _102 from your website). I do not think this is an issue with the new compiler (Turbofan). In fact, in 58 you should still get the old compiler (Crankshaft). It should not be hard to verify which compiler was used for particular function - just run chrome with --js-flags="--trace-opt", that should print the name of the compiler for each optimized function. I think we have done some other changes in how we deal with typed arrays, I will try to look into that. By the way, the game does not seem to run to with Chromium, only with Chrome. Any idea why that could be?
,
Jun 14 2017
By the way, if you want to squeeze more performance out of memcpy, you can go a bit further with truncation inside the memcpy function, see below. This should be about 30% faster.
function memcpy(dst, src, size) {
dst = dst|0;
src = src|0;
size = size|0;
for (var i = 0; i < size; i = i + 1|0) {
a[dst + i|0] = a[src + i|0];
}
return dst;
}
If your memcpy calls are larger, you could make this faster by copying 4 byte chunks using a uint32 typed array view, but you would have to handle the array start and end separately.
,
Jun 14 2017
Thanks for your response!
Actually I tested it on Chrome 56, the performance of code
var $MEMCPY = function(dst, src, size) {
for (var i = 0; i < size; ++i)
$U8[dst+i] = $U8[src+i];
return dst;
}
is very good, but performance of code
var $MEMCPY = function(dst, src, size) {
$U8.set(new Uint8Array($MEM, src, size), dst);
return dst;
}
is acceptable, about 1/2 ~ 2/3 of former.
And could you give the error info with Chromium?
Thanks a lot!
,
Jun 14 2017
Got it, pretty thanks for your suggestion! I am trying it.
,
Jun 14 2017
According to my testing, the version of memcpy with truncations (#10) is also faster in Chrome 56. As for Chromium, here is the error copied from the JS console (I am sorry for losing the formatting): index.html:1 Uncaught (in promise) DOMException: Unable to decode audio data Promise (async) r.onreadystatechange @ user.1497438712014.js:2 XMLHttpRequest.send (async) a @ user.1497438712014.js:2 r @ user.1497438712014.js:2 (anonymous) @ user.1497438712014.js:2 n.open @ user.1497438712014.js:2 (anonymous) @ user.1497438712014.js:2 _14.(anonymous function) @ 0.js:1 _97 @ head.1497438712009.js:1 T @ tail.1497438712012.js:1 c @ tail.1497438712012.js:1 _91 @ head.1497438712009.js:1 (anonymous) @ user.1497438712014.js:2 Promise resolved (async) (anonymous) @ user.1497438712014.js:2 _2.(anonymous function) @ 000000.chunk.1497438712022.js:1 _2.(anonymous function) @ 000000.chunk.1497438712022.js:1 _2.(anonymous function) @ 009600.chunk.1497438712144.js:1 _2.(anonymous function) @ 004800.chunk.1497438712078.js:1 _14.(anonymous function) @ 0.js:1 _97 @ head.1497438712009.js:1 T @ tail.1497438712012.js:1 c @ tail.1497438712012.js:1 _91 @ head.1497438712009.js:1 c @ user.1497438712014.js:1 f.onload @ user.1497438712014.js:1
,
Jun 14 2017
Updated, so far,
function memcpy(dst, src, size) {
dst = dst|0;
src = src|0;
size = size|0;
for (var i = 0; i < size; i = i + 1|0) {
a[dst + i|0] = a[src + i|0];
}
return dst;
}
works well, any new update I will tell you, thanks buddy!
,
Jun 14 2017
For index.html:1 Uncaught (in promise) DOMException: Unable to decode audio data, does Chromium support Web Audio API?
,
Jun 14 2017
With lots of help from @jkummerow, we figured out the problem: keyed store ICs do not play well with prototype changes. Compilers do not really matter - this shows with both Crankshaft and Turbofan.
Below is an example illustrating the problem on a simple function that sets 10M elements in a byte array. If such a function lives across a prototype chain change, it takes >700ms to set the elements; without any prototype chain change, it takes <20ms. (Note that the exact setup is subtle; the problem only occurs if the prototype changes while the function is not optimized, after the function ran at least once.)
Later, I will create a new bug to track the problem with a more detailed explanation.
To mitigate the problem, I suggest avoiding prototype chain changes on your typed arrays. This should recover your performance.
Unfortunately, it is unlikely we can come up with a quick fix on our side that we could merge back to M59 or M60. We are hoping to come with a fix for M61.
Thanks for submitting this super-interesting bug!
var a = new Uint8Array(10000000);
function time(s, f) {
var start = Date.now();
f();
console.log(s + " " + (Date.now() - start));
}
function memset(s) {
for (var i = 0; i < s; i++) {
a[i] = 42;
}
}
memset(10);
Uint8Array.prototype.foo = function() {}
time("memset after proto change: ", () => memset(10000000));
function memset2(s) {
for (var i = 0; i < s; i++) {
a[i] = 42;
}
}
time("memset2 without proto change: ", () => memset2(10000000));
,
Jun 14 2017
Oh, regarding Chromium - the problem was that Chromium does not come with some audio codecs.
,
Jun 15 2017
Great, thanks a lot for your great job, I will track it more. :)
,
Jun 15 2017
And, we double checked our code again, we didn't change any prototype, could you tell me how did you find it is related to prototype changes, maybe it is another issue?
,
Jun 16 2017
@jarin if you file a new issue, please tell me the track number, thanks a lot!
,
Jun 19 2017
Hi, I looked at this a bit more closely, and it appears there is some prototype manipulation in file http://testb.mujigame.com/yy/kingdoms/user.1497846917037.js, which trips up v8. Just search for __proto__ in that file. This is another performance bug in v8, and we will fix it. If you are interested we might have some mitigations; essentially, after setup you would need to read a property from the prototype (such as 'set') several times (in a loop) so that we switch the prototype object to fast prototype mode. FYI, I have also created a bug for keyed store problem with typed arrays - there we should not care about prototype changes. https://crbug.com/v8/6502
,
Jun 20 2017
Based on a suggestion from @verwaest, I have landed a mitigation for the prototype dictionary->fast mode switch in M61, see below. When I tried it locally, it seemed to solve your problem. commit d6c9e534c862a07b0d027c746273aede07ce41c5 Author: jarin <jarin@chromium.org> Date: Mon Jun 19 03:17:30 2017 -0700 [ic] Make prototypes fast when storing through keyed store IC. Toon suggested this as a mitigation to the problem of prototype fast mode switching invalidating prototype chain validity cell, and thus sending keyed store ICs to megamorphic state. BUG= chromium:723479 Review-Url: https://codereview.chromium.org/2943313002 Cr-Commit-Position: refs/heads/master@{#45993}
,
Jun 26 2017
Closing this bug, this particular issue should be solved by the fix from #23, the more general problem is tracked by https://crbug.com/v8/6502.
,
Aug 9 2017
Thanks a lot for your great work! |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by ligim...@chromium.org
, May 17 2017Labels: -Pri-2 Performance Pri-1