New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 6663 link

Starred by 2 users

Issue metadata

Status: Fixed
Closed: Aug 2017
HW: All
NextAction: ----
OS: All
Priority: 2
Type: Bug

Sign in to add a comment

object creation seems to slow down

Reported by, Aug 2 2017

Issue description

OS: Mac OS X 10.12
Architecture: x64

What steps will reproduce the problem?


Using V8 (from node), I got:

empty loop: 62.695ms
literal: 2367.563ms

With V8 (Node 8.2.1), I got:

empty loop: 47.828ms
literal: 1010.933ms

While empty-looping has slowed down, allocating objects seems to have slow 2x.

Things are getting better on V8 6.1.459:

empty loop: 67.859ms
literal: 1758.019ms

Labels: Performance HW-All OS-All Priority-2
Status: Available (was: Untriaged)
Maybe related to object literals? +cc bmeurer, cbruni.
I'll have a look tomorrow.
I have checked also with constructors and classes, and I have observed the same behavior. I'm not sure how to measure the impact of this in production applications.

Comment 4 by, Aug 2 2017

I would bet this is related to the loop being simply eliminated. I expect there are different results if there's a meaningful effect occurring in the loop body

Comment 5 by, Aug 2 2017

then again, I dunno --- it seems like on OSR entry, the `literals` function should see similar benefits, I would be surprised if objects were still allocated/the loop was still there. But maybe the baseline case + GC pauses would be enough to make the 1000-2000ms difference.

Anyways, leaving the discussion to more knowledgable people :)
It does seem the loop is not eliminated and the objects are being created. 

My bad the numbers above are for V8 (Node 6).

empty loop: 48.482ms
literal: 1039.456ms

V8 5.8 has the same performance of V8 6.0.

Both versions are in the 1550-1560 Scavenge run cycles.
I think your benchmark is not measuring what you want to measure. This seems to be dominated by the store buffer overflow on the assignment to the "obj" global object property (in case of d8) or context slot (in case of node). If you modify the benchmark slightly (in a safe way that both d8 and node behave similarly), then you actually measure the allocation overhead I think:

(function() {
  var max = 100000000

  console.time('empty loop')

  function loop () {
    for (var i = 0; i < max; i++) {}


  console.timeEnd('empty loop')


  function literal () {
    var obj = null
    for (var i = 0; i < max; i++) {
        obj = { i }
    return obj;


Now the results look roughly the same (within noise):

$ ./d8- matteo.js
empty loop: 108.429000
literal: 468.233000
$ ./d8-6.1.534.15 matteo.js
empty loop: 108.305000
literal: 471.232000
$ out/Release/d8 matteo.js
empty loop: 108.241000
literal: 480.974000
$ ~/Applications/node-v6.10.0-linux-x64/bin/node matteo.js
empty loop: 81.328ms
literal: 460.296ms
$ ~/Applications/node-v8.1.2-linux-x64/bin/node matteo.js
empty loop: 82.764ms
literal: 459.173ms

But it's also possible that I'm missing something here.
Just to clarify, why the previous one regressed between V8 5.1 and V8 6.0?
Assigning things to an outer context has regressed?

(function () {
var max = 100000000

console.time('empty loop')

function loop () {
  for (var i = 0; i < max; i++) {}


console.timeEnd('empty loop')

var n = 0

function literal () {
  for (var i = 0; i < max; i++) {
    n = i


the above take the same amount of time in both Node 6.11 and 8.2.1, but it increases 30% in node master with V8 60, and it drops again with V8 61.
Components: GC
AFAIK the way the store buffer is implemented changed significantly since 5.1 (which is ancient BTW). I added the GC folks to help answering the question.
Attached is a profile of the initial benchmark. We indeed spend around 34% of time on store buffer management. 21% just on '__lll_lock_wait', which I assume is also part of store buffers.

Anyway, the modified benchmark from #7 gives me the following results on ToT:

$ out/release/d8 ~/crbug-v8-6664-obj-creation.js 
empty loop: 112.856000
literal: 80.606000
104 KB View Download
303 bytes View Download
FYI: the _lll_lock_wait calls come from StoreBuffer::FlipStoreBuffers.
Bisected this down to:

9d1488e4b0b3d9b2a630c602346f1b8417f2a7c4 is the first bad commit
commit 9d1488e4b0b3d9b2a630c602346f1b8417f2a7c4
Author: hpayer <>
Date:   Wed Nov 30 04:17:00 2016 -0800

    [heap] Reduce store buffer size to increase chance to run concurrent store buffer processing thread more often.

Prior to this CL: literal: 1644.279
After    this CL: literal: 2222.33

Increasing kStoreBufferSize decreases the regression.

Assigning to current memory sheriff. Ulan, could you please take a look?

Comment 13 by, Aug 3 2017

jgruber@, my numbers are bit different. I run 50 times each of the three revisions (raw results attached).

Averages [with min-max range]: (91b368): literal: 1232.2  [1222-1248]
before CL  (4d75ea): literal: 2185.86 [1892-2478]
after CL   (9d1488): literal: 2511.24 [2262-2800]

The regression of the CL was about 325ms.
There seems to be another regression of 950ms between the 5.1 and the 4d75ea.

Note that at 4d75ea the variability is huge. Maybe you hit the lower end of the range in #12?

751 bytes View Download
751 bytes View Download
750 bytes View Download
288 bytes View Download

Comment 14 by, Aug 3 2017

Status: Assigned (was: Available)
> There seems to be another regression of 950ms between the 5.1 and the 4d75ea.
About 200ms regression happens before 1fb449.
I bisected the remaining 750ms to

commit a9e6bbba263c98090f96bb0dccff09d8ffb86c0a (refs/bisect/bad)
Author: hpayer <>
Date:   Fri Nov 11 06:00:55 2016 -0800

    [heap] Reland concurrent store buffer processing.
    BUG=chromium:648973,  chromium:648568 
    Cr-Commit-Position: refs/heads/master@{#40928}

So most of the regression comes from the store buffer changes.

Comment 15 by, Aug 3 2017

Labels: HW-x64
This simple optimization removes about 1000ms from the runtime on TOT:

Comment 16 by, Aug 3 2017

Labels: -HW-x64
Can some of you check if this is what is happening in

This is the comparison between master and node 6.x:

timers/set-immediate-breadth.js millions=10           -35.94 %        *** 1.835500e-05

It is using a similar pattern:

I have tried applying the patch and it did not help, so it might be something different after all.
#17: Are store buffers prominent in --prof output? (Run the benchmark with --prof, then ${v8dir}/tools/linux-tick-processor <produced_output_file>.log)
Project Member

Comment 19 by, Aug 7 2017

The following revision refers to this bug:

commit 35f9b26601bb50d9fe968c4a6069fbb1968f008a
Author: Ulan Degenbaev <>
Date: Mon Aug 07 09:09:32 2017

[heap] De-duplicate insertions to the old-to-new remembered set.

Bug:  v8:6663 
Change-Id: I8bf7169c21141a34e3bcb0bb2193ceb1746b33b2
Reviewed-by: Michael Lippautz <>
Commit-Queue: Ulan Degenbaev <>
Cr-Commit-Position: refs/heads/master@{#47186}

Store buffer are not prominent:

It's something else.
I opened a new bug for the issues in #17 and #20: .

Ulan, can we close this one?

Comment 22 by, Aug 8 2017

Status: Fixed (was: Assigned)

Sign in to add a comment