New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 667370 link

Starred by 9 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Feature

Blocking:
issue 793374



Sign in to add a comment

[css-contain] Poor layout performance even with CSS containment

Reported by a...@scirra.com, Nov 21 2016

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.0 Safari/537.36

Steps to reproduce the problem:
1. Visit demo at: https://www.scirra.com/labs/bugs/containlayoutperf/
2. Click a list item near the top to play a collapse animation (which reduces its height to 0px over time)
3. Observe performance (measure in timeline to see layout timings etc.)

What is the expected behavior?
The list contains a few thousand variable-height items, all using contain: content. This should guarantee to the browser that if one of the items changes, it doesn't need to do any layout for any of the items other than the one that changed, except maybe to shunt up items beneath the item that has changed height. It should be able to do this quickly because it doesn't need to look inside any of the other list items, only work out what their new offset is.

So a simple offset of a few thousand items seems reasonable to assume will be <5ms layout time.

What went wrong?
The performance is very poor with items near the top of the list. Chrome's timeline indicates ~30ms layout times. Presumably Chrome is doing a full layout of all the items beneath the changed item. At the bottom of the list, it's faster, but still has long "Update layer tree" times (not sure what that is?)

Firefox is able to smoothly animate items anywhere in the list, and its own profiler shows ~1ms layout times on my machine at both the top and bottom of the list.

Did this work before? N/A 

Does this work in other browsers? Yes

Chrome version: 56.0.2924.0  Channel: canary
OS Version: 10.0
Flash Version: Shockwave Flash 24.0 r0

This makes it hard to have long lists in Chrome which perform well.

The issue also seems to reproduce if you set a fixed height for the list items and use strict containment.
 

Comment 1 by e...@chromium.org, Nov 21 2016

Cc: cbiesin...@chromium.org robho...@gmail.com e...@chromium.org msten...@opera.com
Labels: -Type-Bug Type-Feature
Status: Available (was: Unconfirmed)

Comment 2 by msten...@opera.com, Nov 21 2016

We lay out all siblings that come after the block clicked. It gets marked for layout here:

  if (!child.needsLayout()) {
    if (newLogicalTop != oldLogicalTop && child.shrinkToAvoidFloats()) {
      // The child's width is affected by adjacent floats. When the child shifts
      // to clear an item, its width can change (because it has more available
      // width).
      layoutScope.setChildNeedsLayout(&child);

in LayoutBlockFlow::positionAndLayoutOnceIfNeeded(). shrinkToAvoidFloats() is true because .listItem is overflow:hidden (and also because of contain:content), and therefore establishes a block formatting context. Width is auto, so they will be affected by adjacent floats. Not that there are any floats at all in the test document, so I suppose we could check for that as well.

Comment 3 by msten...@opera.com, Nov 21 2016

Probably not that simple. If we have floats and then remove them, how can the blocks tell if they used to be adjacent to floats or not?

Different thought:

One problem is that Blink calculates logical width as part of laying out an object, so that, in order to calculate the logical width, we have to mark it for layout and then lay out, and, even if it turns out that the width remained unchanged, we still need to lay out the contents. So we generally end up laying out one level too deep. Pretty devastating in this case.

Comment 4 by e...@chromium.org, Nov 21 2016

If it's too hard to fix now we should at the very least make sure that this is fast in LayoutNG.

Comment 5 by a...@scirra.com, Feb 7 2017

Is there anything that can be done about this in the short term? This issue makes it hard to build scalable web apps.

Comment 6 by msten...@opera.com, Feb 7 2017

Using wrapper blocks should in theory work around the sub-optimal performance in Blink.

Instead of this:

<div id="listContainer">
    <div class="listItem">, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate</div>
    <div class="listItem"> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. E</div>
    ...

Try this:

<div id="listContainer">
    <div>
        <div class="listItem">, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate</div>
    </div>
    <div>
        <div class="listItem"> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. E</div>
    </div>
    ...

Or this:

<div id="listContainer">
    <div class="listItem">
        <div>, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate</div>
    </div>
    <div class="listItem">
        <div>tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. E</div>
    </div>
    ...

Comment 7 by a...@scirra.com, Feb 7 2017

I used your first workaround here: https://www.scirra.com/labs/bugs/containlayoutperf-workaround/

It seems to help a lot, but during the animation on a high-end dev machine, I still get per-frame timings like this:

3ms layout
3.3ms update layer tree
1.2ms hit test
8.3ms update layer tree
2.4ms paint

Some frames "update layer tree" takes 11.3ms. So it seems even with the workaround, it is almost completely consuming the frame budget to run the animation. This basically means web apps with large variable content have to disable animations, which is the workaround we are currently using!

Comment 8 by msten...@opera.com, Feb 7 2017

Is performance satisfactory if you animate one of the elements near the end of the container?

Comment 9 by a...@scirra.com, Feb 7 2017

It's still not great animating near the end - per-frame timings look like this:

2.3ms layout
6.9ms update layer tree
2ms paint

Still over 50% the frame budget on a high-end desktop.
OK, I think this proves that it wouldn't be enough to fix what I pointed out in #c3 (which is basically what you worked around in #c7). It's probably too expensive to simply examine and walk past (even if we don't lay them out) all previous siblings that don't need layout. Maybe some fast-paths for containment are possible.

Comment 11 by a...@scirra.com, Feb 13 2017

I verified this in the production version of our webapp: using the nesting workaround makes the layout time fast, but does not improve the "update layer tree" time.

So really the problem is not layout performance, it looks like O(n) time for "update layer tree" when making layout changes even when CSS containment is used.

IMO this is a bug, not a feature.

Comment 12 by msten...@opera.com, Feb 13 2017

Cc: pdr@chromium.org
Regarding #c7, using about://tracing for "Rendering", it seems that FrameView::invalidateTree() (the actual name of the function in the source code is FrameView::invalidateTreeIfNeeded()) gets most of the blame. Maybe the paint team has an idea. I think it's safe to say that this test causes an unusually high number of PaintLayer objects, and maybe we don't handle that too well?

Comment 13 by msten...@opera.com, Feb 14 2017

Components: Blink>Paint
I tested with my own build of Chrome on ToT now. The results should be comparable to what's in the dev-channel.

If I run chrome://tracing with the test case in #7, record a new trace and select "Rendering", and click on an element near the bottom of the document, and then stop recording afterwards, I can see that a lot of time is spent in FrameView::prePaint() (3 or 4 times more than what's spent in layout). It seems to just visit every sibling in the tree, which is a lot!

As an experiment, I removed some code to be able to skip most of the siblings here, just to point out what causes the slowness: https://codereview.chromium.org/2689213004

If I apply that patch, the time spent in pre-painting is ~negligible (one tenth of what's spent in layout). But only if I click on a block near the end of the document. If I click on some block near the top, we end up visiting all the siblings that follow the clicked block in pre-painting, it seems. Most of this work is done on objects that are way below the viewport (so they should be rather uninteresting).

So we need improvements on the paint side to get anywhere here.

Comment 14 by a...@scirra.com, Feb 14 2017

I feel like there are two issues here:

1) layout time takes unnecessarily long unless you add an extra nested element
2) "update layer tree" time takes unnecessarily long (apparently visiting every element in tree), with no known workaround (?)

Should there be a second issue filed to track these separately?

Comment 15 by msten...@opera.com, Feb 14 2017

I think one bug report is enough for now. I also think we have two problems to fix. Fixing the paint problem ("update layer tree" or FrameView::prePaint()) would help a lot, but we most likely need improvements on the layout side as well, even if the block wrapper work-around is acceptable for you. I measured layout to take about 3ms on my relatively powerful workstation, but that probably translates to something much bigger on typical phones, etc.

Comment 16 by a...@scirra.com, Feb 14 2017

I am also seeing issues with poor "hit test" performance, and I filed a separate issue for that: https://bugs.chromium.org/p/chromium/issues/detail?id=692050

Comment 17 by a...@scirra.com, Feb 14 2017

Also FWIW, these mentioned issues are the #1 performance problem we've had when porting a ~250k LOC Windows desktop game development IDE to the browser - you can see what the webapp we're developing looks like here: https://www.scirra.com/blog/184/a-first-look-at-construct-3

Comment 18 by msten...@opera.com, Feb 15 2017

Cc: wangxianzhu@chromium.org
Cc: chrishtr@chromium.org
Labels: M-58
@chrishtr: TL;DR: CSS containment (along with some change) causes full subtree walk in PrePaint. mstensho@opera.com tested with https://codereview.chromium.org/2689213004 to show the performance impact (#13).
For painting, the measurement is FrameView::invalidateTree with --disable-slimming-paint-invalidation, and FrameView::prePaint with --enable-slimming-paint-invalidation (this is default now, but we have experiment for 20% users with it disabled). Filed  bug 692614  for the regression of FrameView::prePaint compared to FrameView::invalidateTree.
What I observed seems much worse than the measurements in the previous comments:

layout: 85ms
invalidateTree: 18ms  (with --disable-slimming-paint-invalidation)
prePaint: 24ms (with --enable-slimming-paint-invalidation)
paint: 4ms

See the traces in  bug 692614 .

I tried on a powerful machine on Linux, with official build and ToT release build (without DCHECK).

Am I missing anything?

Comment 22 by msten...@opera.com, Feb 15 2017

Did you use the test in #c7? Because, the original test in the description field demonstrates some additional layout performance problems.
Just tried the test in #c7. Layout time reduced dramatically, with other numbers unchanged. Perhaps I need to check my 'powerful' machine.

Raised priority of  bug 692614 .
contain: content does cause a subtree walk in PrePaint due to change of clip. However, even
if it was contain: layout (which has no clip), there is still a subtree walk, basically because
the visual rects need to be updated for all DOM elements on the page.

Digging some more now to see how we can avoid some or most of that work.
Visual rects & paint offsets for a DOM element are the positions and pixel
bounds on screen for what it draws.

We use these to determine invalidations for the display list output and pixels
on screen when the elements move.

However, there are changes coming in 2018 which should allow us to avoid doing
work proportional to more than the set of elements with contain:content (i.e. avoid
subtrees below them).
Owner: chrishtr@chromium.org
Status: Assigned (was: Available)
Blocking: 793374
Cc: -msten...@opera.com mstensho@chromium.org
Cc: -cbiesin...@chromium.org ojan@chromium.org
See "Different thought" in Comment 3 for improvements in layout. Looks like further investigation here showed that improvements are needed on the painting side as well, though.
Cc: r...@igalia.com
Owner: cbiesin...@chromium.org
Project Member

Comment 32 by bugdroid1@chromium.org, Dec 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/234c90912ed41094328ae5cfa98ec846fc30721a

commit 234c90912ed41094328ae5cfa98ec846fc30721a
Author: Christian Biesinger <cbiesinger@chromium.org>
Date: Tue Dec 11 01:46:59 2018

Add perf test for contain: contents

Adds the testcase from crbug.com/667370 as a perf test, so we can track
improvements to it.

R=chrishtr@chromium.org

Bug: 667370
Change-Id: I5b9a4854b21de4fc26a5f515154a93d9148890ef
Reviewed-on: https://chromium-review.googlesource.com/c/1357543
Reviewed-by: Chris Harrelson <chrishtr@chromium.org>
Commit-Queue: Christian Biesinger <cbiesinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#615374}
[add] https://crrev.com/234c90912ed41094328ae5cfa98ec846fc30721a/third_party/blink/perf_tests/layout/contain-content-style-change.html

Nice! And LayoutNG is already 3 times faster than legacy, most likely because NG has already addressed what comment #3 points out: Legacy layout calculates the inline size (logical width) as part of layout. NG, on the other hand, checks up front if the inline size is going to change, and, if not, we'll use the previously cached result.
Nice! change-text-css-contain is still 90% slower though :(
> change-text-css-contain is still 90% slower though :(

Yes I know, this is because the optimization I implemented only works on legacy layout.
I've been experimenting with porting it to LayoutNG too but I didn't manage to do it so far, I need to talk more with koji to look for a possible solution.
Yeah, we'll likely need a different approach for NG and it'll be needed to get good editing performance as well.
Summary: [css-contain] Poor layout performance even with CSS containment (was: Poor layout performance even with CSS containment)
> Yeah, we'll likely need a different approach for NG and it'll be needed to get good editing performance as well.

I reported a new bug #914744 for this particular optimization.

Sign in to add a comment