New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 878336 link

Starred by 2 users

Rendering time (compared to Chrome 66) increased by 30% on Chrome 68 and by 100% on Chrome Beta 69

Project Member Reported by atul.mog...@synerzip.com, Aug 28

Issue description

Chrome Version: 
  1. 68.0.3440.106 (Official Build) (64-bit)
  2. 69.0.3497.57 (Official Build) beta (64-bit)

OS: Ubuntu 14.04

What steps will reproduce the problem?
(1) Open poc.html in Chrome 66.0.3359.181 (Stable) and note the timing printed on developer console.
(2) Open poc.html in Chrome 68.0.3440.106 (Stable) and note the timing printed on developer console.
(3) Open poc.html in Chrome 69.0.3497.57 (Beta) and note the timing printed on developer console.

What is the expected result?
  The rendering timing on Chrome 66, Chrome 68 and Chrome 69 should be more or less same.

What happens instead?
  1. The rendering time on Chrome 68 (Stable) has been increased by about 30%
  2. The rendering time on Chrome 69 (Beta) has been increased by about 100%

Additional Information -
  The average time after opening poc.html 20 times on Chrome 66 was 11.04 seconds.
  The average time after opening poc.html 20 times on Chrome 68 was 14.55 seconds.
  The average time after opening poc.html 20 times on Chrome 69 was 20.03 seconds.

  Similar behavior is observed on Windows and Mac OS.
 
poc.html
1.9 KB View Download
Cc: nagmani....@synerzip.com
Labels: Needs-Bisect
Cc: phanindra.mandapaka@chromium.org
Labels: Needs-Feedback Triaged-ET
atul.moglewar@ Thanks for the issue...

As per comment #0, Tried to this issue on reported chrome and latest chrome stable using Ubuntu 14.04. Attaching screenshot for reference.  
Steps:
-----
1. Launched chrome
2. Downloaded given poc.html file and opened in new tab
As we are seen different render values from M-66 to latest chrome 71.0.3543.0 and also noted the time
M-66: 15 seconds , M-68: 18 seconds, M-69 Stable: 18 seconds, M-70 Dev : 18 seconds and M-71 : 19 seconds as per screenshot.

@Reporter: Could you please check the attached screenshots and please let us know if anything missed from our end and also can you please upgrade to latest chrome stable 69.0.3497.81, you can download latest chrome builds here:" https://www.chromium.org/getting-involved/dev-channel ". Let us know whether issue still persists. If possible can you please provide us to expected render time/value.

Thanks..!
878336-1.PNG
241 KB View Download
878336-2.PNG
227 KB View Download
878336-3.PNG
120 KB View Download
The steps you did are correct, nothing else are required. 

I work on application which create large number (more than 100 thousands) of dynamic nodes, attach it to documentfragment and documentfragment is later attached to dom. Our test report is showing big difference in rendering time between 66 and later versions. The poc.html simulates the same behavior as our app (by attaching 20K dynamic nodes). If you increase number of nodes in poc.html, rendering time increases drastically.

As per the screen shot 878336-1.png, the time taken on Chrome 66 is 13 seconds and Chrome 68 is 19 seconds. So the rendering has increased by 46.15%. Also I would request you to refresh the page 20-30 times and take the average rendering time.

I just expect the rendering time on Chrome 66, Chrome 68 and Chrome 69 should be more or less same.

PS: Issue still persist on latest chrome version (69.0.3497.81). 
Labels: -Needs-Feedback
Feedback has been provided, so removing the label.
Labels: -Needs-Bisect -FoundIn-67 RegressedIn-67 Target-70 Target-71 M-71 Needs-Triage-M69 FoundIn-71 FoundIn-70 hasbisect Target-69 FoundIn-69
Owner: samans@chromium.org
Status: Assigned (was: Untriaged)
Able to reproduce the issue on the reported chrome 69.0.3497.57,latest chrome 71.0.3562.0 using Windows 10, Mac 10.13.6 and Ubuntu 14.04.
As the good and bad builds seen on dot builds hence providing Manual Bisect information for same.

Bisect Info:
================
Good build: 67.0.3396.20
Bad build:  67.0.3396.22

CHANGELOG URL:
https://chromium.googlesource.com/chromium/src/+log/67.0.3396.20..67.0.3396.22?pretty=fuller&n=10000

Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1010702

Saman Sami:Please confirm the issue and help in re-assigning if it is not related to your change.

NOTE:Issue is showing as never visited, Hence assigning to Reviewer  Saman Sami

Thanks..!
Components: -Blink Internals>Compositing
Owner: chrishtr@chromium.org
The bisect doesn't seem right. Reassigning to myself to re-triage.
Labels: Hotlist-Partner-GSuite
Owner: masonfreed@chromium.org
Hi Mason, could you redo the bisect on this one?
Cc: chrishtr@chromium.org
Owner: mlippautz@chromium.org
Ok, I gave this a shot. It is hard to bisect, as the POC takes 10-20 seconds, and the run time varies quite substantially from run to run. In addition, the first few runs seem to take 30-50% longer, and then the times go down. So you have to run it 3 times before you really look at the numbers. With all of that uncertainty, here's what I got.

My timings of about 3 runs, after "warming up" for 3 runs:
530009 - 8-10 seconds (good)
557405 - 9-11 seconds (good)
570418 - 9-11 seconds (good)
572726 - 11-13 seconds (good)
572751 - 10-11 seconds (good)
572763 - 10-11 seconds (good)
572771 - 10-11 seconds (good)
572774 - 10-12 seconds (good)
572775 - 16-18 seconds (bad)
572777 - 15-18 seconds (bad)
572833 - 15-18 seconds (bad)
572908 - 15-19 seconds (bad)
573168 - 15-17 seconds (bad)
573769 - 15-18 seconds (bad)
575177 - 15-17 seconds (bad)
579998 - 16-18 seconds (bad)

You are probably looking for a change made after 572774 (known good), but no later than 572775 (first known bad).
CHANGELOG URL:
  https://chromium.googlesource.com/chromium/src/+log/b32131c2cdc3375cf384419fc6ddaae78c0c9e16..41fabafdeff26738bdc6cf88ab15ab4c9e08e84c

Reassigning to mlippautz@chromium.org based on that CL.



Cc: hpayer@chromium.org
Labels: -Pri-1 Pri-3
The benchmark (poc.html) in #0 only creates live memory. The CL mentioned does affect sweeping of unused memory, so for an single run of the benchmark this should not matter.

However,  I assume the benchmark is repeated by just reloading the page?

In this case, garbage collections will trigger and throw away the old benchmarking state.

Now, the CL could could make sense as it fixes a bug where the GC held alive non-used memory for too long. In that case, the benchmark was obviously running faster as it could just reuse that memory that was lying around.

It seems like you are measuring artifacts of previous runs here.
Components: -Internals>Compositing Blink>JavaScript>GC Blink>Infra
Ping on this bug. mlippautz are you saying you can't reproduce the issue, or that
you're guessing the bisect is incorrect? Could you try to do it again and see if
you get the same results?
Labels: Restrict-View-Google
Labels: -Restrict-View-Google
I am saying the the regression is reproducible but that the benchmark is potentially not testing what it is supposed to test.

A single run of that benchmark will only generate live memory and thus my CL is not affecting its performance.

I was asking whether "reloading" means pressing the reload button in the browser. In that case my CL can have an effect because we would previously pool *all* memory for another GC cycle. We considered this as a bug and are now returning memory to the OS. We may implement *some* pooling in future.

If this benchmark represents the application than only the first load should be considered as representative in which case my CL would have no effect.
Cc: -hpayer@chromium.org mlippautz@chromium.org
Components: -Blink>JavaScript>GC
Labels: -hasbisect Needs-Bisect
Owner: ----
Status: Untriaged (was: Assigned)
After some offline clarifications it is clear that the benchmark tries to measure *initial load* (cold load). So warming up and ignoring the first runs is not the right way to run the repro.

I quickly checked locally with before and after my CL and the scores for initial load were:
With CL: [75228:75228:1213/151131.237206:INFO:CONSOLE(49)] "Content rendered on screen after 17074"
Without CL: [83613:83613:1213/151446.782643:INFO:CONSOLE(49)] "Content rendered on screen after 17690"

It also makes sense that subsequent, warmed up, loads are faster without my CL as they can completely reuse *all* of the memory that was left behind. That was a bug though.

For initial load my CL is a noop.

Please bisect using only initial load.
Owner: masonfreed@chromium.org
Status: Assigned (was: Untriaged)
-> masonfreed for redo of bisect.
I will redo the bisect. My original bisect (roughly) followed the instructions in c#4, to "refresh the page 20-30 times". So I will now launch a fresh Chrome and time it once, then redo that multiple times per revision. Due to the time this will take, this bisect might take me a few days to complete, just FYI.
This is a tough one to bisect. First, the times are quite variable within a single version of Chrome - there is a roughly 4.3% coefficient of variation. Second, there seem to be perhaps two changes causing the issue. Prior to about revision 543034, the numbers (on my fairly fast Linux desktop) are around 11300. After about revision 545559, the numbers are around 14900. Between those two revisions, there seems to be a middle-range centered around about 13100. I haven't found the breakpoints between 11300 and 13100, or between 13100 and 14900. Or even made myself 100% sure that there really are two changes causing this.

For all of the raw data and some plots, see this document: https://docs.google.com/spreadsheets/d/1ec-Ok_KKlAdd-N8CBhk4obp67J610MxkyXxR_2qDB88/edit?usp=sharing

I will continue to revise the bisect, but for now, I think we're looking pretty confidently at something within this range:

You are probably looking for a change made after 543034 (known good), but no later than 545559 (first known bad).
https://crrev.com/543034..545559

That's still a lot of revisions. I'll keep digging.
In case anyone is also looking at this - here is a slightly modified poc.html that pops up the result instead of requiring you to open devtools.
poc.html
1.9 KB View Download
Cc: wangxianzhu@chromium.org pdr@chromium.org
Ok, there are definitely two revisions causing this slowdown. I believe I have correctly bisected the first of them down to this:

https://chromium.googlesource.com/chromium/src/+/3a3c78a924a686ed0d3f90d765b00cdd78453e11

Before that revision, the poc takes an average of 11044ms. After that revision, it takes 12830ms, an increase of 16%.
The "intermediate-to-slow" transition is much more difficult to locate. Above some revision (between 545419 and 545429?) the variance seems to go up, which makes it harder to find definite revisions. At this point my guess is somewhere between 545400 and 545559, but that could be wrong.

I've added a lot more data to the Sheets document, please take a look if you're interested.

I've also recorded a trace from a good, intermediate, and bad revision. Please see attached.

It is likely worthwhile seeing what the breakdown is between a) creating the div elements and attaching them to the rootEl document fragment, and b) appending that document fragment to the document and having it render onscreen.  I have not done that, but can, if someone thinks it worthwhile.
trace_revision543283_good.json.gz
3.0 MB Download
trace_revision544688_intermediate.json.gz
4.2 MB Download
trace_revision568703_bad.json.gz
4.5 MB Download
Can we reproduce the 100% regression on M69? I think that one is more important than the 30% regression on M67/M68.
I think the original 30% and 100% numbers were ballpark. Based on my fairly extensive testing (see the spreadsheet link in c#20), I think the first regression is 18%, and the second is an additional 13% (for a total of 33% vs. M66). So actually, the first one is more impactful. And, helpfully, that one does seem to fairly reliably bisect, see c#22.
Labels: Needs-Feedback
NextAction: 2019-01-01
Ok, atul.moglewar@synerzip.com, we think we know what the majority of the difference is between these revisions. The original poc.html adds 520,000 lines of text in the #pane-content-right div. That div is nested inside a div that has "transform: scale(0.9);" style. In M69, we made a change in the way we handle cull rects, which help avoid painting content that is offscreen. For a number of reasons, when an element is nested inside an element containing a transform, we cannot use this optimization, and we instead paint the entire contents of the element. In your case, the child div contains 520,000 lines of text, and the entire thing is being painted. If you instead use the attached poc-fixed.html, the times are roughly the same before and after.

Please take a look and let us know if you agree. Hopefully this gives you more information about how to optimize your code for performance in this case.
poc-fixed.html
1.9 KB View Download
With the optimization suggested, the timings for poc and the actual application are roughly same as that of M66. 

However, the scaling was intentionally applied in the application to support zoom-in and zoom-out. We support 10 levels of Zooms and 0.9 scale is the default level. So removing "transform: scale(0.9);" won't be a solution for me. I would love to see fix from Chrome, since the same html/css structure used to work until M66. But if thats impossible, I will have to see alternative for achieving zoom functionality without using "transform: scale".
I'm glad the timings went back to what you expect when the transform is removed.

It might help us if we knew what the actual application is - then we could brainstorm other approaches. It is possible that you could use CSS zoom (https://developer.mozilla.org/en-US/docs/Web/CSS/@viewport/zoom) or changing the font size and element sizes directly instead of using transform:scale. Or using Javascript to only append a more limited set of nodes instead of all 520,000 at the same time? These are just suggestions, I haven't tried them to make sure they work. At this point, I am doubtful that we will be able to re-implement the cull-rects inside of the transform node, sorry.
Alright, I will explore other approaches. 

The application is QuickSheet from the "Office Editing for Docs, Sheets & Slides"
chrome extension, available at https://chrome.google.com/webstore/detail/office-editing-for-docs-s/gbkeegbaiigmenfmjfclcdgdpimamgkj
Status: WontFix (was: Assigned)
Nice extension.

Ok, I'm guessing the feature is the zoom in/out buttons shown in the attached image? If so, I can see how you're using that - makes sense. Perhaps it would "fix" most of the problem if you set the default zoom to 1.0 instead of 0.9? I would guess that a majority of users don't change the zoom level often, so at least this would reduce the impact. Otherwise, I can see how this might be some work to overcome. It seems like the best bet would be to reduce the amount of data added to the document at one time, though that obviously wouldn't be easy.

One other comment - I'm not sure how, but it appears that your extension disables the effect of the standard browser zoom feature. CTRL + and CTRL - do not have any effect on your page. Perhaps you could remove the custom zoom in/out buttons and just re-enable the standard browser zoom? That feature should not affect speed in the same way that the transform:scale does, so it would also fix your problem. Just a thought.
Screenshot-quicksheet.png
32.7 KB View Download
Yes, the feature is zoom in/out buttons shown in the attached image. Setting the default zoom to 1.0 is a good suggestion, though PM has to approve it.

If I'm not wrong, Browser's default zoom will zoom in/out the entire html document, whereas we zoom in/out only the grid part.
Ok, hopefully the 1.0 zoom helps.

Yes, you're right, the browser zoom would zoom the entire page. But... that is also typical browser behavior that users are accustomed to, at least if they use that feature. So zooming everything might not be so bad? It would definitely be faster, in the zoomed case.

All the popular spreadsheet application including Google Sheet, zooms in/out just the grid area, so QuickSheet zooming in/out the entire application area would be a deviation.

Since performance of zoom in/out (using transform:scale) after the initial load is not that bad, I think best bet would be not to apply any scale by default. That way 
  - I won't be doing many changes in the code 
  - No performance degrade
  - Zoom in/out will work as expected. 
The NextAction date has arrived: 2019-01-01

Sign in to add a comment