Issue metadata
Sign in to add a comment
|
UseModernMediaControls makes Wikipedia's first paint slower on Chrome 67
Reported by
phedens...@wikimedia.org,
Jun 3 2018
|
||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36 Steps to reproduce the problem: Wikipedia seems to render a little but slower on 67 than on 66, I've seen this on multiple synthetic environments and waiting on getting more users on 67 to also see it RUM metrics. 1. Access for example https://en.wikipedia.org/wiki/Barack_Obama (I can see this on many pages) with Chrome 67 2. Do the same with Chrome 66 and collect visual metrics and compare them. In our lab environment we can see that First Visual Change, Speed Index and first paint (collected directly from Chrome) is affected. I'm pretty sure this is a Chrome issue, because I can switch between just the browser version and see the metrics change. I've seen this now both using Browsertime, sitespeed.io and WebPageTest. Since I can see it in different tools, changing the browser back and forth, I'm pretty sure the change is isolated to the Chrome version. I've documented the difference at Github https://github.com/sitespeedio/sitespeed.io/issues/2069 and on https://phabricator.wikimedia.org/T196242 For WebPageTest we can see the diff is 100 ms, in our isolated environment (using Chrome and WebPageReplay) the diff is 33 ms on most pages (that's the lowest granularity we have). Checking the differences just using first paint (collected from Chrome) the diff is ca 50 ms. I also run tests on a couple of other sites but I could only see Wikipedia affected. But maybe something changed so this is expected, let me know if that is the case! What is the expected behavior? The render time shouldn't differ between browser versions. What went wrong? The latest Chrome 67 renders a little bit slower than Chrome 66. Did this work before? N/A Does this work in other browsers? Yes Chrome version: 66.0.3359.139 Channel: n/a OS Version: OS X 10.13.4 Flash Version: At the moment we have so little traffic from users on Chrome 67 so we cannot see a diff in the RUM data, you can look yourself: https://grafana.wikimedia.org/dashboard/db/navigation-timing-by-browser?refresh=5m&orgId=1&var-metric=firstPaint&var-browserFamily=Chrome&var-browserVersion=66&var-browserVersion=67&var-m=median
,
Jun 3 2018
We also keep raw logs from Chrome. This is when we test https://en.wikipedia.org/wiki/Facebook with 67: http://webpagereplay-wikimedia.s3-website-us-east-1.amazonaws.com/?prefix=enwiki/desktop/chrome/100/Facebook/2018-06-03-15-41/ And then with 66: http://webpagereplay-wikimedia.s3-website-us-east-1.amazonaws.com/?prefix=enwiki/desktop/chrome/100/Facebook/2018-06-03-14-24/ When I look through the only big difference I could see is that in 67 time spent in UpdateLayoutTree is larger. It seems to spend 100 ms more there in 67 than on 66. But I could have missed something, it's quite much raw data :)
,
Jun 3 2018
,
Jun 4 2018
I've look into it some more and I don't see the same pattern on all URLs (some are a little faster too). There's a summary here: https://phabricator.wikimedia.org/T196242#4252705 The change I could see in the events is that more time is spent for all URLs on UploadLayerTree (100 ms max) and a little more on BlinkGCMarking (30 ms). Else everything looks the same.
,
Jun 4 2018
This seems likely due to the Slimming Paint V1.75 change. Do we have any insight on whether this is a known regression offset by other progression?
,
Jun 4 2018
,
Jun 4 2018
Thanks phedenskog@wikimedia.org for the report. Just a question about the summary in https://phabricator.wikimedia.org/T196242#4252705: In the last 3 graphs, what do the three numbers (e.g. "UpdateLayoutTree 192ms 296ms 291ms" in the last graph) in each row mean? Are they CPU duration numbers before and after the time? I looked into the traces, and put the numbers into a spreadsheet (https://docs.google.com/spreadsheets/d/1GmLaVp6-bn_YP4Bxf5G69dZDq_vV6sVPzoeH9WRUDMo/edit?usp=sharing), summarized as below: 66 67 Layout 362 470 +29.8% UpdateLayoutTree 278 382 +37.3% UpdateLayerTree 138 103 -25.7% Based on the result, M67 actually has progression for UpdateLayerTree (which is expected because SPv175 reduces PrePaint work). The increase of Layout and UpdateLayoutTree is unexpected. I also noticed that we seem to schedule animation and layout differently. On M67 we schedule more animation and layout. See the attached screenshots. I suspect the changes of Layout and UpdateLayoutTree are caused by some reason other than SPv175. I'm looking into whether we can bisect this.
,
Jun 4 2018
Ah sorry thsi is a little bit unclear: > In the last 3 graphs, what do the three numbers (e.g. "UpdateLayoutTree 192ms 296ms 291ms" in the last graph) in each row mean? Are they CPU duration numbers before and after the time? Yep the blue vertical line is when I did the change to 67 (from 66). We run Chrome with -*,devtools.timeline and we summarize time spent per event type.
,
Jun 4 2018
Can I interpret the following line UpdateLayoutTree 192ms 296ms 291ms as: 192ms for 66 296ms when changed to 67 291ms for 67 ? The numbers of UpdateLayerTree look at the same scale as my statistics (see #c7), while UpdateLayerTree looks different. I see progression of UpdateLayerTree. Are we using the different sets of traces? My method is: open a trace, select all the captured events, and look at the summary of slices. I did this for all of the 11x2 traces for m66 and m67 in #c2.
,
Jun 4 2018
Now I believe this is a scheduling issue. The increased Layout and UpdateLayoutTree times are both because of an extra event happens after an Animation, respectively.
(ULT = UpdateLayoutTree)
M66:
ULT-138ms Layout-273ms
(~same)
M67 (new events marked '(+)'):
Animation(+) Animation(+)
ULT-129ms Layout-288ms Layout-57ms(+)
(~same) ULT-77ms(+)
Based on the above behaviors, I bisected to
You are probably looking for a change made after 542547 (known good), but no later than 542562 (first known bad).
CHANGELOG URL:
https://chromium.googlesource.com/chromium/src/+log/c06159611876d8b4bb1ea997e5647f8ba888ea60..89f128f6486e5efe7e8f44ebf8cddcdfad627cd3
My test method during bisect is:
1. Open https://en.wikipedia.org/wiki/Facebook (listed in the bisect.py command line);
2. Open DevTools
3. Switch to Performance panel, and click Record
4. Reload the page and wait for completion
5. Stop recording in DevTools
6. Expand "Interactions" and "Animation" below "Interactions"
7. Observe if there are animation events (purple) the blue dashed vertical line
Suepecting https://chromium.googlesource.com/chromium/src/+/f8bf3f03dd8f4b81cb36d21c8af07844de4e9ecb. Could it cause previously throttled frame not throttled?
,
Jun 4 2018
The test page has no iframes in it, so it can't be a throttling issue. The test page also doesn't use IntersectionObserver, so it seems quite unlikely that my change is the cause. Punting back to wangxianzhu.
,
Jun 4 2018
Sorry for punting back, but I locally bisected, and the result still pointed to:[f8bf3f03dd8f4b81cb36d21c8af07844de4e9ecb] [IntersectionObserver] Don't snap bounding rects.
,
Jun 8 2018
Just want say this is really important for us and it would super if you have time to look at it before 67 has fully rolled out. I've checked our metrics and 67 is slowly getting more user (25% of 66+67 right now at least for the users where we collect timing metrics) but it's hard to say if the first paint is affected (it looks that way but I want more users to be sure).
,
Jun 8 2018
szager@ told me offline that code changed in the CL didn't execute for the case. I'll investigate more to figure out the root case of the extra animations.
,
Jun 11 2018
I bisected locally again, and the result still pointed to szager@'s CL. However, just on that revision the changed code is not executed when loading wikipedia. Then I tried from the revision before szager@'s CL, and add one CL each time, then the extra animation appeared after crrev.com/fdfc5ae4154a67e0539b9d0a77b3c5e2da38f40b. So I guess my bisects were influenced by the field trial. I tried --disable-features=UseModernMediaControls on crrev.com/fdfc5ae4154a67e0539b9d0a77b3c5e2da38f40b, and the extra animations didn't appear. Also tried to enable and disable feature on ToT, and the extra animations appear only with the feature enabled.
,
Jun 11 2018
+beccahughes who looked at a controls performance. What's surprising here is that none of the pages I tried (Barack Obama, Facebook and Sweden) have an audio or video element in the DOM. I wonder if there is something created but not added to the tree. If that's the case, on top of my head, only the loading of the CSS would be different.
,
Jun 20 2018
Pinging this again, is there something we can do to help or can we add more info? This is really important for us. Thanks
,
Jun 20 2018
I had a quick look and I could see audio elements on all three of those pages quickly flash with our new media controls and then disappear.
,
Jul 11
To echo what beccahughes said, all of these pages do have media controls. You'll find them at the bottom of the page, below the citations. Other wikis that use a slightly different layout have a much more pronounced version of this. For example, if you look at https://nl.wikipedia.org/wiki/Amsterdam you'll see the same behavior for the audio player on that page, which is on the top right-hand corner of the page just below the search bar. Whether this is the culprit for the ~100ms increase in firstPaint times after Chrome 67 isn't clear, but it would seem to be worth addressing in its own right.
,
Aug 23
,
Sep 28
I've been going through our RUM metrics from when users switched from Chrome 66 to 67 and I think I can see slower first paint with 67. Attaching three images: Our user change from 66 -> 67, median and p75 first paint during that period. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by phedens...@wikimedia.org
, Jun 3 2018