Chrome 69 introduced slower firstPaint on some Wikipedia sites/ pages
Reported by phedens...@wikimedia.org, Sep 20
UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36 Steps to reproduce the problem: Wikipedia render slower on 69 than on 68, I've seen this on multiple synthetic environments. In some cases it looks like a 30-50% increase so it's quite much. 1. Access for example https://zh.wikipedia.org/wiki/%E5%8C%97%E4%BA%AC%E5%B8%82 with Chrome 69 2. Do the same with Chrome 68 and collect first paint and compare them You can also test https://zh.wikipedia.org/wiki/Facebook or https://zh.wikipedia.org/wiki/%E6%96%AF%E5%BE%B7%E5%93%A5%E5%B0%94%E6%91%A9. We see this on all pages we test on the Chinese wiki. We see this also on the Russian and Japanese wiki too, In our lab environment we can see that First Visual Change, Speed Index and first paint (collected directly from Chrome) is affected. I'm pretty sure this is a Chrome issue, because I can switch between just the browser version and see the metrics change. I'm pretty sure the change is isolated to the Chrome version. It's somehow content related because I can see that for some wikis (the English Wikipedia for example) there's no change. But for some other there are change across the board. I've attached trace logs from running 68 vs 69. I can see that on 69 more time is spent on layout, but I need help your help to understand why. What is the expected behavior? What went wrong? First paint increased just by updating the browser. Did this work before? Yes 68 Does this work in other browsers? N/A Chrome version: 69.0.3497.100 Channel: stable OS Version: OS X 10.13.6 Flash Version:
It'd be helpful if you do a bisect as per https://www.chromium.org/developers/bisect-builds-py A random guess would be ICU 62.1 update in r573650.
Yeah I see. Not sure though I can reproduce on my local, but I can try later on. We run the tests on machines where we try to have an isolated environment and run the tests against wiki pages replayed by WebPageReplay. I've checked our RUM data but so far the rollout of 69 hasn't hit the majority of users for us so it's hard to see anything.
I've checked and I can see this on the French wiki too, and for some of the pages on the Swedish and English wikipedia, however the change isn't as big as it is for China/Japanese/Russian.
phedenskog@ Thanks for the issue. Tested this issue on Mac OS 10.13.3 and Windows 10 on the M-68 build 68.0.3440.106 and latest Stable 69.0.3497.100 by following the below steps. 1. Launched Chrome and navigated to https://zh.wikipedia.org/wiki/%E5%8C%97%E4%BA%AC%E5%B8%82. 2. Opened Devtools -> Performance and recorded the performance for ~10 secs. 3. In Summary tab, in M-68 build, Painting is ~360ms, whereas in M-69 build it is ~196ms. Attached are the screen shots for reference. Could you please check and confirm if this is the issue seen. Also request you to check and confirm if anything is missed from our end in triaging the issue. Thanks..
Hmm, how do you set the connectivity when you test? It's important that you test under the exact same circumstances. How many runs do you do? Is it possible that you could check the two attached trace files to see if you can see something there (I'm no expert so I'm having a hard time to use them). I can collect more metrics (setting up other trace logs etc for Chrome) but it's really hard for me to use the bisect tool in our setup.
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
phedenskog@ Thanks for the update. As per comment #7, as the attached trace files needs to be checked, this is out of scope of triaging at TE end. Hence adding 'TE-NeedsTriageHelp' label and requesting the appropriate team to look into the issue and help in further triaging. Thanks..
Mac triage: marking this directly for Blink>Layout triage.
I've looked at our RUM metrics when users switched from 68 -> 69 and I'm pretty sure it matches what we've seen in our test environment. Attaching three screenshots: How users switched from 68->69 and our median and p75 first paint during that period.
Both the paint and rendering times are significantly higher according to the screenshots in comment 6. We could really use a bisect here.
Maybe we can look into UKM data here - that's what it's for. I'll own it to do that analysis.
Googlers: I believe this bug + doc detail the issue: https://bugs.chromium.org/p/chromium/issues/detail?id=891871 https://goto.google.com/pemsf
So this is WontFix then, because we measuring different stuff?
Hmm so you have changed your implementation of first paint? But we still see the same issue when we are measuring from the outside using a video capture (as described in the first comment). Let me know if you need more info.
I never got around to looking at UKM for this, and I suspect we can't because the M-69 release is more than 30 days old. I'm assigning to the speed team in the event they know how to see if this is a measuring artifact or a real change. And maybe who to assign this to for investigation. It comes to mind that maybe it is the change to not block on CSS below the body, because it could take longer to show correctly styled content. Does Wikipedia have CSS outside of the <head>, and is that what your metric is looking for?
Yeah, this is likely to be: https://developers.google.com/web/updates/2018/10/paint-timing-issues It should be fixed in M70, feel free to reopen if not.
Sorry for not getting back. > Does Wikipedia have CSS outside of the <head>, and is that what your metric is looking for? I don't think so. But maybe it could be related? I need to go through the URLs where I could spot the biggest diffs. I've talked with Gilles about this and I'm not 100% convinced that it was that paint timing issue, let me explain: I could see the regression in two different ways of measuring: 1. In RUM data where we collect data from the paint time API (that would match https://developers.google.com/web/updates/2018/10/paint-timing-issues) 2. In synthetic where we record a video, and analyze the video with VisualMetrics (that is not related to metrics recorded in the paint timing API. I could spot the regression directly when I updated to 69 in our tests and since those metrics is unrelated to Chrome internal paint timing I think it could be something else. I could easily do runs with different versions if you could guide me which trace categories I should turn on (if those would help you)?
Thanks for the additional details. If you're seeing it in video analysis, you're right, this isn't it. Based on the initial discussion of increased time spent in layout, over to chrishtr@ for triage.
Sign in to add a comment