Issue metadata
Sign in to add a comment
|
4% Mac page load regression in M49 |
||||||||||||||||||||||
Issue description4% overall page load time regression at the median on Mac that's currently rolled out to stable: https://uma.googleplex.com/timeline_v2?q=%7B%22day_count%22%3A%2290%22%2C%22end_date%22%3A%22latest%22%2C%22window_size%22%3A%221%22%2C%22filters%22%3A%5B%7B%22fieldId%22%3A%22channel%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%224%22%5D%7D%2C%7B%22fieldId%22%3A%22platform%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%22M%22%5D%7D%2C%7B%22fieldId%22%3A%22milestone%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%2249%22%2C%2248%22%5D%7D%5D%2C%22histograms%22%3A%5B%22PageLoad.Timing2.NavigationToFirstContentfulPaint%22%5D%2C%22default_entry_values%22%3A%7B%22measureModel%22%3A%7B%22measure%22%3A%22%22%2C%22buckets%22%3A%5B%5D%2C%22percentiles%22%3A%5B%2250%22%5D%2C%22selectedFormulas%22%3A%5B%5D%2C%22allFormulas%22%3A%5B%5D%7D%2C%22zeroBased%22%3Atrue%2C%22logScale%22%3Afalse%2C%22showLowVolumeData%22%3Afalse%2C%22showVersionAnnotations%22%3Atrue%7D%2C%22entries%22%3A%5B%7B%22measureModel%22%3A%7B%22measure%22%3A%22percentile%22%7D%2C%22zeroBased%22%3Afalse%7D%5D%7D Based on timing and characteristics, it seems likely to be related to fa7fc32c5940dfd3d734ed3231b1295da4c3303e. Full list of regressions attributed to that revision can be found at https://chromeperf.appspot.com/group_report?bug_id=564223. According to eae "likely due to a large perf regressions for AAT fonts." Related memory regression in issue 564223 .
,
Mar 31 2016
See https://bugs.chromium.org/p/chromium/issues/detail?id=576989#c16 for more details on the AAT performance regression that has been addressed.
,
Mar 31 2016
Confirmed that Linux and Windows get better, and that most of the regressions on the bots are resolved. However, I'm still concerned because PLT doesn't come back down in 50, when that second Harfbuzz roll should fix things. Are there any known issues outstanding past 50? https://uma.googleplex.com/timeline_v2?q=%7B%22day_count%22%3A%2290%22%2C%22end_date%22%3A%22latest%22%2C%22window_size%22%3A%221%22%2C%22filters%22%3A%5B%7B%22fieldId%22%3A%22channel%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%224%22%2C%223%22%5D%7D%2C%7B%22fieldId%22%3A%22platform%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%22M%22%5D%7D%5D%2C%22histograms%22%3A%5B%22PageLoad.Timing2.NavigationToFirstContentfulPaint%22%5D%2C%22default_entry_values%22%3A%7B%22measureModel%22%3A%7B%22measure%22%3A%22%22%2C%22buckets%22%3A%5B%5D%2C%22percentiles%22%3A%5B%2250%22%5D%2C%22selectedFormulas%22%3A%5B%5D%2C%22allFormulas%22%3A%5B%5D%7D%2C%22zeroBased%22%3Atrue%2C%22logScale%22%3Afalse%2C%22showLowVolumeData%22%3Afalse%2C%22showVersionAnnotations%22%3Atrue%7D%2C%22entries%22%3A%5B%7B%22measureModel%22%3A%7B%22measure%22%3A%22percentile%22%7D%2C%22zeroBased%22%3Afalse%7D%5D%7D
,
Mar 31 2016
Yeah that's bad and requires some further investigation. Interesting that it doesn't match the benchmarks. Thanks Ryan!
,
Apr 25 2016
FYI, the numbers for M50 seems to on par with M48. Yay. I'll keep monitoring to ensure this is the case as M50 rolls out.
,
Apr 26 2016
,
Apr 27 2016
M50 keeps matching M48 so looks like the fixes in M50 worked ans returned performance to M48 levels. Yay.
,
May 3 2016
I'm confused, the numbers here for M50 look worse than M49, if anything. Certainly not back down to M48 levels. Unless I'm looking at the wrong thing?
,
May 3 2016
I think this graph is a little bit easier to understand: https://uma.googleplex.com/timeline_v2?q=%7B%22day_count%22%3A%22180%22%2C%22end_date%22%3A%22latest%22%2C%22window_size%22%3A%221%22%2C%22filters%22%3A%5B%7B%22fieldId%22%3A%22channel%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%224%22%5D%7D%2C%7B%22fieldId%22%3A%22platform%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%22M%22%5D%7D%2C%7B%22fieldId%22%3A%22milestone%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%2250%22%2C%2249%22%2C%2248%22%5D%7D%5D%2C%22histograms%22%3A%5B%5B%22PageLoad.Timing2.NavigationToFirstContentfulPaint%22%5D%5D%2C%22default_entry_values%22%3A%7B%22measureModel%22%3A%7B%22measure%22%3A%22%22%2C%22buckets%22%3A%5B%5D%2C%22percentiles%22%3A%5B%2250%22%5D%2C%22selectedFormulas%22%3A%5B%5D%2C%22allFormulas%22%3A%5B%5D%7D%2C%22zeroBased%22%3Atrue%2C%22logScale%22%3Afalse%2C%22showLowVolumeData%22%3Afalse%2C%22showVersionAnnotations%22%3Atrue%7D%2C%22entries%22%3A%5B%7B%22measureModel%22%3A%7B%22measure%22%3A%22percentile%22%7D%2C%22zeroBased%22%3Afalse%7D%5D%7D The early adopters of M50 were on M48 level, but as soon as the bulk of people moved over it quickly went back up to M49 levels.
,
May 3 2016
,
May 4 2016
Restricting it to just M49 Beta and Stable https://uma.googleplex.com/timeline_v2?q=%7B%22day_count%22%3A%22180%22%2C%22end_date%22%3A%22latest%22%2C%22window_size%22%3A%221%22%2C%22filters%22%3A%5B%7B%22fieldId%22%3A%22channel%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%224%22%2C%223%22%5D%7D%2C%7B%22fieldId%22%3A%22platform%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%22M%22%5D%7D%2C%7B%22fieldId%22%3A%22milestone%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%2249%22%5D%7D%5D%2C%22histograms%22%3A%5B%5B%22PageLoad.Timing2.NavigationToFirstContentfulPaint%22%5D%5D%2C%22default_entry_values%22%3A%7B%22measureModel%22%3A%7B%22measure%22%3A%22%22%2C%22buckets%22%3A%5B%5D%2C%22percentiles%22%3A%5B%2250%22%5D%2C%22selectedFormulas%22%3A%5B%5D%2C%22allFormulas%22%3A%5B%5D%7D%2C%22zeroBased%22%3Atrue%2C%22logScale%22%3Afalse%2C%22showLowVolumeData%22%3Afalse%2C%22showVersionAnnotations%22%3Atrue%7D%2C%22entries%22%3A%5B%7B%22measureModel%22%3A%7B%22measure%22%3A%22percentile%22%7D%2C%22zeroBased%22%3Afalse%7D%5D%7D the regression takes off after 49.0.2623.75 Beta. The change list https://chromium.googlesource.com/chromium/src/+log/49.0.2623.63..49.0.2623.75?pretty=fuller&n=10000 includes the cherry-picked IOSurface change that I've suspected in the other long regression thread. Issue 606850 tracks an experiment to disable that change for some users, so let's see if that makes a difference with this metric as well.
,
May 4 2016
,
May 17 2016
Can we get an owner for this perf regression? This probably should have blocked M50.
,
May 17 2016
+ccameron, erikchen as owners of issue 606850.
,
May 17 2016
With the discussion elsewhere on this issue I'm less confident the IOSurface issue, if it is indeed the cause of the DrawInterval regression, is also the cause of this problem but with the experiment getting underway we should learn more.
,
May 17 2016
M51 Stable is launching very soon! Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and get it merged ASAP. All changes MUST be merged into the release branch by 5pm on May 20 to make into the desktop Stable final build cut. Thank you!
,
May 19 2016
Any update on fix for this bug as we're very close to M51 stable candidate cut and this is M51 stable blocker?
,
May 20 2016
I don't think that this should be RBS for M51.
,
May 20 2016
rschoen@, pinkerton@, could you please remove the RB-stable if c#18 sounds right to you? Also, I am a little confused after reading through the comments. Can one of you summarize which releases see this regression and how bad the current numbers are?
,
May 20 2016
I think that a lot of confusion here is coming from the fact that this bug conflates two issues (perhaps related, but with different symptoms). Issue 1: M49 in BETA regressed this metric by 30msec. I could hazard a guess that this is because we started using the CoreAnimation renderer. In particular, we started using IOSurfaces for all textures, which are much more expensive to allocate. There exists a finch experiment that we could create to verify this hypothesis. If this is the case, it's "good to know, worth the tradeoff". Issue 2 : M49 in STABLE sees this metric DEGENERATE over time. This is the thing where, on M49, the we go from ~840ms on April 23 to ~920ms by May 18. This looks to me to be caused by Chrome running for a long time. Is there a way that we could break this data down by Chrome's uptime? A long time ago we found an OS X bug where they leak IOSurfaces (see crrev.com/168186). Perhaps we're triggering that again? Also concerning is issue 580616 , where some people complained that performance degraded to an unacceptable level after several days of use. We haven't been able to reproduce that issue (and had the theory that it was specific to a particular GPU). Two things to try. 1. Add a UMA stat that measures the systems's IOSurface load. I can put something together to merge into M52. We can then track that as M52 goes off to stable. It may be that we will see a leak there. 2. Determine the cause of metric's regression in Beta by a Finch experiment. In particular, chop the users into 3 groups: 1. Disable the CoreAnimation renderer and disable using IOSurfaces for compositor resources 2. Disable the CoreAnimation renderer but still use IOSurfaces for compositor resources 3. Enable the CoreAnimation renderer (requires IOSurfaces) That will answer the question of "where did issue 1 come from". Sending this experiment to stable would regress users' battery life, so we probably should just do it in beta.
,
May 20 2016
WRT the RBS label: - The "Issue 1" above isn't worth RBS -- not a big enough regression. - The " Issue 2 " above isn't going to be solved for M52. At most I could maybe get the IOSurface tracking UMA merged in, to see if that's correlated. rschoen@: Is there a way to break UMAs down by how long Chrome has been running? That might be revealing.
,
May 20 2016
+rkaplow to answer "Is there a way to break UMAs down by how long Chrome has been running?"
,
May 21 2016
eae@ - was your AlwaysUseComplexText placed under an experiment before it went completely live? I would like to look at the experiment data. --- Taking another look at this, I understand how I arrived at the conclusion that the IOSurface change may be responsible, and think that may still be a possibility. I also think I understand how rschoen@ arrived at his conclusion. In my analysis, my theory for the jump was a cherry-picked change. However if you don't assume a cherry picked change and follow the regression trail backwards, you reach a different conclusion. In this graph: https://uma.googleplex.com/timeline_v2?q=%7B%22day_count%22%3A%22240%22%2C%22end_date%22%3A%22latest%22%2C%22window_size%22%3A%221%22%2C%22filters%22%3A%5B%7B%22fieldId%22%3A%22channel%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%221%22%2C%222%22%2C%223%22%5D%7D%2C%7B%22fieldId%22%3A%22platform%22%2C%22operator%22%3A%22COMPARE%22%2C%22study%22%3A%22%22%2C%22selected%22%3A%5B%22M%22%5D%7D%5D%2C%22histograms%22%3A%5B%5B%22PageLoad.Timing2.NavigationToFirstContentfulPaint%22%5D%5D%2C%22default_entry_values%22%3A%7B%22measureModel%22%3A%7B%22measure%22%3A%22%22%2C%22buckets%22%3A%5B%5D%2C%22percentiles%22%3A%5B%2250%22%5D%2C%22selectedFormulas%22%3A%5B%5D%2C%22allFormulas%22%3A%5B%5D%7D%2C%22zeroBased%22%3Atrue%2C%22logScale%22%3Afalse%2C%22showLowVolumeData%22%3Afalse%2C%22showVersionAnnotations%22%3Atrue%7D%2C%22entries%22%3A%5B%7B%22measureModel%22%3A%7B%22measure%22%3A%22percentile%22%7D%2C%22zeroBased%22%3Afalse%7D%5D%7D the data show a regression in Beta when going from 48.0.2564.82 -> 49.0.2623.28, and a regression in Dev when going from 48.0.2564.22 -> 49.0.2587.3. So basically a regression baked into the transition from M48 to M49 in Beta and Dev. If you look at Canary before the Dev regression there's a regression that starts to take off at 48.0.2571.0. The AlwaysUseComplexText change appears between 48.0.2571.0 and 48.0.2574.0. However I think the regression already began before this change landed. Taking a stab at 48.0.2560.0 as a reasonable point before the regression, and noting that this is a Mac-only regression, the one change that stands out is: 885da5130d948ba7f6d721888db7114a4f912789 mac: Some consumers of SharedMemory require a POSIX fd. https://chromium.googlesource.com/chromium/src/+/885da5130d948ba7f6d721888db7114a4f912789 Perhaps passing memory using POSIX is more expensive.
,
May 24 2016
,
Jun 1 2016
Moving this nonessential bug to the next milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 1 2016
erikchen, any idea if the change shrike identified in #23 could be to blame?
,
Jul 4 2016
I can look into it, although it's pretty unlikely that POSIX/Mach shared memory changes are responsible, so those were gated behind an experiment, which showed improvements to PLT.
,
Jul 6 2016
ccameron's analysis appears accurate. The degradation of the metric over time after M49 is released is highly telling. shrike's analysis relies on the assumption that the regression on Dev channel happened between D4 and D5 in the attached image. Given the amount of noise in the Dev channel, (e.g. D2 has same value as D5), it doesn't really seem like we can pinpoint a regression to that range.
,
Jul 6 2016
Re: my suspicions in #23, I agree with erikchen@ that the regression is not related to the shared memory change. But in #23 I also thought it might be related to the IOSurface clearing change. Again restricting the IOSurface experiment to users with spinning disks and 2Gb of RAM, there is a significant regression. On the theory that the cost of clearing an IOSurface is not negligible for these users, it makes sense that this could increase the time from navigation to first contentful paint. It appears that the slowdown is more sensitive to RAM than spinning disk (the regression pretty much disappears when you look just at disk). https://uma.googleplex.com/p/chrome/variations/?sid=051c2d8734976b2e694437ebe2c0b239 Re: #20, ccameron@ - we should try to see if there is a problem with things getting slower the longer you run the browser.
,
Jul 14 2016
This issue is Pri-1 but has already been moved once. Lowering the priority and moving to the next milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 12 2016
Looking at IOSurface experiment data, IOSurface clearing was not the cause of the regression. https://uma.googleplex.com/p/chrome/variations/?sid=48d6b2d63f010ff0720b0778c3de7d99
,
Aug 22 2016
,
Aug 25 2016
,
Aug 26 2016
Any updates on this? It seems like we're back to thinking it's likely fa7fc32c5940dfd3d734ed3231b1295da4c3303e? Assigning to eae in that case.
,
Sep 14 2016
eae@, Could you please provide an update on this.
,
Oct 13 2016
This has not recovered, and actually continues to regress. Anything we can do? https://uma.googleplex.com/timeline_v2?sid=86bbb93e2c9a449539a4f8e5bd86c04c
,
Oct 18 2016
We knew that fa7fc32c5940dfd3d734ed3231b1295da4c3303e would regress performance on some sites for an improvement on others and an improvement in correctness. The regressions have since been addressed by subsequent performance improvements. Our other performance tests are indicating that text rendering has gotten *faster* over the last few releases so if we're seeing a continuous regression then it likely has a different cause.
,
Nov 1 2016
+parisa as FYI What are the next steps? Do we need to re-bisect and find a different culprit? Upgrading to P1 as this has been a regression for 5 milestones.
,
Nov 1 2016
A couple observations: 1. the regression in m49 is real, is specific to mac, and affects the distribution up through about the 90th percentile - the tail seems unchanged 2. PageLoad.Timing2.NavigationToCommit is unchanged: https://uma.googleplex.com/timeline_v2?sid=ddb9b13461cdf775e1a50fa9a10f0124 this indicates that the regression occurs in the period between commit and first contentful paint 3. the increase in M49 stable starting on May 25 at the end of the graph in comment #20 is not a real regression. M50 became the dominant browser version as of May 25, so any users on M49 after this point are users who are failing to upgrade to M50. We often see higher latency for users that remain on old versions, as these users likely have different network/device/etc characteristics from users who successfully upgrade. To control for this, you can include a "Version tag / contains / dominant" filter in your filtering criteria. 4. The Timing2 variants of these metrics have been replaced by PaintTiming equivalents. We stopped logging Timing2 metrics in M53. M54 became the dominant browser as of late October, so future analysis should look at PageLoad.PaintTiming.NavigationToFirstContentfulPaint. 5. The regression doesn't appear to have recovered: https://uma.googleplex.com/timeline_v2?sid=1e2e9c4170cdc926c9e093267275c8d1 6. The regression appears to have been introduced between the 2564 and 2587 branches, based on dev channel analysis: https://uma.googleplex.com/p/chrome/timeline_v2/?sid=1ebaed7e293611c97ff0eb938b3eb869. Given that the regression also emerges at the 48/49 boundary, this may also be caused by a field trial targeting 49. 7. Canary channel data is unfortunately a bit too noisy for us to see precisely when the regression started on canary: https://uma.googleplex.com/p/chrome/timeline_v2/?sid=36615395e55dc1cd65dfb623e9fb1dec
,
Nov 1 2016
kouhei and annie, is it possible for us to do a bisect using page cycler v2 to find regressions in the TTFCP metric on mac, in the 2564..2587 range?
,
Nov 3 2016
kouhei, bmcquade: do you know which chart on chromeperf you'd like to bisect? If you paste a link I can help out.
,
Nov 8 2016
Kouhei, is there a chromeperf chart that runs page cycler v2 on a representative set of URLs? any such chart/benchmark should suffice here.
,
Apr 10 2017
Given the lack of recent activity and the age of the regression I'm closing this bug as WontFix. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by e...@chromium.org
, Mar 31 2016