Issue metadata
Sign in to add a comment
|
3.3%-400.2% regression in battor.steady_state at 552586:552591 |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Apr 23 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14cbe751c40000
,
Apr 23 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/14cbe751c40000 Make --site-per-process the default on ToT via fieldtrial_testing_config by lukasza@chromium.org https://chromium.googlesource.com/chromium/src/+/fb1ccf02ee8ca79e1404abfd3a3a7d540b7d2dbd Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Apr 23 2018
,
Apr 23 2018
+charliea@ who owns the battor.steady_state benchmark I assume that the new behavior with site-per-process is expected and should be used as the new baseline. OTOH, I am not sure who can make this decision and how.
,
Apr 23 2018
I think we know about and are investigating some of the regressions (paint metrics, input event latency), but there are also regressions in power (perhaps tied to regressions in average CPU time?), which I don't think we've thought about much before and might want to sanity check.
,
May 1 2018
One interesting thing to note is that the Taobao story exercised in this bug visits https://world.taobao.com/ which doesn't have that many OOPIFs - there are only 2 sites on that webpage: taobao.com and alicdn.com
,
May 1 2018
,
May 1 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14c96c33c40000
,
May 1 2018
FWIW, I've tried to gather ETW traces when locally running the "taobao" performance story/test on the 68.0.3416.0 canary (once with --site-per-process cmdline flag and once without the flag and with the trials disabled via chrome://flags). I've saved the traces at https://drive.google.com/open?id=1zYBrlfQVRp4arUsD7d_J_Rfua8aZi1WB (Google-internal, but I can re-share if needed). I suspect that with site-per-process a page can schedule work in more than 1 renderer process at a time (which leads to higher CPU utilization). OTOH, I am not sure how to verify this hypothesis with the gathered ETW traces. Also - I've just realized that the pinpoint job in #c3 talks about timeToFirstContentfulPaint, so I've scheduled another pinpoint job in #c9, hoping to get a confirmation of a regression in cpu_time_percentage.
,
May 1 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/14c96c33c40000 Make --site-per-process the default on ToT via fieldtrial_testing_config by lukasza@chromium.org https://chromium.googlesource.com/chromium/src/+/fb1ccf02ee8ca79e1404abfd3a3a7d540b7d2dbd Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
May 2 2018
,
May 2 2018
,
May 2 2018
I've just realized that the ETW traces in #c10 are gathered for a wrong story - the regression in battor.steady_state / cpu_time_percentage / #c11 happens (as seen in https://chromeperf.appspot.com/group_report?bug_id=835861) mostly in the following stories: - www.indiatimes.com - www.uol.com.br - www.slideshare.net/patrickmeenan Unfortunately, it turns out that battor.steady_state is a Mac-only benchmark (according to docs/speed/benchmark/harnesses/power_perf.md). Attempting to run the benchmark (and gather ETW traces) on Windows gives back: D:\src\chromium\src>python tools\perf\run_benchmark run battor.steady_state --story-filter http...www.uol.com.br --extra-browser-args=--site-per-process --browser=canary ... battor.steady_state is disabled on the selected browser I don't know how to do a CPU sampling profile diff that spans *all* chrome processes... :-(
,
May 2 2018
FWIW, sadrul@'s LatencyInfo fix (r554321) doesn't seem to impact/recover the regresson shown for battor.steady_state / cpu_time_percentage_avg at https://chromeperf.appspot.com/group_report?bug_id=835861 - the chart shows a bump at r552589, but is more-or-less flat at r554321.
,
May 2 2018
,
May 3 2018
Lukasz, you can try to run the story on Windows using: --also-run-disabled-tests. I'm not sure if it will work, but it's worth a try and, given the fact that the same thing that's affecting Mac seems to be affecting Windows, it seems likely that problems you find using ETW on Windows will carry over to Mac.
,
May 4 2018
I've tried manually opening the websites on Windows and gathering ETW traces, but unfortunately nothing actionable or surprising jumped out at me :-( I wonder if it is possible to compare the benchmark results with some UMA data. I don't see good UMA metrics to compare against, but let me ask around. So far I've found: - Power.BatteryDischargeRate (CrOS-only) - doesn't change much - PerformanceMonitor.*CPU.(Browser|Renderer)Process (Windows-only) - difficult to draw overall conclusions
,
May 22 2018
Unfortunately, other than manually digging through the two ETW traces and perhaps looking at some code paths that you might expect to be stressed more with site isolation, I don't have any great suggestions on how to diff the two traces. There definitely does definitely seem to be a ~3% power regression though.
,
May 22 2018
Note that some discussion of these results is also happening in go/site-isolation-performance#heading=h.u3cs4xppn2j2 (google-internal link, sorry...)
,
Aug 30
lukasza@: any update from site-isolation-performance discussion?
,
Aug 30
RE: #c21: mustaq@ I don't think there are any actionable conclusions - maybe we just need to accept the new power consumption level as the new normal / as necessary tax for the better security guarantees offered by Site Isolation. I've also tried to look at more complete data we have from having Site Isolation on the stable channel in M67 and M68 (I've added it to go/site-isolation-performance#heading=h.u3cs4xppn2j2). Interestingly, M67 data shows no statistically significant regression (which might support resolving this bug as WontFix) although M68 data shows a regression of about 1.92% at the 50th percentile (this regression seems like a separate bug, because it happened independently from and after we've turned on Site Isolation). Given the above, I'd be leaning toward resolving this bug as WontFix. If you think there are some remaining actions we should be taking here, please reactivate.
,
Oct 8
To follow-up, no statistically significant changes have been found in M69 in Power.BatteryDischargeRate - go/site-isolation-performance#heading=h.u3cs4xppn2j2 |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Apr 23 2018