Issue metadata
Sign in to add a comment
|
High CPU use in base::Time::Explode on Windows |
||||||||||||||||||||||
Issue descriptionbase::Time::Explode invokes NtQuerySystemInformation under the covers which is known to be a very expensive function -- measured @ 4ms/call -- and also queries registry keys. See the profile in 68.0.3428.0 canary on 64-bit Windows for details: https://uma.googleplex.com/p/chrome/callstacks?sid=0c531cc7f843d8d77be0c1900d3c709b In some cases this is being called often enough to take a large percentage of CPU time for some clients. In profiles observed executing this function via SiteEngagementScore::AddPoints() the percent of time spent executing it, at every 5th percentile, is: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,26,59,80 So the top 10% of profiles containing SiteEngagementScore::AddPoints() show 26+% of their execution time there, entirely due to base::Time::Explode
,
May 16 2018
So... LocalExplode (called by LocalMidnight) is expensive because of the call to Explode(true, ...) which calls SystemTimeToTzSpecificLocalTime which is expensive because the first parameter is nullptr so the OS spends a lot of time in RetrieveTimeZoneInformation. A few ideas come to mind: 1) Does SiteEngagementScore::AddPoints need to be using local time? Could it just use UTC and call that good? In particular, this would guarantee that all days are the same length, whereas currently the length of the days being measured can vary dramatically. 2) Could SiteEngagementScore::AddPoints use a consistent time zone, instead of looking up the time zone every time? This would avoid the cost and, like using UTC, would avoid having variable length days, presumably. 3) Failing that perhaps the number of calls to LocalMidnight() could be reduced. In particular, the result of last_engagement_time_.LocalMidnight() could be cached and only updated when last_engagement_time_ changes. These changes won't make a dramatic difference, but they will help to lower Chrome's CPU usage and battery consumption, something which UMA should not be affecting. I'm happy to provide Windows consulting for any possible changes, but I don't know what the site engagement implications are for using UTC or a constant-per-run time zone.
,
May 22 2018
Thanks for filing this. Answers in line: 1) Does SiteEngagementScore::AddPoints need to be using local time? Could it just use UTC and call that good? In particular, this would guarantee that all days are the same length, whereas currently the length of the days being measured can vary dramatically. We have a number of values which reset per day, e.g. there's a maximum amount of engagement that can be earned per day. Perhaps we could remove the daily limit (and thus avoid needing to call LocalMidnight() at all)? +calamity who implemented most of the score accumulation / decay logic for his thoughts. 2) Could SiteEngagementScore::AddPoints use a consistent time zone, instead of looking up the time zone every time? This would avoid the cost and, like using UTC, would avoid having variable length days, presumably. Is this equivalent to just using UTC, or caching the time zone information? 3) Failing that perhaps the number of calls to LocalMidnight() could be reduced. In particular, the result of last_engagement_time_.LocalMidnight() could be cached and only updated when last_engagement_time_ changes. This seems like a straightforward optimisation. Is there a way to estimate how much of the CPU cost of AddPoints this would save on Windows?
,
May 22 2018
In answer to the last question, I would expect there to be a linear relationship between the number of calls made to LocalMidnight() and the CPU cost.
,
May 22 2018
#4: certainly, there would be a linear relationship. But |last_engagement_time_| can change pretty frequently - in fact, every time AddPoints is called, |last_engagement_time_| may be updated (in fact, I'd say it's updated almost every time the method is called). So 3) alone is unlikely to meaningfully reduce the number of calls to LocalMidnight(), which is why I wanted to know if we could estimate the saving prior to doing anything (e.g. what exactly is this profiler? who is it sampling? is it regular browsing activity?).
,
May 23 2018
The sampling profiler is measuring (in this case) aggregate steady state UI thread execution of canary users on 64-bit Chrome on Windows. Additional background on the profiler is at go/uma-sampling-profiler-overview#heading=h.gjgriz8s7dlc. The easiest way to understand if #3 would make a difference would likely be to make the change and look at the results in the new canary release. The CPU usage analysis in this bug is not self-service yet but I'd be happy to rerun it. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by wittman@chromium.org
, May 15 2018