linux.perf bot running power.desktop tests should disable CPU frequency scaling |
|||||
Issue descriptionThis is in relation to https://bugs.chromium.org/p/chromium/issues/detail?id=891356#c10 Specifically, I was assigned a bug showing a significant regression in power.desktop test cases for linux.perf. The power.desktop test suite doesn't measure power directly. It indirectly measures CPU utilization via chrome://tracing. AFAICT, there is no measure of CPU clock frequency surfaced via chrome://tracing. And this is problematic. Specifically, if we can more uniformly distribute our load over more cores, we may win on power because we can use a lower clock frequency. However, the lower speed cores will have higher utilization. This will regress utilization metrics. It's pretty straightforward to disable frequency scaling on linux. This would help this suite of tests. Other options are to surface CPU frequency and factor that in. Best option to actually measure power consumption. This is certainly causing flake.
,
Oct 4
Is this something we should do for rendering benchmarks too? So that the cpu-time we get from the benchmark is more reliable?
,
Oct 4
There's always a tradeoff between being representative of real user experience, and being stable and easy to understand. I don't have a good intuition for whether the tradeoff would make sense in this case. Perhaps we should be counting CPU cycles instead of task durations? I know Android folks are getting a fair bit of mileage out of counting CPU cycles.
,
Oct 5
(Adding brucedawson@, power benchmarks owner) tdresser@ that's super interesting - do you understand how they calculate this in practice? It does seem like the ideal solution. It seems like you'd need to know: 1) The frequencies that all CPUs are running at at all times. This is easy to get with systrace on Android and ftrace on Linux, but I'm not sure about on Mac and Win 2) What processor each Chrome thread was running on at each point in time. I'm not sure if we currently collect this or how hard it would be to get. Maybe this is also available via systrace? This would at least be enough to get us an approximation of CPU cycles. There is trickiness, though. We only have a single "cpu time" number for a trace event. If the wall time for an event was 100 seconds, the CPU time is 80s, and we know that the processor was running at 0.2 GHz for the first 90s and 1.0 GHz for the last 10s: it's impossible to precisely know what the total clock cycles are. It turns out, though, that we already have this problem (when we have a range of interest that terminates in the middle of an event) and we already assume that the CPU time is evenly distributed (https://cs.chromium.org/chromium/src/third_party/catapult/tracing/tracing/model/thread.html?type=cs&q=getcputimeforrange&sq=package:chromium&g=0&l=258), which is probably a reasonable estimate in practice. Frankly, I think the biggest thing holding us back from implementing this is a willingness to commit the time to get it done. I'd guess this would take O(weeks) to implement, and I'm just not sure how many problems we currently experience due to this.
,
Oct 5
,
Oct 5
I think that QueryProcessCycleTime is supposed to measure CPU cycles consumed by a process and it should account for the constantly change CPU speeds (although even then it would miss some things, because TurboBoost is managed by the CPU itself). I would want to do some testing of this function before trusting it. If this function doesn't work then trying to recreate this would be crazy hard. All of the information should be available in ETW traces but a) recording ETW traces requires admin privileges and b) I am not convinced that the CPU frequency data in ETW traces can be trusted. Disabling CPU scaling is also a reasonable solution. It is imperfect, to be sure, but it will give us a much more sensitive measurement of whether we are increasing or decreasing the CPU load.
,
Oct 5
,
Oct 10
This isn't just power.desktop. I was hit by similar "regressions" with media.desktop test suites as well. I think that this issue is the cause of much flake with CPU usage based tests.
,
Oct 22
*nudge*
,
Oct 23
Trying disabling CPU scaling SGTM. Ned, who is the right owner for this now?
,
Oct 23
#10 that would fit into core automation team. But we don't have any bandwidth to take this now, so just keep the components "Speed > Benchmarks" for now for us to backlog this. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by tdres...@chromium.org
, Oct 4