chromeos-kernel-4_4 schedfreq governor excessive context switching |
||||||
Issue descriptionThe chromeOS 4.4 kernel uses the schedfreq governor. This governor, when adjusting the available capacity, wakes up a kernel thread to perform the OPP adjustment. By default, these adjustments are throttled to 40us and 500ms for up and down adjustments respectively. However, this throttling is done within the kthread itself, meaning we are frequently waking up a kthread only to immediately put it back to sleep, if requests are still being throttled. We should move the throttle check to take place before the request to the kthread is sent, avoiding unnecessary context switches. By doing this, context switches to kschedfreq:X threads were reduced to 25% of what they were before, when idle at the login screen of the chromebook. This also results in an 2-3.7% improvement in page loading on a chromebook that uses this kernel, kevin, and a 10% improvement to the GLaquarium benchmark. It appears to have minimal effect on the scrolling smoothness benchmark, a 0.29% increase in the percentage of smooth frames. It should be noted that this design is already in effect in the schedutil governor, which has replaced the schedfreq governor in mainline, and on AOSP's 4.4 kernel.
,
Feb 2 2018
,
Feb 14 2018
Some power information from the latest version of the patch is attached. This comes from running the 'power_LoadTest.1hour' test 4 times both when establishing a baseline, and measuring the effect of the patch. The test itself seemed to fail right at the end when processing the results (see attached callstack), but still seems to have collected the power/frequency data. Looks like the average frequency of the big cluster increases by 6.8%, while the average frequency of the little cluster decreases by 1.3%, which will explain the improvements in GLAquarium. This comes with 2% decrease in the mean page load time (and 14.6% decrease in minimum page load time). There is a 5% increase in 'energy rate', but only a 0.3% decrease in 'minutes battery life'. The percentage of time spent in sleep increases from 82.2 to 84.8% for the little cluster, and from 69.1 to 71.1% for the big cluster. I've attached the graphs showing the raw values, and values normalised w.r.t. the baseline, listing the percentage increase or decrease.
,
Feb 14 2018
Thanks for collecting all of this -- so it looks like average frequencies went up but idle/sleep time also went up. Energy used also went up a bit, so we're paying for the increased performance there. Does that sound like an accurate summary?
,
Feb 14 2018
#4: Yep, accurate to me. In light of this, what's your view on the acceptability of the change? I'm interested too in the relationship between the increase in energy_rate/used (5%), and the decrease in minutes_battery_life, which is a much smaller percentage (0.3%). Is one of them more of a deciding factor when considering a patch?
,
Feb 14 2018
re #5 -- IIRC energy_rate on 1-hour runs isn't terribly useful because the way we calculate it (voltage * current) -- however, battery voltage could be different depending on how charged the battery was at the start of the run. So if all runs were all done with fully charged batteries it might be comparable. But I think there's another keyval called wh_energy_powerlogger which tries to account for this difference. Do you happen to have the energy_powerlogger keyvals? Or possibly could you do full runs?
,
Feb 15 2018
#6: Looks like the energy_powerlogger keyvals show a 0.34% increase with the patched version, although the raw values are: baseline: 3.665079 baseline: 3.607865 baseline: 3.741675 baseline: 3.690458 patched: 3.788502 patched: 3.714660 patched: 3.559400 patched: 3.692270 And the way the tests were run, the 8 1-hour runs were run sequentially, from a single battery charge. So you would say that's an incorrect way of running it, and it should always be run from a fully charged battery? Anyway, I'll try to get a full run without and with the patch.
,
Feb 19 2018
@sonnyrao:
The results from one full run with and without the patch are in:
metric | baseline | patched
----------------------+----------+--------
minutes_battery_life | 598 | 583
wh_energy_used | 36.746 | 36.7536
w_energy_rate | 3.801 | 3.896
wh_energy_powerlogger | 36.361 | 36.989
,
Feb 21 2018
re #7 and 8, so 0.34% increase on powerlogger is pretty negligable. on the full run it looks really close except for minutes_battery_life for some reason Both of your runs were over 9.5 hours though. I think this looks okay and we can land it, adding tbroch as FYI (in case he wants to veto).
,
Feb 21 2018
Would be good to attach the full logs from both full runs but see no reason to veto. Power increase seems worth the perf improvements IMO.
,
Feb 21 2018
@tbroch: I've attached the full logs for baseline and patched. You'll notice the keys are marked INVALID_, but the failure seems to have happened in the results post-processing stage, see test-failure-callstack.txt in #3, so I don't believe this has much bearing our interpretation of the results. FWIW, my platform/crostestutils checkout is currently at 6b10c01 - "check_ethernet: try each recovery method only once". I've created https://bugs.chromium.org/p/chromium/issues/detail?id=814265 if you have any advice on fixing the issue. @sonnyrao: Great! I'll try running the patch through the bots again, but can you CQ+2 it, if it passes, once @tbroch confirms he's okay with it still?
,
Feb 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/2db6b207e5b932f41a6ca533a0483641e5c73b61 commit 2db6b207e5b932f41a6ca533a0483641e5c73b61 Author: Stephen Kyle <stephen.kyle@arm.com> Date: Wed Feb 21 14:46:06 2018 CHROMIUM: sched: reduce calls to kschedfreq Perform sched gov throttle check when adjusting capacity, rather than waking up the kthread to have it perform the throttle check. This leads to roughly a quarter of the number of context switches to kschedfreq:X happening when idling at the login screen. These threads were often preempting the renderer thread during page loading, so this leads to an improvement for page loading. BUG= chromium:808376 TEST="at login screen, check value of: 'perf stat -e cs -p <kschedfreq:0 PID> -r 5 -- sleep 10' cros_workon --board=kevin start chromeos-kernel-4_4 cros_workon_make --board=kevin chromeos-kernel-4_4 --install ./update_kernel --remote=<DUT IP> confirm value is now ~25% of previous value after reboot" Signed-off-by: Stephen Kyle <stephen.kyle@arm.com> Change-Id: Ibca405a28b80f205ab021802947e5f0b9cf5aa6b Reviewed-on: https://chromium-review.googlesource.com/897816 Commit-Ready: Stephen Kyle <stephen.kyle@arm.com> Tested-by: Stephen Kyle <stephen.kyle@arm.com> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Dmitry Torokhov <dtor@chromium.org> [modify] https://crrev.com/2db6b207e5b932f41a6ca533a0483641e5c73b61/kernel/sched/cpufreq_sched.c
,
Feb 21 2018
,
Feb 21 2018
Thanks for full logs. The invalid piece is a known issue, https://bugs.chromium.org/p/chromium/issues/detail?id=794601 Put the keyvals in this spreadsheet for archival & side-by-side comparisons https://docs.google.com/spreadsheets/d/1tUAO_XJhGSBo3RJ9hhCbvl4cBs3fHPgUW55x-nqjZ-k/edit#gid=0 Confirming things look good between the runs.
,
Feb 23 2018
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by stephen....@arm.com
, Feb 2 2018Labels: OS-Chrome