New issue
Advanced search Search tips

Issue 808376 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Feature



Sign in to add a comment

chromeos-kernel-4_4 schedfreq governor excessive context switching

Project Member Reported by stephen....@arm.com, Feb 2 2018

Issue description

The chromeOS 4.4 kernel uses the schedfreq governor. This governor, when adjusting the available capacity, wakes up a kernel thread to perform the OPP adjustment. By default, these adjustments are throttled to 40us and 500ms for up and down adjustments respectively. However, this throttling is done within the kthread itself, meaning we are frequently waking up a kthread only to immediately put it back to sleep, if requests are still being throttled.

We should move the throttle check to take place before the request to the kthread is sent, avoiding unnecessary context switches. 

By doing this, context switches to kschedfreq:X threads were reduced to 25% of what they were before, when idle at the login screen of the chromebook. This also results in an 2-3.7% improvement in page loading on a chromebook that uses this kernel, kevin, and a 10% improvement to the GLaquarium benchmark. It appears to have minimal effect on the scrolling smoothness benchmark, a 0.29% increase in the percentage of smooth frames.

It should be noted that this design is already in effect in the schedutil governor, which has replaced the schedfreq governor in mainline, and on AOSP's 4.4 kernel.
 
Cc: sonnyrao@chromium.org diand...@chromium.org
Labels: OS-Chrome
Components: OS>Kernel

Comment 3 by stephen....@arm.com, Feb 14 2018

Some power information from the latest version of the patch is attached. This comes from running the 'power_LoadTest.1hour' test 4 times both when establishing a baseline, and measuring the effect of the patch. The test itself seemed to fail right at the end when processing the results (see attached callstack), but still seems to have collected the power/frequency data.

Looks like the average frequency of the big cluster increases by 6.8%, while the average frequency of the little cluster decreases by 1.3%, which will explain the improvements in GLAquarium. 

This comes with 2% decrease in the mean page load time (and 14.6% decrease in minimum page load time). There is a 5% increase in 'energy rate', but only a 0.3% decrease in 'minutes battery life'. 

The percentage of time spent in sleep increases from 82.2 to 84.8% for the little cluster, and from 69.1 to 71.1% for the big cluster.

I've attached the graphs showing the raw values, and values normalised w.r.t. the baseline, listing the percentage increase or decrease.
01-battery-life.png
10.8 KB View Download
01-NORM-battery-life.png
11.0 KB View Download
02-NORM-page-load.png
16.1 KB View Download
02-page-load.png
15.6 KB View Download
03-energy.png
12.4 KB View Download
03-NORM-energy.png
14.1 KB View Download
04-avg-freq.png
14.7 KB View Download
04-NORM-avg-freq.png
16.1 KB View Download
05-little-cluster.png
35.8 KB View Download
05-NORM-little-cluster.png
42.9 KB View Download
06-big-cluster.png
41.6 KB View Download
06-NORM-big-cluster.png
51.2 KB View Download
07-cpuidle.png
28.9 KB View Download
07-NORM-cpuidle.png
34.7 KB View Download
test-failure-callstack.txt
2.1 KB View Download
Thanks for collecting all of this -- so it looks like average frequencies went up but idle/sleep time also went up.  Energy used also went up a bit, so we're paying for the increased performance there.  Does that sound like an accurate summary?

Comment 5 by stephen....@arm.com, Feb 14 2018

#4: Yep, accurate to me. In light of this, what's your view on the acceptability of the change? 

I'm interested too in the relationship between the increase in energy_rate/used (5%), and the decrease in minutes_battery_life, which is a much smaller percentage (0.3%). Is one of them more of a deciding factor when considering a patch?
re #5 -- IIRC energy_rate on 1-hour runs isn't terribly useful because the way we calculate it (voltage * current) -- however, battery voltage could be different depending on how charged the battery was at the start of the run. So if all runs were all done with fully charged batteries it might be comparable.

But I think there's another keyval called wh_energy_powerlogger which tries to account for this difference.

Do you happen to have the energy_powerlogger keyvals?  Or possibly could you do full runs?

Comment 7 by stephen....@arm.com, Feb 15 2018

#6: Looks like the energy_powerlogger keyvals show a 0.34% increase with the patched version, although the raw values are:

baseline: 3.665079
baseline: 3.607865
baseline: 3.741675
baseline: 3.690458

patched: 3.788502
patched: 3.714660
patched: 3.559400
patched: 3.692270

And the way the tests were run, the 8 1-hour runs were run sequentially, from a single battery charge. So you would say that's an incorrect way of running it, and it should always be run from a fully charged battery?

Anyway, I'll try to get a full run without and with the patch.

Comment 8 by stephen....@arm.com, Feb 19 2018

@sonnyrao:

The results from one full run with and without the patch are in:

     metric           | baseline | patched
----------------------+----------+--------
minutes_battery_life  | 598      | 583
wh_energy_used        | 36.746   | 36.7536
w_energy_rate         | 3.801    | 3.896
wh_energy_powerlogger | 36.361   | 36.989

Cc: tbroch@chromium.org
re #7 and 8, so 0.34% increase on powerlogger is pretty negligable.  on the full run it looks really close except for minutes_battery_life for some reason

Both of your runs were over 9.5 hours though.  I think this looks okay and we can land it, adding tbroch as FYI (in case he wants to veto).

Would be good to attach the full logs from both full runs but see no reason to veto.  Power increase seems worth the perf improvements IMO.
@tbroch: I've attached the full logs for baseline and patched. You'll notice the keys are marked INVALID_, but the failure seems to have happened in the results post-processing stage, see test-failure-callstack.txt in #3, so I don't believe this has much bearing our interpretation of the results. FWIW, my platform/crostestutils checkout is currently at 6b10c01 - "check_ethernet: try each recovery method only once". I've created https://bugs.chromium.org/p/chromium/issues/detail?id=814265 if you have any advice on fixing the issue.

@sonnyrao: Great! I'll try running the patch through the bots again, but can you CQ+2 it, if it passes, once @tbroch confirms he's okay with it still?
baseline_test_report.log
138 KB View Download
patched_test_report.log
134 KB View Download
Project Member

Comment 12 by bugdroid1@chromium.org, Feb 21 2018

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/2db6b207e5b932f41a6ca533a0483641e5c73b61

commit 2db6b207e5b932f41a6ca533a0483641e5c73b61
Author: Stephen Kyle <stephen.kyle@arm.com>
Date: Wed Feb 21 14:46:06 2018

CHROMIUM: sched: reduce calls to kschedfreq

Perform sched gov throttle check when adjusting capacity, rather than
waking up the kthread to have it perform the throttle check.

This leads to roughly a quarter of the number of context switches to
kschedfreq:X happening when idling at the login screen. These threads were
often preempting the renderer thread during page loading, so this leads to
an improvement for page loading.

BUG= chromium:808376 
TEST="at login screen, check value of:
        'perf stat -e cs -p <kschedfreq:0 PID> -r 5 -- sleep 10'
      cros_workon --board=kevin start chromeos-kernel-4_4
      cros_workon_make --board=kevin chromeos-kernel-4_4 --install
      ./update_kernel --remote=<DUT IP>
      confirm value is now ~25% of previous value after reboot"

Signed-off-by: Stephen Kyle <stephen.kyle@arm.com>

Change-Id: Ibca405a28b80f205ab021802947e5f0b9cf5aa6b
Reviewed-on: https://chromium-review.googlesource.com/897816
Commit-Ready: Stephen Kyle <stephen.kyle@arm.com>
Tested-by: Stephen Kyle <stephen.kyle@arm.com>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Dmitry Torokhov <dtor@chromium.org>

[modify] https://crrev.com/2db6b207e5b932f41a6ca533a0483641e5c73b61/kernel/sched/cpufreq_sched.c

Labels: M-66
Status: Fixed (was: Started)
Thanks for full logs.  The invalid piece is a known issue,

  https://bugs.chromium.org/p/chromium/issues/detail?id=794601

Put the keyvals in this spreadsheet for archival & side-by-side comparisons

https://docs.google.com/spreadsheets/d/1tUAO_XJhGSBo3RJ9hhCbvl4cBs3fHPgUW55x-nqjZ-k/edit#gid=0

Confirming things look good between the runs.
Cc: dbasehore@chromium.org

Sign in to add a comment