New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 736746 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

factory/kernel: Fix the default way to control CPU frequency.

Project Member Reported by hungte@chromium.org, Jun 26 2017

Issue description

In factory, we usually need to change cpu frequency settings in order to fully control CPU (for example, for stress-testing CPU, or to heat up for thermal tests).

Previously this can be done by:
 (1) enable or disable service 'thermal' or 'dptf'
 (2) writing 'interactive' or 'userspace' to /sys/devices/system/cpu/cpu*/cpufreq

However, on most recent Chromebooks, both 'interactive' and 'userspace' will get an "[Errno 22] Invalid argument".

Is it still possible to control cpufreq on ChromeOS? Or will you recommend us to only controlling thermal/dptf, and not touching cpufreq?

From Duncan:
"I think with the switch to the intel_pstate driver this may not work as expected (or even be possible via sysfs) but we can investigate.
Please file a bug and if we can't figure out how with the new driver we can add Intel and see if they can implement something."
 
Cc: marc.her...@intel.com
HWP might make this even more difficult.

It looks like drivers/cpufreq/intel_pstate.c:intel_pstate_set_policy() exits early in the case of HWP.

Comment 2 by tbroch@chromium.org, Jun 27 2017

I think we need to step back and ask is this even necessary still.  

Hungte can you outline where existing frequency control is in factory tests and what we hope to accomplish by setting the frequency?


Comment 3 by hungte@chromium.org, Jun 27 2017

currently this is used for two tests.

1. CPU stressing, via stessapptest (https://github.com/stressapptest/stressapptest).

We run multiple tests in parallel and want to make sure CPU is running at full speed.

2. Thermal tests. The thermal tests need to verify if system can run in given temperature, so we have to first disable thermal/dptf services, and try to heat up by stressing CPU (yes, using stressapptest again), and see if the thermal sensors can raise to given readings.

3. chipset specific test. Some ARM systems may want to try if the core can really run in specified frequency, for example big/little arch.


Since the cpufreq has been broken for a while, I suspect if we really need that for item 1 and 2.

It is OK if the answer is "there's no need to tune CPU frequency for stress tests or heating CPU, and there's no need to verify if CPU can run in given frequency". Just want to make sure I can get some confirm before removing that code in factory side.
> In factory, we usually need to change cpu frequency settings in order to fully control CPU...

Drop "fully". From https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt

"For contemporary Intel processors, the frequency is controlled by the
processor itself and the P-State exposed to software is related to
performance levels.  The idea that frequency can be set to a single
frequency is fictional for Intel Core processors. Even if the scaling
driver selects a single P-State, the actual frequency the processor
will run at is selected by the processor itself."

Now there are still strings you can pull / policies which can be tweaked, see URL above. Depending on the purpose(s) of these tests (I don't know about them) you may still want to configure things like "make sure the CPU runs/heats up as fast as possible". However this is still going to be thermally dependent in one way or the other because you can't escape physics. At least switch the terminology to "max performance" and not "max frequency" any more since the latter is misleading. 

> HWP might make this even more difficult.

For two reasons: 1. HWP ("SpeedShift") means software has much less control than ever before; 2. Only newer Core processors feature HWP, so tests could need conditional logic.

PS: please Cc: Brian Bian.

Cc: brian.b...@intel.com
> > HWP might make this even more difficult.
> 
> For two reasons: 1. HWP ("SpeedShift") means software has much less control than ever before

Yeah that was my thought too, this may not be possible at all with HWP.

for #1 - we want it to run as fast as possible... within thermal limits.  i.e. we can't always run it at max frequency (turbo or other) or we risk damaging the hardware.

for #2 - this might benefit from re-thinking what we are trying to test.  For a real-world scenario the system (x86 with DPTF anyway) is often tuned to around one specific sensor that is chosen to correlate to a skin temp.  the factory already logs temperatures during run-in and it might be interesting to check those against a threshold value to determine if it is getting too hot.  However often this tuning is not done until late stages of the project so it would be in flux during development, creating even more factory churn and burden...

Comment 6 by hungte@chromium.org, Jul 10 2017

So sounds like I should change them to

 'performance' when we need to stress cpu, and
 'powersave' when leaving stress mode,
 and stop trying to set fixed speed (scaling_setspeed).

Does that sound ok?
That sounds like a correct abstraction, however he devil is in the details - for instance in the differences between hardware architectures and generations.

Comment 8 by hungte@chromium.org, Jul 11 2017

Summary: factory/kernel: Fix the default way to control CPU frequency. (was: kernel: Need a way to control CPU frequency.)
Yes I know, but it's better than nothing (it's broken currently).
We can revise this again when we see a different arch/gen that those settings do not work.
Project Member

Comment 9 by bugdroid1@chromium.org, Jul 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/factory/+/d4638bf38dfce49fb18748d88d1f2ec983ca8132

commit d4638bf38dfce49fb18748d88d1f2ec983ca8132
Author: Hung-Te Lin <hungte@chromium.org>
Date: Thu Jul 13 07:48:19 2017

test/utils: Fix cpufreq_manager to work on modern x86 CPUs.

The governor 'interactive' and 'userspace' are not supported by most
Chromebooks today, also scaling_setspeed is not supported.

BUG= chromium:736746 
TEST=make test; manually started factory UI and not seeing errors.

Change-Id: Ib3f48db81161bd6dc79076166f142fa4310a61dc
Reviewed-on: https://chromium-review.googlesource.com/566201
Commit-Ready: Hung-Te Lin <hungte@chromium.org>
Tested-by: Hung-Te Lin <hungte@chromium.org>
Reviewed-by: Hung-Te Lin <hungte@chromium.org>
Reviewed-by: Pi-Hsun Shih <pihsun@chromium.org>

[modify] https://crrev.com/d4638bf38dfce49fb18748d88d1f2ec983ca8132/py/test/utils/cpufreq_manager.py

Owner: hungte@chromium.org
Status: Fixed (was: Untriaged)
Project Member

Comment 11 by bugdroid1@chromium.org, Jul 13 2017

Labels: merge-merged-factory-eve-9667.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/factory/+/d3009c5c9168d17a595c426183e6b95927a41407

commit d3009c5c9168d17a595c426183e6b95927a41407
Author: Hung-Te Lin <hungte@chromium.org>
Date: Thu Jul 13 14:38:45 2017

test/utils: Fix cpufreq_manager to work on modern x86 CPUs.

The governor 'interactive' and 'userspace' are not supported by most
Chromebooks today, also scaling_setspeed is not supported.

BUG= chromium:736746 
TEST=make test; manually started factory UI and not seeing errors.

Change-Id: Ib3f48db81161bd6dc79076166f142fa4310a61dc
Reviewed-on: https://chromium-review.googlesource.com/567843
Reviewed-by: Hung-Te Lin <hungte@chromium.org>
Commit-Queue: Hung-Te Lin <hungte@chromium.org>
Tested-by: Hung-Te Lin <hungte@chromium.org>
Trybot-Ready: Hung-Te Lin <hungte@chromium.org>

[modify] https://crrev.com/d3009c5c9168d17a595c426183e6b95927a41407/py/test/utils/cpufreq_manager.py

Comment 12 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment