New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 696835 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Temperature telemetry displays improvements as regressions

Project Member Reported by tguilbert@chromium.org, Feb 28 2017

Issue description

Currently, temperature telemetry tests do not have any knowledge of whether a decrease in temperature is good or bad.

For example, this alert classifies a decrease in average temperature as a regression
https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDgxJiOjAgM

crouleau@ looked into this and found that this seemed to be the only chromium file that mentions the board temperature (https://cs.chromium.org/chromium/src/tools/perf/metrics/power.py?q=board_temperature+package:%5Echromium$&l=189).

crouleau@ also pointed out that none of the metrics in that file are using the improvement_direction (https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/value/improvement_direction.py) property of ScalarValue (https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/value/scalar.py#L16)

nednguyen@, do you have a suggestion as to how to fix this?

 
Cc: eakuefner@chromium.org charliea@chromium.org
Sorry, what is the context of this? Are you monitoring the temperature of Telemetry test?

One thing to be aware of is those power metrics are meant to be replaced by battor, which is more precise & more trustworthy. So please don't count on those.
Is there any case when a decrease in temperature *is* a regression? It seems like a decrease in temperature is pretty universally good for computers.
The context is that we are getting Telemetry alerts from decreases in temperature, and we figured it would be good to turn them off.
That means this metric has not work for a very long time, I think it's better to stop monitoring them & remove it.
I think that the metric does basically work: The data is consistent. It just flags improvements as regressions and does not flag regressions at all. All we need to do is to teach Telemetry which direction is correct. I could write a quick changelist to do this, but I'm not entirely sure whether or not it would work or how to test it.
I think the point that Ned was raising is: if this metric hasn't been providing useful alerts since it was written, and no one has bothered to fix it until this point, how useful is the metric really?
Cc: johnchen@chromium.org
Status: WontFix (was: Untriaged)
Probably no one bothered to fix it because we didn't realize it was broken. How would we know it was broken?

It seems like we can just mark this as WontFix and try to be careful not to make similar mistakes in our tbmv2 implementation:  issue 700160 .

So far as I know power metrics in tbmv2 do not contain temperature, and BattOr doesn't have any way to measure temperature. So if we want to keep measuring temperature in the future, we'll have to port the existing code (which uses Intel machine specific registers) to tbmv2. How much interest do we have in continuing supporting temperature measurements?
I don't think we care about temperature except as a proxy for power usage. BattOr measures power usage directly, so we don't care about temperature.
^ What Caleb said. Is the reason that we care about temperature in order to have a proxy for power usage on systems where BattOrs aren't available?

Sign in to add a comment