New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 708809 link

Starred by 2 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

Monitor Windows timer interrupt frequency on test bots, and watch out for Go runtime bug

Project Member Reported by brucedaw...@chromium.org, Apr 5 2017

Issue description

I've heard that there are some plans to convert some of the swarming infrastructure from Python to Go.

Now, I'm sure Go is a fine language but the Go Windows runtime currently has a quirk which has performance implications, which we need to think about.

Stop me if you know about https://github.com/golang/go/issues/8687, which is a bug with the Go runtime raising the global Windows timer interrupt frequency to 1 kHz. This then changes how the Windows scheduler behaves, how timers behave, and it affects power consumption.

When I measured this years ago it caused an 0.3 W increase in power usage. Your mileage may vary, a lot, but we clearly need to watch out for this, especially since the timer frequency is raised anytime any Go program is running.

So... things we might want to do include:
- Monitor the timer interrupt frequency so we know if something is raising it. This is easy to do in various ways. Note that Chrome sometimes raises it during running but ideally it should not be raised on a quiet system
- Apply one of packages discussed in the bug to our Go programs so that they lower it back to normal immediately

See this blog post for more back story:
https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/

For spot checking of the timer frequency you can use the sysinternals' clockres tool
For nanosecond accurate measurement of when the timer frequency is changed (probably overkill) you can use ETW tracing
For finding who has raised the timer frequency you can use ETW tracing or "powercfg /energy /duration 0"

 
We don't have any long lived Go application running on the developers workstation.

I don't have concerns on the workers themselves and don't see value in fixing the tooling meant to only run on the infrastructure. The impact is ~zero.

Maybe a perf test so that tests themselves are not missing the timeEndPeriod() call?
There is no need to fix infrastructure-only machines. Agreed that the impact is tiny. It is only important on machines that will be *measuring* performance.

When processes end they implicitly get a timeEndPeriod call, but if one test raised but didn't lower the timer frequency that would be bad. Adding a test for that would be good, but where? It would ideally be the last test to run, on all test processes, which is not directly supported now I don't think

Cc: dtu@chromium.org mar...@chromium.org martiniss@chromium.org nednguyen@chromium.org
Components: Infra>Client>Perf
Owner: ----
Status: Untriaged (was: Assigned)
cc'ing a few person who may have an opinion.
Hmhh, I thought when perf tests are running, we make sure that there is no background process run by infra?
Re #4: That's the plan, but we don't do that currently.
Cc: charliea@chromium.org
Nvm my comment in #4, I understood the issue now. It was not about a background infra process but switching infra code to go can cause "the global Windows timer interrupt frequency to 1 kHz". 
I agree that there should be some test or some "prepare_machine_for_perf_testing" step that all it does is tuning OS param for perf testing.

+Marc: is it possible to run s.t like prepare_machine_for_perf_testing script with root access on swarming bot? I imagine that some of the configuration may require root.

Sign in to add a comment