New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 681918 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

puppet causing noise on Telemetry Mac power benchmarks due to periods of ~80% CPU usage

Project Member Reported by charliea@chromium.org, Jan 17 2017

Issue description

It seems that puppet is causing noise in our Telemetry Mac power benchmarks (mac-rel-retina).

Looking at the perf dashboard for average power use while sitting idle on about:blank (see puppet_ran.png), there are clear spikes for particular benchmark runs. Digging into the trace for this run (see puppet.html), there's a clear section of the trace towards the end where the power spikes. Our CPU snapshotting mechanism shows that, during this time, puppet is running and consuming 80% of the CPU (see puppet_cpu_snapshot.png).

Compare this to one of the lower-power traces (see non_puppet.html), where no such high-power area of the trace exists.

maruel@, is there any way that we can prevent puppet from running during benchmarks so that it doesn't introduce noise into the results? Do you have any idea who I might be able to talk to about this?
 
pupper_ran.png
86.4 KB View Download
puppet.html
3.9 MB View Download
puppet_cpu_snapshot.png
258 KB View Download
non_puppet.html
3.9 MB View Download
Cc: -charliea@google.com
Cc: rnep...@chromium.org

Comment 3 by mar...@chromium.org, Jan 17 2017

Cc: iannucci@chromium.org
Robbie worked on a plan for this but that will be only for tasks run on Swarming.
Cc: benhenry@chromium.org dpranke@chromium.org
Dirk, for background: Charlie is working on power benchmarks using BattOr. Power benchmarks are even more sensitive to background noise than regular benchmarks. We've added a "CPU snapshot" feature into tracing where we can look at spikes in power usage that seem unrelated to Chrome and correlate with other processes on the machine that were running and using CPU.

For this particular issue, it's puppet, but labs recently updated their version of gagent to fix a similar issue.

The long term plan for this has been swarming dark mode: https://docs.google.com/document/d/1xPhPCtXXsRSqY74hh4DjAt2hvNDiq0w1XLe7VmumpkA/edit

Our desktop bots are now swarmed, and Charlie is primarily looking at ways to reduce noise on desktop.
Given that all perf benchmarks will soon be run on swarmed bots (still waiting on you, Android), I think that a plan that addresses this on only swarmed bots is fine with me. However, it is important to me that this get resolved in the short-term and we don't have to wait on a distant long-term fix.
Cc: bpastene@chromium.org
Owner: friedman@chromium.org
Status: Assigned (was: Untriaged)
Assigning to friedman, but he's still out on leave. Given that we need this when Android is on Swarming, let's make sure Ben and Elliott sign off before moving all of Android infra to swarming.
This has nothing to do with Android, it's about Mac, so not a blocker for android perf -> swarming.
benhenry@, do you have an estimate of when this might be fixed by? As the graphs above show, this is adding quite a bit of noise to BattOr power benchmarks.
benhenry@ and I talked offline, and he confirmed my suspicion that friedman@ isn't likely to get to this within the next week or two due to the fact that he's not due back from paternity leave until 1/30. I told benhenry@ that I'd really like this done within a month from now, but it isn't ruining our results so much that it's a P1.
Is this not a dupe of crbug.com/416072 ?
Sorry - I filed it as a separate bug because it wasn't clear to me if Puppet even *should* be consuming as much CPU as it currently is. I know that when we noticed this for gagent, it was a simple bug that only required a roll of gagent in order to fix it. In an ideal world, the fix would be as easy as that for puppet.
Alright, we really need to move the ball forward here.

Elliott - what question are you waiting for speed to answer? Please restate explicitly.
Labels: Performance-Power
Comment 10, this seems like a dupe of the main issue.
Mergedinto: 416072
Status: Duplicate (was: Assigned)
Cool story, bro. Here, let me help you.

Sign in to add a comment