Implement TBMv2 version of thread_times metric. |
||||||||||||
Issue descriptionNed suggested that I consider try out the thread_times measurement as a more granular metric for measuring CPU power consumption. See Issue 640054 for more details. I did this. The first thing I noticed is that if the user story doesn't have a page interaction, the measurement fails in TimelineController """ 87 if len(self._smooth_records) == 0: 88 raise legacy_page_test.Failure('No interaction record was created.') """"
,
Aug 29 2016
In the long run, I think that it probably makes sense to have thread_times as a diagnostic for our other power metrics. That way, thread_times will help give context to changes in the power data (i.e. this regression follows a clear spike in thread times), but we're not paying attention to thread_times changes that don't affect power (which IMO we shouldn't be: at least not in power benchmarks).
,
Aug 29 2016
What do you mean by "thread_times changes that don't affect power"?
,
Aug 29 2016
Sorry: looking back, that wasn't the best explanation. In TBMv2, there are metrics, which are generally the things that you actually care about going up or down. From the power team's standpoint, power is the thing that we care about going up or down, and is thus the metric that we want to be alerting on. However, in TBMv2, we also have the option of giving additional context to each data point in the metric through something called a diagnostic. A diagnostic is basically numerical metadata that you can add to a value that helps you better understand why it might have gone up or down. What I'm saying is that, at least from the power standpoint, we don't actually care about whether thread_times increases unless it also causes energy consumption to increase. (Note: maybe other benchmarks should care, but the power benchmarks shouldn't.) Instead thread_times as a metric independent of our power metrics, we should attach thread_times as a diagnostic to each power metric that helps provide information on why the power may have increased. As far as the perf dashboard UI goes, this means that instead of having two separate metrics (thread times and average_power, for example) that you have to drag to the graph in the perf dashboard individually, there'd only be one metric, average_power. When you click on each data point in the average_power graph, you'd get something like: Revision range: aaaaa-bbbbb Average power: 15mW (Thread times: 350ms) The main reason for doing this is that our power metrics should give an accurate reflection of whether the state of power w/in Chrome is getting better or worse. If we managed to decrease thread times but increase average power by offloading more work to the graphics card. A couple of other candidates for diagnostics might be things like: - GPU time - Idle wakeups - Screen brightness - % of CPU occupied by non-Chrome processes at start of benchmark
,
Sep 8 2016
Thanks for the explanation. I agree we should add CPU time as a diagnostic to average power. That being said, CPU time might prove worthwhile as a metric, since average_power is computed via BattOr, which measures all processes, whereas CPU time can be scoped to only the processes that we care about. This means that CPU time is much less susceptible to noise from other processes. It will be prone to incorrect diagnoses when we upload work from CPU to GPU, but that's a rather exceptional use case. Perhaps average_power should be a diagnostic for CPU time?
,
Sep 8 2016
It looks like this task mostly involves porting a bunch of logic from tools/perf/metrics/timeline.py into TBMv2. I think this should be as straight forward as adding a new metric to third_party/catapult/tracing/tracing/metrics. I'll give that a shot and see what happens.
,
Sep 10 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c4b1e75d9334ca63343d5e44768c246d796c8386 commit c4b1e75d9334ca63343d5e44768c246d796c8386 Author: nednguyen <nednguyen@google.com> Date: Sat Sep 10 12:00:05 2016 Manually roll src/third_party/catapult/ da6d44e4a..c0b988891 (3 commits). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/da6d44e4a2a5..c0b988891fd1 $ git log da6d44e4a..c0b988891 --date=short --no-merges --format='%ad %ae %s' 2016-09-09 erikchen Add a new cpu time TBMv2 system health metric. 2016-09-09 nednguyen Update the lock operation in cloud_storage with better implementation using py_utils.lock 2016-09-09 rnephew [BattOr] Kill BattOr shell if there is a problem with communicating during clock sync. BUG= 640312 , 637904 , 645106 , 645720 TBR=catapult-sheriff@chromium.org, jbudorick@chromium.org Review-Url: https://codereview.chromium.org/2326063004 Cr-Commit-Position: refs/heads/master@{#417833} [modify] https://crrev.com/c4b1e75d9334ca63343d5e44768c246d796c8386/DEPS [modify] https://crrev.com/c4b1e75d9334ca63343d5e44768c246d796c8386/build/android/test_runner.pydeps
,
Sep 12 2016
nednguyen: I put together a CL to deal with "overhead" slices: https://codereview.chromium.org/2334863002/ However, it looks like no overhead slices are emitted by Chrome tracing. Maybe this should be removed? https://cs.chromium.org/search/?q=trace_event_overhead&sq=package:chromium&type=cs https://cs.chromium.org/chromium/src/tools/perf/metrics/timeline.py?q=trace_event_overhead&sq=package:chromium&dr=C&l=99
,
Sep 15 2016
Tried to enable the metric: https://codereview.chromium.org/2338433002/ Running into issues where Telemetry/tracing can't handle the size of outputted trace. > Same issue: > https://build.chromium.org/p/tryserver.chromium.perf/builders/mac_retina_perf_cq/builds/499/steps/battor.steady_state/logs/stdio > """ > INFO:root:Trace (TraceDataPart("cpuSnapshots")) of size 25539 bytes saved. > INFO:root:Trace (TraceDataPart("tabIds")) of size 40 bytes saved. > INFO:root:Trace (TraceDataPart("telemetry")) of size 22057 bytes saved. > INFO:root:Trace (TraceDataPart("traceEvents")) of size 330205233 bytes saved. > [ RUN ] /tmp/tmp7WE6SW.html > TraceImportError: Invalid string length > """
,
Sep 15 2016
,
Sep 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/68b995c4d3eb37217ce143c225a7d2eed81d320a commit 68b995c4d3eb37217ce143c225a7d2eed81d320a Author: erikchen <erikchen@chromium.org> Date: Tue Sep 20 01:23:15 2016 Add cpuTimeMetric to BattOr benchmark. BUG= 640312 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq Review-Url: https://codereview.chromium.org/2338433002 Cr-Commit-Position: refs/heads/master@{#419624} [modify] https://crrev.com/68b995c4d3eb37217ce143c225a7d2eed81d320a/tools/perf/benchmarks/battor.py [modify] https://crrev.com/68b995c4d3eb37217ce143c225a7d2eed81d320a/tools/perf/page_sets/idle_after_loading_stories.py
,
Sep 22 2016
I have a CL in progress that refactors the cpu time metric, and fixes some small logic issues. https://codereview.chromium.org/2351963006/ There are a couple more changes that need to happen: * Only consider slices that comes from Chrome processes. Note that chrome_model_helper only knows how to detect Renderer, Browser, and GPU processes. * Ignore async slices. * For total "trace" duration, only consider slices that come from Chrome processes.
,
Sep 24 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/175242c00bc4e762699f2b4fe28e667d672f65a3 commit 175242c00bc4e762699f2b4fe28e667d672f65a3 Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Fri Sep 23 23:58:41 2016 Roll src/third_party/catapult/ 7bd10eda4..a8deb272b (1 commit). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/7bd10eda47f1..a8deb272b550 $ git log 7bd10eda4..a8deb272b --date=short --no-merges --format='%ad %ae %s' 2016-09-23 erikchen Add range of interest support to cpuTimeMetric. BUG= 640312 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2368723002 Cr-Commit-Position: refs/heads/master@{#420793} [modify] https://crrev.com/175242c00bc4e762699f2b4fe28e667d672f65a3/DEPS
,
Sep 26 2016
My first thought for C#12 was that we could emit a origin:chrome metadata parameter for each Chrome process. Bug doing some digging around, it looks like we dynamically construct the process object by glomming together slices, and then assign the name field by using the existing *Helper processes and looking for specific signatures. I chatted with benjhayden@, and he suggested that processes have a parent-pid, so I could trace everything back to the root process. This assumes a tree hierarchy, with the browser as root], but seems like it could work. He suggested I cc primiano.
,
Sep 27 2016
Specifically, I wondered if per-process TracingControllers could log ppid in metadata events. Reconstructing the process tree would require that all of Chrome's processes run a TracingController, not just the browser and renderer processes, and that all of the per-process metadata events would be preserved into the log instead of rotated out of the ring buffer. Now that I think about it, though, logging pids is a security no-no, whereas a simple origin:chrome tag would be innocuous. Hm. What about coming at it from the other direction? Could you filter out events from processes that are known to not be chrome? Are the non-chrome tracing agents more easily enumerable than chrome processes? I would love to see the model_helpers/ get some attention. Definitely talk to primiano about what TracingController can do to help here. :-)
,
Sep 27 2016
> Now that I think about it, though, logging pids is a security no-no, whereas a simple origin:chrome tag would be innocuous. Hm. I think we already do that, no? each individual trace event has the pid. Definitely supportive to add some metadata to tell deterministically (without magic guesses) which process is which. I guess I just miss a bit of context (didn't have time to read this full bug yet). Can we talk about this in the upcoming PI meeting? (sorry, bit out of bandwidth this week)
,
Sep 27 2016
Right, the current process's pid is logged, but I thought that it would be bad form to log another processes's pid. But you're right, if malicious code is injected, then it wouldn't need to look in the trace log, it can just call ppid() itself if it wants. Maybe I'm overcomplicating it, maybe logging ppid and reconstructing the process tree would work. The Performance Instrumentation and Modelling weekly on Thursday, right? (I think all perf-insights (the old PI) is gone. :-) Sure, I'll talk to fmeawad about putting it on the agenda.
,
Sep 28 2016
,
Sep 30 2016
The question I am trying to answer is: What is the CPU usage of Chrome in a trace. More generally: Chrome does too much [unnecessary] work. I want a metric that can capture the amount of work that Chrome is doing, so that we can measure this and drive it down. If all Chrome work is captured by task schedulers, and we log the cpu time for each top level task, this should be relatively easy to compute. This information is contained in a trace that includes the "toplevel" category, but there are several points that need fixing. 1) Traces can include slices from other sources [systrace, telemetry, etc.]. We don't want to count this. Note that we don't want to count systrace since that would double-count work. 2) When looking at some real traces, I noticed that there are slices with the "toplevel" label that are nested in other slices. Maybe this isn't the right label to be using. I guess another solution is to add another label to tasks posted by task schedulers, and only count slices that have that special label. 3) I wonder how we measure CPU duration vs slice duration. e.g. if a task spends a long time waiting on disk [but CPU is idle], slice duration should be long and CPU duration should be low. Is this actually the case?
,
Sep 30 2016
Re #19, the other tricky thing is dealing with being descheduled. We may want to know CPU duration vs slice duration vs self duration.
,
Sep 30 2016
#19 1) How would counting systrace slices double-count cpu activity? Is there a bug with cpuSelfTime? Do we need to change systrace somehow, like make it set the pid field of its events to its own pid, and use a different field to indicate the pid of the target process? I am not very familiar with systrace, but if it's doing something wrong then we can fix it, right? 2) Can you filter by slice title as well as category? 3) I believe so, yes. Slice.cpuDuration and Slice.cpuSelfTime should handle that case, unless you've found clear indication of a specific bug.
,
Sep 30 2016
Just chatted with benjhayden, fmeawad, nednguyen. * Only Chrome ever emits the cpuDuration property, so in the short term, that's an acceptable way of distinguishing whether a slice comes from Chrome. * For normalization time, we can find the browser process, so we can find the first/last slices there, and that will allow us to not be egregiously off with normalization.
,
Oct 28 2016
I was thinking about the second bullet point in c#22. I no longer think it's a good idea. Doing so makes the assumption that there is a browser process in a system that is otherwise Chrome-agnostic. This has potential failure conditions, just like the existing behavior, which I think is "good enough". The thread_times metrics looks pretty stable, so I'm going to close this bug. Feel free to reopen if you think there is more discussion necessary.
,
Oct 29 2016
Erik: I disagree with your assessment in @23. To be precise, there metric here is intended to be "cpu time usage of Chrome". What we are computing at the moment is the total number of cpu time of everything in the traces normalized by the trace's total time. It's just happens to approximate the cpu usage well enough because: 1) Our combined trace currently only have cpu time data on Chrome's event. 2) Our combined trace's begin & end time is not to far off from when Chrome actually starts & end. We won't know when conditions 1) & 2) will be break, so I think the metrics' implementation should not be considered complete.
,
Oct 29 2016
,
Oct 31 2016
I've been doing some CPU usage testing of Chrome and Edge during media playback and one think that this reminds me of is that CPU usage caused by chrome is not entirely in chrome.exe. When playing back h264 video in Chrome on a SurfaceBook I see three Chrome processes that use 4.65%, 4.40% and 3.25% of CPU time. I also see dwm.exe (desktop compositing) using 0.80% and audiodg.exe (audio mixing) also using 0.80%. The dwm.exe and audiodg.exe CPU time is triggered by chrome and is therefore a real part of the cost of playing video. I also a small amount of CPU time in the system process - it is possible for Chrome to trigger significant work here. Sometimes the biggest CPU optimizations actually show up as reductions in these other processes. In some cases changes in how we handle video, audio, and networking will change the CPU usage in these other processes, and ideally we would account for this. One way to account for all of these costs would be to monitor idle time instead of Chrome.exe time. This has the advantage of automatically accounting for everything. It also has the disadvantage of being more sensitive to noise caused by random Windows processes that spring to life for random Windows reasons. Something to consider.
,
Oct 31 2016
CL landed: https://codereview.chromium.org/2446423009/ brucedawson: We do have existing mechanisms to monitor CPU usage of all processes on Mac, although the values aren't summed. I believe the telemetry team is actively working on tracking down and getting rid of random processes that start themselves while we're runnign tests. This bug tracks TBMv2 thread_times, which by definition uses data from tracing.
,
Nov 1 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/25ae60df508ec4ae305ba3b5610793af69e5a0be commit 25ae60df508ec4ae305ba3b5610793af69e5a0be Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Tue Nov 01 01:03:35 2016 Roll src/third_party/catapult/ 3c6de14e2..b28a7dab0 (2 commits). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/3c6de14e28c7..b28a7dab0434 $ git log 3c6de14e2..b28a7dab0 --date=short --no-merges --format='%ad %ae %s' 2016-10-31 erikchen Normalize thread_times metric against the bounds of the browser process. 2016-10-31 eakuefner [Telemetry] Add support for macOS Sierra BUG= 640312 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2463743004 Cr-Commit-Position: refs/heads/master@{#428893} [modify] https://crrev.com/25ae60df508ec4ae305ba3b5610793af69e5a0be/DEPS
,
Nov 3 2016
I was trying to use the thread_times metric on a real trace: https://bugs.chromium.org/p/chromium/issues/detail?id=646211#c6 The problem I'm having is that the thread_times metric includes threads from the chrome://tracing process which is very CPU intensive. As a short term solution, I propose checking the process's labels i.e. if (processes.labels[0] == "chrome://tracing") I imagine we'll want a better long term solution. benjhayden@, nednguyen@: does this sound okay in the short term?
,
Nov 3 2016
Erik: I forgot to mention that we have rendererHelper.isChromeTracingUI method (https://github.com/catapult-project/catapult/blob/master/tracing/tracing/metrics/system_health/loading_metric.html#L426)
,
Nov 3 2016
Ha. It would appear someone beat me to it:
"""
this.isChromeTracingUI_ =
process.labels !== undefined &&
process.labels.length === 1 &&
process.labels[0] === 'chrome://tracing';
"""
,
Nov 5 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c9da801622a97bfa43a367741e6bf666e39e2b1e commit c9da801622a97bfa43a367741e6bf666e39e2b1e Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Sat Nov 05 01:54:43 2016 Roll src/third_party/catapult/ 22b8550d6..3660f3bfa (2 commits). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/22b8550d67b9..3660f3bfa20a $ git log 22b8550d6..3660f3bfa --date=short --no-merges --format='%ad %ae %s' 2016-11-04 erikchen Ignore tracing process for TBMv2 cpu time metric. 2016-11-04 lpy Keep selected field inside diagnostics table for consistency. BUG= 640312 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2475373003 Cr-Commit-Position: refs/heads/master@{#430124} [modify] https://crrev.com/c9da801622a97bfa43a367741e6bf666e39e2b1e/DEPS
,
Nov 9 2016
|
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by nedngu...@google.com
, Aug 23 2016That is because the measurement originally created to measure cpu usage of animations only. To mark animation spans in the browser, the pages in the page_set would use: action_runner.CreateInteractionRecord('Scrolling') to mark the region where measurement is supposed to apply to.