New issue
Advanced search Search tips

Issue 888114 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 26
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 884851



Sign in to add a comment

rendering benchmark produces both _per_frame and _per_second metrics

Project Member Reported by sadrul@chromium.org, Sep 21

Issue description

The rendering benchmark currently produces both _per_frame and _per_second metrics. If we are looking at the amount of work we are doing per frame, then the _per_frame metrics should be sufficient. So let's remove the _per_second metrics.

Removing the metrics should also help with reducing the memory usage of the OOPIFs that show the results in pinpoint [1], so hopefully we can run the rendering benchmarks on the full set of pages, and still see the results correctly without the oopif getting killed because of oom.

[1] https://github.com/catapult-project/catapult/issues/4571#issuecomment-413663566
 
My opinion: I think that ???_per_frame = ???_per_second * frame_times. So, I agree that it makes sense to keep 2 of the three metrics; having all three is redundant.

But, I think _per_frame should be removed, because among the tree metrics, the CPU per second metric has a much more clear meaning. The other two are hacky and have vague, implementation-based meanings (e.g.  crbug.com/888551  that you just filed). I think having one clean and one hacky metric is better than having two hacky metrics.
Once we fix the implementation to have a consistent meaning of 'frame', _per_frame metrics would be as informative as _per_second metrics, right?

|frame_times| and |*_per_frame| metrics used to be measured by different benchmarks (smoothness and thread_times), which I think is why the meaning of 'frame' diverged between the two over time. Since they both will be part of the same rendering benchmark from now on, there should be a single implementation, and so the meaning of 'frame' doesn't remain implementation-specific anymore. So I don't think the _per_frame metrics would be vague.

Both _per_frame and _per_second metrics essentially convey the same information. I personally look at _per_frame when doing comparison myself (and I believe that is the case for some other folks too). But I don't have strong opinions either way.

One additional note about _per_frame metrics: we currently measure this as 'total cpu time for the range-of-interest' / 'number of frames'. So this gives us an average cpu-time for each frame. Instead of doing this, I wanted to measure the actual cpu-time spent for each frame separately, and report that. That would allow us to track cpu-time for individual frames, and investigate individual frames that might be doing too much work. Something like this would be more difficult to do with _per_second metrics. So I guess I am actually leaning towards keeping the _per_frame metrics over the _per_second ones.
> But, I think _per_frame should be removed, because among the tree metrics,
> the CPU per second metric has a much more clear meaning. 

This is a case where something seems clearer in theory but is less useful in
practice.  It would be good to use these to debug some regressions or improvements
to get a personal sense for it.

The analogy to think of is when comparing car efficiency we look at
Miles-per-Gallon or Gallons-per-Mile.  It's not so useful to say a car uses
10 Gallons-per-Hour without looking at how many miles are traveled (how much
work is produced) for those gallons.
These are just my opinion and are not meant to block anything. You and Victor have more experience in using these metrics.

> Once we fix the implementation to have a consistent meaning of 'frame', _per_frame metrics would be as informative as _per_second metrics, right?

If we delete _per_frame, then we don't have the issues in  crbug.com/888551  at all :)

> ... So I don't think the _per_frame metrics would be vague.

the CPU per second metric has a more clear mining from a user's perspective: if it's high, the users see it's CPU fan turned on when they scroll a pate. Multiplying that by the average length of the display compositor frame has a less clear meaning from user's perspective. And, it does not add any more information.

> ... I wanted to measure the actual cpu-time spent for each frame separately, and report that.

I absolute agree that this would be a better metric than the current per_frame metric (that I'm proposing to remove). But, if you want this metric, I think we should have the CPU per second metric, too, because that information will be lost. E.g. assume that in a 10 second animation, you have a single CPU-heavy (5s) task. You will get many many 16ms frames with little CPU usage and a single bad frame with 5s CPU. The average will be dominated by the smooth frames and remain small. But, you have increased CPU usage by a factor of 2.
per_second isn't useful for optimizing.  In the past, we have made
performance improvements that reduce frame_times, and therefore increase
per_second metrics.  If we ever increase cpu usage without improving
performance, it will be reflected in per_frame metrics.

If you have a single CPU-heavy task for 5s on some thread, it would be
reflected in cpu time per frame metric for that thread because you'd get
~300 samples with ~16ms cpu time per frame for that thread even if the
animation runs at a smooth 60 fps.
I thought c#2 is talking about a new metric that we compute CPU for each frame separately and then look at some percentile of the distribution (the criteria for firing an alert) instead of the average; otherwise, if we look at the average, how is it different than the current _per_frame version?
We would look at average at first since that's the default, but once we have that data, it's trivial to transition to different %iles (using summaryOptions for the histograms).
The average is equivalent to the current _per_frame metric (assuming  crbug.com/888551  is fixed), isn't it? And the %iles, while interesting metric, will miss the 2x regression in the example in c#6.
The metrics also report max, min, std etc., so we can track all those too, trivially. The main thing to note here is that we would have per-frame costs for analysis, which I think we all agree would be useful.
Project Member

Comment 10 by bugdroid1@chromium.org, Sep 25

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bbd91a9041b819ae6bbd6c505b717aeb1150a16d

commit bbd91a9041b819ae6bbd6c505b717aeb1150a16d
Author: Sadrul Habib Chowdhury <sadrul@chromium.org>
Date: Tue Sep 25 01:02:33 2018

rendering: Remove _per_second metrics.

Remove the various _per_second metrics in favour of the _per_frame
metrics. (see linked bug for more details)

BUG= 888114 

Change-Id: I95c8b547cfdf03414f9abaec7d32ceee907df687
Reviewed-on: https://chromium-review.googlesource.com/1235340
Reviewed-by: Victor Miura <vmiura@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Sadrul Chowdhury <sadrul@chromium.org>
Cr-Commit-Position: refs/heads/master@{#593788}
[modify] https://crrev.com/bbd91a9041b819ae6bbd6c505b717aeb1150a16d/tools/perf/measurements/rendering_unittest.py
[modify] https://crrev.com/bbd91a9041b819ae6bbd6c505b717aeb1150a16d/tools/perf/measurements/thread_times_unittest.py
[modify] https://crrev.com/bbd91a9041b819ae6bbd6c505b717aeb1150a16d/tools/perf/metrics/timeline.py
[modify] https://crrev.com/bbd91a9041b819ae6bbd6c505b717aeb1150a16d/tools/perf/metrics/timeline_unittest.py

Project Member

Comment 11 by bugdroid1@chromium.org, Sep 28

The following revision refers to this bug:
  https://chromium.googlesource.com/catapult/+/a0faa9d4d70b8b2e68d73993a4841db2a93d71b5

commit a0faa9d4d70b8b2e68d73993a4841db2a93d71b5
Author: Ehsan Chiniforooshan <chiniforooshan@chromium.org>
Date: Fri Sep 28 21:41:21 2018

Telemetry: cpu_per_frame metrics in TBMv2

This is the last set of metrics that we need to migrate to TBMv2.
After landing this CL, I'll start deleting legacy codes.

Bug: chromium:760553
Bug:  chromium:888551 
Bug:  chromium:888114 

Change-Id: I5bf47cf0d6d693e0a1e793b18686911a1dcec078
Reviewed-on: https://chromium-review.googlesource.com/1244036
Commit-Queue: Ehsan Chiniforooshan <chiniforooshan@chromium.org>
Reviewed-by: Ben Hayden <benjhayden@chromium.org>

[modify] https://crrev.com/a0faa9d4d70b8b2e68d73993a4841db2a93d71b5/tracing/tracing/metrics/rendering/frame_time.html
[modify] https://crrev.com/a0faa9d4d70b8b2e68d73993a4841db2a93d71b5/tracing/tracing/metrics/rendering/rendering_metric.html
[modify] https://crrev.com/a0faa9d4d70b8b2e68d73993a4841db2a93d71b5/tracing/tracing/metrics/rendering/cpu_utilization_test.html
[modify] https://crrev.com/a0faa9d4d70b8b2e68d73993a4841db2a93d71b5/tracing/tracing/metrics/rendering/cpu_utilization.html

Project Member

Comment 12 by bugdroid1@chromium.org, Sep 29

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f0ac64f08cdcc2d964a60be45db795505784d931

commit f0ac64f08cdcc2d964a60be45db795505784d931
Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>
Date: Sat Sep 29 19:17:06 2018

Roll src/third_party/catapult d525ef309fca..98289bcecf60 (5 commits)

https://chromium.googlesource.com/catapult.git/+log/d525ef309fca..98289bcecf60


git log d525ef309fca..98289bcecf60 --date=short --no-merges --format='%ad %ae %s'
2018-09-28 simonhatch@chromium.org Dashboard - Add missing index for querying last alert.
2018-09-28 chiniforooshan@chromium.org Telemetry: cpu_per_frame metrics in TBMv2
2018-09-28 manojgupta@google.com benchmark_runner: Emit all stories in a benchmark.
2018-09-28 dtu@chromium.org [pinpoint] Don't set difference_count for try jobs.
2018-09-28 jbudorick@chromium.org devil: remove host arm build rules.


Created with:
  gclient setdep -r src/third_party/catapult@98289bcecf60

The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.

CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel

BUG= chromium:890325 ,chromium:760553, chromium:888551 , chromium:888114 , chromium:890041 , chromium:887888 
TBR=sullivan@chromium.org

Change-Id: I25e06429f53579449a9ebd2d1ba948ff6409f4e7
Reviewed-on: https://chromium-review.googlesource.com/1252266
Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#595322}
[modify] https://crrev.com/f0ac64f08cdcc2d964a60be45db795505784d931/DEPS

Cc: sadrul@chromium.org
Owner: chiniforooshan@chromium.org
This is now fixed?
Status: Fixed (was: Started)
We don't have _per_second metrics anymore.

Sign in to add a comment