New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 770596 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

video_VDAPerf.h264: performance major regression on squawks

Project Member Reported by hywu@chromium.org, Oct 2 2017

Issue description

Comment 1 by hywu@chromium.org, Oct 2 2017

Summary: video_VDAPerf.h264: performance major regression on squawks (was: video_VDAPerf.h264: performance major regression on Kevin)
Labels: videoshortlist
Owner: hiroh@chromium.org
Status: Assigned (was: Untriaged)
Owner: acourbot@chromium.org
Taking ownership for investigation.
Status: Started (was: Assigned)
Labels: -Pri-3 Pri-1
Raising priority as this is a quite large regression.
Cc: hiroh@chromium.org
Mmm it seems like I cannot reproduce this on Clapper. I tried 62.9901.32.0, 63.9964.0.0 and a local ToT. For all three crowd2160_h264.cpu_usage.kernel was consisten, around 0.27-0.28. hiroh@, do we have hard evidence that this is reproducible on Clapper? Otherwise I may need a Squawks device.
Could reproduce the result of crowd2160_h264.cpu_usage.kernel == 0.82 on chromeos6-row2-rack12-host22.cros with R63-9997.0.0 (which was already flashed when I locked it). Flashing on the lab is quite slow though, so bisecting may take some time.

Comment 8 by hiroh@chromium.org, Oct 4 2017

Although I have never tried by myself, it should be reproduced on clapper according to the following graph.
https://chromeperf.appspot.com/report?sid=6bc0bccc3a79a48d50d5dc2e9000a05827685442c74d33bd32b9013b9a425ce9&rev=32220000996400000
Mmm, you're right. Let me double-check then, it is strange I could repro this on the lab but not on the local clapper.

Btw, the graph timeline is strange: 2017-10-03 comes before 2017-09-23 for some reason. Can someone explain that?
> Btw, the graph timeline is strange: 2017-10-03 comes before 2017-09-23 for some reason. Can someone explain that?

I think this is because the measured value is sorted by ChromeOS version with Chrome version.
ChromeOS for R62 branch stops 9901 and incremented the second number like 	9901.36.0 -> 9901.37.0.
Therefore, any measured value of 63.*.0.0 are followed by ones for any measured value 62.9901.*.0.
Otherwise, R62 and R63 ChromeOS are scrambled on the graph.
Very weird. I could test on a local Squawks, both R63-9964.0.0 and R63-10000.0.0. crowd2160_h264.cpu_usage.kernel is around 0.28 in both cases, same as the local Clapper.

The difference seems to come from whether we are running the test locally or on a lab machine?
Turns out the lab and local squawks on which I tested had different memory configurations:

Remote (chromeos6-row2-rack12-host22):
MemTotal:        1963636 kB

Local:
MemTotal:        3958800 kB

All the lab machines within pool:suite have the sku:squawks_intel_dual_2Gb label. Would be interesting to borrow (lock) one of the ones with sku:squawks_intel_dual_4Gb and see if we can repro on them, but they are not within the suite pool.
We observed when running R63-9964.0.0 on the 4GB squawks that available memory was falling below 1.3GB, which suggests that 2GB devices would run out of memory and could explain the regression. However, after trying R62-9901.32.0 on the same device, I could observe the same metric. So excessive memory consumption does not seem to be the cause of the regression here.

However, after double-checking the metrics, I noticed that the regression *could* be observed on the 4GB device as well, contrary to what I initially stated:

R62-9901.32.0:
cpu_usage.kernel: 0.137
cpu_usage.user: 0.058

R63-9964.0.0:
cpu_usage.kernel: 0.29
cpu_usage.user: 0.09

The numbers are lower than 2GB devices (probably because of lower memory pressure), but the difference is quite noticeable. So it looks like we have a basis for bisecting locally!

Another observation, this time when running the R62-9901.32.0 VDAPerf binary on a R63-9964.0.0 sysroot:
cpu_usage.kernel: 0.136
cpu_usage.user: 0.057

So looks like the regression is introduced by user-space. Testing VDAPerf is fast, so I am now trying to narrow the range as much as possible.
Result of bisect is that regression has been introduced by 9964 itself. 9963 still shows good results:

cpu_usage.kernel: 0.136
cpu_usage.user: 0.06

Which means the culprit is hiding within this set of changes, kernel changes aside:

https://crosland.corp.google.com/log/9963.0.0..9964.0.0
Also Chromium version has switched from 63.0.3218.0 to 63.0.3222.0, so that may be the cause too.
Cc: owenlin@chromium.org
Bisected culprit to https://chromium-review.googlesource.com/c/chromium/src/+/654459.

Looks like this change only affects the tests, so it does not introduce a real-life regression. Still it would be interesting to understand why offscreen surfaces imply more kernel cpu usage.
Status: WontFix (was: Started)
So, it turns out the cpu_usage metrics are computed from the output of the "time" command that is run on video_decode_accelerator_unittest. Since we do not render anymore, the total runtime decreases, but the kernel time and user time remain the same: this is what actually makes these metrics increase.

In other words, there is no regression and we just need to come with a better metric to measure performance. Closing this as not a bug.
Opened crbug.com/771919 to track the replacement of these metrics.

Sign in to add a comment