New issue
Advanced search Search tips

Issue 666725 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Feature

Blocked on:
issue 677787

Blocking:
issue 666724



Sign in to add a comment

Delete WebRTC performance tests that frequently cause false alarms

Project Member Reported by phoglund@chromium.org, Nov 18 2016

Issue description

Do the following:
- for each bug the past 6 months, note down which tests/metrics were alerting
- if the outcome tends to be that the alert led to a bug fix, keep it
- if the outcome tends to be a false alarm, delete the test (or find some way to fix it, if you feel adventurous).

We should think of false-positive metrics as flaky tests, which means they provide negative value. They're not acceptable, and we should delete them (but if it opens up a large hole in test coverage, launch some effort to cover that hole).


 
Project Member

Comment 1 by bugdroid1@chromium.org, Nov 30 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/external/webrtc.git/+/aff96361d3936e34a270f93ca11d0cf868239517

commit aff96361d3936e34a270f93ca11d0cf868239517
Author: phoglund <phoglund@webrtc.org>
Date: Wed Nov 30 13:04:39 2016

Greatly reduce number of level controller tests.

These tests have been causing a large number of false alerts, which is
tons of work for the perf sheriff. It's infeasible to test every single
permutation, so we must choose carefully. This CL reduces the number
of RESULT lines from the test from ~450 to ~50. I attempted to choose
interesting permutations, but you probably know better what's
interesting...

https://chromeperf.appspot.com/report?
sid=a7193c96f708018848ca07ad9c78ac657cadab3c70b3939c42bd7d70a092d61a
also suggests most of the metrics have enormous standard deviations,
so maybe you could look into how stable the metrics really are
and remove/stabilize the ones that aren't?

BUG= chromium:666725 

Review-Url: https://codereview.webrtc.org/2529393006
Cr-Commit-Position: refs/heads/master@{#15329}

[modify] https://crrev.com/aff96361d3936e34a270f93ca11d0cf868239517/webrtc/modules/audio_processing/level_controller/level_controller_complexity_unittest.cc

Cc: holmer@chromium.org mflodman@chromium.org
I did an analysis of our performance tests: go/webrtc-perf-test-evaluation

The top list of bad actors are by wontfix'd bugs are:
13 webrtc_perf browsertest (stats browsertest)
10 ramp-up-* tests
9 psnr/encode_time/sender_time (webrtc_perf_tests full_stack tests)
9 video_quality browser test
4 screenshare_slides + a bunch of ignores

The screenshare tests cause many alerts, but it also finds lots of legit issues. That's harder to say for the others above.
Magnus already talked about trimming down video perf tests; the above suggests you should cut down on full stack tests, ramp-up tests and maybe screenshare slides tests (though those seem to do a lot of good besides the alerts). Please go ahead and do that :) 

I will trim down on the perf browsertest and video quality tests. There are certain metrics that cause the majority of the regressions, like byte counters, so I'll get rid of or try to fix those.
Project Member

Comment 4 by bugdroid1@chromium.org, Dec 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/054f7b1e4a7b3bbf8b1effb31ae4620918fe8787

commit 054f7b1e4a7b3bbf8b1effb31ae4620918fe8787
Author: phoglund <phoglund@chromium.org>
Date: Wed Dec 14 09:48:22 2016

Delete flaky metrics from the browser perf tests.

These metrics suffer from that they count the number of bytes since
the beginning of the call, and average them during the 10-second
measuring window after 60-seconds of warmup. This means that if there's
any variance in the warmup period or the call gets established a little
earlier or later, it will have a big impact on the numbers.

I think we can just delete them. An alternative approach is to
do like in https://codereview.chromium.org/2545553003 and subtract the
count at the beginning of the measuring window, which means we
effectively measure the bitrate during the measuring window. That's
covered by the test added in that patch though, only it's going through
the new stats API instead of the old one.So, let's just delete them
from the old API for now.

BUG= 666725 
CC=ehmaldonado@chromium.org
R=kjellander@chromium.org

Review-Url: https://codereview.chromium.org/2568133006
Cr-Commit-Position: refs/heads/master@{#438474}

[modify] https://crrev.com/054f7b1e4a7b3bbf8b1effb31ae4620918fe8787/chrome/browser/media/webrtc/webrtc_browsertest_perf.cc

Cc: mcasas@chromium.org
Miguel, you think we can delete sample_duration + interarrival time from our content_browsertests? I can't find a single bug where they led to a fix, and they're moderately noisy (have caused 4 false alerts). Are those things covered by webrtc.smoothness nowadays?
Owner: ehmaldonado@chromium.org
I think deleting the content_browsertests metrics and attacking the worst offenders in webrc_perf_tests is what we have left to do here.

Miguel will take over from here on out since I'm heading out on parental leave.

Comment 7 by mcasas@chromium.org, Dec 14 2016

Cc: qiangchen@chromium.org niklase@chromium.org
#5: probably we can delete them.  I'm not super sure of webrtc.smoothness coverage, either niklase@ or qiangchen@ should know more.
What does the sample_duration + interarrival time tests actually measure? The range 0--300 microseconds doesn't make sense to me.
Blockedon: 677787
Edward: can reply to #8 and then decide if we should drop them?

After doing so, I think this bug should be closed.
I don't know what those are. Maybe Miguel knows?
Cc: -mflodman@chromium.org chfremer@chromium.org
They are the duration and time between two VideoCaptureDeviceClient::OnIncomingCapturedData events.

I don't know what that event is. Maybe when a frame from the camera becomes available?

chfremer: Do you know what those event mean?

https://cs.chromium.org/chromium/src/media/capture/video/video_capture_device_client.cc?type=cs&q=VideoCaptureDeviceClient::OnIncomingCapturedData
Also, they're​ not covered by webrtc.smoothness. And I don't think they belong there.
Cc: mflodman@chromium.org
#13: Yes, the method VideoCaptureDeviceClient::OnIncomingCaptureData() called by the platform-specific (or fake) video capture device implementations when they deliver a frame.

The method itself does many things. Based on the comment at [1] I assume the tracing was intended to measure the duration of rotating frames and as well as the libyuv-based conversion to I420. I am not sure how useful this measurement is, but it may be good to get an alert if these conversions suddenly become more expensive.

BTW, how can I find the corresponding graphs on chromeperf.appspot.com?

[1] https://cs.chromium.org/chromium/src/content/browser/webrtc/webrtc_getusermedia_browsertest.cc?l=648
Status: Fixed (was: Assigned)
Bumping this old discussion.
Re #16: Example graphs are at https://chromeperf.appspot.com/report?sid=8f5b5a97be2f761a2b87ca1a8218c84e806979e8860e431a85908f6e5fa53248

Looking back to over 2 years of data we haven't had a single regression on these metrics, but that can also mean those code paths haven't changed. Since they don't cause any flaky alerts I'm OK with keeping them. It would be good if they could be moved into a telemetry test though, but it's not that important.

Since this bug is about deleting tests frequently causing false alarms, I think we should close it now.

Sign in to add a comment