New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 628630 link

Starred by 3 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Feature

Blocked on:
issue 653659



Sign in to add a comment

Make WebRTC CrOS site tests report perf rather than hard-checking limits

Project Member Reported by phoglund@chromium.org, Jul 15 2016

Issue description

For

video_WebRtcCamera, 
video_WebRtcPeerConnectionWithCamera

and maybe

video_WebRtcMediaRecorder, 

Start reporting perf instead of hard-asserting limits. Hook up to WebRTC perf sheriff rotation and ensure results are monitored on n devices to be determined by Rohit.
 
Labels: -Type-Bug Type-Feature
Cc: blum@chromium.org
Project Member

Comment 3 by bugdroid1@chromium.org, Oct 1 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/377beb3a9c3d8acc61d4af7eedbb16f4fe591f77

commit 377beb3a9c3d8acc61d4af7eedbb16f4fe591f77
Author: Patrik Höglund <phoglund@chromium.org>
Date: Wed Sep 28 15:54:33 2016

Rewrite video_WebRtcCamera to report perf instead of hard-coded limits.

It hasn't worked very well to hard-assert that there are less than
10 black frames per run. All CrOS devices are different, and it's
much easier to deal with the differences (and act on any regressions)
if we report to perf instead.

Also clean up the error flow in the method that prints the perf
values, and rename some variables.

BUG= chromium:628630 
TEST=Ran test on Link platform.

Change-Id: I6434a0d6d598d2c0cd88a034c396b0e9e3484456
Reviewed-on: https://chromium-review.googlesource.com/390812
Commit-Ready: Patrik Höglund <phoglund@chromium.org>
Tested-by: Edward Lesmes <ehmaldonado@chromium.org>
Reviewed-by: Edward Lesmes <ehmaldonado@chromium.org>
Reviewed-by: Dan Shi <dshi@google.com>

[modify] https://crrev.com/377beb3a9c3d8acc61d4af7eedbb16f4fe591f77/client/site_tests/video_WebRtcCamera/control
[modify] https://crrev.com/377beb3a9c3d8acc61d4af7eedbb16f4fe591f77/client/site_tests/video_WebRtcCamera/video_WebRtcCamera.py

Rohit, the first test is in. The video_WebRtcCamera test hasn't showed up in the Chromium perf dashboard yet though. Do you think I need to do anything else to get it to show up, than what I did in the patch?
Ok, I can see my code appears to be executing, but it isn't reporting perf yet. I wonder, maybe it needs to be in the magical crosbolt_perf_perbuild suite that video_webrtcPerf is in?

It's currently in suite:camera_perbuild, suite:bvt-perbuild. Since this is now a performance test, it doesn't really do much unless it can report its perf results, so I think I need to take it out of those suites... Well, it will still fail if we completely fail to acquire the camera.

Rohit, do you think we will lose coverage by jumping off the camera_perbuild suite, e.g. if the camera suite runs on more devices than the perf suite?
Patrik, sorry for the delayed reply. 

https://chromium-review.googlesource.com/#/c/394006/ should start uploading perf results to the dashboard.

Right now the test is in suite:bvt-perbuild, suite:av_webcam.

Vin created a new device setup with ample ambiance lightning (suite:av_webcam) so we are planing to run WebRTC camera related tests in this test setup.




> Right now the test is in suite:bvt-perbuild, suite:av_webcam.

Well, not anymore, since I moved them to the performance test suite :) I did this because it seemed required to make results actually go up into the perf dashboard. Maybe I'm wrong here, and that it just takes 5+ days to get the tests to actually report results? The optimal would be to run on the suite with good lighting _and_ report perf into the dashboard.
Blockedon: 653659
I agree. I think we have everything ready to run this test in the new environment. 

Only blocked on  Issue 653659 .
Ok. I could not find a control file for the av_webcam suite so I assumed it's not up yet. I think the end state we want to be in here is to be in the av_webcam suite only. I'll send a patch to put both tests in av_webcam, bvt-perbuild for now and remove bvt-perbuild later (since perf numbers will be more useful for the well-lit av_webcam).
The first test results for video_webrtCamera are in! They look pretty unstable so far though. The total number of frames varies wildly among devices, and sometimes they're all frozen/black only to recover the next run. Also, since there's only a couple runs per day the blame lists are going to be enormous.

I think the answer is to narrow down the test to devices we know to be reliable. Which devices are going to be in the av_webcam suite? How often can we make those devices run tests? The more, the better.

https://chromeperf.appspot.com/report?sid=d0fa9e3aaaf3d75dc0b65d52e85068df63f155c160bbc9ed4909de9ad760b7e3
Ok, and the reason the peerconnection test isn't reporting yet is because I put it in the wrong suite, all right... https://chromium-review.googlesource.com/#/c/394630/ will fix that. Please take a look at that one, Rohit.
Once that's done, the next step is to hook this up to the WebRTC rotation, announce the new tests and document them so WebRTC sheriffs can actually look at results.
I chumped your change so we should be all set.
I agree, later we will be removing this test from bvt-perbuild to get the perf data from more controlled environment, and we can take action if required.

Vin is adding more devices into his setup.
Project Member

Comment 15 by bugdroid1@chromium.org, Oct 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a936f1297d6f75e1cad3ec25ea02517145dc2939

commit a936f1297d6f75e1cad3ec25ea02517145dc2939
Author: Chung-yih Wang <cywang@google.com>
Date: Fri Oct 07 15:27:02 2016

perf_dashboard_config.json: Add video_WebRtcPeerConnectionCamera

BUG= chromium:628630 
TEST=None

Change-Id: I7db39894213f41f884b74cea4b7bbb44f0830daa
Signed-off-by: Chung-yih Wang <cywang@google.com>
Reviewed-on: https://chromium-review.googlesource.com/394410
Reviewed-by: Patrik Höglund <phoglund@chromium.org>
Reviewed-by: Dan Shi <dshi@google.com>

[modify] https://crrev.com/a936f1297d6f75e1cad3ec25ea02517145dc2939/tko/perf_upload/perf_dashboard_config.json

Ok, I've looked at the results now and I think they look reasonably correct! Let's get off bvt-perbuild and onto av_webcam only then. We can't ask the WebRTC sheriffs to monitor the 30+ devices in bvt-perbuild, so I'm going to add the ones in av_webcam (10 devices): https://wmatrix.googleplex.com/unfiltered?releases=tot&suites=av_webcam and we'll see how it turns out.
And here, by the way, is apparent why the old tests didn't work very well: sometimes all frames are just black: https://chromeperf.appspot.com/report?sid=1491f26986746bfb52eb8b6aa622ab13874f3a6446edc3a3611bb427feace5a6

The code starts looking at frames when the video goes into onplay(), so it looks like we're just getting black frames off the camera. On the other hand, we haven't seen that from the getUserMedia test, which looks pretty solid: https://chromeperf.appspot.com/report?sid=f1f86f261f71ce00ae66a351affe243c25356600d17048c22d47966d586f9a66

I can at least see the peerconnection camera test has an error: it looks at video frames from the local preview, but that part is already covered by the getUserMedia test, so it should look at the remote view instead.
I see a few other things we could do better in the peerconnection test in particular. It uses setInterval, which isn't reliable at all. I'll file a bug to replace the whole frame capture mechanism with mediarecorder, and we should get more reliable results.
Ok, I've added the following to the WebRTC perf sheriff rotation now; will announce that to perf sheriffs and update documentation. Then I think we're done here (minus the spin-off bug to improve frame recording).

ChromeOSVideo/cros-auron_paine/video_WebRtcCamera/*
ChromeOSVideo/cros-candy/video_WebRtcCamera/*
ChromeOSVideo/cros-chell/video_WebRtcCamera/*
ChromeOSVideo/cros-falco/video_WebRtcCamera/*
ChromeOSVideo/cros-lars/video_WebRtcCamera/*
ChromeOSVideo/cros-link/video_WebRtcCamera/*
ChromeOSVideo/cros-nyan_blaze/video_WebRtcCamera/*
ChromeOSVideo/cros-peach_pit/video_WebRtcCamera/*
ChromeOSVideo/cros-peppy/video_WebRtcCamera/*
ChromeOSVideo/cros-veyron_minnie/video_WebRtcCamera/*
ChromeOSVideo/cros-auron_paine/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-candy/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-chell/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-falco/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-lars/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-link/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-nyan_blaze/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-peach_pit/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-peppy/video_WebRtcPeerConnectionWithCamera/*
ChromeOSVideo/cros-veyron_minnie/video_WebRtcPeerConnectionWithCamera/*
Thanks Patrick! Couple notes:

1. I just repaired the Wizpig and that *should* start showing up in test results today.
2. Will be installing a Gru later today in the lab. 

Also, one question:
We were thinking of including some other WebRTC tests in the av_webcam suite, for completeness - video_ChromeRTCHWDecodeUsed, video_ChromeRTCHWEncodeUsed & video_WebRtcMediaRecorder. Any concerns or objections?
Re #20: Ok! You can add whatever you want to av_webcam as far I am concerned. Do you mean we should also add those tests to the WebRTC rotation?

Also: how often does av_webcam run? The more often we can make it run, the better. Perf regressions are often hard to narrow down, and since we don't have auto-bisect it's good if we can get the blame lists down to a minimum.
Tests should now alert; perf sheriffs are notified, and documentation is updated: https://sites.google.com/a/google.com/rtc-platform/test-engineering/performance-tests#TOC-CrOS-Site-Tests-video_WebRtcCamera-video_WebRtcPeerConnectionWithCamera.

There is just one thing left to do, which is to get off bvt-perbuild. I had to hand-write patterns as you can see in #19. If we only run on the more limited device set in av_webcam, I can make the patterns simpler (also it won't require maintenance when the device set in av_webcam changes).
Spun out https://bugs.chromium.org/p/chromium/issues/detail?id=661472 to improve peerconnection test.
video_ChromeRTCHWDecodeUsed, video_ChromeRTCHWEncodeUsed are running in CrOS CQ and use fake video stream. These tests are mainly designed to test CrOS WebRTC HW encode/decode code path so they don't need to run in av_webcam, and not required to  be added in the rotation.
Project Member

Comment 26 by bugdroid1@chromium.org, Nov 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a4cb1a38f3a1133999d95ee1a54c99f4fc78e5bf

commit a4cb1a38f3a1133999d95ee1a54c99f4fc78e5bf
Author: Patrik Höglund <phoglund@chromium.org>
Date: Wed Nov 02 09:04:33 2016

Get WebRTC tests off bvt-perbuild and its huge device pool.

These tests now only really report reliable results if they run
on devices with adequate lighting. We have that in av_webcam but
not in the device pool of bvt-perbuild. Also, we can make the
WebRTC sheriff rotation monitoring rules simpler and reduce
the maintenance (since there's no way we can monitor all 30+
devices in bvt-perbuild; the 10 in av_webcam is more feasible).

This loses a minimal amount of correctness test coverage for
the devices that are in bvt_perbuild but not av_webcam; these
tests would have caught cases where WebRTC crashes or fails
before we get a call up. I think this will be addressed by
628633 though, where I'll implement some test that doesn't
look to closely at the actual camera frames. That test can
run on bvt-perbuild.

BUG= chromium:628630 
TEST=Can't test; control file change

Change-Id: I30cfca6894367601321889dae6049a9fd465da0e
Reviewed-on: https://chromium-review.googlesource.com/406827
Commit-Ready: Patrik Höglund <phoglund@chromium.org>
Tested-by: Rohit Makasana <rohitbm@chromium.org>
Reviewed-by: Rohit Makasana <rohitbm@chromium.org>
Reviewed-by: Vinayak Suley <vsuley@chromium.org>

[modify] https://crrev.com/a4cb1a38f3a1133999d95ee1a54c99f4fc78e5bf/client/site_tests/video_WebRtcPeerConnectionWithCamera/control
[modify] https://crrev.com/a4cb1a38f3a1133999d95ee1a54c99f4fc78e5bf/client/site_tests/video_WebRtcCamera/control

I tried to verify the tests are under monitoring, but I may be hitting bugs in the perf dashboard. Filed https://github.com/catapult-project/catapult/issues/2993....
Pinged issue, let's see if they say anything.
Hmm. I can see that alerts are being generated [1], but they are getting picked up by the "CrOS Video sheriff". This means they won't reach the WebRTC sheriff.

Rohit, is it alright to have the CrOS video sheriff triage failures in these tests? I think in general they should be filed on Blink > WebRTC > Video and triaged according to the regular video team triage process.

Documentation is at https://sites.google.com/a/google.com/rtc-platform/test-engineering/performance-tests#TOC-CrOS-Site-Tests-video_WebRtcCamera-video_WebRtcPeerConnectionWithCamera.

If you agree all I have to do is to remove the tests from the WebRTC sheriff config. Is there any CrOS video sheriff documentation I can update with descriptions of the new tests?

[1]: https://chromeperf.appspot.com/report?sid=c461e48c0e78ac2f3dedc1badb2608171be6f85c568dc96e9f1d507af39c76c4


Sorry, I am not aware of CrOS Video sheriffing. Could you clarify on picked up by the "CrOS Video sheriff"?

Pawel, is video-eng team monitoring video_* performance tests?
I mean this sheriff rotation: https://chromeperf.appspot.com/alerts?sortby=end_revision&sortdirection=down&sheriff=Chrome%20OS%20Video%20Perf%20Sheriff

I got the name somewhat wrong, its proper name is "ChromeOS Video Perf Sheriff". I can't find any mention of it that isn't years old :) Is the rotation abandoned?
Cc: wuchengli@chromium.org
I am not sure on that rotation. 

Pawel, Wu-Cheng to confirm if this video_* performance test sheriffing is still on as per #31 ?
Humm. It's still at 500 alerts (and 500 is the max, so that's probably just the tip of the iceberg). If you don't monitor your alerts you may as well delete the video_ tests for all the good they're going to do you. I understand why it generates so may alerts as well; if you click "list tests" there is an insane number of tests there, probably because of the many CrOS platforms.

With that said, performance test monitoring is hard. We have a cleanup effort underway for the tests in the WebRTC rotation: https://bugs.chromium.org/p/chromium/issues/detail?id=666724. Maybe you need a similar effort?

So short term, I tried to take the webrtc tests out of the CrOS rotation so they report to the WebRTC rotation, but looks like the perf dashboard doesn't accept negative patterns. I can rebuild the CrOS video rotation using positive patterns, but then maybe it's better to create a new perf root name (ChromeOSWebRtcPerf or something). The best solution is of course if you resurrect the rotation and start triaging our alerts :) WDYT?
I think creating a new WebRTC specific rotation makes more sense here.

I thought the dashboard only raises an alert whenever there is a regression with consistent reporting, not just one time spike. 
Cc: harpreet@chromium.org
Re #34: after offline discussion we decided to move the tests to the WebRTC rotation. Please review https://chromium-review.googlesource.com/#/c/414004/.

Re #34: the dashboard files alerts only if the increase is persistent, or if the spike is longer than five builds or so.
Project Member

Comment 37 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/f4cddc4c34f9da94208b5dc20a82b69e8a9d9498

commit f4cddc4c34f9da94208b5dc20a82b69e8a9d9498
Author: Patrik Höglund <phoglund@chromium.org>
Date: Wed Nov 23 09:46:50 2016

Moving video_WebRtc* tests to the WebRTC perf rotation.

This changes the perf path for the video_WebRtc* tests so they're
no longer picked up by the abandoned CrOS video perf sheriff
rotation. We can then add them to the WebRTC perf rotation.

I also pick up video_WebRtcPerf here, as a trial. We'll see how
stable and useful it is for alerting.

BUG= chromium:628630 
TEST=can't test, config change.

Change-Id: I25e7078308019f80cd16ab1564e8f76cb97ea356
Reviewed-on: https://chromium-review.googlesource.com/414004
Reviewed-by: Rohit Makasana <rohitbm@chromium.org>
Commit-Queue: Rohit Makasana <rohitbm@chromium.org>
Tested-by: Rohit Makasana <rohitbm@chromium.org>

[modify] https://crrev.com/f4cddc4c34f9da94208b5dc20a82b69e8a9d9498/tko/perf_upload/perf_dashboard_config.json

Owner: pprabhu@chromium.org
pprabhu@, can you push the new dashboard config to prod?
Owner: shuqianz@chromium.org
shuqianz@ is the deputy this week.
will do a push today
Ok. I've reconfigured the WebRTC rotation meanwhile to look at the following:

ChromeOSWebRtcVideo/*/video_WebRtcCamera/*
ChromeOSWebRtcVideo/*/video_WebRtcPeerConnectionWithCamera/*
ChromeOSWebRtcVideo/cros-auron-paine/video_WebRtcPerf/*
ChromeOSWebRtcVideo/cros-buddy/video_WebRtcPerf/*
ChromeOSWebRtcVideo/cros-panther/video_WebRtcPerf/*

I've already restricted the camera and peerconnection tests to reasonable set of devices earlier, so I'm more careful with webrtcperf until we know it's stable.
I see ChromeOSWebRtcVideo has appeared as a new master, under the old one. I'll try to migrate the data we have so we only see the new one...
Owner: phoglund@chromium.org
WebRtcCamera and PeerConnectionWithCamera are migrated. I will migrate at least the ones we have under monitoring for webrtcperf; a bug in the dashboard makes it really tedious to migrate all of them.
Ok, I've created migration jobs for all tests we have under monitoring. I'll wait for those to go through, then I think we're done here.
Status: Fixed (was: Assigned)
Confirmed the tests are now reporting and we're receiving alerts. I'm following up on some cosmetic problems with the dashboard here: https://github.com/catapult-project/catapult/issues/3038, but those are not a big deal. I'm calling this fixed now. Thanks all who helped!
Status: Verified (was: Fixed)
Thanks, Patrick! 

Sign in to add a comment