New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 620565 link

Starred by 4 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue chrome-os-partner:50290



Sign in to add a comment

RtcVideoEncoder should fallback if encoder doesn't provide timestamp

Project Member Reported by wuchengli@chromium.org, Jun 16 2016

Issue description

Version: CrOS nyan-big 8451.0.0

What steps will reproduce the problem?
(1) https://apprtc.appspot.com/?debug=loopback&vsc=vp8

What do you see instead?
Video hanged.


As of 8451.0.0, nyan HW encoder doesn't support timestamp yet. RTCVideoEncoder::Impl::BitstreamBufferReady should use the current time if |timestamp| is 0. Some of the code in the original patch of #1 should be kept. Something like this:
if (timestamp != 0) {
  rtp_timestamp = static_cast<uint32_t>(timestamp.InMilliseconds()) * 90;
} else {
  rtp_timestamp = rtc::TimeMicros() / 1000;
}

#1
https://codereview.chromium.org/1996453003/diff/200001/content/renderer/media/rtc_video_encoder.cc


We need to fix this in M53. I think some platforms (Mac/android/Windows) may not support timestamp yet.
 
Labels: VideoShortList

Comment 2 by pbos@chromium.org, Jun 16 2016

Cc: mnilsson@chromium.org phoglund@chromium.org posciak@chromium.org mflodman@chromium.org
phoglund@: Do we have insufficient device coverage for webrtc end-to-end quality tests (PSNR etc)? This caused a video hang for a lot of ChromeOS devices, getting one frame through at most.
Re #2: Nope, it's just mac/linux/win for the full-blown video quality tests. We rely on manual testing and dogfooding etc. to catch bugs like these currently.

We do have webrtc stats tests ported to telemetry, and they run on most CrOS devices. Those results aren't monitored in practice because there's too many devices and too much churn and too much data. So if this bug shows up as a blip on some WebRTC stat, and we started to monitor them we could maybe have caught this bug.
I think recent regressions are happening a lot on ARM devices.

Can we start with top affected platforms? If we can set alerts on the perf dashboard, you can reduce manual monitoring as well.
Cc: vsu...@google.com
I have tried forcing to use current timestamp as in https://codereview.chromium.org/2079553002, but it does not solve the video hang problem.
Cc: jansson@chromium.org rohi...@chromium.org wuchengli@chromium.org avkodipelli@chromium.org kjellander@chromium.org
 Issue 620848  has been merged into this issue.
Labels: -Pri-2
Labels: Bug-Regression
pbos@ Is it possible to count the number of encoded frames in Javascript? Does WebRTC have an API for that? If yes, someone can write a test to catch this in the future.
I uploaded a similar patch. https://codereview.chromium.org/2071953003
Unfortunately nyan is not setting timestamp to 0. It's giving negative timestamp from testing. I'll checking with Nvidia how they set the timestamp. If the timestamp is uninitialized, we'll have to revert to the current time until nyan merges the new library (tracked in http://crosbug.com/p/50290).
Re #4, #10: There is already a test that looks for frozen frames: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/master/client/site_tests/video_WebRtcPeerConnectionWithCamera/video_WebRtcPeerConnectionWithCamera.py. I can see there are a large number of failures in that test from today already: https://wmatrix.googleplex.com/failures/video?tests=video_WebRtcPeerConnectionWithCamera. Another view: https://wmatrix.googleplex.com/reasons?suites=bvt-perbuild&tests=video_WebRtcPeerConnectionWithCamera

Those failures include nyan-blaze (is that the same as nyan-big?)

I dug into one of the test failures where I could find logs, and lo and behold: 06/16 09:21:10.405 INFO |video_WebRtcPeerCo:0129| FrameStats: {u'numFrames': 178, u'numBlackFrames': 4, u'numFrozenFrames': 176}
(source: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/66962962-chromeos-test/chromeos2-row3-rack8-host8/video_WebRtcPeerConnectionWithCamera/debug/)

The test fails if the number of frozen frames is above 10, and it detects 178 above. We don't need new tests, we need to look at the ones we already have. 

Given that this test is in go/wmatrix, I'm assuming the ball is in your park on the CrOS side, but let us know if there's anything we can do to help.

Re #4: Rohit, can you compile a list of the regressions you have seen on ARM devices? It would be interesting to see if there's anything we are not covering.
Re #12: sorry, I mean the total number of frames recorded is 178 and out of those 176 are frozen.
Re #12: And to be more concrete: what needs to happen to start monitoring the video_WebRtc* tests? Do they need to be monitored at an earlier stage to catch the bugs faster? I'm assuming you want more testing here because you feel the bug was caught too late. In that case, when is early enough?

Comment 15 by pbos@chromium.org, Jun 17 2016

Shouldn't these tests be hooked up to perf graphs like our perf bots?
Our goal was to setup these tests in bvt-cq to catch any regressions early enough but these tests are not stable enough to be added to bvt-cq.

Not talking about this specific issue but would it help to setup a few CrOS bots at WebRTC side commit queue to ensure that any WebRTC related regressions don't make into real builds?

Setting up alerts for WebRTC perf tests might be also useful if WebRTC test can monitor them.
Re #16: Ok then, is it just a question of making them more stable? Similar tests in Chromium land are stable so it can certainly be done.

As for WebRTC-side bots: seems they would have to be ARM bots to catch these issues, CrOS/Linux bots would not be enough. Are there any Chromium-side bots today that run on ARM (e.g. laptops)? We used to have CrOS/Linux bots in the WebRTC waterfalls a couple years back, but we removed them because they were constantly broken for infra reasons. They would not have helped in this case anyway.

Re #15: Yes, that would be better than hard-asserting on perf numbers, but cpaulin@ never got around to doing that. Site tests can report to chromeperf, right Rohit?

There are then these possibilities as I see it: 1) stabilize the current tests and add to bvt-cq 2) rewrite to report to chromeperf + add to WebRTC perf sheriff rotation or 3) all of the above.
numFrozenFrames should be enough to catch this. I wonder why nyan-big and nyan-kitty have been passing. big, blaze, and kitty should have the same issue.

https://wmatrix.googleplex.com/unfiltered?hide_missing=True&releases=tot&tests=video_WebRtcPeerConnectionWithCamera&days_back=20
Shenghao. FYI. I have a patch to change RtcVideoEncoder to use the current timestamp because of nyan. https://codereview.chromium.org/2071953003
Re #18: Yeah, a random pick from a successful run yields

FrameStats: {u'numFrames': 69, u'numBlackFrames': 0, u'numFrozenFrames': 0}

That looks a whole lot better. wuchengli@, can you still repro this manually on device? Would be nice if you could run the test on that device just to see what the test does. The test is fairly new so it could be missing things.
Sorry, I was OOO on friday.

#17, yes, autotests can report numbers to chromeperf. 1) & 2) sound like a good start. I can create tracking issues and assign them to you? 


Blockedon: chrome-os-partner:50290
Owner: wuchengli@chromium.org
Sheng-hao. Let me take this bug.

This issue is currently blocked by new nyan library.
Re #21: I can do 2) at least. You never answered why you considered the tests to be unstable or what the criteria for bvt-cq is. You have any data or bugs on this instability?
This issue is still blocked by new Nyan LDK. I'll upload a patch today or tomorrow.
pbos@ I'm testing my patch (https://codereview.chromium.org/2105683002). I synced chrome today. The timestamp from WebRTC was 0 in my testing. Any idea why? That is, I printed input_image.ntp_time_ms() in RTCVideoEncoder::Encode() and it was 0.


Comment 26 by pbos@chromium.org, Jun 28 2016

FTR: From offline discussion it looks like it's not propagated correctly when doing native handles.
Re #26: that's correct. I'll fix it in RtcVideoEncoder in the next patchset tomorrow.
Re #23: these tests fail on random boards mainly because of "Failed [1280, 720] local peer connection test" error. I will send an email with more details define AIs on this.

The criteria for including a test into bvt-cq is have the test with almost no flakiness.A randomly failing test can't be included in bvt-cq as it halts the development. 

https://wmatrix.googleplex.com/unfiltered?hide_missing=True&releases=tot&tests=video_WebRtcPeerConnectionWithCamera&days_back=15
Project Member

Comment 29 by bugdroid1@chromium.org, Jul 1 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a31c66943657af646c8baaaab4ce42bb0f650931

commit a31c66943657af646c8baaaab4ce42bb0f650931
Author: wuchengli <wuchengli@chromium.org>
Date: Fri Jul 01 07:39:20 2016

RtcVideoEncoder: use the timestamp from encoder.

Fallback to the current time if the timestamp from the encoder
callback is 0.

BUG= chromium:620565 
TEST=Run apprtc loopback on oak.

Review-Url: https://codereview.chromium.org/2105683002
Cr-Commit-Position: refs/heads/master@{#403423}

[modify] https://crrev.com/a31c66943657af646c8baaaab4ce42bb0f650931/content/renderer/media/gpu/rtc_video_encoder.cc

Status: Fixed (was: Assigned)
Status: Verified (was: Fixed)
Verified on 8530.10.0 / 53.0.2785.10

Sign in to add a comment