samus: many decode errors on apprtc loopback with H264 |
|||||||||||||||||||||||
Issue descriptionReproduced on samus M55 CrOS 8867.0.0. Previously not reproducible on M54 CrOS 8556.0.0. Steps: 1. Navigate to https://appr.tc/?debug=loopback&vsc=h264. 2. Observe ui logs, which are filled with: ERROR:h264_decoder_impl.cc(333)] avcodec_decode_video2 error: -1094995529 However, about:histograms/Media.RTC shows HW encoder and decoder were initialized successfully. Emircan: would you be able to please triage? Thank you.
,
Oct 7 2016
mflodman@ can you triage? posciak@ are you getting working video for a while before this happens or does it fail immediately? emircan@ do you know what settings the HW encoder is using or if this has changed recently? For example, we currently only support baseline profile (fyi +magjed is working on high profile support), any changes to settings need to be negotiated and supported by both HW and SW implementations.
,
Oct 10 2016
,
Oct 10 2016
hbos@ there is one change regarding scaling input frames, see issue 630577 . Re changes within CrOS, wuchengli@ and posciak@ would know more.
,
Oct 11 2016
Magnus, Can someone in your team take a first look at this?
,
Oct 17 2016
Sorry for ping ponging, but I don't think the mobile team is the right owner for SW codec errors. hbos@ is the sole author of webrtc/modules/video_coding/codecs/h264/h264_decoder_impl.cc, but if he can't take this, then maybe some owner of webrtc/modules/video_coding/codecs? Stefan - Can you take a look?
,
Oct 17 2016
I will not be able to work on this until next week since I'm travelling. Erik, can you work with magjed and hbos to try to reproduce and figure out what's happened here? My guess would be that we're passing down delta frames without the corresponding key frame. The jitter buffer probably thinks it's ok since it doesn't know the HW codec has failed and fallen back to SW. I would have expected it to recover after some time though since it returns an error which in turn should lead to a PLI being sent by the receiver. https://cs.chromium.org/chromium/src/third_party/webrtc/modules/video_coding/codecs/h264/h264_decoder_impl.cc?rcl=0&l=330 I think the right thing would have been to reset the jitter buffer and send a PLI once we fall back to software, but even if we don't I expect things to get resolved after some time when a new key frame is received. A simple thing to try could be to ensure that we have decoded a key frame in h264_decoder_impl.cc before we decode any delta frames. If we get a delta frame before a key frame we should return an error so that a PLI is sent.
,
Oct 19 2016
Is this also reproducible in Hangout?
,
Oct 19 2016
I tried reproducing on my workstation without success, so I'm assuming samus has some kind of hw codec? Is it reproducible on any other platform? I'm having trouble finding a Pixel2 to test on...
,
Oct 24 2016
Yes, samus has both a H264 HW encoder and decoder. But this should be reproducible on basically any Chrome OS device apart from really old ones, as almost anything has a HW H264 decoder.
,
Oct 26 2016
I can reproduce it on a guado running 56.0.2899. AppRTC loopback does not start. I dont see the logs mentioned in opening post as I just have a release build image. I tried running an AppRTC call between CrOs and Mac. What I see is that on CrOS I get both encode/decode going, but nothing get decoded on Mac. Logs showed that Mac does not receive any frames to decode.
,
Oct 26 2016
Note that #11 is using H264. Using VP8(HW) and VP9(SW) works on AppRTC loopback. This device crashes in login screen when I deploy a debug build. So I added some logs to release to see what is happening. HW Encode is working fine. RTCVideoDecoder::Create is called and creates Hw decoder as well. However, RTCVideoDecoder::InitDecode() never gets called. https://paste.googleplex.com/5437628061057024 I wonder if this is something to do with H264 profiles matching. magjed@, did that make it to M55?
,
Oct 26 2016
> I wonder if this is something to do with H264 profiles matching. magjed@, did that make it to M55? No, no changes to H264 profiles matching in M55.
,
Oct 27 2016
,
Oct 27 2016
I've finally managed to get a suitable chromebook and repro this. Didn't see the mentioned logging though, or any other logging so far that looks suspicious. When I try this out I can get artifacts if I put finger over the camera a few times, but then the artifacts go away if I move around. I see a few mentions about resetting codec from hd to vga and back, wondering if that can cause anything.
,
Oct 27 2016
It looks like there are two issues here. AppRTC loopback not starting for H264 that I mentioned on #11, #12 does not happen on M55 (checked 8872.27.0). So, I filed a seperate bug about it, see 660221. On M55 8841.0.0 guado, I can reproduce the issue and logs mentioned in the opening post [0]. There are no error logs from rtc_video_decoder.cc, so it looks like it is hitting the line on resolution change not supported[1]. This is in line with what sprang@ explained. Note that right before the errors start, there is a V4L2VideoEncodeAccelerator::SetOutputFormat() error. [0] https://paste.googleplex.com/4945582648983552 [1] https://cs.chromium.org/chromium/src/content/renderer/media/gpu/rtc_video_decoder.cc?rcl=1477581616&l=223
,
Oct 28 2016
@15: Which Chromebook did you try on? This seems to reproduce only on Intels from my testing...
,
Oct 28 2016
@16: did something change wrt. resolution change? I think previously we were reinitializing RTCVideoDecoder on res change? Would we be getting an error from avcodec_decode_video2 because of falling back to software on res change? Also, if I disable HW *encoder*, the errors are gone. So I think this actually suggests that it's the encoded stream produced by HW encoder causing issues for the SW decoder. This would mean we have three issues: 1) loopback for H264 not starting 2) resolution change not supported by RTCVD and fallback to SW decoder 3) stream produced by HW encoder in loopback resulting in above decode errors when using SW decoder to decode it.
,
Oct 28 2016
I added logs to print out the decode frame size. The frame that causes error has size 0x0, so this might be similar to earlier issue 641600. However, I printed out the frames coming from rtc_video_encoder and they are all 1280x720. https://paste.googleplex.com/6124570230652928 @17: I am not aware of any changes on that behavior. Also, 2 isn't necessarily a bug since there isn't much to do if the size is not supported.
,
Oct 31 2016
I used a chromebook minnie. Looking closer it seems that I did in fact only use h264 hw decoder and did encoding in software (which explains the poor frame rate). With sw decode, I could not reproduce the issue. I'm building a custom image, so I can do some logging.
,
Nov 1 2016
Sorry, unfortunately, minnie actually does not support H264 HW encode... Do you have any Intel-based chromebooks available?
,
Nov 8 2016
Hi, would you perhaps have any updates please? Thanks.
,
Nov 8 2016
I have not been able to get hold of an intel chromebook I can put in dev mode, but since can reproduce this with only hardware decode it went ahead with that. I've been trying different things, like disabling CPU adaptation to prevent resolution changes, but I can still trigger the errors. Also looked into a few bugs related to H264 decodability (https://codereview.webrtc.org/2385143002), but still hit errors. I also actually triggered this on M54, but not as frequently. I'm wondering if there are actually two separate issues. As I don't really have time to spend on this right now, I'd appreciate if someone also with access to the mentioned hardware can take over. +niklase, didn't your team do some work on hw codecs? Feel free to take over or reassign if this is wrong.
,
Nov 9 2016
Over to emircan that looked at this today.
,
Nov 9 2016
sprang@, as I explained on #19, what triggers the errors is that Chrome receives a keyframe with size 0x0 in rtc_video_decoder from webrtc. After that it falls back to software and errors follow. See l.2513 on https://paste.googleplex.com/5013609293807616. I added logs in video_receiver.cc and they also point to 0x0 keyframe parsed from the packets. I also added logs to see if rtc_video_encoder sends a corrupt frame, but they are all 1280x720 and have non-zero payload sizes. Another interesting observation here is that we have ~4 key frames per second, which is too high. Do you think any of the changes rtp_format_h264.cc would be related?https://chromium.googlesource.com/external/webrtc/trunk/webrtc/+log/master/modules/rtp_rtcp/source/rtp_format_h264.cc I haven't worked on WebRTC packet parsing or hardware H264 for CrOS, so I am really confused what is going wrong here. I can help you with debug logs if you can give me a range to look at, but please take ownership of the bug if you think it is appropriate. posciak@, do you think https://codereview.chromium.org/2274493002 would require changes in WebRTC?
,
Nov 9 2016
I'm not seeing that behavior on Minnie, so that seems to be another issue. I tried on peach pit, verified that both encode and decode is using hw, but was unable to reproduce any error. I tried reproducing on a chromebook pixel, but contrary to what I read, that doesn't seem to support hw h264 encoding. Could you try to reproduce this on samus, on latest and greatest, where a jitter buffer bug causing incorrect handling of SPS/PPS is fixed? Otherwise I guess we'll have to order new hardware, cause I'm sort of blind if I can't reproduce.
,
Nov 9 2016
+philipel who has worked on some parts of the H264 specifics in the jitter buffer.
,
Nov 9 2016
@26, I don't have a samus available, and I have been reproducing this on guado using M55 so far. On M56, we have another issue that AppRTC loopback does not even connect with H264 HW encode, see #16 and issue 660221 . Maybe the jitter buffer bug/fix is the underlying reason for these problems? Also, as far as I read rtp_format_h264.cc, there are cases where width/height is not set but frames are marked as keyframes, i.e. IDR and SEI [0]. Are those expected to be sent back to Chromium with width/height 0? If yes, then it is expected that we hit the check on [1] and fall back to SW. [0] https://cs.chromium.org/chromium/src/third_party/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc?rcl=0&l=586 [1] https://cs.chromium.org/chromium/src/content/renderer/media/gpu/rtc_video_decoder.cc?rcl=0&l=222
,
Nov 10 2016
We are looking at a regression first introduced in 8941.0.0 that could be causing this. Please see issue 662792 for details.
,
Nov 10 2016
Assigning to kcwu@ as well as I can't make any progress on this.
,
Nov 14 2016
Kaung-che. Now that all fixes were in, is H264 loopback working now?
,
Nov 15 2016
Without my patch, apprtc loopback cannot start ( issue 660221 ). After my patch, apprtc loopback can start but something wrong as described in this issue earlier. I saw the same error messages [1:23:1115/191150:VERBOSE2:rtc_video_decoder.cc(221)] Got key frame. size=0x0 [1:23:1115/191150:VERBOSE1:rtc_video_decoder.cc(227)] Resolution unsupported, falling back to software decode [1:23:1115/191150:WARNING:video_decoder.cc(78)] Decoder falling back to software decoding. [1:23:1115/191150:ERROR:h264_decoder_impl.cc(328)] avcodec_decode_video2 error: -1094995529 It switched between SW and HW decoding back and forth every few frames. I have tried replacing libva*.so with my local built by gcc, but didn't help.
,
Nov 15 2016
Pushing to M57. Please update if that's wrong.
,
Nov 15 2016
Keeping this on 55 for the time being, might be critical to get fixed.
,
Nov 16 2016
kcwu@, Have you tried with a fresh chrome build as well, so that the recent H264 fixes in the jitter buffer are included? The HW codec implementation in samus, what exactly is that? Something of our own design like kepler, something in the gpu? I haven't been able to get hold of hardware to repro this, so we may need to go out and acquire some, but I can't even find a samus for sale... :/
,
Nov 16 2016
philipel@, I'll assign this to you for now. Please work with pawel@ on reproing and determining in this in indeed a jitter buffer problem.
,
Nov 16 2016
re #35, recent samus R56-8994.0.0 has the same issue (avcodec_decode_video2 error)
,
Nov 16 2016
I'm having decoding problems using the new H264 negotiation that was fixed in issue 645599 . magjed@ suspects that the problem that I'm having is related to this issue. Can you please check what is happening. Logs attached. Thanks
,
Nov 17 2016
I realized that this problem occurs on M54 today. I tried the first version where CrOS HW H264 was enabled for WebRTC[0] on 8564.0/54.0.2791.0 guado and the problem is still there. It doesn't look like a regression on WebRTC packet handling side. I am assigning it to wuchengli@ to take a look if it is related to the kernel fix[1]. Can you take a look at why HW encoder sends a keyframe with size 0? It doesn't make sense to disable HW encode for CrOS though, as things work fine after falling back to SW decode on the remote peer. Doing SW encode would be much more expensive that SW decode. Additionally, I tested a call between guado CrOS 54.0.2791.0 and Mac ToT. As expected, Mac ToT falls back to SW decode as it receives a keyframe size 0. [0] https://codereview.chromium.org/2125163003 [1] https://bugs.chromium.org/p/chromium/issues/detail?id=625073#c3
,
Nov 18 2016
I spent some time yesterday looking into this bug and I saw the same thing as in #39 (error decoding frames produced by M54 hw encoder), although it seems to occur much more often on M55. Something strange that I also noticed was that when running apprtc between a M54/55 chromebook and my desktop was that for the first ~3 seconds I received keyframes frequently even though I didn't have any decoding issues.
,
Nov 21 2016
I'll take a look.
,
Nov 23 2016
H264 apprtc loopback using HW decode and encode work on samus 9014.0.0.
,
Nov 23 2016
H264 apprtc loopback using HW decode and encode work on guado 9014.0.0.
,
Nov 24 2016
For #42 and #43, was it still producing the errors in the initial report? Thanks.
,
Nov 24 2016
Re 44: no. There were no avcodec_decode_video2 error: -1094995529. I tried guado M56 9000.3.0. There were avcodec_decode_video2 errors. This means the issue was fixed recently. I've attached chrome log.
,
Nov 24 2016
I tried M57 9014.0.0 on guado again. This time it has avcodec_decode_video2 errors.
,
Nov 24 2016
I tried M57 9014.0.0 on samus again. I could reproduce the errors. I don't know why I couldn't reproduce it in #42 and #42. I deployed my self-built chrome (e0e9385ba796fd29e1f0f11126a18e173364c864, #431149). It's a bit old (Nov 9). I could reproduce the issue. I enabled the logs in RtcVideoDecoder. Like emircan said, rtc_video_decoder got key frame 0x0 and fell back to software. [1:22:1124/174704:VERBOSE2:rtc_video_decoder.cc(218)] Got key frame. size=0x0 [1:22:1124/174704:VERBOSE1:rtc_video_decoder.cc(224)] Resolution unsupported, falling back to software decode
,
Nov 24 2016
I enabled all verbose logs. The first frame to RtcVideoDecoder::Decode was 1280x720 key frame. The second frame to RtcVideoDecoder::Decode was a 0x0 key frame.
,
Nov 24 2016
I'm OOO tomorrow. I'll continue investigating next Monday. I'm removing M54 ReleaseBlock-Stable because it's too late for M54.
,
Nov 28 2016
I talked to Stefan. - could possibly be that you're getting a frame with only sps/pps and then a frame without sps/pps, but only IDR. in that case I guess we won't be able to parse width/height for the IDR - RtpDepacketizerH264::ProcessStapAOrSingleNalu (#1) is where width/height and frame type are set. If that's the case, I think RtcVideoDecoder shouldn't fall back to software decoder if it gets a keyframe size 0x0. I'll try changing the code tomorrow. #1: https://cs.chromium.org/chromium/src/third_party/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc?rcl=1480321223&l=486
,
Nov 28 2016
Just a heads up, we are nearing 55 stable and this is marked as a blocker, if we can get a fix in the next two days we can make the targeted RC, if not we may have to punt or delay.
,
Nov 29 2016
I'm going to punt this to M56 because I don't see a real broken use case. Please reply if there's concern.
,
Nov 29 2016
- I changed RtcVideoDecoder not to fallback to software when it gets a keyframe with size 0x0. Then H264 loopback worked and didn't fallback to software. But sometimes video hanged for a few seconds. It could be a different issue. - I found Sei would also set frame type to kVideoFrameKey (#1). Is it expected Sei frame is a key frame with size 0x0 sent to RtcVideoDecoder? - I reverted https://codereview.webrtc.org/2385143002 and most of size 0x0 kVideoFrameKey cases were gone except one. - I tried to dump the streams using VideoSendStream::EnableEncodedFrameRecording and VideoReceiveStream::EnableEncodedFrameRecording. It's not working yet. I'm trying to figure out why. I have a question. When RtcVideoEncoder encodes n frames (EncodedImageCallback.OnEncodedImage), does it mean RtcVideoDecoder::Decode will get n frames to decode? #1: https://cs.chromium.org/chromium/src/third_party/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc?rcl=1480321223&l=530
,
Nov 29 2016
- I forgot to say. I added some logs in rtp_format_h264.cc and size 0x0 keyframe is not IDR.
,
Nov 30 2016
Thanks Wu-Cheng, when you say "video hanged for a few seconds. It could be a different issue." Were we still encoding and/or decoding during that time? I'm not sure why SEI message NALUs are treated as keyframes, they are not actual frames. Perhaps the interpretation of kVideoFrameKey is different...?
,
Nov 30 2016
I dumped a h264 stream from VideoSendStream. It's in ivf format. I could use mplayer to play it and it looked fine.
,
Nov 30 2016
+noahric I made SEI not counted as keyframe locally. RtcVideoDecoder stopped getting size 0x0 keyframes and apprtc h264 loopback could work. SEI was added as a keyframe by https://codereview.webrtc.org/1664733002 in Feb this year. Should we modify RtcVideoDecoder not to fallback softwrae decoder when getting size 0x0 keyframe? Or should we stopped setting SEI as keyframes?
,
Nov 30 2016
I don't know why other devices like elm doesn't have SEI issues when doing h264 apprtc loopback.
,
Nov 30 2016
Either patch below can make H264 loopback work. - Do not fallback to software when getting size 0x0 keyframe. https://codereview.chromium.org/2532953009 - Do not set SEI nalu as a keyframe. https://codereview.webrtc.org/2544463002:
,
Nov 30 2016
I discussed with Stefan. If we don't mark sei as keyframe, we may still get an idr which is marked as keyframe but doesn't have a proper size. So RtcVideoDecoder shouldn't fallback to software when it gets a keyframe without size. The intention of the code to fallback is when the new resolution is not supported by the hardware decoder. 0x0 is not really a new resolution and RtcVideoDecoder should just send it to VDA. Tomorrow I'll clean up https://codereview.chromium.org/2532953009 and start code review. Then we don't really care if SEI is a key frame or now.
,
Dec 2 2016
https://codereview.chromium.org/2532953009 has been sent to CQ. This is what happened: (1) RtcVideoDecoder got a size 0x0 keyframe and fell back to software decoder. (2) SW decoder couldn't decode the frame (probably because the frame with the resolution was sent to HW decoder) and showed the error. ERROR:h264_decoder_impl.cc(333)] avcodec_decode_video2 error: -1094995529 (3) The next several delta frames are sent to SW decoder. SW decoder could not decode them and continued to print errors. (4) The next keyframe with resolution arrived. HW decoder could decode it and we stopped falling back to software decoder. (5) When (1) happened again, all above repeated. With the fix, there's no fallback to software decoder and there's no error logs. The fix is small. But there's no way to test every decoder and encoder combinations on all different platforms. I believe the problem is only on H264 HW decode via webrtc. H264 HW encode is fine (e.g. casting). I don't know any use case of H264 decode via webrtc on ChromeOS. I'm proposing not to merge the fix to M56. Please reply if you have any concern.
,
Dec 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/91f58a78aa62ea4b6e3a81a98f017fdc1c0ed644 commit 91f58a78aa62ea4b6e3a81a98f017fdc1c0ed644 Author: wuchengli <wuchengli@chromium.org> Date: Fri Dec 02 10:06:01 2016 RtcVideoDecoder: do not fallback to software when getting size 0x0 keyframe. A keyframe doesn't mean it has a resolution like IDR. On the other hand, SPS is not a keyframe but it has a resolution. Size 0x0 is not a new resolution and we should not fallback to software decoder. BUG= 653434 TEST=H264/VP8 apprtc loopback and Hangout on Sentry and Elm. Review-Url: https://codereview.chromium.org/2532953009 Cr-Commit-Position: refs/heads/master@{#435901} [modify] https://crrev.com/91f58a78aa62ea4b6e3a81a98f017fdc1c0ed644/content/renderer/media/gpu/rtc_video_decoder.cc [modify] https://crrev.com/91f58a78aa62ea4b6e3a81a98f017fdc1c0ed644/content/renderer/media/gpu/rtc_video_decoder_unittest.cc
,
Dec 2 2016
Patrick. Will new H264 video_WebRtcPeerConnectionWithCamera catch this bug? In this issue, H264 loopback still works. But it switches between hardware and software decoder several times. There's a one second delay during fallback.
,
Dec 2 2016
What are the symptoms? If video freezes or is black, then the answer is yes.
,
Dec 2 2016
That is, of course, provided the WebRTC sheriff understands what's going on and files a bug in the appropriate place. I'm a bit concerned with the new CrOS alerts; even if we get them, it's extremely difficult to narrow down a culprit.
,
Dec 7 2016
I'm having some decoding problems when negotiating in H264. I've installed the latest version that contains this fix: Version 57.0.2945.0 (64-bit). Can you please tell me what is going on. I've attached the log files below
,
Dec 8 2016
Re 66: Please file a new bug. The log looks like a different issue. Please provide the device name, the exact repro steps (did you use apprtc loopback?), and get a bugreport by running generate_logs in the command line. The bugreport is bug so please use google drive to share. [30148:18424:1207/155145.284:WARNING:generic_decoder.cc(164)] Failed to decode frame with timestamp 81025, error code: -1 [30148:2884:1207/155145.431:ERROR:rtc_video_decoder.cc(530)] VDA Error:2 [30148:18424:1207/155145.450:ERROR:rtc_video_decoder.cc(180)] Decoding error occurred. I tested tot 9060.0.0 on samus and peach pi. Both vp8 and h264 apprtc loopback worked.
,
Dec 8 2016
Ok. I will do that thanks. This was done in Windows 10 with Chromium Version 57.0.2945.0 (64-bit) that I've downloaded and installed from here on 07/12/2016 http://chromium.woolyss.com/download/. The logs that I've posted were generated using "--enable-logging --v=1" I looked up a bit and I don't see the option "generate_logs" can you tell me how I can enable that?
,
Jan 27 2017
,
Mar 16 2017
Verified on 9202.53.0, 57.0.2987.112. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by emir...@chromium.org
, Oct 6 2016Owner: hbos@chromium.org