WebRTC: RTP synchronization and PTS assignment issues
Reported by
mparisd...@gmail.com,
Sep 29 2016
|
|||||
Issue description
UserAgent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0
Steps to reproduce the problem:
Hello,
I am dealing with synchronization and PTS assignment issues interoperating with Chrome in our implementation.
The interstream synchronization (typically known as lypsync) and the PTS assignment are based on:
Timestamp of the RTP packets received [1]
RTP timestamp of the RTP SR (Sender Report) packets received [2]
NTP timestamp of the RTP SR (Sender Report) packets received [2]
Specifically, I have problems in the video PTS assignment due to:
Huge gaps in the timestamp of the RTP packets sent by Chrome.
The intrinsic problem of receiving RTP and RTCP-SR packets out of sync (not serialized).
Even, the RTCP-SR can be lost
To understand the problem I am exposing a real example. The example shows part of a video RTP stream where the clockrate is 90000:
The RTCP-SR-M is received
SSRC: 100020, rtp_time: 1874314573, ntp_time: 15816590577043708928, ntp_ns_time: 1022940:49:24.462815999
The RTP packets of the VideoFrame-N are received
seqnum 8016, rtptime 1874315833, pts 0:00:22.727008174
seqnum 8017, rtptime 1874315833, pts 0:00:22.727008174
seqnum 8018, rtptime 1874315833, pts 0:00:22.727008174
seqnum 8019, rtptime 1874315833, pts 0:00:22.727008174
[5 frames more received seqnums: 8020-8040]
The RTP packets of the VideoFrame-(N+6) are received
seqnum 8041, rtptime 1874342833, pts 0:00:23.027008174
seqnum 8042, rtptime 1874342833, pts 0:00:23.027008174
seqnum 8043, rtptime 1874342833, pts 0:00:23.027008174
seqnum 8044, rtptime 1874342833, pts 0:00:23.027008174
An RTCP-SR-(M+1) is received
SSRC: 100020, rtp_time: 1874345983, ntp_time: 15816590579105293230, ntp_ns_time: 1022940:49:24.942815999
The RTP packets of the VideoFrame-(N+7) are received
Here we can notice a huge increment gap in the RTP timestamps: 3.687 seconds [(1874674663 - 1874342833) / 90000]
seqnum 8045, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8046, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8047, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8048, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8049, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8050, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8051, rtptime 1874674663, pts 0:00:26.845008174
seqnum 8052, rtptime 1874674663, pts 0:00:26.845008174
[3 frames more received seqnums: 8053-8068]
The RTP packets of the VideoFrame-(N+11) are received
seqnum 8069, rtptime 1874699503, pts 0:00:27.121008174
seqnum 8070, rtptime 1874699503, pts 0:00:27.121008174
seqnum 8071, rtptime 1874699503, pts 0:00:27.121008174
the RTCP-SR-(M+2) is received
Here we can notice a huge increment gap between the RTP timestamp of this RTCP SR packet and the last one: 3.948 seconds [(1874701303 - 1874345983) / 90000]
but a small gap between their NTP timestamps: 0.368 seconds [25.310815999 - 24.942815999]
SSRC: 100020, rtp_time: 1874701303, ntp_time: 15816590580685841195, ntp_ns_time: 1022940:49:25.310815999
The RTP packets of the VideoFrame-(N+12) are received
Here we can notice a huge decrement gap in the resulting PTS due to use the last RTCP SR info
seqnum 8072, rtptime 1874704003, pts 0:00:23.591008174
seqnum 8073, rtptime 1874704003, pts 0:00:23.591008174
seqnum 8074, rtptime 1874704003, pts 0:00:23.591008174
seqnum 8075, rtptime 1874704003, pts 0:00:23.591008174
What is the expected behavior?
From this example, I would say that the PTS of the VideoFrame(N+7) to VideoFrame(N+10) are not valid and should have been based on the RTCP-SR-(M+2), but we cannot know it.
So, if I am not wrong, with the currently implemented protocols there is not a solution for that. To solve the problem we should have the NTP time in each RTP packet as proposed in "Rapid Synchronisation of RTP" [3] (discussed in [4]).
What went wrong?
Anyway, I would like to know:
How does Chrome deal with this problem?
How does Chrome assign the PTS for the video frames and audio samples received?
Could please anybody give me a pointer to the source code? (I didn't find it)
Did this work before? No
Chrome version: 53.0.2785.116 Channel: n/a
OS Version: Ubuntu 14.04
Flash Version: Shockwave Flash 11.2 r202
Related to https://groups.google.com/forum/?fromgroups#!topic/discuss-webrtc/npLmOesI8A4
Refs
[1] https://tools.ietf.org/html/rfc3550#section-5.1
[2] https://tools.ietf.org/html/rfc3550#section-6.4.1
[3] https://tools.ietf.org/html/rfc6051
[3] https://groups.google.com/forum/#!searchin/discuss-webrtc/from$3Ame%7Csort:relevance/discuss-webrtc/XRd5WEWD5VM/1_4FaUpVBQAJ
,
Sep 30 2016
holmer: can you have a look?
,
Sep 30 2016
Åsa, is this similar to what you have been looking at?
,
Sep 30 2016
Nisse, do you have any insights to share regarding the timestamping issue seen here?
,
Sep 30 2016
I'm afraid I'm not familiar with the rtcp protocol, so let me think aloud and tell me if you agree. What I see: 1. The timestamp jump between RTP N+6 and N+7 seem wrong (unless there really was a pause in the video). 2. I agree the different rtp vs ntp increment in RTCP-SR-(M+2) is strange, but it might be a consequence of (1). 3. I find it a bit odd that the rtp timestamps in the RTCP messages don't correspond to the timestamp of any actual RTP packet. As I said, I'm not really familiar with the protocol, but I would have expected the RTCP-SR to contain the rtp timestamp of the most recent RTP packet, together with the corresponding ntp time (where "corresponding" is some hairy logic). Also note that this may well work quite differently in M53, M54, and current master. In the current code, the intention is that the frame timestamp at input to the video pipeline should always be in system monotonic time (rtc::TimeMicros, which on posix is clock_gettime(CLOCK_MONOTONIC,...), reducing the number of places where various offsets are derived and added to the timestamps. The rtp time could then be derived by simply scaling this microsecond frame time (except if it is randomized as required(?) by the spec), while the ntp time is derived by adding an offset which I think is computed once and not updated mid-stream. I don't know if we do any send-side tweaks to the ntp-time to improve lipsync; as far as I understand, audio and video sources don't have and shared clock, so best we can do is to read rtc::Timemicros as soon as possible when the data arrives to us from the source, and try to sync using that.
,
Sep 30 2016
I suspect that the problem experienced here is the same as what Åsa has been looking at. It's triggered when the camera provides a capture timestamp for each frame to webrtc, and the time between the capture timestamp is set to when webrtc gets the frame varies a lot. Since we used the capture timestamp and the system time to extrapolate RTP timestamps for the RTCP sender reports, this could cause issues similar to this. If this theory is correct I think nisse's change to add a capture timestamp aligner (https://chromium.googlesource.com/external/webrtc/+/61050f67efbf3e3df2c6aa353db0d027be3eac4f/webrtc/base/timestampaligner.cc) should solve the issue. This hopefully solves the problem on the send-side, and I guess that should be enough. We could however also look at improving the robustness of our sync code by doing similar averaging in there on the receive-side, but in this case that wouldn't have helped since the receiving client was Firefox (if I'm reading the report correctly). mparisdiaz@gmail.com, Could you try with Chrome Dev or Canary and see if you can repro?
,
Oct 4 2016
Hello @holmer, your suspicions make sense in relation to what I have seen, because the huge gaps take place when the machine where the browser is running is overloaded (eg: adding 3 participants in a room application). After doing some tests comparing Stable version (53.0.2785.116) to Canary (55.0.2880.0) I can say that in Canary it is working much more better, so great work ;). In relation to improve/fix the synchronization in the receive-side, as I commented before: "if I am not wrong, with the currently implemented protocols there is not a solution for that. To solve the problem we should have the NTP time in each RTP packet as proposed in "Rapid Synchronisation of RTP" [3] (discussed in [4])." In this way: - The receive-side delegates to the sender-side (which is the media source) the correctness of the synchronization (as is done now). - The receive-side does not suffer the implicit race condition problems due to receive separately the media (RTP) and the sync info (wallclock in RTCP). - The receive-side always receives the needed info for syncing (wallclock) next to the media (RTP) to set the final PTS, easing the synchronization algorithms. Refs [3] https://tools.ietf.org/html/rfc6051 [4] https://groups.google.com/forum/#!searchin/discuss-webrtc/from$3Ame%7Csort:relevance/discuss-webrtc/XRd5WEWD5VM/1_4FaUpVBQAJ
,
Oct 4 2016
Yes, I think rfc 6051 could be a nice improvement. Feel free to file a feature request on that on bugs.webrtc.org, but I suggest we close this issue.
,
Oct 4 2016
Yes, I think that we can close this issue. Moreover, I have already opened the feature request: https://bugs.chromium.org/p/webrtc/issues/detail?id=6474 Thanks!!
,
Oct 4 2016
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by ajha@chromium.org
, Sep 30 2016Labels: TE-NeedsTriageHelp