[R-C] Congestion Control BOF
Randell Jesup
randell-ietf at jesup.org
Tue Oct 11 04:08:24 CEST 2011
On 10/8/2011 11:29 PM, Justin Uberti wrote:
>
>
> On Sat, Oct 8, 2011 at 10:39 PM, Randell Jesup <randell-ietf at jesup.org
> <mailto:randell-ietf at jesup.org>> wrote:
>
> Well, I'm probably being overly-worried about processing delays (and
> in particular differing delays for audio and video). Let's say
> audio gets sampled at X, and (ignoring other processing steps) takes
> 1ms to encode. It gets to the wire at X + <other steps> + 1. Lets
> say video is also sampled at X, and (ignoring other processing
> steps) takes 10ms to encode. It gets to the wire at X + <other
> steps> + 10. So we've added a 9ms offset to all our A/V sync, and
> in this case it's in the "wrong" direction (people are more
> sensitive to early-audio than early-video). And if "other steps" on
> each side don't balance (and they may not), it could be worse. I
> also worry more that in a browser, with no access to true RT_PRI
> processing, the delays could be significantly variable (we get
> preempted by some other process/thread for 10 or 20ms, etc). Also,
> if the receiver isn't careful it could be tricked into skipping
> frames it should be displaying due to jitter in the packet-to-packet
> timestamps.
>
>
> So perhaps I'm not being overly-worried. I realize that I'm trading
> off accuracy in bandwidth estimation (or if you prefer, reaction
> speed) for ease in getting a consistent framerate and best-possible
> A/V sync.
> In a perfect world we'd record the sampling time and the delta until
> it was submitted to sendto(), so we'd have both. (You could use a
> header extension to do that).
>
>
> There's a lot more going on here. The algorithmic delays for audio and
> video will often be different, the capture delays perhaps wildly so. In
> addition, you won't want to just dump the video directly onto the wire -
> typically it will be leaked out over some interval to avoid bandwidth
> spikes, and the audio will have to maintain some jitter buffer to
> prevent underrun - so I think the encoding processing deltas will be
> nominal compared to the other delays in the pipeline.
Sure - though you have the sampling time of the audio and video, and if
you do your job right on the playback side, they'll be rock-solid synced
(and that can be done even if there's static drift between the audio and
video timestamp clocks). So long as you don't use time-on-wire
timestamps...
> I think this also does illustrate why having "time-on-wire" timestamping
> is really useful for increasing estimation accuracy :-)
BTW, I was serious when I said you could improve on this with an RTP
header extension with "time-on-the-wire" delta from sample time.
However, I don't think we need this here. As it would be totally
optional and ignored, that could be added later.
--
Randell Jesup
randell-ietf at jesup.org
More information about the Rtp-congestion
mailing list