[RTW] Realtime communication over HTTP (Re: Baseline in or out of scope)

Sun Feb 27 00:54:35 CET 2011

> When we encounter congestion, audio-over-TCP will experience this as
> jitter, while audio-over-UDP will experience this as packet loss, so the
> experience may be different.

"Jitter" is equivalent to packet loss in a real-time system. If you 
don't have decoded audio ready to go when it's time to play it out, 
there's nothing to be done but invoke your packet loss concealment 
algorithms, regardless of whether the packet is simply late, or not 
coming at all. UDP encounters this just as much as TCP does, and while 
you can attempt to mitigate the effect with a jitter buffer, there are 
limits to what it can do. You can also snapshot internal codec state and 
go back and re-decode if a packet does arrive late, but this is 
expensive and complicated, and the only benefit is a slightly faster 
recovery time: it's too late to fix the initial loss.

The practical difference between TCP and UDP is that, when you encounter 
congestion, TCP wastes time and bandwidth continually trying to 
re-transmit packets that you no longer care about (making the congestion 
worse), while UDP does not. There are many other differences, but this 
is the one that can't be engineered around. Thus, http will always be 
suboptimal. It's better than nothing if you're behind a firewall that 
doesn't allow UDP, but that doesn't mean that everyone should have to 
use a suboptimal solution just because 5-10% of users do.