[R-C] LEDBAT vs RTCWeb

Jim Gettys jg at freedesktop.org
Wed Apr 11 14:31:51 CEST 2012


On 04/11/2012 07:31 AM, Harald Alvestrand wrote:
> On 04/11/2012 12:43 PM, Jim Gettys wrote:
>> On 04/11/2012 02:16 AM, Harald Alvestrand wrote:
>>> On 04/10/2012 09:14 PM, Jim Gettys wrote:
>>>> On 04/10/2012 02:58 PM, Randell Jesup wrote:
>>>>> 100ms is just bad, bad, bad for VoIP on the same links.  The only
>>>>> case
>>>>> where I'd say it's ok is where it knows it's competing with
>>>>> significant TCP flows.  If it reverted to 0 queuing delay or close
>>>>> when the channel is not saturated by TCP, then we might be ok (not
>>>>> sure).  But I don't think it does that.
>>>>>
>>>> You aren't going to see delay under saturating load under 100ms unless
>>>> the bottleneck link is running a working AQM; that's the property of
>>>> tail drop, and the "rule of thumb" for sizing buffers has been of
>>>> order
>>>> 100ms.  This is to ensure maximum bandwidth over continental paths
>>>> of a
>>>> single TCP flow.
>>>>
>>>> Unfortunately, the bloat in the broadband edge is often/usually much,
>>>> much higher than this, being best measured in seconds :-(.
>>>> http://gettys.files.wordpress.com/2010/12/uplink_buffer_all.png
>>>> http://gettys.files.wordpress.com/2010/12/downlink_buffer_all.png
>>>> (thanks to the Netalyzr folks).
>>> the encouraging thing in those (depressing) charts is that the fiber
>>> stuff (green subcloud) seems to be less broken than the DSL. So the
>>> future may actually be less depressing than the past.
>> Get out your anti-depressants.  The ICSI data *understates* the severity
>> of the problem.
>>
>> The ICSI data tops out at 20Mbps due to a limitation in their server
>> systems, so we don't really know how good/bad fiber is (since most fiber
>> tiers of service start around 20Mbps and so won't show up where it
>> should on that plot).
>>
>> Secondly, the home router situation is even worse than broadband.  As
>> soon as the bandwidth is higher in the broadband hop than the wireless
>> hop (and 802.11g tops out at about 20-22Mbps), the bottleneck shifts to
>> the wireless hop, and you have the problem on either side of the
>> wireless hop (our OS's and home routers).  This is why I spend my time
>> on home routers and Linux.  Home routers/our operating systems have yet
>> more bloat than broadband, typically.
>>
>> We have a disaster on our hands.
>>
>> Sorry to be the bearer of such horrifying news.
> I know - well enough to want to smile at any small hint of a silver
> lining :-)
>
I know you (and I) would like some smiles...

Unfortunately, given the limitations of buffering ICSI's test, we don't
know much for faster devices.  This means much/most of fiber, and the
higher tiers of cable are not testable by that test, and you won't see
valid results.  (you can go look at the netalyzr papers for better
understanding and interpretation of the scatter plots).  They are
working to try to raise the bandwidth limit of that test.

But I'm *very* pessimistic, for the reasons I now explain.

There seem to be two common cases among engineers building network devices:
    1) no conscious thought about buffering: use all the RAM you've got,
or defaults inherited from other technologies (e.g. gigabit Ethernet
defaults applied to wireless, for a concrete example).
    2) the usual 100MS rule of thumb *at the highest bandwidth the
device can run* for a single TCP flow so that it benchmarks well. 
Sometimes this buffering seems to get stretched to hide firmware bugs
too...  We *know* that hiding firmware bugs is occurring in practice; it
isn't a hypothetical.

So, say you have a cable modem or fiber box that is designed to be able
to run at 100Mbps, and you are buying 20Mbps.  You end up at least five
times over-buffered out of the starting gate (at 1/2 second, presuming
they designed to the usual 100MS metric).

The first generation DOCSIS 3 modems, for example, are designed to run
up to 150Mbps.  I've seen corresponding amounts of buffering present;
God forbid you only buy 10Mbps and use such a modem (until the DOCSIS
amendment is fully deployed, you'll get today's full glory).  Then the
buffering turns into a couple seconds.  I think it likely (actually as
certain as I can be without having run tests myself) that fiber has the
same problem: else they would flunk their bandwidth tests, and not get
bought by the ISP.  So the fact that fiber doesn't show as much
buffering on the ICSI data is an artefact, almost certainly.

And as soon as this bandwidth exceeds your share of your 802.11 link,
you get all the problems of both home routers and our operating systems,
rather than the broadband link.  There, there seems to be hundreds (or
even more than 1000) packets of buffering present, depending on the
vintage of your OS and home router.  Since the buffering here is in
packets rather than bytes, you get points on the ICSI data that are not
the "power of two" structure you see prominently in that data.

And I won't even go into what may be going on elsewhere in the internet
(e.g. the broadband head end CMTS, DSLAM or fiber box) beyond saying
that they likely are not running AQM....

It's a mess.

Sorry to be so depressing on a fine morning.  I see no silver lining.
                            - Jim








More information about the Rtp-congestion mailing list