Bufferbloat is a drag, indeed. The way I see it, a delay-sensing CC will <i>have</i> to lose the battle against TCP flows over "bufferbloated" devices, at least in the steady-state case (i.e., persistent TCP flows). Otherwise we will have created an algorithm that may just as well fill up the buffers for itself, even in a situation without competing flows. I'm pessimistic that there is any other way around this than QoS.<div>

<br></div><div>For the transient case that you are describing here, I think that we still cannot do much about the actual filling of the buffers; we are not to blame for it. But as you point out, we can probably do some more about how to respond when the transient cross-traffic hits. I'm not sure how, though.</div>

<div><br></div><div>/Henrik</div><div><br></div><div><br><br><div class="gmail_quote">On Wed, Oct 12, 2011 at 7:12 AM, Randell Jesup <span dir="ltr"><<a href="mailto:randell-ietf@jesup.org">randell-ietf@jesup.org</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Jim:  We're moving this discussion to the newly-created mailing sub-list -<br>

   <a href="mailto:Rtp-congestion@alvestrand.no" target="_blank">Rtp-congestion@alvestrand.no</a><br>

   <a href="http://www.alvestrand.no/mailman/listinfo/rtp-congestion" target="_blank">http://www.alvestrand.no/<u></u>mailman/listinfo/rtp-<u></u>congestion</a><br>

<br>

If you'd like to continue this discussion (and I'd love you to do so), please join the mailing list.  (Patrick, you may want to join too and read the very small backlog of messages (perhaps 10 so far)).<br>

<br>

On 10/11/2011 4:17 PM, Jim Gettys wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On 10/11/2011 03:11 AM, Henrik Lundin wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

I do not agree with you here. When an over-use is detected, we propose<br>

to measure the /actual/ throughput (over the last 1 second), and set<br>

the target bitrate to beta times this throughput. Since the measured<br>

throughput is a rate that evidently was feasible (at least during that<br>

1 second), any beta<  1 should assert that the buffers get drained,<br>

but of course at different rates depending on the magnitude of beta.<br>

</blockquote>

Take a look at the data from the ICSI netalyzr: you'll find scatter<br>

plots at:<br>

<br>

<a href="http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/" target="_blank">http://gettys.wordpress.com/<u></u>2010/12/06/whose-house-is-of-<u></u>glasse-must-not-throw-stones-<u></u>at-another/</a><br>


<br>

Note the different coloured lines.  They represent the amount of<br>

buffering measured in the broadband edge in *seconds*.  Also note that<br>

for various reasons, the netalyzr data is actually likely<br>

underestimating the problem.<br>

</blockquote>

<br>

Understood.  Though that's not entirely relevant to this problem, since the congestion-control mechanisms we're using/designing here are primarily buffer-sensing algorithms that attempt to keep the buffers in a drained state.  If there's no competing traffic at the bottleneck, they're likely to do so fairly well, though more simulation and real-world tests are needed.  I'll note that several organizations (Google/GIPS, Radvision and my old company WorldGate) had found that these types of congestion-control algorithms are quite effective in practice.<br>


<br>

However, it isn't irrelevant to the problem either:<br>

<br>

This class of congestion-control algorithms are subject to "losing" if faced with a sustained high-bandwidth TCP flow like some of your tests, since they back off when TCP isn't seeing any restriction (loss) yet. Eventually TCP will fill the buffers.<br>


<br>

More importantly, perhaps, bufferbloat combined with the high 'burst' nature of browser network systems (and websites) optimizing for page-load time means you can get a burst of data at a congestion point that isn't normally the bottleneck.<br>


<br>

The basic scenario goes like this:<br>

<br>

1. established UDP flow near bottleneck limit at far-end upstream><br>

2. near-end browser (or browser on another machine in the same house)<br>

   initiates a page-load<br>

3. near-end browser opens "many" tcp connections to the site and<br>

   other sites that serve pieces (ads, images, etc) of the page.<br>

4. Rush of response data saturates the downstream link to the<br>

   near-end, which was not previously the bottleneck.  Due to<br>

   bufferbloat, this can cause a significant amount of data to be<br>

   temporarily buffered, delaying competing UDP data significantly<br>

   (tenths of a second, perhaps >1 second in cases).  This is hard<br>

   to model accurately; real-world tests are important.<br>

5. Congestion-control algorithm notices transition to buffer-<br>

   induced delay, and tells the far side to back off.  The latency<br>

   of this decision may help us avoid over-reacting, as we have to<br>

   see increasing delay which takes a number of packets (at least<br>

   1/10 second, and easily could be more).  Also, the result of<br>

   the above "inrush"/pageload-induced latency may not trigger the<br>

   congestion mechanisms we discuss here, as we might see a BIG jump<br>

   in delay followed by steady delay or a ramp down (since if the<br>

   buffer has suddenly jumped from drained to full, all it can do is<br>

   be stable or drain).<br>

<br>

Note that Google's current algorithm (which you comment on above) uses recent history for choosing the reduction; in this case it's hard to say what the result would be: if it invokes the backoff at the start of the pageload, then the bandwidth received recently is the current bandwidth, so the new bandwidth is current minus small_delta.  If it happens after data has queued behind the burst of TCP traffic, then when the backoff is generated we'll have gotten almost no data through "recently" and we may back off all the way to min bandwidth; an over-reaction, depending on the time constant and level of how fast that burst can fill the downstream buffers.<br>


<br>

Now, in practice this is likely messier and the pageload doesn't generate a huge sudden block of data that fills the buffers, so there's some upward slope to delay as you head to saturation of the downstream buffers.  And there's very little you can do about this - and backing off a lot may help in that the less data you put onto the end of this overloaded queue (assuming the pageload flow has ended or soon will), the sooner the queue will drain and low-latency will be re-established.<br>


<br>

Does the ICSI data call out *where* the buffer-bloat occurs?<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Then realise that when congested, nothing you do can react faster than<br>

the RTT including the buffering.<br>

<br>

So if your congestion is in the broadband edge (where it often/usually<br>

is), you are in a world of hurt, and you can't use any algorithm that<br>

has fixed time constants, even one as long as 1 second.<br>

<br>

Wish this weren't so, but it is.<br>

<br>

Bufferbloat is a disaster...<br>

</blockquote>

<br>

Given the loss-based algorithms for TCP/etc, yes.  We have to figure out how to (as reliably *as possible*) deliver low-latency data in this environment.<br><span class="HOEnZb"><font color="#888888">

<br>

<br>

-- <br>

Randell Jesup<br>

<a href="mailto:randell-ietf@jesup.org" target="_blank">randell-ietf@jesup.org</a><br>

</font></span></blockquote></div><br></div>