Draft: draft-arberg-pppoe-mtu-gt1492-03 Reviewer: Harald Alvestrand [harald@alvestrand.no] Review Date: Wednesday 4/19/2006 9:33 AM IETF LC Date: 3/02/2006 Summary: Problem description is wrong. Solution is OK. Unlike some of the commentary on the IETF list, I see no general problem with the solution, as long as we have decided that violating an IEEE standard is OK. Also, I see no big problem with publishing an Info document showing a somewhat-distasteful fix to a somewhat-distasteful problem. But I really dislike the fact that the problem description is wrong. This is the problem description, from section 2: <----- IPoE -----> <--------- PPPoE session ---------> +-----+ +-----+ +--+ +---+ | | | | |PC|--------------| RG|-----------|DSLAM|------------| BRAS| +--+ +---+ | | | | +-----+ +-----+ Fig. 2: Next generation broadband network designs with PPPoE. In the network design shown in figure 2, fragmentation becomes a major problem since the subscriber session is a combination of IPoE and PPPoE. The IPoE typically use a MTU of 1500 octets. However, when the Residential Gateway and the BRAS are the PPPoE session endpoints, and therefore negotiate a MTU/MRU of 1492 octets resulting in a large number of fragmented packets in the network. But in this case, there is no big problem. Both ends of the session shown in the picture know what's going on, and can adjust expectations accordingly. The REAL problem occurs in this config: <----- IPoE -----> <------ PPPoE session --------------><-----Internet----> +-----+ +-----+ +----+ +-------+ +--+ +---+ | | | | | | | | |PC|--------------| RG|--------|DSLAM|---------| BRAS|---| FW |---| Server| +--+ +---+ | | | | | | | | +-----+ +-----+ +----+ +-------+ "FW" is a firewall. "Server" is a server that "PC" wants to access. "RG" and "BRAS" set up the PPPoE session, and limit the MTU to 1492 octets. They see no problem. But - "PC", "FW" and "Server" have no way to find out that there is a PPPoE session on the path at all. In a normal Web experience, "PC" will send out a small query (Web retrieval), which will get to "Server". "Server" will send back a reply in a 1500-byte packet (with DF), which hits BRAS, which sends back "packet too big", "Server" adjusts its MTU for that session, and everyone's happy. One extra RTT per session; no big deal. The failure mode is when "FW" is configured (stupidly) to drop incoming ICMP "packet too big" messages. "Server" never gets an ack to its packet, retransmits at the same size, never gets an ack to that - and remains stuck in that mode forever. I've tried to live in that mode; at the time (2003), it seemed that about 20% of the Web was configured this way. I believe the document would make a much clearer case for applying its fix if it explained that this is the scenario for which the PPPoE user experience is mind-bogglingly bad. Apart from that - I say go for it.