Network Working Group | F. L. Templin, Ed. |
Internet-Draft | Boeing Research & Technology |
Intended status: Informational | October 18, 2012 |
Expires: April 19, 2013 |
Operational Issues with Tunnel Maximum Transmission Unit (MTU)
draft-generic-v6ops-tunmtu-10.txt
The Maximum Transmission Unit (MTU) for popular IP-within-IP tunnels is currently recommended to be set to 1500 (or less) minus the length of the encapsulation headers when static MTU determination is used. This requires the tunnel ingress to either fragment any IP packet larger than the MTU or drop the packet and return an ICMP Packet Too Big (PTB) message. Concerns for operational issues with Path MTU Discovery (PMTUD) point to the possibility of MTU-related black holes when a packet is dropped due to an MTU restriction. The current "Internet cell size" is effectively 1500 bytes (i.e., the minimum MTU configured by the vast majority of links in the Internet) and should therefore also be the minimum MTU assigned to tunnels, but this has proven to be problematic in common operational practice. This document therefore discusses operational issues for IP-within-IP tunnel MTUs.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http:/⁠/⁠datatracker.ietf.org/⁠drafts/⁠current/⁠.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 19, 2013.
Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http:/⁠/⁠trustee.ietf.org/⁠license-⁠info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The Maximum Transmission Unit (MTU) for popular IP-within-IP tunnels is currently recommended to be set to 1500 (or less) minus the length of the encapsulation headers when static MTU determination is used. This requires the tunnel ingress to either fragment any IP packet larger than the MTU or drop the packet and return an ICMP Packet Too Big (PTB) message [RFC0791][RFC2460]. Concerns for operational issues with Path MTU Discovery (PMTUD) [RFC1191][RFC1981] point to the possibility of MTU-related black holes when a packet is dropped due to an MTU restriction. The current "Internet cell size" is effectively 1500 bytes (i.e., the minimum MTU configured by the vast majority of links in the Internet) and should therefore also be the minimum MTU assigned to tunnels, but this has proven to be problematic in common operational practice. This document therefore discusses operational issues for IP-within-IP tunnel MTUs.
Pushing the tunnel MTU to 1500 bytes or beyond is met with the challenge that the addition of encapsulation headers would cause an inner IP packet that is 1500 bytes (or slightly smaller) to appear as a slightly larger than 1500 byte outer IP packet on the wire, where it may be too large to traverse the path in one piece. When an IP tunnel configures an MTU smaller than 1500 bytes, packets that are small enough to traverse earlier links in the path toward the final destination may be dropped at the tunnel ingress which then returns a PTB message to the original source. However, operational experience has shown that the PTB messages can be lost in the network [RFC2923], in which case the source does not receive notification of the loss.
It is therefore highly desirable that the tunnel configure an MTU of at least 1500 bytes even though encapsulation would cause some tunneled packets to be slightly larger than 1500 bytes. In that case, the tunnel ingress would need to make special adaptations to deliver packets that are no larger than 1500 bytes yet larger than can be accommodated in a single piece.
One possibility is to use IP fragmentation of the inner IP layer protocol before encapsulation so that inner packet fragments can be delivered via the tunnel without loss due to a size restriction and then reassembled at the final destination. This option removes the burden from the tunnel endpoints, but is only available for IPv4 packets (since IPv6 deprecates router fragmentation [RFC2460]), and is further only available when the IPv4 header sets the Don't Fragment (DF) bit in the IPv4 header to 0. For IPv6, the tunnel endpoint can invoke inner packet fragmentation only if it has a way of requesting the original host to perform the fragmentation itself (e.g., see: [I-D.generic-6man-tunfrag]).
A second possibility is to use IP fragmentation of the outer IP layer protocol following encapsulation so that the outer packet fragments can be delivered via the tunnel without loss due to a size restriction and then reassembled at the tunnel egress. This option is available for tunnels over both IPv4 and IPv6, and indeed the tunnel ingress is permitted to use IPv6 fragmentation since it is acting as a "host" (i.e., and not a router) for the encapsulated packets it produces. While IPv6 fragmentation is assumed to be "safe at all speeds", IPv4 fragmentation is dangerous at high data rates due to the possibility of Identification field wrapping while reassemblies are still active [RFC4963]. Also, if outer IP fragmentation were used the tunnel ingress has no assurance that the egress can reassemble packets larger than 1500 bytes, since the Minimum Reassembly Unit (MRU) is 1500 bytes for IPv6 [RFC2460] and only 576 bytes for IPv4 [RFC1122].
A third possibility for accommodating inner packets that are slightly too large is the use of "tunnel fragmentation" based on a mid-layer encapsulation that is inserted between the inner and outer IP headers. Tunnel fragmentation requires separate packet Identification and segmentation control bits in the mid-layer encapsulation that are distinct from those that appear in the inner and/or outer headers. As for outer fragmentation, the tunnel egress is responsible for reassembly. Tunnel fragmentation can be particularly useful for tunnels over IPv4, since the mid-layer encapsulation can include an extended Identification field that avoids the identification wrapping issue discussed above. However, tunnel fragmentation is not used in common widely-deployed tunneling mechanisms at the time of this writing. An example of tunnel fragmentation appears in SEAL [I-D.templin-intarea-seal].
Following any inner, tunnel or outer fragmentation, the ingress must allow the encapsulated packets or fragments to be further fragmented by a router on the path that configures a link with a too-small MTU. These fragments would be reassembled by the tunnel egress the same as if the fragmentation occurred within the tunnel ingress. This final form of fragmentation is undesirable and should be avoided if at all possible through the application of fragmentation at the tunnel ingress. However, common widely-deployed tunneling mechanisms at the time of this writing make no such provisions.
In addition to failure to accommodate packets up to 1500 bytes in length, current tunneling solutions typically do not make provisions for delivering packets that are larger than 1500 bytes. As long as they are no larger than the underlying link used for tunneling, the tunnel ingress should admit such "jumbo" packets into the tunnel and allow them to either be delivered to the egress in one piece or be dropped with the possibility of a PTB message being returned. The original host will then be able to determine the correct packet sizes whether or not PTB messages are delivered if it is using [RFC4821]. However, this approach is not used in common widely-deployed tunneling mechanisms at the time of this writing.
The operational issues discussed in this document apply to existing IPv6-within-IPv4 transition mechanisms, including configured tunnels [RFC4213], 6to4 [RFC3056], Teredo [RFC4380], ISATAP [RFC5214], DSMIP [RFC5555], 6rd [RFC5969], etc.
The issues further apply to existing IP-within-IP tunneling mechanisms of all varieties, including GRE [RFC1701], IPv4-in-IPv4 [RFC2003], IPv6-in-IPv6 [RFC2473], IPv4-in-IPv6 [RFC6333], IPsec [RFC4301], etc.
There are no IANA considerations for this document.
The security considerations for the various tunneling mechanisms apply also to this document.
This method was inspired through discussion on the IETF v6ops and NANOG mailing lists in the May/June 2012 timeframe.
[RFC0791] | Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. |
[RFC2460] | Deering, S.E. and R.M. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. |