TSVWG | R. Penno |
Internet-Draft | Cisco |
Intended status: Best Current Practice | S. Perreault |
Expires: August 22, 2015 | Viagenie |
S. Kamiset | |
Insieme Networks | |
M. Boucadair | |
France Telecom | |
K. Naito | |
NTT | |
February 18, 2015 |
Network Address Translation (NAT) Behavioral Requirements Updates
draft-ietf-tsvwg-behave-requirements-update-01
This document clarifies and updates several requirements of RFC4787, RFC5382 and RFC5508 based on operational and development experience. The focus of this document is NAPT44.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 22, 2015.
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
[RFC4787], [RFC5382] and [RFC5508] greatly advanced NAT interoperability and conformance. But with widespread deployment and evolution of NAT more development and operational experience was acquired some areas of the original documents need further clarification or updates. This document provides such clarifications and updates.
This document focuses solely on NAPT44 and its goal is to clarify, fill gaps or update requirements of [RFC4787], [RFC5382] and [RFC5508].
It is out of the scope of this document the creation of completely new requirements not associated with the documents cited above. New requirements would be better served elsewhere and if they are CGN specific in an update to [RFC6888].
The reader should be familiar with the terms defined in [RFC2663],[RFC4787],[RFC5382],and [RFC5508]
[RFC5382] specifies TCP timers associated with various connection states but does not specify the TCP state machine a NAPT44 should use as a basis to apply such timers. The TCP state machine depicted in Figure 1, adapted from [RFC6146], provides guidance on how TCP session tracking could be implemented - it is non-normative.
+-----------------------------+ | | V | +------+ CV4 | |CLOSED|-----SYN------+ | +------+ | | ^ | | |TCP_TRANS T.O. | | | V | +-------+ +-------+ | | TRANS | |V4 INIT| | +-------+ +-------+ | | ^ | | data pkt | | | | V4 or V4 RST | | | TCP_EST T.O. | | V | SV4 SYN | +--------------+ | | | ESTABLISHED |<---------+ | +--------------+ | | | | CV4 FIN SV4 FIN | | | | V V | +---------+ +----------+ | |CV4 FIN | | SV4 FIN | | | RCV | | RCV | | +---------+ +----------+ | | | | SV4 FIN CV4 FIN TCP_TRANS | | T.O. V V | +----------------------+ | | CV4 FIN + SV4 FIN RCV|--------------------+ +----------------------+
Figure 1
[RFC5382]:REQ-5 The transitory connection idle-timeout is defined as the minimum time a TCP connection in the partially open or closing phases must remain idle before the NAT considers the associated session a candidate for removal. But the document does not clearly states if these can be configured separately.
This document clarifies that a NAT device SHOULD provide different knobs for configuring the open and closing idle timeouts. This document further acknowledges that most TCP flows are very short (less than 10 seconds) [FLOWRATE][TCPWILD] and therefore a partially open timeout of 4 minutes might be excessive if security is a concern. Therefore, it MAY be configured to be less than 4 minutes in such cases. There also may be cases that a timeout of 4 minutes might be excessive. The case and the solution are written below.
The TCP TIME_WAIT state is described in [RFC0793]. The TCP TIME_WAIT state needs to be kept for 2MSL before a connection is CLOSED, for the reasons listed below:
These points are important for the TCP to work without problems.
[RFC5382] leaves the handling of TCP connections in TIME_WAIT state unspecified and mentions that TIME_WAIT state is not part of the transitory connection idle-timeout. If the NAT device honors the TIME_WAIT state, each TCP connection and its associated resources is kept for a certain period, typically for four minutes, which consumes port resources.
[RFC6191] explains that in certain situation it is necessary to reduce the TIME_WAIT state and defines such a mechanism using TCP timestamps and sequence numbers. When a connection request is received with a four-tuple that is in the TIME-WAIT state, the connection request may be accepted if the sequence number or the timestamp of the incoming SYN segment is greater than the last sequence number seen on the previous incarnation of the connection.
This document specifies that a NAT device should keep TCP connections in TIME_WAIT state unless it implements the proposal described in the following sub-section.
This section proposes to apply [RFC6191] mechanism at NAT. This mechanism MAY be adopted for both clients' and remote hosts' TCP active close.
client NAT remote host | | | | FIN | FIN | |------------------------>|------------------------>| | | | | ACK | ACK | |<------------------------|<------------------------| | FIN | FIN | |<------------------------|<------------------------| | | | | ACK(TSval=A) | ACK | |------------------------>|------------------------>| | | - | | | | | | | | | | | | | | | | TIME_WAIT | | | | ->assassinated at x | | | | | | | | | | | | | | SYN(TSval>A) | x SYN | |------------------------>|------------------------>| | | - | | | | | | | | SYN_SENT | | | | | | | | |
Also, PAWS works to discard old duplicate packets at NAT. A packet can be discarded as an old duplicate if it is received with a timestamp or sequence number value less than a value recently received on the connection.
To make these mechanisms work, we should concern the case that there are several clients with nonsuccessive timestamp or sequence number values are connected to a NAT device (i.e., not monotonically increasing among clients). Two mechanisms to solve this mechanism and applying [RFC6191] and PAWS to NAT are described below. These mechanisms are optional.
Rewrite timestamp and sequence number values of outgoings packets at NAT to be monotonically increasing. This can be done by adopting following mechanisms at NAT.
When packets come back as replies from remote hosts, NAT rewrite again the timestamp and sequence number values to be the original values. This can be done by adopting following mechanisms at NAT.
Adopt following mechanisms at NAT.
Packets from other clients which are not chosen by these mechanisms are rejected at NAT, unless there is unassigned port left.
We need to solve another scenario to make [RFC6191] work with NAT. In the case the remote TCP could not receive the acknowledgment of its connection termination request, the NAT device, on behalf of clients, resends the last ACK packet when it receives a FIN packet of the previous connection, and when the state of the previous connection has been deleted from the NAT. This mechanism MAY be used when clients starts closing process, and the remote host could not receive the last ACK.
To solve the port shortage problem on the client side, the behavior of remote host should be compliant to [RFC6191] or the mechanism written in Section 4.2.2.13 of [RFC1122], since NAT may reuse the same 5 tuple for a new connection. We have investigated behaviors of OSes (e.g., Linux, FreeBSD, Windows, MacOS), and found that they implemented the server side behavior of the above two.
[RFC5382] leaves the handling of TCP RST packets unspecified. This document does not try standardize such behavior but clarifies based on operational experience that a NAT that receives a TCP RST for an active mapping and performs session tracking MAY immediately delete the sessions and remove any state associated with it. If the NAT device that performs TCP session tracking receives a TCP RST for the first session that created a mapping, it MAY remove the session and the mapping immediately.
[RFC4787] [RFC5382]: REQ-1 Current RFCs specifiy a specific port overlapping behavior, i.e., that the external IP:port can be reused for connections originating from the same internal source IP:port irrespective of the destination. This is known as endpoint-independent mapping. This document clarifies that this port overlapping behavior can be extended to connections originating from different internal source IP:ports as long as their destinations are different. This known as EDM (Endpoint Dependent Mapping). The mechanism below MAY be one optional implement to NAT.
If destination addresses and ports are different for outgoing connections started by local clients, NAT MAY assign the same external port as the source ports for the connections. The port overlapping mechanism manages mappings between external packets and internal packets by looking at and storing their 5-tuple (protocol, source address, source port, destination address, destination port) . This enables concurrent use of a single NAT external port for multiple transport sessions, which enables NAT to work correctly in IP address resource limited network.
Discussions:
[RFC4787] and [RFC5382] requires "endpoint-independent mapping" at NAT, and port overlapping NAT cannot meet the requirement. This mechanism can degrade the transparency of NAT in that its mapping mechanism is endpoint-dependent and makes NAT traversal harder. However, if a NAT adopts endpoint-independent mapping together with endpoint-dependent filtering, then the actual behavior of the NAT will be the same as port overlapping NAT.
[RFC4787]: REQ-2 [RFC5382]:ND Address Pooling Paired behavior for NAT is recommended in previous documents but behavior when a public IPv4 run out of ports is left undefined. This document clarifies that if APP is enabled new sessions from a subscriber that already has a mapping associated with a public IP that ran out of ports SHOULD be dropped. The administrator MAY provide a knob that allows a NAT device to starting using ports from another public IP when the one that anchored the APP mapping ran out of ports. This is trade-off between subscriber service continuity and APP strict enforcement. (Note, it is sometimes referred as 'soft-APP')
[RFC4787]:REQ-8 and [RFC5382]:REQ-3 End-point independent filtering could potentially result in security attacks from the public realm. In order to handle this, when possible there MUST be strict filtering checks in the inbound direction. A knob SHOULD be provided to limit the number of inbound sessions and a knob SHOULD be provided to enable or disable EIF on a per application basis. This is specially important in the case of Mobile networks where such attacks can consume radio resources and count against the user quota.
[RFC4787]:REQ-8 and[RFC5382]: REQ-3 Current RFCs do not specify whether EIF mappings are protocol independent. In other words, if an outbound TCP SYN creates a mapping, it is left undefined whether inbound UDP packets destined to that mapping should be forwarded. This document specifies that EIF mappings SHOULD be protocol independent in order allow inbound packets for protocols that multiplex TCP and UDP over the same IP: port through the NAT and also maintain compatibility with stateful NAT64 RFC6146 [RFC6146]. But, the administrator MAY provide a configuration knob to make it protocol dependent.
[RFC4787]: REQ-6 [RFC5382]: ND The NAT mapping Refresh direction MAY have a "NAT Inbound refresh behavior" of "True" but it does not clarifies how this applies to EIF mappings. The issue in question is whether inbound packets that match an EIF mapping but do not create a new session due to a security policy should refresh the mapping timer. This document clarifies that even when a NAT device has a inbound refresh behavior of TRUE, such packets SHOULD NOT refresh the mapping. Otherwise a simple attack of a packet every 2 minutes can keep the mapping indefinitely.
In the case of NAT outbound refresh behavior there are certain types of packets that should not refresh the mapping even if their direction is outbound. For example, if the mapping is kept alive by ICMP Errors or TCP RST outbound packets sent as response to inbound packets, these SHOULD NOT refresh the mapping.
[RFC4787] [RFC5382]: REQ-1 Current RFCs do not specify whether EIM are protocol independent. In other words, if a outbound TCP SYN creates a mapping it is left undefined whether outbound UDP can reuse such mapping and create session. On the other hand, Stateful NAT64 [RFC6146] clearly specifies three binding information bases (TCP, UDP, ICMP). This document clarifies that EIM mappings SHOULD be protocol dependent . A knob MAY be provided in order allow protocols that multiplex TCP and UDP over the same source IP and port to use a single mapping.
A NAT devices MAY disable port parity preservation for dynamic mappings. Nevertheless, A NAT SHOULD support means to explicitly request to preserve port parity (e.g., [I-D.ietf-pcp-port-set]).
A NAT SHOULD follow the recommendations specified in Section 4 of [RFC6056] especially:
A NAT SHOULD handle the Identification field of translated IPv4 packets as specified in Section 9 of [RFC6864].
Section 3.1 of [RFC5508] says that ICMP Query Mappings are to be maintained by NAT device. However, RFC doesn't discuss about the Query Mapping timeout values. Section 3.2 of that RFC only discusses about ICMP Query Session Timeouts.
ICMP Query Mappings MAY be deleted once the last the session using the mapping is deleted.
[RFC5508]:REQ-7 This requirement specifies that NAT devices enforcing Basic NAT MUST support traversal of hairpinned ICMP Query sessions. This implicitly means that address mappings from external address to internal address (similar to Endpoint Independent Filters) MUST be maintained to allow inbound ICMP Query sessions. If an ICMP Query is received on an external address, NAT device can then translate to an internal IP. [RFC5508]:REQ-7 This requirement specifies that all NAT devices (i.e., Basic NAT as well as NAPT devices) MUST support the traversal of hairpinned ICMP Error messages. This requires NAT devices to maintain address mappings from external IP address to internal IP address in addition to the ICMP Query Mappings described in section 3.1 of that RFC.
This document does not require any IANA action.
In the case of EIF mappings due to high risk of resource crunch, a NAT device MAY provide a knob to limit the number of inbound sessions spawned from a EIF mapping.
[I-D.ietf-tcpm-tcp-security] contains a detailed discussion of the security implications of TCP Timestamps and of different timestamp generation algorithms.
Thanks to Dan Wing, Suresh Kumar, Mayuresh Bakshi, Rajesh Mohan and Senthil Sivamular for review and discussions
[FLOWRATE] | Zhang, Y., Breslau, L., Paxson, V. and S. Shenker, "On the Characteristics and Origins of Internet Flow Rates", . |
[I-D.ietf-pcp-port-set] | Qiong, Q., Boucadair, M., Sivakumar, S., Zhou, C., Tsou, T. and S. Perreault, "Port Control Protocol (PCP) Extension for Port Set Allocation", Internet-Draft draft-ietf-pcp-port-set-07, November 2014. |
[I-D.ietf-tcpm-tcp-security] | Gont, F., "Survey of Security Hardening Methods for Transmission Control Protocol (TCP) Implementations", Internet-Draft draft-ietf-tcpm-tcp-security-03, March 2012. |
[TCPWILD] | Qian, F., Subhabrata, S., Spatscheck, O., Morley Mao, Z. and W. Willinger, "TCP Revisited: A Fresh Look at TCP in the Wild", . |