PCP working group | D. Wing, Ed. |
Internet-Draft | Cisco |
Intended status: Standards Track | S. Cheshire |
Expires: January 07, 2012 | Apple |
M. Boucadair | |
France Telecom | |
R. Penno | |
Juniper Networks | |
P. Selkirk | |
Internet Systems Consortium | |
July 06, 2011 |
Port Control Protocol (PCP)
draft-ietf-pcp-base-13
The Port Control Protocol allows an IPv6 or IPv4 host to control how incoming IPv6 or IPv4 packets are translated and forwarded by a network address translator (NAT) or simple firewall, and also allows a host to optimize its outgoing NAT keepalive messages.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 07, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The Port Control Protocol (PCP) provides a mechanism to control how incoming packets are forwarded by upstream devices such as NAT64, NAT44, and firewall devices, and a mechanism to reduce application keepalive traffic. PCP is primarily designed to be implemented in the context of both Carrier-Grade NATs (CGN) and small NATs (e.g., residential NATs). PCP allows hosts to operate servers for a long time (e.g., a webcam) or a short time (e.g., while playing a game or on a phone call) when behind a NAT device, including when behind a CGN operated by their Internet service provider.
PCP allows applications to create mappings from an external IP address and port to an internal IP address and port. These mappings are required for successful inbound communications destined to machines located behind a NAT or a firewall.
After creating a mapping for incoming connections, it is necessary to inform remote computers about the IP address and port for the incoming connection. This is usually done in an application-specific manner. For example, a computer game might use a rendezvous server specific to that game (or specific to that game developer), a SIP phone would use a SIP proxy, and a client using DNS-Based Service Discovery [DNS-SD] would use DNS Update [RFC2136] [RFC3007]. PCP does not provide this rendezvous function. The rendezvous function will support IPv4, IPv6, or both. Depending on that support and the application's support of IPv4 or IPv6, the PCP client will need an IPv4 mapping, an IPv6 mapping, or both.
Many NAT-friendly applications send frequent application-level messages to ensure their session will not be timed out by a NAT. These are commonly called "NAT keepalive" messages, even though they are not sent to the NAT itself (rather, they are sent 'through' the NAT). These applications can reduce the frequency of those NAT keepalive messages by using PCP to learn (and influence) the NAT mapping lifetime. This helps reduce bandwidth on the subscriber's access network, traffic to the server, and battery consumption on mobile devices.
Many NATs and firewalls have included application layer gateways (ALGs) to create mappings for applications that establish additional streams or accept incoming connections. ALGs incorporated into NATs may also modify the application payload. Industry experience has shown that these ALGs are detrimental to protocol evolution. PCP allows an application to create its own mappings in NATs and firewalls, reducing the incentive to deploy ALGs in NATs and firewalls.
PCP can be used in various deployment scenarios, including:
The PCP OpCodes defined in this document are designed to support transport-layer protocols that use a 16-bit port number (e.g., TCP, UDP, SCTP, DCCP). Protocols that do not use a port number (e.g., IPsec ESP) are beyond the scope of this document, as is using PCP to request forwarding of all traffic to a single default host (often nicknamed a "DMZ").
PCP assumes a single-homed IP address model. That is, for a given IP address of a host, only one default route exists to reach the Internet. This is important because after a PCP mapping is created and an inbound packet (e.g., TCP SYN) arrives at the host, the outbound response (e.g., TCP SYNACK) has to go through the same path so it is seen by the firewall or rewritten by the NAT. This restriction exists because otherwise there would need to be a PCP-enabled NAT for every egress (because the host could not reliably determine which egress path packets would take) and the client would need to be able to reliably make the same internal/external mapping in every NAT gateway, which in general is not possible (because the other NATs might have the necessary port mapped to another host).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in "Key words for use in RFCs to Indicate Requirement Levels" [RFC2119].
For simplicity in building and parsing request and response packets, PCP always uses fixed-size 128-bit IP address fields for both IPv6 addresses and IPv4 addresses.
When the address field holds an IPv6 address, the fixed-size 128-bit IP address field holds the IPv6 address stored as-is.
When the address field holds an IPv4 address, IPv4-mapped IPv6 addresses [RFC4291] are used (::FFFF/96). This has the first 80 bits set to zero and the next 16 set to one, while its last 32 bits are filled with the IPv4 address. This is unambiguously distinguishable from a legal IPv6 address, because IPv4-mapped IPv6 address [RFC4291] are not used as either the source or destination address of actual IPv6 packets.
When checking for an IPv4-mapped IPv6 address, all of the first 96 bits MUST be checked for the pattern -- it is not sufficient to check for 0xFF in bits 90-96.
The all-zeroes IPv6 address is expressed by filling the fixed-size 128-bit IP address field with all zeroes. The all-zeroes IPv4 address is expressed as: 80 bits of zeros, 16 bits of ones, and 32 bits of zeros.
The PCP server receives and responds to PCP requests. The PCP server functionality is typically a capability of a NAT or firewall device, as shown in Figure 1. It is also possible for the PCP functionality to be provided by some other device, which communicates with the actual NAT or firewall via some other proprietary mechanism, as long as from the PCP client's perspective such split operation is indistinguishable from the integrated case.
+-----------------+ +------------+ | NAT or firewall | | PCP client |-<network>-+ with +---<Internet> +------------+ | PCP server | +-----------------+
All PCP messages contain a request (or response) header containing an OpCode, any relevant OpCode-specific information, and zero or more Options. The packet layout for the common header, and operation of the PCP client and PCP server, are described in the following sections. The information in this section applies to all OpCodes. Behavior of the OpCodes defined in this document is described in Section 8 and Section 9.
All requests have the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version = 1 |R| OpCode | PCP Client's Port (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Requested Lifetime (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | PCP Client's IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) OpCode-specific information : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) PCP Options : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
All responses have the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version = 1 |R| OpCode | Reserved | Result Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Lifetime (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Epoch (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Reserved (96 bits) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) OpCode-specific response data : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : (optional) Options : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
A PCP OpCode can be extended with one or more Options. Options can be used in requests and responses. The decision about whether to include a given piece of information in the base OpCode format or in an Option is an engineering trade-off between packet size and code complexity. For information that is usually (or always) required, placing it in the fixed OpCode data results in simpler code to generate and parse the packet, because the information is a fixed location in the OpCode data, but wastes space in the packet in the event that field is all-zeroes because the information is not needed or not relevant. For information that is required less often, placing it in an Option results in slightly more complicated code to generate and parse packets containing that Option, but saves space in the packet when that information is not needed. Placing information in an Option also means that an implementation that never uses that information doesn't even need to implement code to generate and parse it. For example, a client that never requests mappings on behalf of some other device doesn't need to implement code to generate the THIRD_PARTY Option, and a PCP server that doesn't implement the necessary security measures to create third-party mappings safely doesn't need to implement code to parse the THIRD_PARTY Option.
Options use the following Type-Length-Value format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Code | Reserved | Option-Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : (optional) data : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The description of the fields is as follows:
The handling of an Option by the PCP client and PCP server MUST be specified in an appropriate document, which MUST include whether the PCP Option can appear in a request and/or response, whether it can appear more than once, and indicate what sort of Option data it conveys. If several Options are included in a PCP request, they MAY be encoded in any order by the PCP client, but MUST be processed by the PCP server in the order in which they appear.
If, while processing an Option, an error is encountered that causes a PCP error response to be generated, the PCP request MUST cause no state change in the PCP server or the PCP-controlled device (i.e., it rolls back any changes it might have made while processing the request). The response MUST encode the Options in the same order, but MAY omit some PCP Options in the response, to indicate the PCP server does not understand that Option or that Option is not permitted to be included in responses by the definition of the Option itself. Additional Options included in the response (if any) MUST be included at the end. A certain Option MAY appear more than once in a request or in a response, if permitted by the definition of the Option itself. If the Option's definition allows the Option to appear only once but it appears more than once in a request, the PCP server MUST respond with the MALFORMED_OPTION result code; if this occurs in a response, the PCP client processes the first occurrence and ignores the other occurrences as if they were not present.
If the "O" bit (high bit) in the OpCode is clear, a PCP server MUST process this Option. If the PCP server does not implement this Option, or cannot perform the function indicated by this Option (e.g., due to a parsing error with the Option), it MUST generate an error response with code UNSUPP_OPTION or MALFORMED_OPTION (as appropriate) and include the UNPROCESSED Option in the response (Section 6.7.1).
If the "O" bit is set, a PCP server MAY process or ignore this Option, entirely at its discretion.
PCP clients are free to ignore any or all Options included in responses, although naturally if a client explicitly requests an Option where correct handling of that Option requires processing the Option data in the response, that client is expected to implement code to do that.
Option definitions MUST include the information below:
The following result codes may be returned as a result of any OpCode received by the PCP server. The only success result code is 0; other values indicate an error. If a PCP server encounters multiple errors during processing of a request, it SHOULD use the most specific error message.
Additional result codes, specific to the OpCodes and Options defined in this document, are listed in Section 8.2 and Section 10.1.
PCP messages MUST be sent over UDP [RFC0768]. Every PCP request generates a response, so PCP does not need to run over a reliable transport protocol.
PCP is idempotent, meaning that if the PCP client sends the same request multiple times (or the PCP client sends the request once and it is duplicated by the network), and the PCP server processes those requests multiple times, the result is the same as if the PCP server had processed only one of those duplicate requests.
This section details operation specific to a PCP client, for any OpCode. Procedures specific to the MAP OpCodes are described in Section 8, and procedures specific to the PEER OpCodes are described in Section 9.
Prior to sending its first PCP message, the PCP client determines which server to use. The PCP client performs the following steps to determine its PCP server:
For the purposes of this document, only a single PCP server address is supported. Should future specifications define configuration methods that provide a list of PCP server addresses, those specifications will define how clients select one or more addresses from that list.
With that PCP server address, the PCP client formulates its PCP request. The PCP request contains a PCP common header, PCP OpCode and payload, and (possibly) Options. As with all UDP or TCP client software on any operating system, when several independent PCP clients exist on the same host, each uses a distinct source port number to disambiguate their requests and replies. The PCP client's source port SHOULD be randomly generated [RFC6056].
To assist with detecting an on-path NAT, he PCP header includes the source IP address and port of the PCP message itself. On operating systems that support the sockets API, the following steps are RECOMMENDED to determine the correct source address and port to include in the PCP header:
When attempting to contact a PCP server, the PCP client initializes a timer to 2 seconds. The PCP client sends a PCP message to the first server in its list of PCP servers. If no response is received before the timer expires, the timer is doubled (to 4 seconds) and the request is re-transmitted. If no response is received before the timer expires, the timer is doubled again (to 8 seconds) and the request is re-transmitted.
Once a PCP client has successfully received a response from a PCP server on that interface, it sends subsequent PCP requests to that same server, with a retransmission timer of 2 seconds. If, after 2 seconds, a response is not received from that PCP server, the same back-off algorithm described above is performed.
This section details operation specific to a PCP server. Processing SHOULD be performed in the order of the following paragraphs.
A PCP server MUST only accept normal (non-THIRD_PARTY) PCP requests from a client on the same interface it would normally receive packets from that client, and silently ignores PCP requests arriving on any other interface. For example, a residential NAT gateway only accepts PCP requests arriving on its (LAN) interface connecting to the internal network, and silently ignores PCP requests arriving on its external (WAN) interface. A PCP server which supports THIRD_PARTY requests MAY be configured to accept THIRD_PARTY requests on other interfaces from properly authorized clients.
Upon receiving a request, the PCP server parses and validates it. A valid request contains a valid PCP common header, one valid PCP Opcode, and zero or more Options (which the server might or might not comprehend). If an error is encountered during processing, the server generates an error response which is sent back to the PCP client. Processing an OpCode and the Options are specific to each OpCode.
If the server is overloaded by requests (from a particular client or from all clients), it MAY simply discard requests, as the requests will be retried by PCP clients, or it MAY generate the SERVER_OVERLOADED error response.
If the received message is shorter than 4 octets or has the R bit set, the message is simply dropped. If the length of the message exceeds 1024 octets or is not a multiple of 4 octets, it is invalid. Invalid requests are handled by copying up to 1024 octets of the request into the response, setting the result code to MALFORMED_REQUEST, and zero-padding the response to a multiple of 4 octets if necessary. If the version number is not supported, a response is generated with the UNSUPP_VERSION result code and the other steps detailed in Section 6.6. If the OpCode is not supported, a response is generated with the UNSUPP_OPCODE result code.
If the source IP address and port of the received packet do not match the contents of the PCP Client's IP Address and PCP Client's Port fields, a response is generated with the ADDRESS_MISMATCH result code. This is done to detect and prevent accidental use of PCP where a non-PCP-aware NAT exists between the PCP client and PCP server.
Error responses have the same packet layout as success responses, with fields from the request copied into the response, and fields assigned by the PCP server are set as indicated in Figure 3.
The PCP client receives the response and verifies that the source IP address and port belong to the PCP server of an outstanding PCP request. It validates that the OpCode matches an outstanding PCP request. Responses shorter than 12 octets, longer than 1024 octets, or not a multiple of 4 octets are invalid and ignored, likely causing the request to be re-transmitted. The response is further matched by comparing fields in the response OpCode-specific data to fields in the request OpCode-specific data, as described by the processing for that OpCode. After these matches are successful, the PCP client checks the Epoch field to determine if it needs to restore its state to the PCP server (see Section 6.5).
If the result code is 0 (SUCCESS), the PCP client knows the request was successful.
If the result code is not 0, the request failed. If the result code is UNSUPP_VERSION, processing continues as described in Section 6.6. If the result code is SERVER_OVERLOADED, the PCP client SHOULD NOT send *any* further requests to that PCP server for the indicated error lifetime. For other error result codes, the PCP client SHOULD NOT resend the same request for the indicated error lifetime. If the PCP server indicates an error lifetime in excess of 30 minutes, the PCP client MAY choose to set its retry timer to 30 minutes.
If the PCP client has discovered a new PCP server (e.g., connected to a new network), the PCP client SHOULD immediately begin communicating with this PCP server, without regard to hold times from communicating with a previous PCP server.
Hosts which desire a PCP mapping might be multi-interfaced (i.e., own several logical/physical interfaces). Indeed, a host can be configured with several IPv4 addresses (e.g., WiFi and Ethernet) or dual-stacked. These IP addresses may have distinct reachability scopes (e.g., if IPv6 they might have global reachability scope as for Global Unicast Address (GUA, [RFC3587]) or limited scope as for Unique Local Address (ULA) [RFC4193]).
IPv6 addresses with global reachability (e.g., GUA) SHOULD be used as the source address when generating a PCP request. IPv6 addresses without global reachability (e.g., ULA [RFC4193]), SHOULD NOT be used as the source interface when generating a PCP request. If IPv6 privacy addresses [RFC4941] are used for PCP mappings, a new PCP request will need to be issued whenever the IPv6 privacy address is changed. This PCP request SHOULD be sent from the IPv6 privacy address itself. It is RECOMMENDED that mappings to the previous privacy address be deleted.
Due to the ubiquity of IPv4 NAT, IPv4 addresses with limited scope (e.g., private addresses [RFC1918]) MAY be used as the source interface when generating a PCP request.
As mentioned in Section 2.3, only single-homed CP routers are in scope. Therefore, there is no viable scenario where a host located behind a CP router is assigned two Global Unicast Addresses belonging to different global IPv6 prefixes.
Every PCP response sent by the PCP server includes an Epoch field. This field increments by 1 every second, and is used by the PCP client to determine if PCP state needs to be restored. If the PCP server resets or loses the state of its explicit dynamic Mappings (that is, those mappings created by PCP MAP requests), due to reboot, power failure, or any other reason, it MUST reset its Epoch time and begin counting again from 0. Similarly, if the public IP address(es) of the NAT (controlled by the PCP server) changes, the Epoch MUST be reset to 0. A PCP server MAY maintain one Epoch value for all PCP clients, or MAY maintain distinct Epoch values (per PCP client, per interface, or based on other criteria); this choice is implementation-dependent.
Whenever a client receives a PCP response, the client computes its own conservative estimate of the expected Epoch value by taking the Epoch value in the last packet it received from the gateway and adding 7/8 (87.5%) of the time elapsed since that packet was received. If the Epoch value in the newly received packet is less than the client's conservative estimate by more than one second, then the client concludes that the PCP server lost state, and the client MUST immediately renew all its active port mapping leases as described in Section 11.3.1.
A PCP client sends its requests using PCP version number 1. Should later updates to this document specify different message formats with a version number greater than 1 it is expected that PCP servers will still support version 1 in addition to the newer version(s). However, in the event that a server returns a response with result code UNSUPP_VERSION, the client MAY log an error message to inform the user that it is too old to work with this server.
When sending a response containing the UNSUPP_VERSION result code, the PCP message MUST be 12 octets long.
If future PCP versions greater than 1 are specified, version negotiation is expected to proceed as follows:
The following Option can appear in certain PCP responses, without regard to the OpCode.
If the PCP server cannot process a mandatory-to-process Option, for whatever reason, it includes the UNPROCESSED Option in the response, shown in Figure 5. This helps with debugging interactions between the PCP client and PCP server. This Option MUST NOT appear more than once in a PCP response. The unprocessed Options are listed once, and the Option data is zero-filled to the necessary 32 bit boundary. If a certain Option appeared more than once in the PCP request, that Option value can appear once or as many times as it occurred in the request. The order of the Options in the PCP request has no relationship with the order of the Option values in this UNPROCESSED Option. This Option MUST NOT appear in a response unless the associated request contained at least one mandatory-to-process Option.
The UNPROCESSED Option is formatted as follows, showing an example of two Option codes that were unprocessed:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option-code-1 | Option-code-2 | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Padding: 0, 1, 2, or 3 octets. If the number of Option-codes is not a multiple of 4, padding is used to make it 32-bit aligned. The padding MUST be zeroed on sending, and MUST be ignored by the receiver.
There are four uses for the MAP and PEER OpCodes defined in this document: a host operating a server and wanting an incoming connection (Section 7.1); a host operating a client and server on the same port (Section 7.2); a host operating a client and wanting to optimize the application keepalive traffic (Section 7.3); and a host operating a client and wanting to restore lost state in its NAT (Section 7.4). These are discussed in the following sections.
When operating a server (Section 7.1 and Section 7.2) the PCP client knows if it wants an IPv4 listener, IPv6 listener, or both on the Internet. The PCP client also knows if it has an IPv4 address on itself or an IPv6 interface on itself. It takes the union of this knowledge to decide which of its PCP servers to send the request (e.g., a PCP server on its IPv4 interface or its IPv6 interface), and if to send one or two MAP requests for each of its interfaces (e.g., if the PCP client has only an IPv4 address but wants both IPv6 and IPv4 listeners, it sends a MAP4 request and a MAP6 request from its IPv4 interface. If the PCP client has both an IPv4 and IPv6 address, and only wants an IPv4 listener, it sends one MAP request from its IPv4 interface (if the PCP server supports NAT44 or IPv4 firewall) or one MAP request from its IPv6 interface (if the PCP server supports NAT64)). The PCP client can simply request the desired mapping to determine if the PCP server supports the desired mapping. Applications that embed IP addresses in payloads (e.g., FTP, SIP) will find it beneficial to avoid address family translation, if possible.
It is REQUIRED that the PCP-controlled device assign the same external IP address to PCP-created explicit dynamic mappings and to implicit dynamic mappings for a given Internal Host. It is RECOMMENDED that static mappings for that Internal Host (e.g., those created by a command-line interface on the PCP server or PCP-controlled device) also be assigned to the same IP address. Once all internal addresses assigned to a given Internal Host have no implicit dynamic mappings and have no explicit dynamic mappings in the PCP-controlled device, a subsequent PCP request for that Internal Address MAY be assigned to a different External Address. Generally, this re-assignment would occur when a CGN device is load balancing newly-seen hosts to its public IPv4 address pool.
A host operating a server (e.g., a web server) listens for traffic on a port, but the server never initiates traffic from that port. For this to work across a NAT or a firewall, the host needs to (a) create a mapping from a public IP address and port to itself as described in Section 8 and (b) publish that public IP address and port via some sort of rendezvous server (e.g., DNS, a SIP message, a proprietary protocol). Publishing the public IP address and port is out of scope of this specification. To accomplish (a), the host follows the procedures described in this section.
As normal, the application needs to begin listening on a port. Then, the application constructs a PCP message with the appropriate MAP OpCode depending on if it is listening on an IPv4 or IPv6 address and if it wants a public IPv4 or IPv6 address.
The following pseudo-code shows how PCP can be reliably used to operate a server:
/* start listening on the local server port */ int s = socket(...); bind(s, ...); listen(s, ...); getsockname(s, &internal_sockaddr, ...); bzero(&external_sockaddr, sizeof(external_sockaddr)); while (1) { /* Note: the "time_to_send_pcp_request()" check below includes: * 1. Sending the first request * 2. Retransmitting requests due to packet loss * 3. Resending a request due to impending lease expiration * The PCP packet sent is identical in all cases, apart from the * Suggested External Address and Port which may change over time */ if (time_to_send_pcp_request()) pcp_send_map_request(internal_sockaddr.sin_port, internal_sockaddr.sin_addr, &external_sockaddr, /* will be zero the first time */ requested_lifetime, &assigned_lifetime); if (pcp_response_received()) update_rendezvous_server("Client Ident", external_sockaddr); if (received_incoming_connection_or_packet()) process_it(s); if (other_work_to_do()) do_it(); /* ... */ block_until_we_need_to_do_something_else(); }
A host operating a client and server on the same port (e.g., Symmetric RTP [RFC4961] or SIP Symmetric Response Routing (rport) [RFC3581]) first establishes a local listener, (usually) sends the local and public IP addresses and ports to a rendezvous service (which is out of scope of this document), and initiates an outbound connection from that same source address and same port. To accomplish this, the application uses the procedure described in this section.
An application that is using the same port for outgoing connections as well as incoming connections MUST first signal its operation of a server using the PCP MAP OpCode, as described in Section 8, and receive a positive PCP response before it sends any packets from that port.
The following pseudo-code shows how PCP can be used to operate a symmetric client and server:
/* start listening on the local server port */ int s = socket(...); bind(s, ...); listen(s, ...); getsockname(s, &internal_sockaddr, ...); bzero(&external_sockaddr, sizeof(external_sockaddr)); while (1) { /* Note: the "time_to_send_pcp_request()" check below includes: * 1. Sending the first request * 2. Retransmitting requests due to packet loss * 3. Resending a request due to impending lease expiration * The PCP packet sent is identical in all cases, apart from the * Suggested External Address and Port which may change over time */ if (time_to_send_pcp_request()) pcp_send_map_request(internal_sockaddr.sin_port, internal_sockaddr.sin_addr, &external_sockaddr, /* will be zero the first time */ requested_lifetime, &assigned_lifetime); if (pcp_response_received()) update_rendezvous_server("Client Ident", external_sockaddr); if (received_incoming_connection_or_packet()) process_it(s); if (need_to_make_outgoing_connection()) make_outgoing_connection(s, ...); if (data_to_send()) send_it(s); if (other_work_to_do()) do_it(); /* ... */ block_until_we_need_to_do_something_else(); }
A host operating a client (e.g., XMPP client, SIP client) sends from a port, and may receive responses, but never accepts incoming connections from other Remote Peers on this port. It wants to ensure the flow to its Remote Peer is not terminated (due to inactivity) by an on-path NAT or firewall. To accomplish this, the application uses the procedure described in this section.
Middleboxes such as NATs or firewalls need to see occasional traffic or will terminate their session state, causing application failures. To avoid this, many applications routinely generate keepalive traffic for the primary (or sole) purpose of maintaining state with such middleboxes. Applications can reduce such application keepalive traffic by using PCP.
To use PCP for this function, the application first connects to its server, as normal. Afterwards, it issues a PCP request with the PEER4 or PEER6 OpCode as described in Section 9. The PEER4 OpCode is used if from the host's point of view it is using IPv4 for its communication to its peer; PEER6 if from the host's point of view it is using IPv6 (e.g., a host behind NAT64 would use PEER6 because from that host's point of view it is using IPv6). The same 5-tuple as used for the connection to the server is placed into the PEER4 or PEER6 payload.
The following pseudo-code shows how PCP can be reliably used with a dynamic socket, for the purposes of reducing application keepalive messages:
int s = socket(...); connect(s, &remote_peer, ...); getsockname(s, &internal_sockaddr, ...); bzero(&external_sockaddr, sizeof(external_sockaddr)); while (1) { /* Note: the "time_to_send_pcp_request()" check below includes: * 1. Sending the first request * 2. Retransmitting requests due to packet loss * 3. Resending a request due to impending lease expiration * The PCP packet sent is identical in all cases, apart from the * Suggested External Address and Port which may change over time */ if (time_to_send_pcp_request()) pcp_send_peer_request(internal_sockaddr.sin_port, internal_sockaddr.sin_addr, &external_sockaddr, /* will be zero the first time */ remote_peer, requested_lifetime, &assigned_lifetime); if (data_to_send()) send_it(s); if (other_work_to_do()) do_it(); /* ... */ block_until_we_need_to_do_something_else(); }
After a NAT loses state (e.g., because of a crash or power failure), it is useful for clients to re-establish TCP mappings on the NAT. This allows servers on the Internet to see traffic from the same IP address and port, so that sessions can be resumed exactly where they were left off. This can be useful for long-lived connections (e.g., instant messaging) or for connections transferring a lot of data (e.g., FTP). This can be accomplished by establishing a TCP connection normally and then sending a PEER request/response and remember the External Address and External Port. Later, when the NAT has lost state, the client can send a PEER request with the Suggested External Port and Suggested External Address remembered from the previous session, which will create a mapping in the NAT that functions exactly as an implicit dynamic mapping. The client then resumes sending TCP data to the server.
This section defines two OpCodes which control forwarding from a NAT (or firewall) to an Internal Host. They are:
All compliant PCP Servers MUST support one or both MAP opcodes, appropriate to the address families they support (e.g., a traditional NAT44 gateway is not required to support MAP6). PCP Servers SHOULD provide a configuration option to allow administrators to disable MAP support if they wish.
The internal address is the source IP address of the PCP request message itself, unless the THIRD_PARTY Option is used.
Mappings created by PCP MAP requests are, by definition, Endpoint Independent Mappings with Endpoint Independent Filtering (unless the FILTER Option is used), even on a NAT that usually creates Endpoint Dependent Mappings or Endpoint Dependent Filtering for outgoing connections, since the purpose of an (unfiltered) MAP mapping is to receive inbound traffic from any remote endpoint, not from only one specific remote endpoint.
Note also that all NAT mappings (created by PCP or otherwise) are by necessity bidirectional and symmetric. For any packet going in one direction (in or out) that is translated by the NAT, a reply going in the opposite direction needs to have the corresponding opposite translation done so that the reply arrives at the right endpoint. This means that if a client creates a MAP mapping, and then later sends an outgoing packet using the mapping's internal source port, the NAT should translate that packet's Internal Address and Port to the mapping's External Address and Port, so that replies addressed to the External Address and Port are correctly translated to the mapping's Internal Address and Port.
The operation of the MAP OpCodes is described in this section.
The two MAP OpCodes (MAP4, MAP6) share a similar packet layout for both requests and responses. Because of this similarity, they are shown together. For both of the MAP OpCodes, if the assigned External IP address and assigned External Port in the PCP response always match the Internal IP Address and Port in the PCP request, then the functionality is purely a firewall; otherwise it pertains to a network address translator which might also perform firewall-like functions.
The following diagram shows the format of the OpCode-specific information in a request for the MAP4 and MAP6 OpCodes.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internal Port | Suggested External Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Suggested External IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
The following diagram shows the format of OpCode-specific information in a response packet for the MAP4 and MAP6 OpCodes:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internal Port | Assigned External Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Assigned External IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
In addition to the general PCP result codes (Section 5.4), the following additional result codes may be returned as a result of the two MAP OpCodes received by the PCP server. Each error code below is classified as either a 'long lifetime' error or a 'short lifetime' error, which provides guidance to PCP server developers for the value of the Lifetime field for these errors. It is RECOMMENDED that short lifetime errors use 30 second lifetime and long lifetime errors use 30 minute lifetime.
Additional result codes may be returned if the THIRD_PARTY Option is used, see Section 10.1.
This section and Section 8.6 describe the operation of a PCP client when sending requests with OpCodes MAP4 and MAP6.
The request MAY contain values in the Suggested External Port and Suggested External IP Address fields. This allows the PCP client to attempt to rebuild lost state on the PCP server, which improves the chances of existing connections surviving, and helps the PCP client avoid having to change information maintained at its rendezvous server. Of course, due to other activity on the network (e.g., by other users or network renumbering), the PCP server may not be able grant the suggested External IP Address and Port, and in that case it will allocate a different External IP Address and Port.
An existing mapping can have its lifetime extended by the PCP client. To do this, the PCP client sends a new MAP request indicating the internal port. The PCP MAP request SHOULD also include the currently allocated external IP address and port as the suggested external IP address and port, so that if the NAT gateway has lost state it can recreate the lost mapping with the same parameters.
The PCP client SHOULD renew the mapping before its expiry time, otherwise it will be removed by the PCP server (see Section 8.6). In order to prevent excessive PCP chatter, it is RECOMMENDED to send a single renewal request packet when a mapping is halfway to expiration time, then, if no SUCCESS result is received, another single renewal request 3/4 of the way to expiration time, and then another at 7/8 of the way to expiration time, and so on, subject to the constraint that renewal requests MUST NOT be sent less than four seconds apart (a PCP client MUST NOT send a flood of ever-closer-together requests in the last few seconds before a mapping expires).
This section and Section 8.6 describe the operation of a PCP server when processing a request with the OpCodes MAP4 or MAP6. Processing SHOULD be performed in the order of the following paragraphs
If the requested lifetime is non-zero, it indicates a request to create a mapping or extend the lifetime of an existing mapping. However, if the request also contains Internal Port equal to 0 or Protocol equal to 0, the server MUST generate a MALFORMED_REQUEST error.
If the requested lifetime is zero, it indicates a request to delete an existing mapping or set of mappings.
Processing of the lifetime is described in Section 8.6.
If the PCP-controlled device is stateless (that is, it does not establish any per-flow state, and simply rewrites the address and/or port in a purely algorithmic fashion), the PCP server simply returns an answer indicating the external IP address and port yielded by this stateless algorithmic translation. This allows the PCP client to learn its external IP address and port as seen by remote peers. Examples of stateless translators include stateless NAT64 and 1:1 NAT44, both of which modify addresses but not port numbers.
If an Option with value less than 128 exists (i.e., mandatory to process) but that Option does not make sense (e.g., the PREFER_FAILURE Option is included in a request with lifetime=0), the request is invalid and generates a MALFORMED_OPTION error.
If a mapping already exists for the requested Internal Address and Port, the PCP server MUST refresh the lifetime of that already-existing mapping, and return the already-existing External Address and Port in its response.
If no mapping already exists for the requested Internal Address and Port, and the PCP server is able to create a mapping using the Suggested External Address and Port, it SHOULD do so. This is beneficial for re-establishing state lost when the PCP server loses its state (e.g., due to a reboot). If the PCP server cannot allocate the Suggested External Address and Port but can allocate some other External Address and Port (and the request did not contain the PREFER_FAILURE Option) the PCP server MUST do so and return the newly allocated External Address and Port in the response. Cases where a NAT gateway cannot allocate the Suggested External Address and Port include:
By default, a PCP-controlled device MUST NOT create mappings for a protocol not indicated in the request. For example, if the request was for a TCP mapping, a UDP mapping MUST NOT be created.
If the THIRD_PARTY Option is not present in the request, the source IP address of the PCP packet is used as the Internal Address for the mapping. If the THIRD_PARTY Option is present, the PCP server validates that the client is authorized to make mappings on behalf of the indicated Internal IP Address. This validation depends on the PCP deployment scenario; see Section 13.3 for an example validation procedure. If the internal IP address in the PCP request is not authorized to make mappings on behalf of the indicated internal IP address, an error response MUST be generated with result code NOT_AUTHORIZED.
Mappings typically consume state on the PCP-controlled device, and it is RECOMMENDED that a per-host and/or per-subscriber limit be enforced by the PCP server to prevent exhausting the mapping state. If this limit is exceeded, the result code USER_EX_QUOTA is returned.
If all of the preceding operations were successful (did not generate an error response), then the requested mapping is created or refreshed as described in the request and a SUCCESS response is built. This SUCCESS response contains the same OpCode as the request, but with the "R" bit set.
This section describes the operation of the PCP client when it receives a PCP response for the OpCodes MAP4 or MAP6.
After performing common PCP response processing, the response is further matched with an outstanding request by comparing the protocol, internal IP address, and internal port. On an error response, the assigned external address and assigned external port can also be used to match the responses (which is useful if several requests with the PREFER_FAILURE Option are outstanding). Other fields are not compared, because the PCP server sets those fields.
On a successful response, the PCP client can use the External IP Address and Port as desired. Typically the PCP client will communicate the External IP Address and Port to another host on the Internet using an application-specific rendezvous mechanism such as DNS SRV records.
The PCP client MUST also set a timer or otherwise schedule an event to renew the mapping before its lifetime expires. Renewing a mapping is performed by sending another MAP request, exactly as described above in Section 8.3, except that the Suggested External Address and Port SHOULD be set to the values received in the successful response. This allows the same mapping to be recreated in the event of PCP server state loss. From the PCP server's point of view a MAP request to renew a mapping is identical to a MAP request to request a new mapping, and is handled identically. Indeed, in the event of PCP server state loss, a renewal request from a PCP client will appear to the server to be a request for a new mapping, with a particular Suggested External Address and Port, which happens to be what the PCP server previously allocated. See also Section 11.3.2.
On an error response, clients SHOULD NOT repeat the same request to the same PCP server within the lifetime returned in the response.
The PCP client requests a certain lifetime, and the PCP server responds with the assigned lifetime. The PCP server MAY grant a lifetime smaller or larger than the requested lifetime. The PCP server SHOULD be configurable for permitted minimum and maximum lifetime, and the RECOMMENDED values are 120 seconds for the minimum value and 24 hours for the maximum. It is RECOMMENDED that the server be configurable to restrict lifetimes to less than 24 hours, because mappings will consume ports even if the Internal Host is no longer interested in receiving the traffic or no longer connected to the network. These recommendations are not strict, and deployments should evaluate the trade offs to determine their own minimum and maximum lifetime values.
Once a PCP server has responded positively to a mapping request for a certain lifetime, the port forwarding is active for the duration of the lifetime unless the lifetime is reduced by the PCP client (to a shorter lifetime or to zero) or until the PCP server loses its state (e.g., crashes). Mappings created by PCP MAP requests are not special or different from mappings created in other ways. In particular, it is implementation-dependent if outgoing traffic extends the lifetime of such mappings beyond the PCP-assigned lifetime. PCP clients MUST NOT depend on this behavior to keep mappings active, and MUST explicitly renew their mappings as required by the Lifetime field in PCP response messages.
If the requested lifetime is zero (lifetime==0) then:
The suggested external address and port fields are ignored in requests where the requested lifetime is 0.
If the PCP client attempts to delete a single static mapping (i.e., a mapping created outside of PCP itself), the error NOT_AUTHORIZED is returned. If the PCP client attempts to delete an implicit dynamic mapping, the PCP server deletes the mapping and filtering and responds with the SUCCESS result code. If the PCP client attempts to delete a mapping that does not exist, the SUCCESS result code is returned (this is necessary for PCP to be idempotent). If the PCP MAP request was for port=0 (indicating 'all ports'), the PCP server deletes all of the explicit dynamic mappings it can (but not any implicit or static mappings), and returns a SUCCESS response. If the deletion request was properly formatted and successfully processed, a SUCCESS response is generated with lifetime of 0 and the server copies the protocol and internal port number from the request into the response. An explicit dynamic mapping MUST NOT have its lifetime reduced by transport protocol messages (e.g., TCP RST, TCP FIN).
An application that forgets its PCP-assigned mappings (e.g., the application or OS crashes) will request new PCP mappings. This may consume port mappings, if the application binds to a different Internal Port every time it runs. The application will also likely initiate new implicit dynamic mappings without using PCP, which will also consume port mappings. If there is a port mapping quota for the Internal Host, frequent restarts such as this may exhaust the quota. PCP provides some protections against such port consumption: When a PCP client first acquires a new IP address (e.g., reboots or joins a new network), it SHOULD remove mappings that may already be instantiated for that new Internal Address. To do this, the PCP client sends a MAP request with protocol, internal port, and lifetime set to 0. Some port mapping APIs (e.g., the "DNSServiceNATPortMappingCreate" API provided by Apple's Bonjour on Mac OS X, iOS, Windows, Linux [Bonjour]) automatically monitor for process exit (including application crashes) and automatically send port mapping deletion requests if the process that requested them goes away without explicitly relinquishing them.
To reduce unwanted traffic and data corruption, External UDP and TCP ports SHOULD NOT be re-used for an interval (TIME_WAIT interval [RFC0793]). However, the PCP server SHOULD allow the previous user of the External Port to re-acquire the same port during that interval.
As a side-effect of creating a mapping, ICMP messages associated with the mapping MUST be forwarded (and also translated, if appropriate) for the duration of the mapping's lifetime. This is done to ensure that ICMP messages can still be used by hosts, without application programmers or PCP client implementations needing to signal PCP separately to create ICMP mappings for those flows.
The customer premises router might obtain a new IP address. This can occur because of a variety of reasons including a reboot, power outage, DHCP lease expiry, or other action by the ISP. If this occurs, traffic forwarded to the host's previous address might be delivered to another host which now has that address. This affects both implicit dynamic mappings and explicit dynamic mappings. However, this same problem already occurs today when a host's IP address is re-assigned, without PCP and without an ISP-operated CGN. The solution is the same as today: the problems associated with host renumbering are caused by host renumbering and are eliminated if host renumbering is avoided. PCP defined in this document does not provide machinery to reduce the host renumbering problem.
When an Internal Host changes its IP address (e.g., by having a different address assigned by the DHCP server) the NAT (or firewall) will continue to send traffic to the old IP address. Typically, the Internal Host will no longer receive traffic sent to that old IP address. Assuming the Internal Host wants to continue receiving traffic, it needs to install new mappings for its new IP address. The suggested external port field will not be fulfilled by the PCP server, in all likelihood, because it is still being forwarded to the old IP address. Thus, a mapping is likely to be assigned a new external port number and/or public IP address. Note that such host renumbering is not expected to happen routinely on a regular basis for most hosts, since most hosts renew their DHCP leases before they expire (or re-request the same address after reboot) and most DHCP servers honor such requests and grant the host the same address it was previously using before the reboot.
A host might gain or lose interfaces while existing mappings are active (e.g., Ethernet cable plugged in or removed, joining/leaving a WiFi network). Because of this, if the PCP client is sending a PCP request to maintain state in the PCP server, it SHOULD ensure those PCP requests continue to use the same interface (e.g., when refreshing mappings). If the PCP client is sending a PCP request to create new state in the PCP server, it MAY use a different source interface or different source address.
This section defines two OpCodes for controlling dynamic connections. They are:
The use of these OpCodes is described in this section.
All compliant PCP Servers MUST support one or both PEER opcodes, appropriate to the address families they support (e.g., a traditional NAT44 gateway is not required to support PEER6). PCP Servers SHOULD provide a configuration option to allow administrators to disable PEER support if they wish.
Note that mappings created or managed using PCP PEER requests may be Endpoint Independent Mappings or Endpoint Dependent Mappings, with Endpoint Independent Filtering or Endpoint Dependent Filtering, consistent with the existing behavior of the NAT gateway or firewall in question for implicit mappings it creates automatically as a result of observing outgoing traffic from Internal Hosts.
The PEER OpCodes provide the ability for the PCP client to create, query and (possibly) extend the lifetime of a mapping and its associated filtering.
The two PEER OpCodes (PEER4 and PEER6) share a similar packet layout for both requests and responses. Because of this similarity, they are shown together.
The following diagram shows the request packet format for PEER4 and PEER6. This packet format is aligned with the response packet format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internal Port | Suggested External Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Peer Port | Reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Remote Peer IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Suggested External IP address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
When attempting to re-create a lost mapping, the Suggested External IP Address and Port are set to the External IP Address and Port fields received in a previous PEER response from the PCP server. On an initial PEER request, the External IP Address and Port are set to zero.
The following diagram shows the response packet format for PEER4 and PEER6:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internal Port | External Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Peer Port | Reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Remote Peer IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | External IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In addition to the general PCP result codes (Section 5.4), the PCP server may return the same result codes for PEER OpCodes as for MAP OpCodes (see Section 8.2).
This section describes the operation of a client when generating the OpCodes PEER4 or PEER6.
The PEER4 or PEER6 OpCodes MAY be sent before or after establishing bi-directional communication with the remote peer. If sent before, PEER4 or PEER6 OpCodes will create a mapping in the PCP-controlled device. If sent after, the PEER4 or PEER6 OpCodes query the state of the implicit dynamic mapping, recreate the implicit dynamic mapping if it as been lost, and possibly modify its lifetime (for the purpose described in Section 7.3).
The PEER4 and PEER6 OpCodes contain a description of the remote peer address, from the perspective of the PCP client. Note that when the PCP-controlled device is performing address family translation (NAT46 or NAT64), the remote peer address from the perspective of the PCP client is different from the remote peer address on the other side of the address family translation device.
This section describes the operation of a server when receiving a request with the OpCode PEER4 or PEER6. Processing SHOULD be performed in the order of the following paragraphs.
On receiving the PEER4 or PEER6 OpCode, the PCP server examines the mapping table. If the requested mapping does not yet exist yet, it is created, and the Suggested External Address and Port are honored (if possible; if not possible, a mapping to a different External Address and Port is created). By having PEER create such a mapping, we avoid a race condition between the PEER request or the initial outgoing packet arriving at the NAT gateway first, and allow PEER to be used to recreate an implicit dynamic mapping (see last paragraph of Section 11.3.1).
The PEER4 or PEER6 OpCode MAY reduce the lifetime of an existing mapping; this is implementation-dependent.
If the PCP-controlled device can extend the lifetime of a mapping, the PCP server uses the smaller of its configured maximum lifetime value and the requested lifetime from the PEER request, and sets the lifetime to that value.
If all of the proceeding operations were successful (did not generate an error response), then a SUCCESS response is generated, with the Lifetime field containing the lifetime of the mapping.
After a successful PEER response is sent, it is implementation-specific if the PCP-controlled device destroys the mapping when the lifetime expires, or if the PCP-controlled device's implementation allows traffic to keep the mapping alive. Thus, if the PCP client wants the mapping to persist beyond the lifetime, it MUST refresh the mapping (by sending another PEER message) prior to the expiration of the lifetime. If the mapping is terminated by the TCP client or server (e.g., TCP FIN or TCP RST), the mapping will be destroyed normally; the mapping will not persist for the time indicated by Lifetime. This means the Lifetime in a PEER response indicates how long the mapping will persist in the absence of a transport termination message (e.g., TCP RST).
This section describes the operation of a client when processing a response with the OpCode PEER4 or PEER6.
After performing common PCP response processing, the response is further matched with a request by comparing the protocol, internal IP address, internal port, remote peer address and remote peer port. Other fields are not compared, because the PCP server changes those fields to provide information about the mapping created by the OpCode.
On a successful response, the application can use the assigned lifetime value to reduce its frequency of application keepalives for that particular NAT mapping. Of course, there may be other reasons, specific to the application, to use more frequent application keepalives. For example, the PCP assigned lifetime could be one hour but the application may want to maintain state on its server (e.g., "busy" / "away") more frequently than once an hour.
If the PCP client wishes to keep this mapping alive beyond the indicated lifetime, it SHOULD issue a new PCP request prior to the expiration. That is, inside->outside traffic is not sufficient to ensure the mapping will continue to exist. It is RECOMMENDED to send a single renewal request packet when a mapping is halfway to expiration time, then, if no SUCCESS response is received, another single renewal request 3/4 of the way to expiration time, and then another at 7/8 of the way to expiration time, and so on, subject to the constraint that renewal requests MUST NOT be sent less than four seconds apart (a PCP client MUST NOT ever-closer-together requests in the last few seconds before a mapping expires).
This section describes Options for the MAP4, MAP6, PEER4 and PEER6 OpCodes. These Options MUST NOT appear with other OpCodes, unless permitted by those OpCodes.
This Option is used when a PCP client wants to control a mapping to an Internal Host other than itself. This is used with both MAP and PEER OpCodes.
A THIRD_PARTY Option MUST NOT contain the same address as the source address of the packet. A PCP server receiving a THIRD_PARTY Option specifying the same address as the source address of the packet MUST return a MALFORMED_REQUEST result code. This is because many PCP servers may not implement the THIRD_PARTY Option at all, and a client using the THIRD_PARTY Option to specify the same address as the source address of the packet will cause mapping requests to fail where they would otherwise have succeeded.
Where possible, it may beneficial if a client using the THIRD_PARTY Option to create and maintain mappings on behalf of some other device can take steps to verify that the other device is still present and active on the network. Otherwise the client using the THIRD_PARTY Option to maintain mappings on behalf of some other device risks maintaining those mappings forever, long after the device that required them has gone. This would defeat the purpose of PCP mappings having a finite lifetime so that they can be automatically deleted after they are no longer needed.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Internal IP Address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The fields are described below:
This Option:
The following additional result code may be returned as a result of using this Option.
A PCP server MAY be configured to permit or to prohibit the use of the THIRD_PARTY Option. If this Option is permitted, properly authorized clients may perform these operations on behalf of other hosts. If this Option is prohibited, and a PCP server receives a PCP MAP request with a THIRD_PARTY Option, it MUST generate a UNAUTH_THIRD_PARTY_INTERNAL_ADDRESS response.
It is RECOMMENDED that customer premises equipment implementing a PCP Server be configured to prohibit third party mappings by default. With this default, if a user wants to create a third party mapping, the user needs to interact out-of-band with their customer premises router (e.g., using its HTTP administrative interface).
It is RECOMMENDED that service provider NAT and firewall devices implementing a PCP Server be configured to permit the THIRD_PARTY Option, when sent by a properly authorized host. If the packet arrives from an unauthorized host, the PCP server MUST generate an UNAUTH_THIRD_PARTY_INTERNAL_ADDRESS error.
Determining which PCP clients are authorized to use the THIRD_PARTY Option for which other hosts is deployment-dependent. For example, an ISP using Dual-Stack Lite could choose to allow a client connecting over a given IPv6 tunnel to manage mappings for any other host connecting over the same IPv6 tunnel, or the ISP could choose to allow only the DS-Lite B4 element to manage mappings for other hosts connecting over the same IPv6 tunnel. A cryptographic authentication and authorization model is outside the scope of this specification. Note that the THIRD_PARTY Option is not needed for today's common scenario of an ISP offering a single IP address to a customer who is using NAT to share that address locally, since in this scenario all the customer's hosts appear to be a single host from the point of view of the ISP.
A PCP client can delete all PCP-created explicit dynamic mappings (i.e., those created by PCP MAP requests) that it is authorized to delete by sending a PCP MAP request including a zero-length THIRD_PARTY Option.
This Option is only used with the MAP4 and MAP6 OpCodes.
This Option indicates that if the PCP server is unable to map the Suggested External Port, then rather than returning an external port that it can allocate, the PCP server should instead allocate no external port and return an error. The error returned would be a general MAP error (e.g., NOT_AUTHORIZED) or the result code specific to this Option, CANNOT_PROVIDE_EXTERNAL_PORT.
The result code CANNOT_PROVIDE_EXTERNAL_PORT is returned if the Suggested External Port cannot be mapped. This can occur because the External Port is already mapped to another host's implicit dynamic mapping, an explicit dynamic mapping, a static mapping, or the same Internal Address and Port has an implicit dynamic mapping which is mapped to a different External Port than requested. The server MAY set the Lifetime in the response to the remaining lifetime of the conflicting mapping, rounded up to the next larger integer number of seconds.
This Option exists solely for use by UPnP IGD interworking [I-D.bpw-pcp-upnp-igd-interworking], where the semantics of UPnP IGD version 1 only allow the UPnP IGD client to dictate mapping a specific port. A PCP server MAY support this Option, if its designers wish to support downstream devices that perform UPnP IGD interworking. PCP servers MAY choose to rate-limit their handling of PREFER_FAILURE requests, to protect themselves from a rapid flurry of 65535 consecutive PREFER_FAILURE requests from clients probing to discover which external ports are available. PCP servers that are not intended to support downstream devices that perform UPnP IGD interworking are not required to support this Option. PCP clients other than UPnP IGD interworking clients SHOULD NOT use this Option because it results in inefficient operation, and they cannot safely assume that all PCP servers will implement it. It is anticipated that this Option will be deprecated in the future as more clients adopt PCP natively and the need for UPnP IGD interworking declines.
This Option indicates that filtering incoming packets is desired. The Remote Peer Port and Remote Peer IP Address indicate the permitted remote peer's source IP address and port for packets from the Internet. The remote peer prefix length indicates the length of the remote peer's IP address that is significant; this allows a single Option to permit an entire subnet. After processing this MAP request containing the FILTER Option and generating a successful response, the PCP-controlled device will drop packets received on its public-facing interface that don't match the filter fields. After dropping the packet, if its security policy allows, the PCP-controlled device MAY also generate an ICMP error in response to the dropped packet.
The FILTER packet layout is described below:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Prefix Length | Remote Peer Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Remote Peer IP address (always 128 bits) | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
This Option:
The Prefix Length indicates how many bits of the IPv6 address or IPv4 address are used for the filter. For MAP4, a Prefix Length of 32 indicates the entire IPv4 address is used. For MAP6, a Prefix Length of 128 indicates the entire IPv6 address is used. For MAP4 the minimum Prefix Length value is 0 and the maximum value is 32. For MAP6 the minimum Prefix Length value is 0 and the maximum value is 128. Values outside those range cause an MALFORMED_OPTION result code.
If multiple occurrences of the FILTER Option exist in the same MAP request, they are processed in the same order received (as per normal PCP Option processing) and they MAY overlap the filtering requested. If an existing mapping exists (with or without a filter) and the server receives a MAP request with FILTER, the filters indicated in the new request are added to any existing filters. If a MAP request has a lifetime of 0 and contains the FILTER Option, the error MALFORMED_OPTION is returned.
If any of occurrences of the FILTER Option in a request packet are not successfully processed then an error is returned (e.g., MALFORMED_OPTION if one of the Options was malformed) and as with other PCP errors, returning an error causes no state to be changed in the PCP server or in the PCP-controlled device.
To remove all existing filters, the Prefix Length 0 is used. There is no mechanism to remove a specific filter.
To change an existing filter, the PCP client sends a MAP request containing two FILTER Options, the first Option containing a Prefix Length of 0 (to delete all existing filters) and the second containing the new remote peer's IP address and port. Other FILTER Options in that PCP request, if any, add more allowed Remote Peers.
The PCP server or the PCP-controlled device is expected to have a limit on the number of remote peers it can support. This limit might be as small as one. If a MAP request would exceed this limit, the entire MAP request is rejected with the result code EXCESSIVE_REMOTE_PEERS, and the state on the PCP server is unchanged.
All PCP servers MUST support at least one filter per MAP mapping.
The use of the FILTER Option can be seen as a performance optimization. Since all software using PCP to receive incoming connections also has to deal with the case where may be directly connected to the Internet and receive unrestricted incoming TCP connections and UDP packets, if it wishes to restrict incoming traffic to a specific source address or group of source addresses such software already needs to check the source address of incoming traffic and reject unwanted traffic. However, the FILTER Option is a particularly useful performance optimization for battery powered wireless devices, because it can enable them to conserve battery power by not having to wake up just to reject a unwanted traffic.
This section provides non-normative guidance that may be useful to implementors.
For implicit dynamic mappings, some existing NAT devices have endpoint-independent mapping (EIM) behavior while other NAT devices have endpoint-dependent mapping (EDM) behavior. NATs which have EIM behavior do not suffer from the problem described in this section. The IETF strongly encourages EIM behavior [RFC4787][RFC5382].
In such EDM NAT devices, the same external port may be used by an implicit dynamic mapping (from the same Internal Host or from a different Internal Host) and an explicit dynamic mapping. This complicates the interaction with the MAP4 and MAP6 OpCodes. With such NAT devices, there are two ways envisioned to implement the MAP4 and MAP6 OpCodes:
No matter if a NAT is EIM or EDM, it is possible that one (or more) implicit dynamic mappings, using the same internal port on the Internal Host, might be created before or after a MAP request. When this occurs, it is important that the NAT honor the Lifetime returned in the MAP response. Specifically, if a mapping was created with the MAP OpCode, the implementation needs to ensure that termination of an implicit dynamic mapping (e.g., via a TCP FIN handshake) does not prematurely destroy the MAP-created mapping. On a NAT that implements endpoint-independent mapping with endpoint-independent filtering, this could be implemented by extending the lifetime of the implicit dynamic mapping to the lifetime of the explicit dynamic mapping.
If an event occurs that causes the PCP server to lose explicit dynamic mapping state (such as a crash or power outage), the mappings created by PCP are lost. Such loss of state is rare in a service provider environment (due to redundant power, disk drives for storage, etc.), but more common in a residential NAT device which does not write this information to non-volatile memory. Of course, due to outright failure of service provider equipment (e.g., software malfunction), state may still be lost.
The Epoch allows a client to deduce when a PCP server may have lost its state. When the Epoch value is observed to be smaller than expected, the PCP client can attempt to recreate the mappings following the procedures described in this section.
A mapping renewal packet is formatted identically to an original mapping request; from the point of view of the client it is a renewal of an existing mapping, but from the point of view of a newly rebooted PCP server it appears as a new mapping request. In the normal process of routinely renewing its mappings before they expire, a PCP client will automatically recreate all its lost mappings.
When the PCP server loses state and begins processing new PCP messages, its Epoch is reset and begins counting again from zero (per the procedure of Section 6.5). As the result of receiving a packet where the Epoch field indicates that a reboot or similar loss of state has occurred, the client can renew its port mappings sooner, without waiting for the normal routine renewal time.
A PCP client refreshes a mapping by sending a new PCP request containing information from the earlier PCP response. The PCP server will respond indicating the new lifetime. It is possible, due to reconfiguration or failure of the PCP server, that the public IP address and/or public port, or the PCP server itself, has changed (due to a new route to a different PCP server). To detect such events more quickly, the PCP client may find it beneficial to use shorter lifetimes (so that it communicates with the PCP server more often). If the PCP client has several mappings, the Epoch value only needs to be retrieved for one of them to verify the PCP server has not lost explicit dynamic mapping state.
If the client wishes to check the PCP server's Epoch, it sends a PCP request for any one of the client's mappings. This will return the current Epoch value. In that request the PCP client could extend the mapping lifetime (by asking for more time) or maintain the current lifetime (by asking for the same number of seconds that it knows are remaining of the lifetime).
If a PCP client changes its Internal IP Address (e.g., because the Internal Host has moved to a new network), and the PCP client wishes to still receive incoming traffic, it needs create new mappings on that new network. New mappings will typically also require an update to the application-specific rendezvous server if the External Address or Port are different to the previous values (see Section 7.1 and Section 8.7).
As with implicit dynamic mappings created by outgoing TCP packets, explicit dynamic mappings created via PCP use the source IP address of the packet as the Internal Address for the mappings. Therefore ingress filtering [RFC2827] should be used on the path between the Internal Host and the PCP Server to prevent the injection of spoofed packets onto that path.
On PCP-controlled devices that create state when a mapping is created (e.g., NAT), the PCP server SHOULD maintain per-host and/or per-subscriber quotas for mappings. It is implementation-specific whether the PCP server uses a separate quotas for implicit, explicit, and static mappings, a combined quota for all of them, or some other policy.
The PEER OpCode can create a mapping (which behaves exactly as if an implicit dynamic mapping were created (e.g., by a TCP SYN)). In that case, the security implications for PEER are similar to MAP, described below. When PEER is used to create, query or extend an existing mapping, it does not introduce any new security considerations, unless the THIRD_PARTY Option is included. Discussion of the THIRD_PARTY Option is below.
Internet service providers do not generally filter traffic from the Internet towards their subscribers (with the exception of wireless providers who are interested in protecting both their radio access network and their subscriber's battery lifetime). However, when an ISP introduces stateful address sharing with a NAT device, such filtering will occur as a side effect of the NAT device. Filtering occurs as a side-effect of IPv4 NAT devices and may also occur with some IPv6 CPE devices [RFC6092]. Unlike the PEER OpCode, the MAP OpCode allows a PCP client to create a mapping so that a host can receive inbound traffic and operate a server. In some deployments the ability to accept connections from any host on the Internet may be considered a security issue. Security considerations for the MAP OpCode are described in the following sections.
Because of the state created in a NAT or firewall, a per-host and/or per-subscriber quota will likely exist for both implicit dynamic mappings and explicit dynamic mappings. A host might make an excessive number of implicit or explicit dynamic mappings, consuming an inordinate number of ports, causing a denial of service to other hosts. Thus, Section 12.2 recommends that hosts be limited to a reasonable number of explicit dynamic mappings.
An attacker, on the path between the PCP client and PCP server, can drop PCP requests, drop PCP responses, or spoof a PCP error, all of which will effective deny service. Through such actions, the PCP client would not be aware the PCP server might have actually processed the PCP request.
It is important to prevent a host from fraudulently creating, deleting, or refreshing a mapping (or filtering) for another host, because this can expose the other host to unwanted traffic and consumes the other host's mapping quota. Both implicit and explicit dynamic mappings are created based on the source IP address in the packet, and hence depend on ingress filtering to guard against spoof source IP addresses. Thus, PCP relies on the same ingress filtering as today's implicit dynamic mappings and PCP does not create a new requirement for ingress filtering.
The THIRD_PARTY Option contains a Internal Address field, which allows a PCP client to create, extend, or delete an implicit or explicit dynamic mapping for another host, as described in Section 10.1.
In most cases PCP Servers will reject all THIRD_PARTY requests.
The one scenario were it is currently envisaged that THIRD_PARTY will be used is for DS-Lite deployments where the B4 devices implements an UPnP IGD Interworking gateway which handles IGD requests from clients on the local network and makes PCP mapping requests on their behalf, or the B4 devices implements an administrative web-based interface to allow users to manually create mapping requests. In this case it is envisaged that the DS-Lite PCP server will be configured to allow only B4 devices to make THIRD_PARTY requests, and only on behalf of other Internal Hosts sharing the same DS-Lite IPv6 tunnel. Since the B4 device is itself the DS-Lite IPv6 tunnel endpoint, it is in a position to guard against spoof packets being injected into that tunnel using the B4 device's IPv4 source address, so the DS-Lite PCP server can trust that packets received over the DS-Lite IPv6 tunnel with the B4 device's source IPv4 address did in fact originate from the B4 device.
In the time between when a PCP server loses state and the PCP client notices the lower than expected Epoch value, it is possible that the PCP client's mapping will be acquired by another host (via an explicit dynamic mapping or implicit dynamic mapping). This means incoming traffic will be sent to a different host ("theft"). A mechanism to immediately inform the PCP client of state loss would reduce this interval, but would not eliminate this threat. The PCP client can reduce this interval by using a relatively short lifetime; however, this increases the amount of PCP chatter. This threat is reduced by using persistent storage of explicit dynamic mappings in the PCP server (so it does not lose explicit dynamic mapping state), or by ensuring the previous external IP address and port cannot be used by another host (e.g., by using a different IP address pool).
IANA is requested to perform the following actions:
PCP will use port 5351 (currently assigned by IANA to NAT-PMP [I-D.cheshire-nat-pmp]). We request that IANA re-assign that same port number to PCP, and relinquish UDP port 44323.
[Note to RFC Editor: Please remove the text about relinquishing port 44323 prior to publication.]
IANA shall create a new protocol registry for PCP OpCodes, initially populated with the values in Section 8, Section 9, and the value 0 for the "no-op" operation PCP Rapid Recovery [I-D.cheshire-pcp-recovery]. The value 127 is reserved.
Additional OpCodes in the range 5-95 can be created via Specification Required [RFC5226], and the range 96-126 is for Private Use [RFC5226].
IANA shall create a new registry for PCP result codes, numbered 0-255, initially populated with the result codes from Section 5.4, Section 8.2, and Section 10.1. The values 0 and 255 are reserved.
Additional Result Codes can be defined via Specification Required [RFC5226].
IANA shall create a new registry for PCP Options, numbered 0-255 with an associated mnemonic. The values 0-127 are mandatory-to-process, and 128-255 are optional to process. The initial registry contains the Options described in Section 10. The Option values 127 and 255 are reserved.
Additional PCP Option codes in the ranges 5-63 and 128-191 can be created via Specification Required [RFC5226], and the ranges 64-126 and 192-254 are for Private Use [RFC5226].
Thanks to Xiaohong Deng, Alain Durand, Christian Jacquenet, Jacni Qin, Simon Perreault, and James Yu for their comments and review. Thanks to Simon Perreault for highlighting the interaction of dynamic connections with PCP-created mappings.
Thanks to Francis Dupont for his several thorough reviews of the specification, which improved the protocol significantly.
The Port Control Protocol (PCP) is a successor to the NAT Port Mapping Protocol, NAT-PMP [I-D.cheshire-nat-pmp], and shares similar semantics, concepts, and packet formats. Because of this NAT-PMP and PCP both use the same port, and use NAT-PMP and PCP's version negotiation capabilities to determine which version to use. This section describes how an orderly transition may be achieved.
A client supporting both NAT-PMP and PCP SHOULD send its request using the PCP packet format. This will be received by a NAT-PMP server or a PCP server. If received by a NAT-PMP server, the response will be as indicated by the NAT-PMP specification [I-D.cheshire-nat-pmp], which will cause the client to downgrade to NAT-PMP and re-send its request in NAT-PMP format. If received by a PCP server, the response will be as described by this document and processing continues as expected.
A PCP server supporting both NAT-PMP and PCP can handle requests in either format. The first byte of the packet indicates if it is NAT-PMP (first byte zero) or PCP (first byte non-zero).
A PCP-only gateway receiving a NAT-PMP request (identified by the first byte being zero) will interpret the request as a version mismatch. Normal PCP processing will emit a PCP response that is compatible with NAT-PMP, without any special handling by the PCP server.
[Note to RFC Editor: Please remove this section prior to publication.]