PCP working group | D. Wing, Ed. |
Internet-Draft | Cisco |
Intended status: Standards Track | S. Cheshire |
Expires: August 26, 2011 | Apple |
M. Boucadair | |
France Telecom | |
R. Penno | |
Juniper Networks | |
F. Dupont | |
Internet Systems Consortium | |
February 22, 2011 |
Port Control Protocol (PCP)
draft-ietf-pcp-base-05
Port Control Protocol allows a host to control how incoming IPv6 or IPv4 packets are translated and forwarded by a network address translator (NAT) or simple firewall to an IPv6 or IPv4 host, and also allows a host to optimize its NAT keepalive messages.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 26, 2011.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Port Control Protocol (PCP) provides a mechanism to control how incoming packets are forwarded by upstream devices such as NAT64, NAT44, and firewall devices, and a mechanism to reduce application keepalive traffic. PCP is primarily designed to be implemented in the context of both Carrier-Grade NATs (CGN) and small NATs (e.g., residential NATs). PCP allows hosts to operate server for a long time (e.g., a webcam) or a short time (e.g., while playing a game or on a phone call) when behind a NAT device, including when behind a CGN operated by their Internet service provider.
PCP allows applications to create mappings from an external IP address and port to an internal IP address and port. If the PCP-controlled device is a NAT, a mapping is created; if the PCP-controlled device is a firewall, a mapping is created in the firewall. These mappings are required for successful inbound communications destined to machines located behind a NAT.
After creating a mapping for incoming connections, it is necessary to inform remote computers about the IP address and port for the incoming connection. This is usually done in an application-specific manner. For example, a computer game would use a rendezvous server specific to that game (or specific to that game developer), and a SIP phone would use a SIP proxy. PCP does not provide this rendezvous function. The rendezvous function will support IPv4, IPv6, or both. Depending on that support and the application's support of IPv4 or IPv6, the PCP client will need an IPv4 or IPv6 mapping or both.
Many NAT-friendly applications send frequent application-level messages to ensure their session will not be timed out by a NAT. These are commonly called "NAT keepalive" messages, even though they are not sent to the NAT itself (rather, they are sent 'through' the NAT). These applications can reduce the frequency of those NAT keepalive messages by using PCP to learn (or control) the NAT mapping lifetime. This helps reduce bandwidth on the subscriber's access network, traffic to the server, and battery consumption on mobile devices.
Many NATs and firewalls have included application layer gateways (ALGs) to create mappings for applications that establish additional streams or accept incoming connections. ALGs incorporated into NATs additionally modify the application payload. Industry experience has shown that these ALGs are detrimental to protocol evolution. PCP allows an application create its own mappings in NATs and firewalls, removing the incentive to deploy ALGs in NATs and firewalls.
PCP can be used in various deployment scenarios, including:
The PCP OpCodes defined in this document are designed to support transport protocols that use a 16-bit port number (e.g., TCP, UDP, SCTP, DCCP). Transport protocols that do not use a port number (e.g., IPsec ESP), and the ability to use PCP to forward all traffic to a single default host (often nicknamed "DMZ"), are beyond the scope of this document.
The PCP machinery assumes a single-homed host model. That is, for a given IP version, only one default route exists to reach the Internet. This is important because after a PCP mapping is created and an inbound packet (e.g., TCP SYN) arrives at the host the outbound response (e.g., TCP SYNACK) has to go through the same path so the proper address rewriting takes place on that outbound response packet. This restriction exists because otherwise there would need to be one PCP server for each egress, because the host could not reliably determine which egress path packets would take, so the client would need to be able to reliably make the same internal/external mapping in every NAT gateway, which in general is not possible.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
The PCP server receives PCP requests. The PCP server might be integrated within the NAT or firewall device (as shown in Figure 1) which is expected to be a common deployment.
+-----------------+ +------------+ | NAT or firewall | | PCP client |-<network>-+ with +---<Internet> +------------+ | PCP server | +-----------------+
It is also possible to operate the PCP server in a separate device from the NAT, so long as such operation is indistinguishable from the PCP client's perspective.
All PCP messages contain a request (or response) header containing an opcode, any relevant opcode-specific information, and zero or more options. The packet layout for the common header, and operation of the PCP client and PCP server are described in the following sections. The information in this section applies to all OpCodes. Behavior of the OpCodes defined in this document is described in Section 8 and Section 9.
All requests have the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| Version = 0 |R| OpCode | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Reserved (48 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) opcode-specific information : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) PCP Options : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
All responses have the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| Version = 0 |R| OpCode | Reserved | Result Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Epoch | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : (optional) OpCode-specific response data : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : (optional) Options : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
A PCP OpCode can be extended with an Option. Options can be used in requests and responses. It is anticipated that Options will include information which are associated with the normal function of an OpCode. For example, an Option could indicate DSCP [RFC2474] markings to apply to incoming or outgoing traffic associated with a PCP mapping, or an Option could include descriptive text (e.g., "for my webcam").
Options use the following Type-Length-Value format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Code | Reserved | Option-Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : (optional) data : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The description of the fields is as follows:
A given Option MAY be included in a request or a response, as permitted by that Option. If a given Option was included in a request, and understood and processed by the PCP server, it MUST be included in the response. The handling of an Option by the PCP client and PCP server MUST be specified in an appropriate document and must include whether the PCP Option can appear (one or more times) in a request, and indicate the contents of the Option in the request and in the response. If an Option appears in a request or response more often than allowed, the first occurence(s) is used and the others simply ignored as if they were not present. If several Options are included in a PCP request or response, they can be encoded in any order by the PCP client and are processed in the order received. If, while processing an option, an error is encountered that causes a PCP error response to be generated, the PCP request causes no state change in the PCP server or the PCP-controlled device (i.e., it rolls back any changes it might have made while processing the request). The response MUST encode the Options in the same order, but may omit some PCP Options in the response, as is necessary to indicate the PCP server does not understand that Option or simply because that Option is not included in responses. Additional Options included in the response (if any) MUST be included at the end. A certain Option MAY appear more than once in a request or in a response, if permitted by the definition of the Option itself. If the Option's definition allows the Option to appear once but it appears more than once in a request, the PCP server MUST respond with the MALFORMED_OPTION result code; if this occurs in a response, the PCP client processes the first occurrence and ignores the other occurrences.
If the "P" bit in the OpCode is set,
If the "P" bit is clear, the PCP server MAY process or ignore this Option.
To enhance interoperability, newly defined Options should avoid interdependencies with each other.
New Options MUST include the information below:
The following result codes may be returned as a result of any OpCode received by the PCP server. The only success result code is 0, other values indicate an error.
Additional result codes, specific to the OpCodes defined in this document, are listed in Section 8.2 and Section 9.2 .
PCP messages MUST be sent over UDP, and the PCP server MUST listen for PCP requests on the PCP port number (Section 13.1). Every PCP request generates a response, so PCP does not need to run over a reliable transport protocol.
PCP is idempotent, so if the PCP client sends the same request multiple times and the PCP server processes those requests, the same result occurs. The order of operation is that a PCP client generates and sends a request to the PCP server which processes the request and generates a response back to the PCP client.
This section details operation specific to a PCP client, for any OpCode. Procedures specific to the MAP OpCodes are described in Section 8, and procedures specific to the PEER OpCodes are described in Section 9.
Prior to sending its first PCP message, the PCP client determines which servers to use. The PCP client performs the following steps to determine its PCP server(s):
With that list of PCP servers, the PCP client formulates its PCP request. The PCP request contains a PCP common header, PCP OpCode and payload, and (possibly) Options. It initializes a retransmission timer to 4 seconds. (As with all UDP or TCP clients on any operating system, when several PCP clients are embedded in the same host, each uses a distinct source port number to disambiguate their requests and replies.) The PCP client sends a PCP message to each server in sequence, waiting for a response until its timer expires. Once a PCP client has successfully communicated with a PCP server, it continues communicating with that PCP server until that PCP server becomes non-responsive, which causes the PCP client to attempt to re-iterate the procedure starting with the first PCP server on its list. If a hard ICMP error is received the PCP client SHOULD immediately abort trying to contact that PCP server (see Section 2 of [RFC5461] for discussion of ICMP and ICMPv6 hard errors). If no response is received from any of those servers, it doubles its retransmission timer and tries each server again. This is repeated 4 times (for a total of 5 transmissions to each server). If, after these transmissions, the PCP client has still not received a response, the PCP client SHOULD abort the procedure.
Upon receiving a response (success or error), the PCP client does not change to a different PCP server. That is, it does not "shop around" trying to find a PCP server to service its (same) request.
This section details operation specific to a PCP server.
Upon receiving a PCP request message, the PCP server parses and validates it. A valid request contains a valid PCP common header, one valid PCP Opcode, and zero or more Options (which the server might or might not comprehend). If an error is encountered during processing, the server generates an error response which is sent back to the PCP client. Processing an OpCode and the Options are specific to each OpCode.
If the received message is shorter than 4 octets, has the R bit set, or the first bit is clear, the request is simply dropped. If the version number is not supported, a response is generated containing the UNSUPP_VERSION response code and the protocol version which the server does understand (if the server understands a range of protocol versions then it returns the supported version closest to the version in the request).
If the OpCode is not supported, a response is generated with the UNSUPP_OPCODE response code. If the length of the request exceeds 1024 octets or is not a multiple of 4 octets, it is invalid. Invalid requests are handled by copying up to 1024 octets of the request into the response, setting the response code to MALFORMED_REQUEST, and zero-padding the response to a multiple of 4 octets if necessary.
Error responses have the same packet layout as success responses, with fields copied from the request copied into the response, and other fields assigned by the PCP server set to 0.
The PCP client receives the response and verifies the source IP address and port belong to the PCP server of an outstanding request. It validates the version number and OpCode matches an outstanding request. Responses shorter than 12 octets, longer than 1024 octets, or not a multiple of 4 octets are invalid and ignored, likely causing the request to be re-transmitted. The response is further matched by comparing fields in the response OpCode-specific data to fields in the request OpCode-specific data. After a successful match with an outstanding request, the PCP client checks the Epoch field to determine if it needs to restore its state to the PCP server (see Section 6.5).
If the response code is 0, the PCP client knows the request was successful.
If the response code is not 0, the request failed. If the response code is UNSUPP_VERSION, processing continues as described in Section 6.6. The PCP client MAY resend the same message but MUST first wait until the smaller of 30 minutes or the value of the Lifetime field. If the PCP client has re-discovered a new PCP server (e.g., connected to a new network), the PCP client is not restricted from communicating immediately with its new PCP server.
Non-0 responses will normally have a value in the Assigned Lifetime field. Clients SHOULD NOT repeat the same request to the same PCP server within the lifetime given in the error response. In the case of the SERVER_OVERLOADED error response, clients SHOULD NOT send *any* further requests to the that PCP server within the lifetime given in the SERVER_OVERLOADED error response.
Hosts which desire a PCP mapping might be multi-interfaced (i.e., own several logical/physical interfaces). Indeed, a host can be dual-stack or be configured with several IP addresses. These IP addresses may have distinct reachability scopes (e.g., if IPv6 they might have global reachability scope as for GUA (Global Unicast Address) or limited scope such as ULA (Unique Local Address, [RFC4193])).
IPv6 addresses with global reachability scope SHOULD be used as internal IP address when requesting a PCP mapping in a PCP-controlled device. IPv6 addresses with limited scope (e.g., ULA), SHOULD NOT be indicated as internal IP address in a PCP message.
As mentioned in Section 2.3, only single-homed CP routers are in scope. Therefore, there is no viable scenario where a host located behind a CP router is assigned with two GUA addresses belonging to the same global IPv6 prefix.
Every PCP response sent by the PCP server includes an Epoch field. This field increments by 1 every second, and indicates to the PCP client if PCP state needs to be restored. If the PCP server resets or loses the state of its explicit dynamic Mappings (that is, those mappings created by PCP MAP requests), due to reboot, power failure, or any other reason, it MUST reset its Epoch time to 0. Similarly, if the public IP address(es) of the NAT (controlled by the PCP server) changes, the Epoch MUST be reset to 0. A PCP server MAY maintain one Epoch value for all PCP clients, or MAY maintain distinct Epoch values for each PCP client; this choice is implementation-dependent.
Whenever a client receives a PCP response, the client computes its own conservative estimate of the expected Epoch value by taking the Epoch value in the last packet it received from the gateway and adding 7/8 (87.5%) of the time elapsed since that packet was received. If the Epoch value in the newly received packet is less than the client's conservative estimate by more than one second, then the client concludes that the PCP server lost state, and the client MUST immediately renew all its active port mapping leases as described in Section 8.9.1.
When the PCP server reduces its Epoch value, the PCP clients will send PCP requests to refresh their mappings. The PCP server needs to be scaled appropriately to accomodate this traffic. Because PCP lacks a mechanism to simultaneously inform all PCP clients of the Epoch value, the PCP clients will not flood the PCP server simultaneously when the PCP server reduces its Epoch value.
A PCP client sends its requests using PCP version number 0. Should later updates to this document specify different message formats with a version number greater than zero it is expected that PCP servers will still support version 0 in addition to the newer version(s). However, in the event that a server returns a response with error code UNSUPP_VERSION, the client MAY log an error message to inform the user that it is too old to work with this server, and the client SHOULD set a timer to retry its request in 30 minutes (in case this was a temorary condition and the server configuration is changed to rectify the situation).
If future PCP versions greater than zero are specified, version negotiation is expected to proceed as follows:
The following options can appear in certain PCP responses.
If the PCP server cannot process a mandatory-to-process option, for whatever reason, it includes the UNPROCESSED Option in the response, shown in Figure 5. This helps with debugging interactions between the PCP client and PCP server. For simplicity, no more than 4 options can be encoded. This option MUST NOT appear more than once in a PCP response, no matter how many PCP options appeared in the request and were unprocessed by the PCP server. If only one Option code was unprocessed, that option code it is placed in option-code-1 (and the other three fields are set to zero), if two Option codes were unprocessed, their option codes are placed in option-code-1 and option-code-2, and so on. If a certain Option appeared more than once in the PCP request, that Option value only appears once in the option-code fields. The order of the Options in the PCP request has no relationship with the order of the Option values in this UNPROCESSED Option. This Option MUST NOT appear in a response unless the associated request contained at least one mandatory-to-process Option. This Option MUST NOT appear more than once.
The UNPROCESSED option is formatted as follows:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | option-code-1 | option-code-2 | option-code-3 | option-code-4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | option-code-5 | option-code-6 | option-code-7 | option-code-8 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
There are three uses for the MAP and PEER OpCodes defined in this document: a host operating a server (and wanting an incoming connection), a host operating a client (and wanting to optimize the application keepalive traffic), and a host operating a client and server on the same port. These are discussed in the following sections.
When operating a server (Section 7.1 and Section 7.3) the PCP client knows if it wants an IPv4 listener, IPv6 listener, or both on the Internet. The PCP client also knows if it has an IPv4 interface on itself or an IPv6 interface on itself. It takes the union of this knowledge to decide to send a one or two MAP requests for each of its interfaces. Applications that embed IP addresses in payloads (e.g., FTP, SIP) will find it beneficial to avoid address family translation, if possible.
A host operating a server (e.g., a web server) listens for traffic on a port, but the server never initiates traffic from that port. For this to work across a NAT or a firewall, the application needs to (a) create a mapping from a public IP address and port to itself as described in Section 8 and (b) publish that public IP address and port via some sort of rendezvous server (e.g., DNS, a SIP message, a proprietary protocol). Publishing the public IP address and port is out of scope of this specification. To accomplish (a), the application follows the procedures described in this section.
As normal, the application needs to begin listening to a port, and to ensure that it can get exclusive use of that port it needs to choose a port that is not in the operating system's ephemeral port range. Then, the application constructs a PCP message with the appropriate MAP OpCode depending on if it is listening on an IPv4 or IPv6 interface and if it wants a public IPv4 or IPv6 address.
The following pseudo-code shows how PCP can be reliably used to operate a server:
/* start listening on the local server port */ int s = socket(...); internal_sockaddr = ...; bind(s, &internal_sockaddr, ...); listen(s, ...); requested_external_sockaddr = 0; pcp_send_map_request(internal_sockaddr, requested_external_sockaddr, &assigned_external_sockaddr, requested_lifetime, &assigned_lifetime); update_rendezvous_server("Client 12345", assigned_external_sockaddr); while (1) { int c = accept(s, ...); /* ... */ }
A host operating a client (e.g., XMPP client, SIP client) sends from a port but never accepts incoming connections on this port. It wants to ensure the flow to its server is not terminated (due to inactivity) by an on-path NAT or firewall. To accomplish this, the applications uses the procedure described in this section.
Middleboxes such as NATs or firewalls need to see occasional traffic or will terminate their session state, causing application failures. To avoid this, many applications routinely generate keepalive traffic for the primary (or sole) purpose of maintaining state with such middleboxes. Applications can reduce such application keepalive traffic by using PCP.
To use PCP for this function, the applications first connects to its server, as normal. Afterwards, it issues a PCP request with the PEER4 or PEER6 OpCode as described in Section 9. The PEER4 OpCode is used if the host is using IPv4 for its communication to its peer; PEER6 if using IPv6. The same 5-tuple as used for the connection to the server is placed into the PEER4 or PEER6 payload.
The following pseudo-code shows how PCP can be reliably used with a dynamic socket, for the purposes of reducing application keepalive messages:
int s = socket(...); connect(s, &remote_peer, ...); getsockname(s, &internal_address, ...); external_address = 0; pcp_send_peer_request(internal_address, requested_external_address, &assigned_external_address, remote_peer, requested_lifetime, &assigned_lifetime);
A host operating a client and server on the same port (e.g., Symmetric RTP [RFC4961] or SIP Symmetric Response Routing (rport) [RFC3581]) first establishes a local listener, (usually) sends the local and public IP addresses and ports to a rendezvous service (which is out of scope of this document), and (usually) initiates outbound connections from that same source address. To accomplish this, the application uses the procedure described in this section.
An application that is using the same port for outgoing connections as well as incoming connections MUST first signal its operation of a server using the PCP MAP OpCode, as described in Section 8, and receive a positive PCP response before it sends any packets from that port.
The following pseudo-code shows how PCP can be used to operate a symmetric client and server:
/* start listening on the local server port */ int s = socket(...); internal_sockaddr = ...; bind(s, &internal_sockaddr, ...); listen(s, ...); requested_external_sockaddr = 0; pcp_send_map_request(internal_sockaddr, requested_external_sockaddr, &assigned_external_sockaddr, requested_lifetime, &assigned_lifetime); update_rendezvous_server("Client 12345", assigned_external_sockaddr); send_packet(s, "Hello World"); while (1) { int c = accept(s, ...); /* ... */ }
This section defines four OpCodes which control forwarding from a NAT (or firewall) to an internal host. They are:
The operation of these OpCodes is described in this section.
The two MAP OpCodes (MAP4, MAP6) share a similar packet layout for both requests and responses. Because of this similarity, they are shown together. For both of the MAP OpCodes, if the assigned external IP address and assigned external port both match the request's source IP address and MAP OpCode's internal IP address, the functionality is purely a firewall; otherwise it pertains to a network address translator which might also perform firewall functions.
The following diagram shows the request packet format for MAP4 and MAP6. This packet format is aligned with the response packet format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Requested external IP address (32 or 128, depending on OpCode): : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | internal port | requested external port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
The following diagram shows the response packet format for MAP4 and MAP6 OpCodes:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Assigned external IP address (32 or 128, depending on OpCode) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : Lifetime : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | internal port | assigned external port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
In addition to the general PCP result codes (Section 5.4), the following additional result codes may be returned as a result of the four MAP OpCodes received by the PCP server. These errors are considered 'long lifetime' or 'short lifetime', which provides guidance to PCP server developers for the duration of the Lifetime value for these errors. It is RECOMMENDED that short lifetime errors use 30 second lifetime and long lifetime errors use 30 minute lifetime.
Other result codes are defined following the procedure in Section 13.3.
This section describes the operation of a PCP client when sending requests with OpCodes MAP4 and MAP6.
There are certain sentinal values used by the MAP OpCodes:
The request MAY contain values in the requested-external-ip-address and requested-external-port fields. This allows the PCP client to attempt to rebuild the PCP server's state, so that the PCP client could avoid having to change information maintained at the rendezvous server. Of course, due to other activity on the network (e.g., by other users or network renumbering), the PCP server may not be able to fulfill the request.
An existing mapping can have its lifetime extended by the PCP client. To do this, the PCP client sends a new MAP request indicating the internal IP address and port(s).
The PCP client SHOULD renew the mapping before its expiry time, otherwise it will be removed by the PCP server (see Section 8.6). In order to prevent excessive PCP chatter, it is RECOMMENDED to send a single renewal request packet when a mapping is halfway to expiration time, then, if no positive response is received, another single renewal request 3/4 of the way to expiration time, and then another at 7/8 of the way to expiration time, and so on, subject to the constraint that renewal requests MUST NOT be sent less than four seconds apart (a PCP client MUST NOT send an infinite number of ever-closer-together requests in the last few seconds before a mapping expires).
This section describes the operation of a PCP server when processing a request with the OpCodes MAP4 or MAP6.
If the requested lifetime is 0, it indicates a request to delete the mapping immediately. If the target-ip-address is 0, it indicates all IP addresses belonging to this subscriber should have all their mappings removed [[THIRD_PARTY]]. If internal-port is 0, it means the delete request is for all ports of the particular protocol. On a deletion request, the requested external port field is ignored by the server. PCP MAP requests only control mappings created by MAP requests. So, if the PCP client attempts to delete a static mapping (i.e., a mapping created outside of PCP itself), the PCP server deletes all of the PCP-created mappings but MUST respond with UNABLE_TO_DELETE_ALL result code, with the other fields encoded as described above. If the PCP client attempts to delete a mapping that does not exist, the success response code is returned. If the PCP client is not authorized to delete this mapping, NOT_AUTHORIZED is returned. If the deletion request was properly formatted, a positive response is generated with lifetime of 0 and the server copies the protocol and internal port number from the request into the response; this positive response is generated even if there is no mapping (because the mapping could have been already deleted by a previous PCP transaction).
If the requested lifetime is not zero, it indicates a request to create a mapping or extend the lifetime of an existing mapping.
Processing of the lifetime is described in Section 8.6.
If the PCP-controlled device is stateless (that is, it does not establish any per-flow state, and simply rewrites the address and/or port in a purely algorithmic fashion), the PCP server simply returns an answer indicating the external IP address and port yielded by this stateless algorithmic translation. This allows the PCP client to learn its external IP address and port as seen by remote peers. Examples of stateless translators include stateless NAT64 and 1:1 NAT44, both of which modify addresses but not port numbers.
If an Option with value greater than 128 exists but that option does not make sense (e.g., the PREFER_FAILURE option is included in a request with lifetime=0), the request is invalid and generates a MALFORMED_OPTION error.
By default, a PCP-controlled device MUST NOT create mappings for a protocol not indicated in the request. For example, if the request was for a TCP mapping, a UDP mapping MUST NOT be created.
If the THIRD_PARTY option is not present in the request, the source IP address of the PCP packet is used when creating the mapping. If the THIRD_PARTY option is present, the PCP server validates the indicated target IP address belongs to the same subscriber. This validation depends on the PCP deployment scenario; see Section 12.3 for the validation procedure. If the internal IP address in the PCP request does not belong to the subscriber, an error response MUST be generated with result code NOT_AUTHORIZED.
Mapings typically consume state on the PCP-controlled device, and it is RECOMMENDED that a per-subscriber or per-host limit be enforced by the PCP server to prevent exhausting the mapping state. If this limit is exceeded, the response code USER_EX_QUOTA is returned.
If all of the proceeding operations were successful (did not generate an error response), then the requested mappings are created as described in the request and a positive response is built. This positive result contains the same OpCode as the request, but with the "R" bit set.
As a side-effect of creating a mapping, ICMP messages associated with the mapping MUST be forwarded (and also translated, if appropriate) for the duration of the mapping's lifetime. This is done to ensure that ICMP messages can still be used by hosts, without application programmers or PCP client implementations needing to signal PCP separately to create ICMP mappings for those flows.
This section describes the operation of the PCP client when it receives a PCP response for the OpCodes MAP4 or MAP6.
A response is matched with a request by comparing the protocol, internal IP address, and internal port. Other fields are not compared, because the PCP server sets those fields.
If a successful response, the PCP client can use the external IP address and port(s) as desired. Typically the PCP client will communicate the external IP address and port(s) to another host on the Internet using an application-specific rendezvous mechanism such as DNS SRV records.
If the response code is IMPLICIT_MAPPING_EXISTS, it indicates the PCP client is attempting to use MAP when an implicit dynamic connection already exists for the same internal host and internal port. This can occur with certain types of NATs. When this is received, if the PCP client still wants to establish a mapping, the PCP client MUST choose a different internal port and send a new PCP request specifying that port.
On an error response, clients SHOULD NOT repeat the same request to the same PCP server within the lifetime returned in the response.
The PCP client requests a certain lifetime, and the PCP server responds with the assigned lifetime. The PCP server MAY grant a lifetime smaller or larger than the requested lifetime. The PCP server SHOULD be configurable for permitted minimum and maximum lifetime, and the RECOMMENDED values are 120 seconds for the minimum value and 24 hours for the maximum. It is NOT RECOMMENDED that the server allow lifetimes exceeding 24 hours, because they will consume ports even if the internal host is no longer interested in receiving the traffic or no longer connected to the network.
Once a PCP server has responded positively to a mapping request for a certain lifetime, the port forwarding is active for the duration of the lifetime unless the lifetime is reduced by the PCP client (to a shorter lifetime or to zero) or until the PCP server loses its state (e.g., crashes). However, if the PCP lifetime has reached zero yet there is still active inside-to-outside traffic, the PCP server MAY, if it desires, keep the mapping active until the inside-to-outside traffic has stopped.
An application that forgets its PCP-assigned mappings (e.g., the application or OS crashes) will request new PCP mappings. This will consume port mappings. The application will also likely initiate new implicit dynamic mappings (e.g., TCP connections) without using PCP, which will also consume port mappings. If there is a port mapping quota for the internal host, frequent restarts such as this may exhaust the quota. PCP provides no explicit protection against such port consumption. In such environments, it is RECOMMENDED that applications use shorter PCP lifetimes to reduce the impact of consuming the user's port quota. An operating system or framework that issues a mapping request to "delete all" (protocol=0, port=0, lifetime=0) on reboot protects itself against this resource exhaustion by voluntarily relinquishing all of its old mappings before beginning to request new ones. The PCP server MAY chose to allocate the same (recently relinquished) mappings when mappings are re-requested by the booting OS. Some port mapping APIs (such as the "DNSServiceNATPortMappingCreate" API provided by Apple's Bonjour on Mac OS X, iOS, Windows, Linux, etc.) automatically monitor for process exit (including application crashes) and automatically send port mapping deletion requests if the process that requested them goes away without explicitly relinquishing them.
In order to reduce unwanted traffic and data corruption, a port that was mapped using the MAP OpCode SHOULD NOT be assigned to another internal target, or another subscriber, for 120 seconds (MSL, [RFC0793]). However, the PCP server MUST allow the same internal target to re-acquire the same port during that same interval.
When a PCP client first acquires a new IP address, it may want to remove mappings that may have been instantiated for a previous host. To do this, the PCP client sends a MAP request with protocol, external port, internal port, and lifetime set to 0.
The customer premises router might obtain a new IPv4 address or new IPv6 prefix. This can occur because of a variety of reasons including a reboot, power outage, DHCP lease expiry, or other action by the ISP. If this occurs, traffic forwarded to the subscriber might be delivered to another customer who now has that address. This affects both implicit dynamic mappings and explicit dynamic mappings. However, this same problem occurs today when a subscriber's IP address is re-assigned, without PCP and without an ISP-operated CGN. The solution is the same as today: the problems associated with subscriber renumbering are caused by subscriber renumbering and are eliminated if subscriber renumbering is avoided. PCP defined in this document does not provide machinery to reduce the subscriber renumbering problem.
When a new Internal Address is assigned to a host embedding a PCP client, the NAT (or firewall) controlled by the PCP server will continue to send traffic to the old IP address. Assuming the PCP client wants to continue receiving traffic, it needs to install new mappings for its new IP address. The requested external port field will not be fulfilled by the PCP server, in all likelihood, because it is still being forwarded to the old IP address. Thus, a mapping is likely to be assigned a new external port number and/or public IP address. Note that this scenario is not expected to happen routinely on a regular basis for most hosts, since most hosts renew their DHCP leases before they expire (or re-request the same address after reboot) and most DHCP servers honor such requests and grant the host the same address it was previously using before the reboot.
This Option is used when a PCP client wants to control a mapping to another host. A PCP server will only support this option if sent by an authorized PCP client. Determining which PCP clients are authorized to use the THIRD_PARTY option depends on the deployment scenario. For Dual-Stack Lite deployments, the PCP server only supports this option if the source IPv4 address is the B4's source IP address. For other scenarios, the subscriber has only one IPv4 address and this Option serves no purpose (and will only generate error messages from the server). If a subscriber has more than one IPv4 address, the ISP MUST determine its own policy for how to identify the trusted device within the subscriber's home. This might be, for example, the lowest- or highest-numbered host address for that user's IPv4 prefix.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Target Internal IP address (32 bits of 128 bits, depending : : on Option length) : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The fields are described below:
This Option:
A PCP server is configured to permit third party mappings or to restrict third party mappings. If third party mappings are permitted, any host on the network can create, modify, or destroy mappings for another host on the network, which is generally undesirable. If third party mappings are restricted, only a certain host on the network can perform third party mappings. If a PCP server is configured to restrict third party mappings, and receives a PCP MCP request with a Target Address that does not match the source IP address of that request, it MUST generate a UNAUTH_TARGET response. A customer premise router SHOULD be configured to restrict third party mappings.
It is RECOMMENDED that PCP servers embedded into customer premise equipment be configured to restrict third party mappings. With this configuration, if a user wants to create a third party mapping, the user needs to interact out-of-band with their customer premise router (e.g., using its administrative interface).
It is RECOMMENDED that PCP servers embedded into service provider NAT and firewall devices be configured to permit third party mapings. With this confguration, if a user wants to create a third party mapping, the user needs to interact out-of-band with their customer premise router (e.g., using its administrative interface). This is because the service provider's PCP server only allows third party mappings from the subscriber's customer premise router. To do this, the PCP server needs certain knowledge about the network's subscribers. It needs to determine the IP address of the subscriber's customer premise router and to determine the IP subnet assigned to the subscriber. This knowledge might be dynamic (e.g., database query into the service provider's user database for every incoming PCP request), might be a table (e.g., subscribers with a certain IPv4 network prefix all have an IPv4 /24, other IPv4 prefixes have an IPv4 /32, certain IPv6 prefixes have an have an IPv6 /32, and so on), or might be very static (e.g., all subscribers have one IPv4 address). In many common deployments, there is only one IPv4 address assigned to a subscriber, and thus the Target Address will always match the source address of the PCP message. If there are multiple IPv4 or multiple IPv6 addresses assigned to a subscriber, the PCP server allows the highest-numbered address to perform third party mappings. Thus, on a network supporting PCP with multiple addresses assigned to a subscriber, the highest-numbered host SHOULD be the subscriber's customer premise router. Upon receiving a MAP request where the Target Address does not match the source IP address of the request, the PCP server determines if the source IP address of the request is the subscriber's highest numbered address, following the procedure above. If not, the PCP server MUST generate an UNAUTH_SOURCE_ADDRESS error. Then the PCP server determines if the Target Address belongs to the same subscriber as the source IP address of the PCP packet, using the procedure described above. If not, the PCP server MUST generate an UNAUTH_TARGET_ADDRESS error.
This Option indicates packet filtering is desired. The remote peer port and remote peer IP Address indicate the permitted remote peer's source IP address and port for packets from the Internet. The remote peer prefix length indicates the length of the remote peer's IP address that is significant; this allows a single Option to permit an entire subnet. After processing this MAP request and generating a successful response, the PCP-controlled device will simply drop packets with a source IP address, transport, or port that do not match the fields.
The REMOTE_PEER_FILTER packet layout is described below:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | prefix-length | Remote Peer Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Remote Peer IP address (32 bits if MAP4, : : 1 28 bits if MAP6) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
This Option:
Because of interactions with dynamic ports this Option MUST only be used by a client that is operating a server (that is, using the MAP OpCode), as this ensures that no other application will be assigned the same ephemeral port for its outgoing connection. Other use by a PCP client is NOT RECOMMENDED and will cause some UNSAF NAT traversal mechanisms [RFC3424] to fail where they would have otherwise succeeded, breaking other applications running on this same host.
The prefix-length indicates how many bits of the IPv6 address or IPv4 address are used for the filter. For MAP4, a prefix-length of 32 indicates the entire IPv4 address is used. For MAP6, a prefix-length of 128 indicates the entire IPv6 address is used. For MAP4 the minimum prefix-length value is 0 and the maximum value is 32. For MAP6 the minimum prefix-length value is 0 and the maximum value is 128. Values outside those range cause an MALFORMED_OPTION response code.
If multiple occurrences of REMOTE_PEER_FILTER exist in the same MAP request, they are processed in the same order received, and they MUST all be successfully processed or return an error (e.g., MALFORMED_OPTION if one of the options was malformed). As with other PCP errors, returning an error causes no state to be changed in the PCP server or in the PCP-controlled device. If an existing mapping exists (with or without a filter) and the server receives a MAP request with REMOTE_PEER_FILTER, the filters indicated in the new request are added to any existing filters. If a MAP request has a lifetime of 0 and contains the REMOTE_PEER_FILTER option, the error MALFORMED_OPTION is returned.
To remove all existing filters, the prefix-length 0 is used. There is no mechanism to remove a specific filter.
To change an existing filter, the PCP client sends a MAP request containing two REMOTE_PEER_FILTER options, the first option containing a prefix-length of 0 (to delete all existing filters) and the second containing the new remote peer's IP address and port. Other REMOTE_PEER_FILTER options in that PCP request, if any, add more allowed remote hosts.
The PCP server or the PCP-controlled device is expected to have a limit on the number of remote peers it can support. This limit might be as small as one. If a MAP request would exceed this limit, the entire MAP request is rejected with the result code EXCESSIVE_REMOTE_PEERS, and the state on the PCP server is unchanged.
If this option appears in a request, the following addition result code could be returned:
This option indicates that if the PCP server is unable to allocate the requested port, then instead of returning an available port that it *can* allocate, the PCP server should instead allocate no port and return result code CANNOT_HONOR_EXTERNAL_PORT.
This option is intended solely for use by UPnP IGD interworking [I-D.bpw-pcp-upnp-igd-interworking], where the semantics of IGD version 1 do not provide any way to indicate to an IGD client that any port is available other than the one it requested. A PCP server MAY support this option, if its designers wish to support downstream devices that perform IGD interworking. PCP servers MAY choose to rate-limit their handling of PREFER_FAILURE requests, to protect themselves from a rapid flurry of 65535 consecutive PREFER_FAILURE requests from clients probing to discover which external ports are available. PCP servers that are not intended to support downstream devices that perform IGD interworking are not required to support this option. PCP clients other than IGD interworking clients SHOULD NOT use this option because it results in inefficient operation, and they cannot safely assume that all PCP servers will implement it. The option is provided only because the semantics of IGD version 1 offer no viable alternative way to implement an IGD interworking function. It is anticipated that this option will be deprecated in the future as more clients adopt PCP natively and the need for IGD interworking declines.
If an event occurs that causes the PCP server to lose state (such as a crash or power outage), the mappings created by PCP are lost. Such loss of state is rare in a service provider environment (due to redundant power, disk drives for storage, etc.). But such loss of state is more common in a residential NAT device which does not write information to its non-volatile memory.
The Epoch allows a client to deduce when a PCP server may have lost its state. If this occurs, the PCP client can attempt to recreate the mappings following the procedures described in this section.
The PCP server SHOULD store mappings in persistent storage so when it is powered off or rebooted, it remembers the port mapping state of the network. Due to the physical architecture of some PCP servers, this is not always achievable (e.g., some non-volatile memory can withstand only a certain number of writes, so writing PCP mappings to such memory is generally avoided).
However, maintaining this state is not essential for correct operation. When the PCP server loses state and begins processing new PCP messages, its Epoch is reset to zero (per the procedure of Section 6.5).
A mapping renewal packet is formatted identically to an original mapping request; from the point of view of the client it is a renewal of an existing mapping, but from the point of view of the PCP server it appears as a new mapping request.
As the result of receiving a packet where the Epoch field indicates that a reboot or similar loss of state has occurred, the client renews its port mappings.
The discussion in this section focuses on recreating inbound port mappings after loss of PCP server state, because that is the more serious problem. Losing port mappings for outgoing connections destroys those currently active connections, but does not prevent clients from establishing new outgoing connections. In contrast, losing inbound port mappings not only destroys all existing inbound connections, but also prevents the reception of any new inbound connections until the port mapping is recreated. Accordingly, we consider recovery of inbound port mappings the more important priority. However, clients that want outgoing connections to survive a NAT gateway reboot can also achieve that using PCP. After initiating an outbound TCP connection (which will cause the NAT gateway to establish an implicit port mapping) the client should send the NAT gateway a port mapping request for the source port of its TCP connection, which will cause the NAT gateway to send a response giving the external port it allocated for that mapping. The client can then store this information, and use it later to recreate the mapping if it determines that the NAT gateway has lost its mapping state.
A PCP client can refresh a mapping by sending a new PCP request containing information from the earlier PCP response. The PCP server will respond indicating the new lifetime. It is possible, due to failure of the PCP server, that the public IP address and/or public port, or the PCP server itself, has changed (due to a new route to a different PCP server). To detect such events more quickly, the PCP client may find it beneficial to use shorter lifetimes (so that it communicates with the PCP server more often). If the PCP client has several mappings, the Epoch value only needs to be retrieved for one of them to verify the PCP server has not lost port forwarding state.
If the client wishes to check the PCP server's Epoch, it sends a PCP request for any one of the client's mappings. This will return the current Epoch value. In that request the PCP client could extend the mapping lifetime (by asking for more time) or maintain the current lifetime (by asking for the same number of seconds that it knows are remaining of the lifetime).
This section defines two OpCodes for controlling dynamic connections. They are:
The operation of these OpCodes is described in this section.
The two PEER OpCodes (PEER4 and PEER6) share a similar packet layout for both requests and responses. Because of this similarity, they are shown together. For both of the PEER OpCodes, if the internal IP address and internal port fields of the request both match the external IP address and external port fields of the response, the IP addresses and ports are not changed and thus the functionality is purely a firewall; otherwise it pertains to a network address translator which might also perform firewall functions.
The following diagram shows the request packet format for PEER4 and PEER6. This packet format is aligned with the response packet format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Reserved (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Internal IP address (32 bits if PEER4, 128 bits if PEER6) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Remote Peer IP address (32 bits if PEER4, 128 bits if PEER6) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Reserved (128 bits) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Requested lifetime | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | internal port | reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | remote peer port | reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These fields are described below:
The following diagram shows the response packet format for PEER4 and PEER6:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | External_AF | Reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Internal IP address (32 bits if PEER4, 128 bits if PEER6) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Remote Peer IP address (32 bits if PEER4, 128 bits if PEER6) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : External IP address (always 128 bits) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Assigned Lifetime | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | internal port | external port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | remote peer port | reserved (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In addition to the general PCP result codes (Section 5.4) the following additional result codes may be returned as a result of the two PEER OpCodes received by the PCP server.
Other result codes are defined following the procedure in Section 13.3.
This section describes the operation of a client when generating the OpCodes PEER4 or PEER6.
The PEER4 or PEER6 OpCodes MUST NOT be sent until establishing bi-directional communicaion with the remote peer; for TCP, this means completing the TCP 3-way handshake.
This section describes the operation of a server when receiving a request with the OpCodes PEER4 or PEER6.
The PEER OpCodes provide a single function: the ability for the PCP client to query and (possibly) extend the lifetime of an existing mapping.
On receiving the PEER4 or PEER6 OpCode, the PCP server examines the mapping table. If a mapping does not exist, the NONEXIST_PEER error is returned.
If the PCP-controlled device can have its lifetime adjusted, the PCP server uses the smaller of its configured maximum lifetime value and the requested lifetime from the PEER request, and sets the lifetime to that value.
Note: The PEER4 or PEER6 OpCodes can never reduce the lifetime of an existing mapping, nor can those OpCodes delete a mapping. If the mapping is terminated by the TCP client or server (e.g., TCP FIN or TCP RST), the mapping will eventually be destroyed normally; the earlier use of PEER does not extend the lifetime in that case.
If all of the proceeding operations were successful (did not generate an error response), then a SUCCESS response is generated, with the assigned-lifetime containing the lifetime of the mapping.
This section describes the operation of a client when processing a response with the OpCodes PEER4 or PEER6.
A response is matched with a request by comparing the protocol, external AF, internal IP address, internal port, remote peer address and remote peer port. Other fields are not compared, because the PCP server changes those fields to provide information about the mapping created by the OpCode.
If a successful response, the PCP client uses the assigned lifetime value to reduce its frequency of application keepalives for that particular NAT mapping. Of course, there may be other reasons, specific to the application, to use more frequent application keepalives. For example, the PCP assigned-lifetime could be one hour but the application may want to ensure the server is still accessible (e.g., has not crashed) more frequently than once an hour.
If the error response NONEXIST_PEER, this could have occurred if the PCP client sent its PEER request before the PCP-controlled device had installed the mapping, or because the mapping has been destroyed (e.g., due to a TCP FIN). If the PCP client believes the mapping should exist, the PCP client SHOULD retry the request after a brief delay (e.g., 5 seconds).
Other error responses SHOULD NOT be retried.
It is REQUIRED that the PCP-controlled device assign the same external IP address PCP-created explicit dynamic mappings and to implicit dynamic mappings. It is RECOMMENDED that static mappings (e.g., those created by a command language interface on the PCP server or PCP-controlled device) also be assigned to the same IP address.
Once all internal hosts belonging to a given subscriber have no implicit dynamic mappings and have no explicit dynamic mappings in the PCP-controlled device, a subsequent PCP request for that internal host MAY be assigned to a different external IP address. Generally, this re-assignment would occur when a CGN device is load balancing newly-seen hosts to its public IPv4 address pool.
To prevent spoofing of PCP requests, ingress filtering [RFC2827] MUST be performed by devices between the PCP clients and PCP server. For example, with a PCP server integrated into a customer premise router, the ethernet switch needs to perform ingress filtering. As another example, with a PCP server deployed by a service provider, the service provider's aggregation router (the first device connecting to subscribers) needs to do ingress filtering.
The interesting components in a Dual-Stack Lite deployment are the B4 element (which is the customer premises router) and the AFTR element (which is the device that both terminates the IPv6-over-IPv4 tunnel and also implements the Carrier-Grade NAT44 function). The B4 element does not need to perform a NAT function (and usually does not perform a NAT function), but it does operate its own DHCP server and is the local network's default router.
Various PCP deployment scenarios can be considered to control the PCP server embedded in the AFTR element:
Two modes are identified to forward PCP packets to a PCP server controlling the provisioned AFTR as described in the following sub-sections.
In this mode, B4 element does no processing at all of the PCP messages, and forwards them as any other UDP traffic. With DS-Lite, this means that IPv4 PCP messages issued by internal PCP clients are encapsulated into the IPv6 tunnel sent to the AFTR as for any other IPv4 packets. The IPv6 address used as source address MUST be the same as the one used by the B4 element. The AFTR decapsulates the IPv4 packets and processes the PCP requests (because the destination IPv4 address points to the PCP server embedded in the AFTR).
Another alternative for deployment of PCP in a DS-Lite context is to rely on a PCP Proxy in the B4 element. Protocol exchanges between the PCP Proxy and the PCP server are conveyed using plain IPv6 (no tunnelling is used). Nevertheless, the IPv6 address used as source address by the PCP Proxy MUST be the same as the one used by the B4 element.
Hosts behind a NAT64 device can make use of PCP in order to perform port reservation (to get a publicly routable IPv4 port).
Residential subscribers in NAT44 (and NAT444) deployments are usually given one IPv4 address, but may also be given several IPv4 addresses. These addresses are not routable on the IPv4 Internet, but are routable between the subscriber's home and the ISP's CGN. To accommodate multiple hosts within a home, especially when provided insufficient IPv4 addresses for the number of devices in the home, subscribers operate a NAPT device. When this occurs in conjunction with an upstream NAT44, this is nicknamed "NAT444".
Many IPv6 deployments will include a simple firewall [RFC6092], which permits outgoing packets to initiate bi-directional communication but blocks unsolicited incoming packets, which is similar to PCP's security model that allows a host to create a mapping to itself. In many situations, especially residential networks that lack an IT staff, the security provided by an IPv6 simple firewall and the security provided by PCP are compatible. In such situations, the IPv6 simple firewall and the IPv6 host can use PCP's OpCode PIN6 to allow unsolicited incoming packets, so the host can operate a server.
The PCP client's source port SHOULD be randomly generated [RFC6056]. The PCP server MUST only listen for requests from its internal interfaces, and MUST NOT listen for requests on its Internet-facing interfaces.
This document defines Port Control Protocol and two types of OpCodes, PEER and MAP. The PEER OpCode allows querying and extending (if permitted) the lifetime of an existing implicit dynamic mapping, so a host can reduce its keepalive messages. The MAP OpCode allows creating a mapping so a host can receive incoming unsolicited connections from the Internet in order to run a server.
The PEER OpCode does not introduce any new security considerations.
On today's Internet, ISPs do not typically filter incoming traffic for their subscribers. However, when an ISP introduces stateful address sharing with a NAPT device, such filtering will occur as a side effect. Filtering will also occur with IPv6 CPE [I-D.ietf-v6ops-cpe-simple-security]. The MAP OpCode allows a PCP client to create a mapping so that a host can receive inbound traffic and operate a server. Security considerations for the MAP OpCode are described in the following sections.
Because the state created in a NAPT or firewall, a per-subscriber quota will likely exist for both implicit dynamic mappings (e.g., outgoing TCP connections) and explicit dynamic mappings (PCP). A subscriber might make an excessive number of implicit or explicit dynamic mappings, consuming an inordinate number of ports, causing a denial of service to other subscribers. Thus, section XXX recommends that subscribers be limited to a reasonable number of explicit dynamic mappings.
It is important to prevent a subscriber from creating a mapping for another subscriber, because this allows incoming packets from the Internet and consumes the other user's mapping quota. Both implicit dynamic mappings (e.g., outgoing TCP connections) and explicit dynamic mappings (PCP) need ingress filtering. Thus, PCP does not create a new requirement for ingress filtering.
The MAP OpCode / THIRD_PARTY contains a Target Address field, which allows a PCP client to create an explicit dynamic mapping for another host. Hosts within a subscriber's network cannot create, modify, or delete mappings of other hosts, except by using the administrative interface of the customer premise router Section 8.8.1.
IANA is requested to perform the following actions:
IANA has assigned UDP port 44323 for PCP.
IANA shall create a new protocol registry for PCP OpCodes, initially populated with the values in Section 8 and Section 9.
New OpCodes in the range 1-95 can be created via Standards Action [RFC5226], and the range 96-128 is for Private Use [RFC5226].
IANA shall create a new registry for PCP result codes, numbered 0-255, initially populated with the result codes from Section 5.4, Section 8.2, Section 8.8.2, and Section 9.2.
Additional Result Codes can be defined via Specification Required [RFC5226].
IANA shall create a new registry for PCP Options, numbered 0-255 with an associated mnemonic. The values 0-127 are optional-to-process, and 128-255 are mandatory-to-process. The initial registry contains the options described in Section 8.8, and the option values 0 and 255 are reserved.
New PCP option codes in the range 0-63 and 128-192 can be created via Standards Action [RFC5226], and the range 64-127 and 192-255 is for Private Use [RFC5226].
Thanks to Alain Durand, Christian Jacquenet, and Simon Perreault for their comments and review. Thanks to Simon Perreault for highlighting the interaction of dynamic connections with PCP-created mappings.