Network Working Group B. Williams
Internet-Draft Akamai, Inc.
Intended status: Standards Track March 2012

Overlay Path Option for IP and TCP

Abstract

Data transport through overlay networks often uses either connection termination or network address translation (NAT) in such a way that the public IP addresses of the true endpoint machines involved in the data transport are invisible to each other. This document describes IPv4, IPv6, and TCP options for communicating this information from the overlay network to the endpoint machines.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

An overlay network is a network of machines distributed throughout multiple autonomous systems within the public Internet that can be used to improve the performance of data transport. IP packets from the sender are delivered first to one of the machines that make up the overlay network. That machine will relay the IP packets via one or more other machines in the overlay network, applying various performance enhancement methods and eventually delivering the packets to the real intended receiver. Such overlay networks are used to improve the performance of content delivery [IEEE1344002]. Overlay networks are also used for peer-to-peer data transport [RFC5694], and they have been suggested for use in improved scalability for the internet routing infrastructure [RFC6179].

Data transport through such an overlay network will often use network address translation for the source (SNAT) or destination (DNAT) addresses [RFC2663] in such a way that the public IP addresses of the true endpoint machines involved in the data transport are invisible to each other. For example, the actual sender and receiver may use two completely different pairs of source and destination addresses to identify the connection on the sending and receiving networks.

---------------------------------------------------------------------

          ip hdr contains:               ip hdr contains:
SENDER -> src = sender   --> OVERLAY --> src = overlay2  --> RECEIVER 
          dst = overlay1                 dst = receiver  

---------------------------------------------------------------------
        

This can be problematic for network security and diagnostics, since the source and destination IP addresses used by the sender are not accessible by the receiver.

Some application layer protocols provide functionality that allows the overlay network to communicate the sender's public IP address to the receiver, such as the HTTP X-FORWARDED-FOR header field. However, use of such application specific options requires both the overlay network and the receiver to perform application layer processing with associated overhead. It also requires separate implementations for each supported application type.

Some proprietary implementations use IP-in-IP or UDP tunneling in order to communicate the information in question, but these solutions require continuous network overhead throughout the life of the connection.

In order to limit the network and processing overhead associated with the other commonly used approaches, the mechanism described herein uses an internet or transport layer protocol option to communication the required address(es).

Theoretically, the existing IP Record Route option could be used to provide the required information, but there are a few problems associated with this approach. First, this use would be a re-purposing of the option, and thus existing implementations of support for the option would quite possibly conflict with this new use. More importantly, many IP-layer devices are configured to drop packets that include IP protocol options. Allowing the information to be transmitted at either the internet layer or the transport layer allows for more reliable packet delivery in a broad range of network configurations.

This document describes IPv4 [RFC0791], IPv6 [RFC2460], and TCP [RFC0793] options for communicating this information from the overlay to the destination network/machine. The list of addresses specified in the option may include any addresses that might be useful to the eventual receiver for security and/or diagnostic purposes, including but not limited to the source and destination addresses as seen by the sender.

The IPv4 and TCP protocol options described herein provide limited support for IPv6 addresses. It should be noted that inclusion of even a single IPv6 address would require the option to consume nearly half of the total option space provided by TCP and IPv4, which means that the entire TCP option space would be consumed for SYN packets that include the most commonly used options (i.e. MSS, WSOPT, SACK permitted, and TSOPT). This may prevent effective use of the IPv4 and TCP protocol options for communicating IPv6 addresses in some circumstances.

2. Detailed Use Case

An example use case that is satisfied by this design is as follows.

  1. The sending endpoint host performs a DNS lookup that returns the IP address of a machine on the overlay network. The sending endpoint addresses its packets with its own public IP address as the source and the provided overlay IP address as the destination.
  2. The overlay ingress host receives the packet on its public IP address, and forwards the packet through the overlay to the egress host. The overlay egress host performs SNAT, replacing the source IP address of the sending endpoint with its own IP address in order to ensure that return traffic will also use the overlay network. This use of SNAT hides the client's public IP address for the receiving network.
  3. For load balancing and diagnostic purposes, it is important for one or more machines on the receiver's network to be able to determine the public IP address associated with the sending host (i.e. the address that was hidden due to the use of SNAT by the overlay).
  4. The overlay egress host is located on the receiver's network, which means there is a relatively small set of addresses for machines that may be producing packets that include the overlay path option.

Under these circumstances, the overlay path option will contain a single IP address: the public IP address of the sending host. If the receiving network must use the IP address included in the option for a purpose that requires trust, the fact that the overlay egress host is under the receiver's administrative control allows the receiver to apply the necessary limitations to the network configuration. For example, under these circumstances, the receiver's firewall device could be configured to drop packets from external hosts if they contain the overlay path option.

3. Option Format

Some implementations already exist for version 1 of the overlay path option. However, version 1 of the option does not provide support for communicating IPv6 addresses in either the IPv4 or TCP option. Both version 1 and version 2 of the option are described here in order to reflect the requirements of current and future implementors.

It is up to the implementor whether version 1 is supported or both versions are supported. A receiving implementation that supports version 2 MUST also support version 1. The format changes defined for version 2 directly support the required backward compatibility.

When a receiving implementation encounters the overlay path option with an unsupported version number, the receiver MAY either ignore the option or drop the packet. The appropriate response will be dependent upon how the overlay path option's value is used by the receiver.

3.1. Version 1

Version 1 of the option supports only IPv4 addresses. The option format for both IPv4 and TCP is identical.

+---------+---------+---------+--------------------------------+ 
|Type/Kind| Length  | Version | Addresses ...
+---------+---------+---------+--------------------------------+ 
     1         1         1            4 x Address Count
----------------------------------------------------------------
          

IPv4 Type:
The type value for IPv4 is TBD (see also IANA Considerations [iana]).
Copied flag:
1 (All fragments must carry the option.)
Option class:
2 (debugging/measurement)
Option number:
TBD (decimal)

TCP Kind:
The option kind value for TCP is TBD (see also IANA Considerations [iana]).
Length:
The length of the option is variable, based on the number of addresses provided. The minimum value is 7 (3 1-octet fields plus one 4-octet address). The option MUST be ignored if the length value cannot represent 3 octets plus a list of 4-octet address value.
Version:
The version number is 1.
Addresses:
Version 1 of the option supports only IPv4 addresses. The remainder of the option space is filled with standard 32-bit IPv4 addresses. In practice, the first address will be the public source address used by the sender and the second address (if present) will be the public destination address used by the sender. However, the nature of the addresses provided may vary depending on the nature of the overlay network in question and is not required to include every IP address used for the connection. The list of IP addresses MUST be provided in order of traversal from sender to receiver.

3.2. Version 2

Version 2 of the options supports either IPv4 addresses or IPv6 addresses, but it does not support a mix of IPv4 and IPv6 options within the same option value. Version 2 provides not only IPv4 and TCP options, but also an IPv6 option for inclusion in the IPv6 Hop-by-hop Options header. When IPv6 address support is required, the implementation SHOULD use the IPv6 header option whenever possible in order to avoid exhaustion of the TCP option space. The option format for all three protocols is identical.

+---------------+---------------+---------------+----------------------\ 
|   Type/Kind   |    Length     |Fmly| Version  |    Addresses ...
+---------------+---------------+---------------+----------------------\
       8b              8b       | 3b     5b     |
                                -----------------
        1               1              1            Addr Size x Count
------------------------------------------------------------------------
          

IPv4 Type:
Identical to Version 1.
TCP Kind:
Identical to Version 1.
IPv6 Type:
The Type value for IPv6 is TBD (see also IANA Considerations [iana]).
act flag:
00 (skip over option)
chg flag:
0 (option data does not change en-route)
rest:
TBD (decimal)

Length:
The length of the option is variable, based on the address family and the number of addresses provided. The minimum value is 7 (3 1-octet fields plus one 4-octet IPv4 address). The option MUST be ignored if the length value cannot represent 3 octets plus a list of addresses of the correct address family.
Family/Version:
The third octet is comprised of two fields: family and version.Note that the possible family values have been selected to support backward compatibility with the 8-bit version field in version 1 of the option format.
Family:
The high order 3 bits of the third octet indicate the address family for all IP addresses represented in the variable-length Addresses field. The allowed values are:
0:
Address family is IPv4.
1:
Address family is IPv6.

Version:
The low order 5 bits of the third octet indicate the protocol version number. The version number is 2.
Addresses:
The remainder of the option space is filled with either 32-bit IPv4 or 128-bit IPv6 addresses, as indicated by the Family field. In practice, the first address will be the public source address used by the sender and the second address (if present) will be the public destination address used by the sender. However, the nature of the addresses provided may vary depending on the nature of the overlay network in question and is not required to include every IP address used for the connection. The list of IP addresses MUST be provided in order of traversal from sender to receiver.

4. Network Traversal

The following block diagram illustrates the use of addresses in the IPv4 header and the overlay path option as a packet traverses the network from sender to receiver. The diagram assumes that the overlay network uses separate addresses (overlay1 and overlay2) for ingress and egress.

-----------------------------------------------------------------

                             SENDER
                               |
                               V
                       +----------------+
                       |                |
                       | src: sender    |
                       | dst: overlay1  |
                       | opt: none      |
                       |                |
                       +----------------+
                               |
                               V
                            OVERLAY
                            NETWORK
                               |
                               V
                       +----------------+
                       |                |
                       | src: overlay2  |
                       | dst: receiver  |
                       | opt: sender    |
                       |                |
                       +----------------+
                               |
                               V
                            RECEIVER

-----------------------------------------------------------------
        

5. Option Use

Use of the TCP option allows an implementation to minimize the impact of this option on bandwidth utilization. Due to the connection-oriented nature of TCP, the addresses used by the overlay network cannot change throughout the life of the connection. For this reason, it is not necessary for the overlay network to include the overlay path option on every packet. On the other hand, it is not enough for the option to be provided exclusively in the TCP SYN packet because the use of SYN cookies, for example, would mean that connection state is not stored until completion of the three-way handshake. For this reason, the overlay network MUST include the TCP overlay path option in every outgoing packet until the receiver has either acknowledged or transmitted at least one byte of real data. The overlay network SHOULD discontinue inclusion of the TCP overlay path option after the first byte is either received or acknowledged. The receiver MAY ignore the TCP overlay path option on subsequent packets after successfully processing one instance of the option attached to a single in-order TCP packet.

IP is not connection oriented, which means that the above described optimization is not possible. In order to make effective use of the TCP optimization, an overlay network SHOULD only send the IP option on packets that do not use TCP as the transport layer protocol. When the IP option is in use, the overlay network MUST transmit the option with every packet. The receiver MUST NOT assume that that addresses in the IP overlay path option will remain consistent, but instead MUST be prepared to handle address changes in an application appropriate way.

Use of the IP option is dependent upon support for IP options in all routers between the overlay egress point and the packet receiver. If any router along the path is configured to drop packets with unknown IPv4 options (or any IP options, as is sometimes done as part of a DoS protection scheme), then use of the IP option will cause connections to simply fail. For this reason, the IP option SHOULD only be used in environments where the full path between the overlay egress machine and the packet receiver is under common administrative control.

As explained above, the intention of both the TCP and IP options is to provide the receiver with public IP addresses that it would otherwise have seen if the overlay network were not in use. There are security implications associated with exposing a network's use of the private [RFC1918] address space to the public internet, and for this reason, the overlay path option SHOULD NOT be used to communicate RFC1918 addresses in packets that traverse the public internet.

6. Security Considerations

This specification provides no authentication/validity verification for the data contained in the address fields. For this reason, the data contained in the addresses field of the new option cannot itself be considered inherently secure. In other words, confidence in the validity of the source address of the IPv4/IPv6 packet does not translate into confidence in the validity of the addresses in the overlay path option. With this exception, this specification does not alter the inherent security of IPv4, IPv6, or TCP.

The addresses provided in the option SHOULD NOT be used for purposes that require a trust relationship between the overlay network and the receiver (e.g. billing and/or intrusion prevention) unless a mechanism outside the scope of this specification is used to ensure the necessary level of trust. As noted above, one possible example of such a mechanism would be to place the overlay egress host on the receiver's own network and to configure the receiver's firewall to drop any packets from external hosts that provide the overlay path option. When the receiving network uses the values provided by the option in a way that does not require trust (e.g. maintaining session affinity in a load-balancing system), then use of a mechanism to enforce the trust relationship may not be required.

7. Forward Compatibility Support

The most common use of this option on the internet today will require recording IP addresses for a single address family only. However, it may be important in the future to be able to record a mix of IPv4 and IPv6 addresses. Alternatively, future security requirements may demand the use of, for example, a keyed hash for data integrity and authentication purposes and/or inclusion of additional information specific to the sender's connection.

To balance current-day performance and efficiency against the need for future extensibility, the option includes a version field, so that future requirements can be met without the need to consume a new option number.

8. IANA Considerations

This document will make a request to IANA to allocate new option/kind values for IPv4, IPv6, and TCP.

9. References

[RFC0791] Postel, J., "Internet Protocol", RFC 791, STD 5, September 1981.
[RFC0793] Postel, J., "Transmission Control Protocol", RFC 793, STD 7, September 1981.
[RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G.J. and E. Lear, "Address Allocation for Private Internets", RFC 1918, BCP 5, February 1996.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)", RFC 2460, December 1998.
[RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address Translator (NAT) Terminology and Considerations", RFC 2663, August 1999.
[RFC5694] Camarillo, G., "Peer-to-Peer (P2P) Architecture: Definition, Taxonomies, Examples, and Applicability", RFC 5694, November 2009.
[RFC6179] Templin, F., "The Internet Routing Overlay Network (IRON)", RFC 6179, March 2011.
[IEEE1344002] Byers, J.W., Considine, J., Mitzenmacher, M. and S. Rost, "Informed content delivery across adaptive overlay networks: IEEE/ACM Transactions on Networking, Vol 12, Issue 5, ppg 767-780", October 2004.

Author's Address

Brandon Williams Akamai, Inc. Cambridge, MA USA EMail: brandon.williams@akamai.com