Network Working Group | C.D. Donley |
Internet-Draft | C.G. Grundemann |
Intended status: Experimental Protocol | V.S. Sarawat |
Expires: March 29, 2012 | K.S. Sundaresan |
CableLabs | |
September 26, 2011 |
Deterministic Address Mapping to Reduce Logging in Carrier Grade NATs
draft-donley-behave-deterministic-cgn-00
Many Carrier Grade NAT solutions require per-connection logging. Unfortunately, such logging is not scalable to many residential broadband services. This document suggests a way to manage Carrier Grade NAT translations in such a way as to significantly reduce the amount of logging required while providing traceability for abuse response. This method also provides a way of including geo-location significance in such assignments.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 29, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The world is rapidly running out of unallocated IPv4 addresses. To meet the growing demand for Internet service from new subscribers, devices, and service types, ISPs will be forced to share a single public IPv4 address among multiple subscribers using a technology such as Carrier Grade Network Address Translation (CGN) [RFC6264]. However, address sharing poses additional challenges to ISPs in responding to law enforcement requests or attack/abuse reports. In order to respond to such requests to identify a specific user associated with an IP address, an ISP will need to map a subscriber's internal source IP address and source port with the global public IP address and source port provided by the CGN for every connection initiated by the user.
CGN connection logging satisfies the need to identify attackers and respond to abuse/law enforcement requests, but it imposes significant operational challenges to ISPs. In lab testing, we have observed CGN log messages to be approximately 150 bytes long for NAT444 [I-D.shirasaki-nat444], and 175 bytes for DS-Lite [RFC6333] (individual log messages vary somewhat in size). Although we are not aware of definitive studies of connection rates per subscriber, reports from several ISPs in the US sets the average number of connections per household per day at approximately 33,000 connections per day. If each connection is individually logged, this translates to a data volume of approximately 5 MB per subscriber per day, or about 150 MB per subscriber per month; however, specific data volumes may vary across different ISPs based on myriad factors. Based on available data, a 1-million subscriber service provider will generate approximately 150 terabytes of log data per month, or 1.8 petabytes per year.
As an alternative to per-connection logging, CGNs could deterministically map internal addresses to external addresses in such a way as to be able to algorithmically calculate the mapping without relying on per connection logging. This document describes a method for such CGN address mapping, combined with block port reservations, that significantly reduces the burden on ISPs while offering the ability to map a subscriber's inside IP address with an outside address and port observed on the Internet.
While a subscriber uses thousands of connections per day, most subscribers use far fewer at any given time. When the compression ratio is low (i.e., the ratio of the number of subscribers to the number of public IPv4 addresses allocated to a CGN is closer to 10:1 than 1000:1), each subscriber could expect to have access to thousands of TCP/UDP ports at any given time. Thus, as an alternative to logging each connection, CGNs could deterministically map customer private addresses on the inside of the CGN to public addresses on the outside of the CGN. This algorithm will allow an operator to identify a subscriber internal IP address when provided the public side IP and port number without having to examine the CGN translation logs. This prevents an operator from having to transport and store massive amounts of session data from the CGN and then process it to identify a subscriber.
Deterministic NAT requires configuration of the following variables:
Note: The inside address range (I) will be an IPv4 range in NAT444 operation ([I-D.shirasaki-nat444]) and an IPv6 range in DS-Lite operation ([RFC6333]).
The CGN then reserves ports as follows:
Thus, the CGN will maintain translation mapping information for all connections within its internal translation tables; however, it only needs to externally log translations for dynamically-assigned ports.
As described in [RFC6269], CGN implementation can reduce the level of confidence and level of granularity of geo-location information. However, the level of confidence in geo-location data can be increased, even in a centralized CGN deployment, by sub-dividing inside and outside ranges. If I and O are subdivided such that I-1 corresponds to a particular headend or central office (CO), and I-2 corresponds with another headend/CO, etc., then geo-location data tied to address ranges O-1 and O-2, etc. can be accurate down to the headend/CO level, approximately the same level of granularity available in residential broadband services without CGNs. This information can help content providers enforce regional content licensing restrictions, target advertising to local markets, and assist with emergency services provisioning.
To illustrate the use of deterministic NAT, let's consider a simple example. The operator configures an inside address range (I) of 192.168.0.0/28 and outside address (O) of 203.0.113.1. The dynamic buffer factor is set to '2'. Thus, the total compression ratio is 1:(14+2) = 1:16. Only the system ports (e.g. ports < 1024) are reserved. This configuration causes the CGN to preallocate ((65536-1024)/16 =) 4032 TCP and 4032 UDP ports per inside IPv4 address. For the purposes of this example, let's assume that they are allocated sequentially, where 192.168.0.1 maps to 203.0.113.1 ports 1024-5055, 192.168.0.2 maps to 203.0.113.1 ports 5056-9087, etc. The dynamic port range thus contains ports 57472-65535. Finally, the maximum ports/subscriber is set to 5040.
When subscriber 1 using 192.168.0.1 initiates a low volume of connections (e.g. < 4032 concurrent connections), the CGN maps the outgoing source address/port to the preallocated range. These translation mappings are not logged.
Subscriber 2 concurrently uses more than the allocated 4032 ports (e.g. for peer-to-peer, mapping, video streaming, or other connection-intensive traffic types), the CGN allocates up to an additional 1008 ports using bulk port reservations. In this example, subscriber 2 uses outside ports 5056-9087, and then 100-port blocks between 58000-58999. Connections using ports 5056-9087 are not logged, while 10 log entries are created for ports 58000-58099, 58100-58199, 58200-58299, ..., 58900-58999.
If a law enforcement agency reports abuse from 203.0.113.1, port 2001, the operator can reverse the mapping algorithm to determine that subscriber 1 generated the traffic without consulting logs. If a second abuse report comes in for 203.0.113.1, port 58204, the operator will determine that port 58204 is within the dynamic pool range, consult the log file, and determine that subscriber 2 generated the traffic (assuming that the law enforcement timestamp matches the operator timestamp).
In this example, there are no log entries for the majority of subscribers, who only use pre-allocated ports. Only minimal logging would be needed for those few subscribers who exceed their pre-allocated ports and obtain extra bulk port assignments from the dynamic pool. Logging data for those users will include inside address, outside address, outside port range, and timestamp.
In order to be able to identify a subscriber based on observed external IPv4 address, port, and timestamp, an operator needs to know how the CGN was configured with regards to internal and external IP addresses, dynamic address pool factor, maximum ports per user, and reserved port range at any given time. Therefore, the CGN MUST generate a log message any time such variables are changed. Also, the CGN SHOULD generate such a log message once per day to facilitate quick identification of the relevant configuration in the event of an abuse notification.
Such a log message MUST, at minimum, include the timestamp, inside prefix I, inside mask, outside prefix O, outside mask, D, M, and reserved port range; for example:
[Wed Oct 11 14:32:52 2000]:192.168.0.0:28:203.0.113.0:32:2:5040:1-1023,5004,5060.
This document makes no request of IANA.
The security considerations applicable to NAT operation for various protocols as documented in, for example, RFC 4787 [RFC4787] and RFC 5382 [RFC5382] also apply to this document.
TBD
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC4787] | Audet, F. and C. Jennings, "Network Address Translation (NAT) Behavioral Requirements for Unicast UDP", BCP 127, RFC 4787, January 2007. |
[RFC6269] | Ford, M., Boucadair, M., Durand, A., Levis, P. and P. Roberts, "Issues with IP Address Sharing", RFC 6269, June 2011. |
[RFC5382] | Guha, S., Biswas, K., Ford, B., Sivakumar, S. and P. Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, RFC 5382, October 2008. |
[RFC6264] | Jiang, S., Guo, D. and B. Carpenter, "An Incremental Carrier-Grade NAT (CGN) for IPv6 Transition", RFC 6264, June 2011. |
[I-D.ietf-pcp-base] | Wing, D, Cheshire, S, Boucadair, M, Penno, R and P Selkirk, "Port Control Protocol (PCP)", Internet-Draft draft-ietf-pcp-base-13, July 2011. |
[RFC6333] | Durand, A., Droms, R., Woodyatt, J. and Y. Lee, "Dual-Stack Lite Broadband Deployments Following IPv4 Exhaustion", RFC 6333, August 2011. |
[I-D.ietf-tsvwg-iana-ports] | Cotton, M, Eggert, L, Touch, J, Westerlund, M and S Cheshire, "Internet Assigned Numbers Authority (IANA) Procedures for the Management of the Service Name and Transport Protocol Port Number Registry", Internet-Draft draft-ietf-tsvwg-iana-ports-10, February 2011. |
[I-D.shirasaki-nat444] | Yamagata, I, Shirasaki, Y, Nakagawa, A, Yamaguchi, J and H Ashida, "NAT444", Internet-Draft draft-shirasaki-nat444-04, July 2011. |
[I-D.tsou-behave-natx4-log-reduction] | ZOU), T, Li, W and T Taylor, "Port Management To Reduce Logging In Large-Scale NATs", Internet-Draft draft-tsou-behave-natx4-log-reduction-02, September 2010. |