TOC |
|
This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 18, 2010.
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
The success of the Internet is largely down to the elegant manner in which it shares capacity amongst all users while avoiding congestion collapse. However this relies on the cooperation of all end users to work efficiently. Increasingly a small minority of users are able to grab larger and larger shares of the network leading ISPs to impose arbitrary controls on traffic. These controls set ISPs on a direct collision course with their customers and the regulators.
The root of the problem lies in the fact the ISPs are unable to see the most important information about the traffic – namely the amount of congestion that traffic is going to cause in the network. We propose congestion exposure as a possible solution. Every packet will carry an accurate prediction of the congestion it expects to cause downstream. This memo sets out the motivations for congestion exposure and introduces a strawman protocol designed to achieve congestion exposure.
1.
Introduction
1.1.
Definitions
2.
The Problem
2.1.
The Impact of Congestion
2.2.
Making Congestion Visible
2.3.
ECN - a Step in the Right Directions
3.
Existing Approaches to Traffic Control
3.1.
Passive Measurement
3.1.1.
Volume Accounting
3.1.2.
Rate Measurement
3.2.
Active Discrimination
3.2.1.
Bottleneck Rate Policing
3.2.2.
DPI and Application Rate Policing
4.
Why Now?
5.
Requirements for a Solution
6.
A Strawman Congestion Exposure Protocol
7.
Use Cases
8.
IANA Considerations
9.
Security Considerations
10.
Conclusions
11.
Acknowledgements
12.
Informative References
TOC |
The Internet has grown from humble origins to become a global phenomenon with billions of end-users able to share the network and exchange data and more. One of the key elements in this success has been the use of distributed algorithms such as TCP that share capacity while avoiding congestion collapse. These algorithms rely on the end-users altruistically reducing their transmission rate in response to any congestion they see.
In recent years ISPs have seen small minority of customers taking a large share of the network by using protocols that open multiple simultaneous TCP connections and by remaining connected for hours or even days at a time. This issue first came to the fore with the advent of “always on” broadband connections. Frequently peer to peer protocols have been held responsible [RFC5594] (Peterson, J. and A. Cooper, “Report from the IETF Workshop on Peer-to-Peer (P2P) Infrastructure, May 28, 2008,” July 2009.) but streaming video traffic is becoming increasingly significant. In order to improve the network experience for the majority of their customers, many ISPs have imposed controls on how their network’s capacity is shared. Approaches include volume counting or charging, and application rate limiting. Typically these traffic controls, whilst not impacting most customers, set a restriction on a customer’s level of network usage, as defined in a “fair usage policy”.
We believe that such traffic controls seek to control the wrong quantity. What matters in the network is neither the volume of traffic nor the rate of traffic, it is the amount of congestion – congestion means that your traffic impacts other users, and conversely that their traffic impacts you. So if there is little other traffic there should be no restriction on the amount a user can send; restrictions should only apply when others are sending a lot of traffic such that there is congestion. In fact some of the current work at the IETF [LEDBAT] (Shalunov, S., “Low Extra Delay Background Transport (LEDBAT),” March 2009.) and IRTF [CC‑open‑research] (Welzl, M., Scharf, M., Briscoe, B., and D. Papadimitriou, “Open Research Issues in Internet Congestion Control,” September 2009.) already reflect this thinking. For example, a LEDBAT user reduces their transmission rate when they detect an increase in end-to-end delay (as a measure of incipient congestion). However, these techniques at the moment rely on voluntary, altruistic action by end users. ISPs cannot enforce their use. This leads to our second point.
We believe that congestion needs to be visible to network nodes and not just to the end hosts as is the case today. By this we mean that a network can detect how much congestion traffic causes between a point in the network and the destination (“rest-of-path congestion”). This capability is new; today a network can only detect how much congestion traffic has suffered between the source and a point in the network. Such a capability enables an ISP to give incentives for the use, without restrictions, of LEDBAT-like applications whilst perhaps restricting TCP and UDP ones.
So we propose a new approach which we call congestion exposure. We propose that congestion information should be made visible at the IP layer, where it is accessible by any application or transport protocol. Once the information is exposed in this way, it is then possible to use it to measure the true impact of any traffic on the network. One use of this information would be to measure the congestion attributable to a given application or user and thereby incentivise the use of protocols such as [LEDBAT] (Shalunov, S., “Low Extra Delay Background Transport (LEDBAT),” March 2009.) which aim to reduce the congestion caused by bulk data transfers. {ToDo}It is possible to imagine many other ways to use the exposed congestion information [maybe a forward ref?].
TOC |
Throughout this document we use two terms that need to be carefully defined to avoid ambiguity:
- Upstream congestion is defined as the congestion that has already been experienced by a packet as it travels along its path. In other words at any point on the path it is the congestion between that point and the source of the packet.
- Downstream congestion is defined as the congestion that a packet still has to experience on the remainder of its path. In other words at any point it is the congestion still to be experienced as the packet travels between that point and ts destination.
TOC |
ISPs are facing a quandary – traffic is growing rapidly yet they know that any increases in capacity will be of most benefit to the most aggressive users. At the same time, traffic patterns are changing significantly [Cisco‑VNI] (Cisco Systems, inc., “Cisco Visual Networking Index: Forecast and Methodology, 2008–2013,” June 2009.) and all the while increasing competition is squeezing their profit margins. Faced with these problems, some ISPs are seeking to reduce what they regard as "heavy usage" in order to improve the service experienced by the majority of their customers.
TOC |
As the Internet has grown, access rates have lagged behind those in the core. However over recent years this has started to change. Increasingly large numbers of people now access the network via broadband connections and the speed they can get is steadily increasing. Alongside this have gone significant changes in traffic patterns. We have been through a boom in large-scale data transfer by peer to peer networks and now are seeing an even larger boom in streaming media with applications such as the BBC iPlayer becoming increasingly popular. The main effect of this has been that users now routinely see their network connections running slow in the evenings [OfCom] (Ofcom: Office of Communications, “UK Broadband Speeds 2008: Research report,” January 2009.).
TOC |
Unfortunately ISPs are only able to see limited information about the traffic they forward. As we will see in section 3 they are forced to use the only information they do have available which leads to myopic control that has scant regard for the actual impact of the traffic or the underlying network conditions. All their approaches are flawed because they measure the wrong metric. The volume or rate of a given flow doesn’t directly affect other users, but the congestion the flow causes does. This can be seen with a simple illustration. A 5Mbps flow in an otherwise empty 10Mbps bottleneck causes no congestion and so affects no other users. By contrast a 1Mbps flow entering a 10Mbps bottleneck that is already fully occupied causes significant congestion and impacts every other user sharing that bottleneck. So the real problem that needs to be addressed is how to close this information gap. How can we expose congestion at the IP layer so that it can be used as the basis for measuring the impact of any traffic on the network as a whole?
TOC |
Explicit Congestion Notification [RFC3168] (Ramakrishnan, K., Floyd, S., and D. Black, “The Addition of Explicit Congestion Notification (ECN) to IP,” September 2001.) allows routers to explicitly tell end-hosts that they are approaching the point of congestion. ECN builds on Active Queue Mechanisms such as random early discard (RED) [RFC2309] (Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., and L. Zhang, “Recommendations on Queue Management and Congestion Avoidance in the Internet,” April 1998.) by allowing the router to mark a packet with a Congestion Experienced (CE) codepoint, rather than dropping it. The probability of a packet being marked increases with the length of the queue and thus the rate of CE marks is a guide to the level of congestion at that queue. This CE codepoint travels forward through the network to the receiver which then informs the sender that it has seen congestion. The sender is then required to respond as if it had experienced a packet loss. Because the CE codepoint is visible in the IP layer, this approach reveals the upstream congestion level for a packet.
The choice of two ECT code-points in the ECN field [RFC3168] (Ramakrishnan, K., Floyd, S., and D. Black, “The Addition of Explicit Congestion Notification (ECN) to IP,” September 2001.) permitted future flexibility, optionally allowing the sender to encode the experimental ECN nonce [RFC3540] (Spring, N., Wetherall, D., and D. Ely, “Robust Explicit Congestion Notification (ECN) Signaling with Nonces,” June 2003.) in the packet stream. This mechanism has since been included in the specifications of DCCP [RFC4340] (Kohler, E., Handley, M., and S. Floyd, “Datagram Congestion Control Protocol (DCCP),” March 2006.). The ECN nonce is an elegant scheme that allows the sender to detect if someone in the feedback loop - the receiver especially - tries to claim no congestion was experienced when in fact congestion led to packet drops or ECN marks. For each packet it sends, the sender chooses between the two ECT codepoints in a pseudo-random sequence. Then, whenever the network marks a packet with CE, if the receiver wants to deny congestion happened, she has to guess which ECT codepoint was overwritten. She has only a 50:50 chance of being correct each time she denies a congestion mark or a drop, which ultimately will give her away.
So Is ECN the Solution? Alas not - ECN does allow downstream nodes to measure the upstream congestion for any flow, but this is not enough. What is needed is knowledge of the downstream congestion level for which you need additional information that is still concealed from the network by design.
TOC |
Existing approaches intended to address the problems outlined above can be broadly divided into two groups - those that passively monitor traffic and can thus measure the apparent impact of a given flow of packets and those that can actively discriminate against certain packets, flows, applications or users based on various characteristics or metrics.
TOC |
Passive measurement of traffic relies on using the information that can be measured directly or is revealed in the IP header of the packet. Architecturally, passive measurement is best since it fits with the idea of the hourglass design of the Internet [RFC3439] (Bush, R. and D. Meyer, “Some Internet Architectural Guidelines and Philosophy,” December 2002.). This asserts that "the complexity of the Internet belongs at the edges, and the IP layer of the Internet should remain as simple as possible."
TOC |
Volume accounting is a passive technique that is often used to discriminate between users. The volume of traffic sent by a given user or network is one of the easiest pieces of information to monitor in a network. Measuring the size of every packet and adding them up is a simple operation and to make it even easier, every IP packet carries its overall size in the header. Consequently this has long been a favoured measure used by operators to control their customers.
The precise manner in which this volume information is used may vary. Typically ISPs may impose an overall volume cap on their customers (perhaps 10Gbytes a month). Alternatively they may decide that only a minority of heavy users are affected in some fashion.
TOC |
Traffic rates are often used as the basis of accounting at borders between ISPs. For instance a contract might specify a charge based on the 95th percentile of the peak rate of traffic crossing the border every month. Such bulk rate measurements are relatively easy to make. With a little extra effort it is possible to measure the rate of a given flow by using the 3-tuple of source and destination IP address and protocol number.
TOC |
[RFC5290] (Floyd, S. and M. Allman, “Comments on the Usefulness of Simple Best-Effort Traffic,” July 2008.) seeks to reinforce [RFC3439] (Bush, R. and D. Meyer, “Some Internet Architectural Guidelines and Philosophy,” December 2002.) by stating that "...differential treatment of traffic can clearly be useful..." but adding that such techniques are only useful "...as *adjuncts* to simple best-effort traffic, not as *replacements* of simple best-effort traffic." We fully agree with the authors that the network should be built on the concept of simple best effort traffic. However, as this section shows, a number of approaches have emerged over recent years that explicitly differentiate between different traffic types, applications and even users.
TOC |
Bottleneck rate policers such as [XCHOKe] (Chhabra, P., Chuig, S., Goel, A., John, A., Kumar, A., Saran, H., and R. Shorey, “XCHOKe: Malicious Source Control for Congestion Avoidance at Internet Gateways,” November 2002.) and [pBox] (Floyd, S. and K. Fall, “Promoting the Use of End-to-End Congestion Control in the Internet,” August 1999.) have been proposed as approaches for rate policing traffic. But they must be deployed at bottlenecks in order to work. Unfortunately, a large proportion of traffic traverses at least two bottlenecks which limits the utility of this approach. Such rate policers also make an assumption about what constitutes acceptable behaviour. If these bottleneck policers were widely deployed, the Internet would find itself with one universal rate adaptation policy embedded throughout the network. With TCP's congestion control algorithm approaching its scalability limits as the network bandwidth continues to increase, new algorithms are being developed for high-speed congestion control. Embedding assumptions about acceptable rate adaptation would make evolution to such new algorithms extremely painful.
TOC |
Some operators use deep packet inspection (DPI) and traffic analysis to identify certain applications believed to have an excessive impact on the network. These so-called heavy applications are generally things like peer-to-peer or streaming video. Having identified a flow as belonging to a heavy application, the operator is able to use standard Diffserv [RFC2475] (Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., and W. Weiss, “An Architecture for Differentiated Services,” December 1998.) approaches such as policing and traffic shaping to limit the throughput given to that flow. This has fuelled the on-going battle between application developers and DPI vendors. When operators first started to limit the throughput of P2P, it soon became common knowledge that turning on encryption could boost your throughput. The DPI vendors then improved their equipment so that it could identify P2P traffic by the pattern of packets it sends. This risks becoming an endless vicious cycle - an arms race that neither side can win. Furthermore such techniques may put the operator in direct conflict with the customers, regulators and content providers.
TOC |
Congestion has long been the key metric for deciding how fast a flow can safely send traffic. Since the late 1990s it has also been recognised as a key metric for measuring the impact of traffic across the network [Kelly] (Kelly, F., Maulloo, A., and D. Tan, “Rate control for communication networks: shadow prices, proportional fairness and stability,” 1998.). The IETF is keen to encourage techniques that reduce congestion. For instance the [LEDBAT] (Shalunov, S., “Low Extra Delay Background Transport (LEDBAT),” March 2009.) working group focuses on broadly applicable techniques that allow large amounts of data to be transmitted without substantially affecting the delays experienced by other users and applications, thus reducing the congestion that traffic is causing.
But users will only want to take advantage of such techniques if they actually improve their performance. As long as ISPs continue to use rate and volume as the key metrics for determining when to control traffic there is no incentive to use LEDBAT or other protocols that reduce congestion. We believe that congestion exposure gives ISPs the information they need to be able to discriminate in favour of such low-congestion transports.
TOC |
This section lists the main requirements for any solution to this problem. We believe that a solution that meets most of these requirements is likely to be better than one that doesn't.
Many of these requirements are by no means unique to the problem of congestion exposure. Incremental deployment for instance is a critical requirement for any new protocol that affects something as fundamental as IP. Being robust under attack is also a pre-requisite for any protocol to succeed in the real Internet and this is covered in more detail in Section 9 (Security Considerations).
TOC |
In this section we are going to explore a simple strawman protocol that would solve the congestion exposure problem. This protocol neatly illustrates how a solution might work. A practical implementation of this protocol has been produced and both simulations and real-life testing show that it works. The protocol is based on a concept known as re-feedback [Re‑fb] (Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C., Salvatori, A., Soppera, A., and M. Koyabe, “Policing Congestion Response in an Internetwork Using Re-Feedback,” August 2005.) and assumes that routers can measure their congestion level precisely.
Re-feedback, standing for re-inserted feedback, is a system designed to allow end-hosts to reveal to the network information they have received via conventional feedback (for instance RTT or congestion level).
In our strawman protocol we imagine that packets have two "congestion" fields in their IP header. One field records the upstream congestion level and routers indicate their current congestion level by changing this field in every packet. So as the packet traverses the network it builds up a record of the overall congestion along its path in this field. This data is then sent back to the sender who uses it to determine its transmission rate. Using re-feedback the sender now inserts this congestion value in the second whole path congestion field on every packet it sends out. Thus at any node downstream of the sender you can see the upstream congestion for the packet (the congestion thus far), the whole path congestion (with a time lag of 1RTT) and can calculate the downstream congestion by subtracting one from the other.
So congestion exposure can be achieved by coupling congestion notification from routers with the re-insertion of this information by the sender. This establishes an information symmetry between users and network providers which opens the door for the evolution of new congestion responses which are not bounded to a "universal rate adaptation policy".
TOC |
Once downstream congestion information is revealed in the IP header it can be used for a number of purposes. Precise details of how the information might be used are beyond the scope of this document but this section will give an overview of some possible uses.
It allows an ISP to accurately identify which traffic is having the greatest impact on the network and eitehr police directly on that basis or use it to determine which users should be policed. It can form the basis of inter-domain contracts between operators. It could even be used as the basis for inter-domain routing, thus encouraging operators to invest appropriately in improving their infrastructure.
From Rich Woundy: "I would add a section about use cases. The primary use case would seem to be an "incentive environment that ensures optimal sharing of capacity", although that could use a better title. Other use cases may include "DDoS mitigation", "end-to-end QoS", "traffic engineering", and "inter-provider service monitoring". (You can see I am stealing liberally from the motivation draft here. We'll have to see whether the other use cases are "core" to this group, or "freebies" that come along with re-ECN as a particular protocol.)"
My take on this is we need to concentrate on one or two major use cases. The most obvious one is using this to control user-behaviour and encourage the use of "congestion friendly" protocols such as LEDBAT. {Comments from Louise Krug} simply say that operators MUST turn off any kind of rate limitation for ledbat traffic and what they might mean for the amount of bandwidth they see compared to a throttled customer? YOu could then extend that to say how it leads to better QoS differentiation under the assumption that there is a broad traffic mix any way? Not sure how much detail you want to go into here though?
TOC |
This document makes no request to IANA.
TOC |
This section needs to briefly cover the obvious security aspects of any congestion exposure scheme: Source address spoofing, DoS, integrity of signals, honesty. It might also be the place to mention the possible reluctance to reveal too much information to the whole network (some ISPs view congestion level as a commerically sensitive concept). It needs to concentrate on two things in particular: DoS attacks that seek to corrupt the congestion information and how to ensure the honest declaration of information (both by the network and by the end-user).
TOC |
Congestion exposure is the idea that traffic itself indicates to all nodes on its path how much congestion it causes on the entire path. It is useful for network operators to police flows only if they really cause congestion in the Internet instead of doing blind rate capping independently of the congestion situation. This change would give incentives to users to adopt new transport protocols such as LEDBAT which try to avoid congestion more than TCP does. Requirements for congestion exposure in the IP header were summarized, one technical solution was presented, and additional use cases for congestion exposure were discussed.
TOC |
A number of people have provided text and comments for this memo. The document is being produced in support of a BoF on Congestion Exposure as discussed extensively on the <re-ecn@ietf.org> mailing list.
TOC |
[CC-open-research] | Welzl, M., Scharf, M., Briscoe, B., and D. Papadimitriou, “Open Research Issues in Internet Congestion Control,” draft-irtf-iccrg-welzl-congestion-control-open-research-05 (work in progress), September 2009 (TXT). |
[Cisco-VNI] | Cisco Systems, inc., “Cisco Visual Networking Index: Forecast and Methodology, 2008–2013,” June 2009. |
[Fairness] | Briscoe, B., Moncaster, T., and A. Burness, “Problem Statement: Transport Protocols Don't Have To Do Fairness,” draft-briscoe-tsvwg-relax-fairness-01 (work in progress), July 2008 (TXT). |
[Kelly] | Kelly, F., Maulloo, A., and D. Tan, “Rate control for communication networks: shadow prices, proportional fairness and stability,” Journal of the Operational Research Society 49(3) 237--252, 1998 (PDF). |
[LEDBAT] | Shalunov, S., “Low Extra Delay Background Transport (LEDBAT),” draft-shalunov-ledbat-congestion-00 (work in progress), March 2009 (TXT). |
[OfCom] | Ofcom: Office of Communications, “UK Broadband Speeds 2008: Research report,” January 2009. |
[RFC2309] | Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., and L. Zhang, “Recommendations on Queue Management and Congestion Avoidance in the Internet,” RFC 2309, April 1998 (TXT, HTML, XML). |
[RFC2475] | Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., and W. Weiss, “An Architecture for Differentiated Services,” RFC 2475, December 1998 (TXT, HTML, XML). |
[RFC3168] | Ramakrishnan, K., Floyd, S., and D. Black, “The Addition of Explicit Congestion Notification (ECN) to IP,” RFC 3168, September 2001 (TXT). |
[RFC3439] | Bush, R. and D. Meyer, “Some Internet Architectural Guidelines and Philosophy,” RFC 3439, December 2002 (TXT). |
[RFC3448] | Handley, M., Floyd, S., Padhye, J., and J. Widmer, “TCP Friendly Rate Control (TFRC): Protocol Specification,” RFC 3448, January 2003 (TXT). |
[RFC3540] | Spring, N., Wetherall, D., and D. Ely, “Robust Explicit Congestion Notification (ECN) Signaling with Nonces,” RFC 3540, June 2003 (TXT). |
[RFC4340] | Kohler, E., Handley, M., and S. Floyd, “Datagram Congestion Control Protocol (DCCP),” RFC 4340, March 2006 (TXT). |
[RFC5290] | Floyd, S. and M. Allman, “Comments on the Usefulness of Simple Best-Effort Traffic,” RFC 5290, July 2008 (TXT). |
[RFC5594] | Peterson, J. and A. Cooper, “Report from the IETF Workshop on Peer-to-Peer (P2P) Infrastructure, May 28, 2008,” RFC 5594, July 2009 (TXT). |
[Re-fb] | Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C., Salvatori, A., Soppera, A., and M. Koyabe, “Policing Congestion Response in an Internetwork Using Re-Feedback,” ACM SIGCOMM CCR 35(4)277--288, August 2005 (PDF). |
[XCHOKe] | Chhabra, P., Chuig, S., Goel, A., John, A., Kumar, A., Saran, H., and R. Shorey, “XCHOKe: Malicious Source Control for Congestion Avoidance at Internet Gateways,” Proceedings of IEEE International Conference on Network Protocols (ICNP-02) , November 2002 (PDF). |
[pBox] | Floyd, S. and K. Fall, “Promoting the Use of End-to-End Congestion Control in the Internet,” IEEE/ACM Transactions on Networking 7(4) 458--472, August 1999 (PDF). |
TOC |
Toby Moncaster (editor) | |
BT | |
B54/70, Adastral Park | |
Martlesham Heath | |
Ipswich IP5 3RE | |
UK | |
Phone: | +44 7918 901170 |
EMail: | toby.moncaster@bt.com |
Louise Krug | |
BT | |
B54/77, Adastral Park | |
Martlesham Heath | |
Ipswich IP5 3RE | |
UK | |
EMail: | louise.burness@bt.com |
Michael Menth | |
University of Wuerzburg | |
room B206, Institute of Computer Science | |
Am Hubland | |
Wuerzburg D-97074 | |
Germany | |
Phone: | +49 931 888 6644 |
EMail: | menth@informatik.uni-wuerzburg.de |
João Taveira Araújo | |
UCL | |
GS206 Department of Electronic and Electrical Engineering | |
Torrington Place | |
London WC1E 7JE | |
UK | |
EMail: | j.araujo@ee.ucl.ac.uk |
Steven Blake | |
Extreme Networks | |
Pamlico Building One, Suite 100 | |
3306/08 E. NC Hwy 54 | |
RTP, NC 27709 | |
US | |
EMail: | sblake@extremenetworks.com |
Richard Woundy | |
Comcast | |
Comcast Cable Communications | |
27 Industrial Avenue | |
Chelmsford, MA 01824 | |
US | |
EMail: | richard_woundy@cable.comcast.com |
URI: | http://www.comcast.com |