Network Working Group | F. Baker, Ed. |
Internet-Draft | Cisco Systems |
Obsoletes: 2309 (if approved) | April 24, 2013 |
Intended status: Best Current Practice | |
Expires: October 26, 2013 |
IETF Recommendations Regarding Active Queue Management
draft-baker-aqm-recommendation-01
This memo presents recommendations to the Internet community concerning measures to improve and preserve Internet performance. It presents a strong recommendation for testing, standardization, and widespread deployment of active queue management in routers, to improve the performance of today's Internet. It also urges a concerted effort of research, measurement, and ultimate deployment of router mechanisms to protect the Internet from flows that are not sufficiently responsive to congestion notification.
The note largely repeats the recommendations of RFC 2309, updated after fifteen years of experience and new research.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 26, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The Internet protocol architecture is based on a connectionless end- to-end packet service using the Internet Protocol, whether IPv4 [RFC0791] or IPv6 [RFC2460]. The advantages of its connectionless design, flexibility and robustness, have been amply demonstrated. However, these advantages are not without cost: careful design is required to provide good service under heavy load. In fact, lack of attention to the dynamics of packet forwarding can result in severe service degradation or "Internet meltdown". This phenomenon was first observed during the early growth phase of the Internet of the mid 1980s [RFC0896][RFC0970], and is technically called "congestive collapse".
The original fix for Internet meltdown was provided by Van Jacobsen. Beginning in 1986, Jacobsen developed the congestion avoidance mechanisms that are now required in TCP implementations [Jacobson88] [RFC1122]. These mechanisms operate in the hosts to cause TCP connections to "back off" during congestion. We say that TCP flows are "responsive" to congestion signals (i.e., marked or dropped packets) from the network. It is primarily these TCP congestion avoidance algorithms that prevent the congestive collapse of today's Internet.
However, that is not the end of the story. Considerable research has been done on Internet dynamics since 1988, and the Internet has grown. It has become clear that the TCP congestion avoidance mechanisms [RFC5681], while necessary and powerful, are not sufficient to provide good service in all circumstances. Basically, there is a limit to how much control can be accomplished from the edges of the network. Some mechanisms are needed in the routers to complement the endpoint congestion avoidance mechanisms.
It is useful to distinguish between two classes of router algorithms related to congestion control: "queue management" versus "scheduling" algorithms. To a rough approximation, queue management algorithms manage the length of packet queues by marking or dropping packets when necessary or appropriate, while scheduling algorithms determine which packet to send next and are used primarily to manage the allocation of bandwidth among flows. While these two router mechanisms are closely related, they address rather different performance issues.
This memo highlights two performance issues. The first issue is the need for an advanced form of queue management that we call "active queue management." Section 2 summarizes the benefits that active queue management can bring. A number of Active Queue Management procedures are described in the literature, with different characteristics. This document does not recommend any of them in particular, but does make recommendations that ideally would affect the choice of procedure used in a given implementation.
The second issue, discussed in Section 3 of this memo, is the potential for future congestive collapse of the Internet due to flows that are unresponsive, or not sufficiently responsive, to congestion indications. Unfortunately, there is no consensus solution to controlling congestion caused by such aggressive flows; significant research and engineering will be required before any solution will be available. It is imperative that this work be energetically pursued, to ensure the future stability of the Internet.
Section 4 concludes the memo with a set of recommendations to the Internet community concerning these topics.
The discussion in this memo applies to "best-effort" traffic, which is to say, traffic generated by applications that accept the occasional loss, duplication, or reordering of traffic in flight. It is most effective, on time scales of a single RTT or a small number of RTTs, for elastic traffic [RFC1633], but also impacts real time traffic generated by adaptive applications.
[RFC2309] resulted from past discussions of end-to-end performance, Internet congestion, and RED in the End-to-End Research Group of the Internet Research Task Force (IRTF). This update results from experience with that and other algorithms, and the Active Queue Management discussion within the IETF.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The traditional technique for managing router queue lengths is to set a maximum length (in terms of packets) for each queue, accept packets for the queue until the maximum length is reached, then reject (drop) subsequent incoming packets until the queue decreases because a packet from the queue has been transmitted. This technique is known as "tail drop", since the packet that arrived most recently (i.e., the one on the tail of the queue) is dropped when the queue is full. This method has served the Internet well for years, but it has two important drawbacks.
Besides tail drop, two alternative queue disciplines that can be applied when the queue becomes full are "random drop on full" or "drop front on full". Under the random drop on full discipline, a router drops a randomly selected packet from the queue (which can be an expensive operation, since it naively requires an O(N) walk through the packet queue) when the queue is full and a new packet arrives. Under the "drop front on full" discipline [Lakshman96], the router drops the packet at the front of the queue when the queue is full and a new packet arrives. Both of these solve the lock-out problem, but neither solves the full-queues problem described above.
We know in general how to solve the full-queues problem for "responsive" flows, i.e., those flows that throttle back in response to congestion notification. In the current Internet, dropped packets serve as a critical mechanism of congestion notification to end nodes. The solution to the full-queues problem is for routers to drop packets before a queue becomes full, so that end nodes can respond to congestion before buffers overflow. We call such a proactive approach "active queue management". By dropping packets before buffers overflow, active queue management allows routers to control when and how many packets to drop.
In summary, an active queue management mechanism can provide the following advantages for responsive flows.
One of the keys to the success of the Internet has been the congestion avoidance mechanisms of TCP. Because TCP "backs off" during congestion, a large number of TCP connections can share a single, congested link in such a way that bandwidth is shared reasonably equitably among similarly situated flows. The equitable sharing of bandwidth among flows depends on the fact that all flows are running basically the same congestion avoidance algorithms, conformant with the current TCP specification [RFC1122].
Flows that behaves under congestion like a flow produced by a conformant TCP have come to be called "TCP Friendly" [RFC5348]. A TCP Friendly flow is responsive to congestion notification, and in steady-state it uses no more bandwidth than a conformant TCP running under comparable conditions (drop rate, RTT, MTU, etc.)
It is convenient to divide flows into three classes: (1) TCP Friendly flows, (2) unresponsive flows, i.e., flows that do not slow down when congestion occurs, and (3) flows that are responsive but are not TCP Friendly. The last two classes contain more aggressive flows that pose significant threats to Internet performance, as we will now discuss.
The projected increase in more aggressive flows of both these classes, as a fraction of total Internet traffic, clearly poses a threat to the future Internet. There is an urgent need for measurements of current conditions and for further research into the various ways of managing such flows. There are many difficult issues in identifying and isolating unresponsive or Non-TCP-Friendly flows at an acceptable router overhead cost. Finally, there is little measurement or simulation evidence available about the rate at which these threats are likely to be realized, or about the expected benefit of router algorithms for managing such flows.
There is an issue about the appropriate granularity of a "flow". There are a few "natural" answers: 1) a TCP or UDP connection (source address/port, destination address/port); 2) a source/destination host pair; 3) a given source host or a given destination host. We would guess that the source/destination host pair gives the most appropriate granularity in many circumstances. However, it is possible that different vendors/providers could set different granularities for defining a flow (as a way of "distinguishing" themselves from one another), or that different granularities could be chosen for different places in the network. It may be the case that the granularity is less important than the fact that we are dealing with more unresponsive flows at *some* granularity. The granularity of flows for congestion management is, at least in part, a policy question that needs to be addressed in the wider IETF community.
The IRTF, in developing [RFC2309], and the IETF in subsequent discussion, has developed a set of specific recommendations regarding the implementation and operational use of Active Queue Management procedures. These include:
These recommendations are expressed using the word "SHOULD". This is in recognition that there may be use cases unenvisaged in this document in which the recommendation does not apply. However, care should be taken in concluding that one's use case falls in that category; during the life of the Internet, such use cases have been rarely if ever observed and reported on. To the contrary, available research [Papagiannaki] says that even high speed links in network cores that are normally very stable in depth and behavior experience occasional issues that need moderation.
In short, Active Queue Management procedures are designed to minimize delay induced in the network by queues which have filled as a result of host behavior. Marking and loss behaviors signal to the senders of data that network buffers are becoming unnecessarily full, and they would do well to moderate their behavior.
Means of signaling to an endpoint regarding its effect on the network and how it might consider adapting include, at least:
The use of advanced scheduling mechanisms, such as priority queuing, classful queuing, and fair queuing, is often effective in networks to help a network to serve the needs of an application. It can be used to manage traffic passing a choke point. This is discussed in [RFC2474] and [RFC2475]. They are used operationally when an operator considers it important to do so.
Loss has two effects. It protects the network, which is the primary reason the network imposes it. Its use as a signal to TCP or SCTP is a pragmatic heuristic; "when the network discards a message in flight, it may imply the presence of faulty equipment or media in a path, and it may imply the presence of congestion. Presume the latter." However, it also has an effect on the efficiency of the data flow. The data in question must be retransmitted, or its absence must otherwise be adapted to by the application in question, which implies at least inefficient use of available bandwidth and may affect other data flows. Hence, loss is not entirely positive; it is a necessary evil.
Explicit Congestion Control, however, communicates information about network congestion that is assuredly about congestion, and avoids the unintended consequences of loss.
Hence, network communication to the host regarding the moderation of its traffic flow SHOULD use an AQM algorithm to determine which packets it should affect, and then implement that effect by marking ECN-capable traffic "Congestion Experienced (CE)" or dropping non-ECN-capable traffic.
Due to the possibility of abuse, the queue must also impose an upper bound, so that even ECN-capable traffic experiences tail-drop if necessary; this possibility, while equipment must design for the end case, should in theory be very uncommon.
A number of algorithms have been proposed. Many require some form of tuning or initial condition, which makes them difficult to use operationally. Hence, self-tuning algorithms are to be preferred.
Active Queue Management algorithms often target TCP [RFC0793], as it is by far the predominant transport in the Internet today. However, we have significant use of UDP [RFC0768] in voice and video services, and find utility in SCTP [RFC4960] and DCCP [RFC4340]. Hence, Active Queue Management algorithms that are effective with all of those transports and the applications that use them are to be preferred.
The terms "knee" and "cliff" area defined by [Jain94]. They respectively refer to the minimum and maximum values of the effective window that have the effect of maximizing transmission rate in a congestion control algorithm such as is used by TCP or SCTP. For the sender of data, exceeding the cliff is ineffective, as it (by definition) induces loss; operating at a point close to the cliff has a negative impact on other traffic and applications, triggering operator activities such as discussed in [RFC6057].
Operating below the knee is also ineffective, as it fails to use available network capacity. If the objective is to deliver data from its source to its recipient in the least possible time, as a result, the behavior of any TCP/SCTP congestion control algorithm SHOULD be to seek and use effective window values at or above the knee and well below the cliff.
[RFC2309] called for, as its second recommendation, further research in the interaction between network queues and host applications, and the means of signaling between them. This research occurred, and we as a community have learned a lot. However, we are not done. An obvious example in 2013 is in the use of Map/Reduce applications in data centers; do we need to extend our taxonomy of TCP/SCTP sessions to include not only "mice" and "elephants", but "lemmings"? "Lemmings" are flash crowds of "mice" that the network inadvertently tries to signal to as if they were elephant flows, resulting in head of line blocking in data center applications.
Hence, this document reiterates the call: we need continuing research as applications develop.
This memo asks the IANA for no new parameters.
While security is a very important issue, it is largely orthogonal to the performance issues discussed in this memo. We note, however, that denial-of-service attacks may create unresponsive traffic flows that are indistinguishable from flows from normal high-bandwidth isochronous applications, and the mechanism suggested in The recommendation in support of ongoing research will be equally applicable to such attacks.
This document, by itself, presents no new privacy issues.
The original recommendation in [RFC2309] was written by the End-to-End Research Group, which is to say Bob Braden, Dave Clark, Jon Crowcroft, Bruce Davie, Steve Deering, Deborah Estrin, Sally Floyd, Van Jacobson, Greg Minshall, Craig Partridge, Larry Peterson, KK Ramakrishnan, Scott Shenker, John Wroclawski, and Lixia Zhang. This is an edited version of that document, with much of its text and arguments unchanged.
The need for an updated document was agreed to in the tsvarea meeting at IETF 86. This document was reviewed on the aqm@ietf.org list. Comments came from Colin Perkins, Richard Scheffenegger, and Dave Taht.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC3168] | Ramakrishnan, K., Floyd, S. and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. |
[RFC6679] | Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P. and K. Carlberg, "Explicit Congestion Notification (ECN) for RTP over UDP", RFC 6679, August 2012. |
[RFC4301] | Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005. |
[RFC4774] | Floyd, S., "Specifying Alternate Semantics for the Explicit Congestion Notification (ECN) Field", BCP 124, RFC 4774, November 2006. |
[RFC6040] | Briscoe, B., "Tunnelling of Explicit Congestion Notification", RFC 6040, November 2010. |