Internet DRAFT - draft-iyengar-burst-mitigation
draft-iyengar-burst-mitigation
Internet Engineering Task Force Janardhan Iyengar
INTERNET DRAFT University of Delaware
draft-iyengar-burst-mitigation-01.txt Mark Allman
Expires: July, 2006 ICIR/ICSI
Ethan Blanton
Purdue University
January, 2006
TCP Burst Mitigation Through Congestion Window Limiting
draft-iyengar-burst-mitigation-01.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Copyright Notice
Copyright (C) The Internet Society (2006).
Abstract
This document describes Congestion Window Limiting (CWL), a method
for mitigating micro-bursts in TCP by limiting the congestion window
during interruptions in TCP's acknowledgment clock.
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
The reader is expected to be familiar with terminology from
[RFC2581].
Iyengar, Allman, Blanton [Page 1]
draft-iyengar-burst-mitigation-01.txt January 2006
1. Introduction
TCP dynamics and application sending patterns can cause a TCP sender
to inject bursts into the network with potentially harmful effects
for both the network and the sender. Bursting can stress network
queues causing loss in the bursting connection as well as in other
flows sharing the stressed queues. Bursting can also cause scaling
on short timescales [JD03] and increase queueing delays in routers.
This document draws from previously proposed burst mitigation
techniques and presents one possible technique to reduce some of
TCP's burstiness.
In this document, we are concerned with one type of bursting which
we call "micro-bursts". Micro-bursts are generated by a TCP in
response to changes in the cumulative acknowledgment point. Each
TCP segment carrying a cumulative acknowledgment (ACK) that slides
the sender's transmission window allows previously unsent data
segments to be transmitted (when application data is available).
These segments are ideally transmitted at the line rate of the
sender's network (assuming the host's CPU can produce packets fast
enough). We refer to such bursts of segments sent in response to
receipt of a single ACK as "micro-bursts".
TCP exhibits other bursting behaviors as well, which we collectively
term as "macro-bursts" since they tend to occur over longer
timescales than micro-bursts. Macro-bursts can be caused by several
TCP and/or network phenomena, such as slow start [RFC2581] and ACK
compression [ZSC91]. Although macro-bursts and their mitigation
have also been the topic of much research ([AB05] briefly discusses
this research), we limit ourselves to only micro-burst mitigation in
this document.
Several situations can cause micro-bursting:
* Although TCP's cumulative ACK mechanism is robust to loss, ACK
loss causes a TCP sender's transmission window to slide by a
greater amount with lesser frequency, potentially triggering large
micro-bursts in the process.
* An application can send data in a bursty fashion, causing TCP to
transmit micro-bursts.
* Reordered ACKs cause an ACK stream that appears similar to an ACK
stream with loss, causing similar micro-bursting.
* In some cases, when a TCP sender exits fast recovery, a large
number of segments are transmitted at line rate [FF96]. This
dynamic occurs when the sender cannot transmit enough new
segments during the recovery phase (e.g., due to ACK loss) and
therefore stores "permission to send" until a cumulative ACK
arrives. This phenomenon is discussed in [FF96], where the
"MaxBurst" mechanism is introduced to contain the consequent
burst (see discussion in section 3).
Iyengar, Allman, Blanton [Page 2]
draft-iyengar-burst-mitigation-01.txt January 2006
These and other causes of bursting are described in more detail in
[JD03,AB05].
In this document, we present one possible method for mitigating TCP
micro-bursts called Congestion Window Limiting (CWL), which is based
on work in [HTH01] and originally outlined in [AB05]. Alternate
schemes have been proposed to mitigate the impact of micro-bursts,
as discussed in section 3. We note that the question of whether or
not micro-bursts need mitigation remains open. [JD03] suggests that
TCP's bursting may need mitigation from the perspective of the
network, while [BA05] suggests that micro-bursts often do not cause
loss within the bursting connection. By specifying a particular
mitigation technique this document intends to draw community
attention to the issue of micro-bursts, and attempts to generate
discussion and further exploration and experimentation in the area.
2. Congestion Window Limiting (CWL)
CWL introduces a new parameter called "BLimit", which represents the
largest acceptable micro-burst a TCP should transmit.
Each time an ACK is received that slides the transmission window,
the congestion window (cwnd) modification (increase or decrease)
procedures outlined in [RFC2581] MUST be applied. When using CWL,
the following steps MUST be executed before any data is sent in
response to the received ACK:
(1) If cwnd > (FlightSize + BLimit) TCP will likely send a
micro-burst and steps (2) and (3) MUST be used; otherwise,
skip (2) and (3) and transmit data as usual. If this
condition holds, the only case where a micro-burst will not
occur is when not enough application data is available to
transmit.
(2) If ssthresh < cwnd then ssthresh MUST be set to cwnd.
(3) Set cwnd = (FlightSize + BLimit).
After these steps, available application data should be transmitted
as allowed by the cwnd and the receiver's advertised window.
CWL controls bursts by reducing cwnd when the ACK clock is lost or
interrupted to the point where the cumulative ACK will trigger a
burst of segments in excess of BLimit. History information
maintained in ssthresh allows the connection to exponentially
increase the cwnd (via slow start) back to the size before the
reduction.
BLimit SHOULD be chosen such that bursts are no larger than those
allowed by [RFC3390]. From [RFC3390], we therefore choose:
BLimit = min (4*MSS, max (2*MSS, 4380 bytes)) (1)
If useful, BLimit MAY be smaller than allowed by equation (1).
Iyengar, Allman, Blanton [Page 3]
draft-iyengar-burst-mitigation-01.txt January 2006
3. Related Work
CWL makes TCP congestion control more conservative and is therefore
implicitly allowed by [RFC2581].
Congestion Window Validation (CWV) [RFC2861] attempts to protect the
network from a sender's incorrect or stale view of the available
capacity along the path. [RFC2861] recommends (i) not increasing
the cwnd when it is not fully used by an application-limited sender,
and (ii) decaying the cwnd after a sufficiently long idle period to
avoid use of an unvalidated cwnd. [RFC2861] suggests reducing the
cwnd of an application-limited sender by half for each idle RTO
interval. While CWV can prevent micro-bursts in some situations,
this is accidental and not part of the problem CWV is trying to
solve. CWL, on the other hand, aims at preventing micro-bursts by
reducing the cwnd when appropriate, and in doing so, protects the
network from an application-limited sender with stale cwnd
information. CWL also prevents a cwnd from increasing during
application-limited periods by limiting it to (FlightSize +
BLimit). Note that CWL is more aggressive in reducing cwnd than
[RFC2861].
Several techniques have been proposed in the past for controlling
micro-bursts, as follows:
* As noted above, [FF96] introduces the "MaxBurst" mechanism.
MaxBurst is an additional constraint that limits the number of
data segments that can be transmitted in response to any given
ACK.
CWL provides a single control for the amount of data a TCP
connection can transmit into the network at any given point.
This is arguably a clean approach to controlling the load
imposed on the network. On the other hand, by introducing a
second control, MaxBurst provides for separation of concerns.
In other words, limiting the sizes of micro-bursts is, in some
sense, a different task than limiting the overall transmission
rate to control network congestion; therefore, using two
different controls may make sense. An additional drawback of
MaxBurst is that the two transmission controllers may interact
poorly, causing undesirable side effects. When BLimit ==
MaxBurst, CWL and MaxBurst perform similarly [AB05].
* [HTH01] introduces an algorithm called "Use it or Lose it"
(UI/LI) which modifies the cwnd to reflect the actual
outstanding number of bytes, thereby controlling bursts in
response to an ack. UI/LI is used in SCTP [RFC2960,RA+05] and
provides the basis for CWL. CWL extends UI/LI by modifying
ssthresh and enabling a sender to slow start up to the last
known safe cwnd (step (2) in the algo above). In the absence of
explicitly setting ssthresh as part of the burst mitigation
process the UI/LI algorithm is non-deterministic in its use of
slow start after reducing cwnd. [AB05] illustrates cases where
Iyengar, Allman, Blanton [Page 4]
draft-iyengar-burst-mitigation-01.txt January 2006
slow start is used and cases where it is not used, simply
depending on the state of the connection before UI/LI reduces
the cwnd.
* Rate-Based Pacing [VH97] imposes a limitation on the rate of
sending, and prevent bursts by pacing data into the network
until the ACK clock is established. Although this solution can
be very effective in burst mitigation in some cases, it requires
a new timer and parameters for pacing out the data segments.
Further, as shown in [AB05], there are cases where there is no
natural "lull" in the connection into which segments can be
nicely paced. Therefore, the exact application of pacing
requires more research.
4. Discussion
We emphasize that the question of whether or not micro-bursts need
mitigation remains open. While this document provides the
specification for one mitigation technique based on current
knowledge, continued research on bursts and alternative mitigation
mechanisms is strongly encouraged.
Finally, we note that some TCP stacks may already implement some
form of micro-burst mitigation, although the mechanisms used may not
be well understood and have not been through IETF community
review. This document presents an initial step towards encouraging
better understood and community reviewed micro-burst mitigation
mechanisms.
5. Security Considerations
This document calls for reducing the congestion window during loss
of TCP's ACK clock. An attacker can therefore reduce throughput of
a TCP connection by causing ACK loss or reordering of data or acks.
6. IANA Considerations
None.
Acknowledgments
Discussions with Sally Floyd have shaped some of the thinking that
is contained in this document.
Normative References
[RFC2119] S. Bradner. Key words for use in RFCs to Indicate
Requirement Levels, March 1997. BCP 14, RFC 2119.
[RFC2581] M. Allman, V. Paxson, W. Stevens. TCP Congestion Control,
April 1999. RFC 2581.
Iyengar, Allman, Blanton [Page 5]
draft-iyengar-burst-mitigation-01.txt January 2006
Informative References
[RFC2861] M. Handley, J. Padhye, S. Floyd. TCP Congestion Window
Validation, June 2000. RFC 2861.
[AB05] M. Allman, E. Blanton. Notes on Burst Mitigation for
Transport Protocols. ACM Computer Communication Review, 35(2),
April 2005.
[BA05] E. Blanton, M. Allman. On the Impact of Bursting on TCP
Performance. Proceedings of the Workshop for Passive and Active
Measurement, March 2005.
[FF96] K. Fall, S. Floyd. Simulation-based Comparisons of Tahoe,
Reno, and SACK TCP. Computer Communication Review, 26(3), July
1996.
[HTH01] A. Hughes, J. Touch, J. Heidemann. Issues in TCP Slow-Start
Restart After Idle. Internet draft
<draft-hughes-restart-00.txt>, December 2001 (expired).
URL: http://www.isi.edu/touch/pubs/draft-hughes-restart-00.txt.
[JD03] H. Jiang, C. Dovrolis. Source-Level IP Packet Bursts: Causes
and Effects. In ACM SIGCOMM/Usenix Internet Measurement
Conference, October 2003.
[SA+05] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro,
M. Tuexen. SCTP Specification Errata and Issues. Internet draft
<draft-ietf-tsvwg-sctpimpguide-16.txt>, October 2005 (work in
progress).
[VH97] V. Visweswaraiah and J. Heidemann. Improving Restart of
Idle TCP Connections. Technical Report 97-661, University of
Southern California, November 1997.
[ZSC91] L. Zhang, S. Shenker, and D. Clark. Observations on the
Dynamics of a Congestion Control Algorithm: The Effects of
Two-Way Traffic. ACM SIGCOMM, September 1991.
Author's Addresses
Janardhan Iyengar
Protocol Engineering Lab, CIS Department
University of Delaware
103 Smith Hall
Newark, DE 19716
Email: iyengar@cis.udel.edu
URL: http//www.cis.udel.edu/~iyengar/
Mark Allman
ICSI Center for Internet Research
1947 Center Street, Suite 600
Berkeley, CA 94704-1198
Phone: (440) 235-1792
Iyengar, Allman, Blanton [Page 6]
draft-iyengar-burst-mitigation-01.txt January 2006
Email: mallman@icir.org
URL: http://www.icir.org/mallman/
Ethan Blanton
Purdue University Computer Sciences
250 North University Street
West Lafayette, IN 47907
Email: eblanton@cs.purdue.edu
URL: http://www.cs.purdue.edu/homes/eblanton/
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed
to pertain to the implementation or use of the technology described
in this document or the extent to which any license under such
rights might or might not be available; nor does it represent that
it has made any independent effort to identify any such rights.
Information on the procedures with respect to rights in RFC
documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use
of such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository
at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on
an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2006). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Iyengar, Allman, Blanton [Page 7]