Internet DRAFT - draft-touch-tcpm-automatic-iw
draft-touch-tcpm-automatic-iw
TCPM Working Group J. Touch
Internet Draft USC/ISI
Intended status: Standards Track July 16, 2012
Expires: January 2013
Automating the Initial Window in TCP
draft-touch-tcpm-automatic-iw-03.txt
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on January 16, 2011.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Touch, (TBD) Expires January 16, 2013 [Page 1]
Internet-Draft Automating the Initial Window in TCP July 2012
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Abstract
The Initial Window (IW) provides the starting point for TCP's
feedback-based congestion control algorithm. Its value has increased
over time to increase performance and to reflect increased
capability of Internet devices. This document describes a mechanism
to adjust the IW over long timescales, to make future changes more
safely deployed and to potentially avoid reexamination of this value
in the future.
Table of Contents
1. Introduction...................................................2
2. Conventions used in this document..............................3
3. Design Considerations..........................................3
4. Proposed IW Algorithm..........................................4
5. Discussion.....................................................7
6. Observations...................................................8
7. Security Considerations........................................9
8. IANA Considerations............................................9
9. Conclusions....................................................9
10. References....................................................9
10.1. Normative References.....................................9
10.2. Informative References...................................9
11. Acknowledgments..............................................10
1. Introduction
TCP's congestion control algorithm uses an initial window value
(IW), both as a starting point for new connections and after one RTO
or more [RFC2581][RFC2861]. This value has evolved over time,
originally one maximum segment size (MSS), and increased to the
lesser of four MSS or 4,380 bytes [RFC3390][RFC5681]. For typical
Internet connections with an maximum transmission units (MTUs) of
1500 bytes, this permits three segments of 1,460 bytes each.
The IW value was originally implied in the original TCP congestion
control description, and documented as a standard in 1997
[RFC2001][Ja88]. The value was last updated in 1998 experimentally,
and moved to the standards track in 2002 [RFC2414][RFC3390]. There
have been recent proposals to update the IW based on further
increases in host and router capabilities and network capacity, some
Touch, (TBD) Expires January 16, 2013 [Page 2]
Internet-Draft Automating the Initial Window in TCP July 2012
focusing on specific values (e.g., IW=10), and others prescribing a
schedule for increases over time (e.g., IW=6 for 2011, increasing by
1-2 MSS per year).
This document proposes that TCP can objectively measure when an IW
is too large, and that such feedback should be used over long
timescales to adjust the IW automatically. The result should be
safer to deploy and might avoid the need to repeatedly revisit IW
size over time.
2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119].
In this document, these words will appear with that interpretation
only when in ALL CAPS. Lower case uses of these words are not to be
interpreted as carrying RFC-2119 significance.
In this document, the characters ">>" preceding an indented line(s)
indicates a compliance requirement statement using the key words
listed above. This convention aids reviewers in quickly identifying
or finding the explicit compliance requirements of this RFC.
3. Design Considerations
TCP's IW value has existed statically for over two decades, so any
solution to adjusting the IW dynamically should have similarly
stable, non-invasive effects on the performance and complexity of
TCP. In order to be fair, the IW should be similar for most machines
on the public Internet. Finally, a desirable goal is to develop a
self-correcting algorithm, so that IW values that cause network
problems can be avoided. To that end, we propose the following list
of design goals:
o Impart little to no impact to TCP in the absence of loss, i.e.,
it should not increase the complexity of default packet
processing in the normal case.
o Adapt to network feedback over long timescales, avoiding values
that persistently cause network problems.
o Decrease the IW in the presence of sustained loss of IW segments,
as determined over a number of different connections.
Touch, (TBD) Expires January 16, 2013 [Page 3]
Internet-Draft Automating the Initial Window in TCP July 2012
o Increase the IW in the absence of sustained loss of IW segments,
as determined over a number of different connections.
o Operate conservatively, i.e., tend towards leaving the IW the
same in the absence of sufficient information, and give greater
consideration to IW segment loss than IW segment success.
We expect that, without other context, a good IW algorithm will
converge to a single value, but this is not required. An endpoint
with additional context or information, or deployed in a constrained
environment, can always use a different value. In specific,
information from previous connections, or sets of connections with a
similar path, can already be used as context for such decisions
[RFC2140].
However, if a given IW value persistently causes packet loss during
the initial burst of packets, it is clearly inappropriate and could
be inducing unnecessary loss in other competing connections. This
might happen for sites behind very slow boxes with small buffers,
which may or may not be the first hop.
4. Proposed IW Algorithm
Below is a simple description of the proposed IW algorithm. It
relies on the following parameters:
o MinIW = 3 MSS or 4,380 bytes (as per RFC3390]
o MaxIW = 10
o MulDecr = 0.5
o AddIncr = 2 MSS
o Threshold = 0.05
We assume that the minimum IW (MinIW) should be as currently
specified [RFC3390]. The maximum IW can be set to a fixed value
[Ch10], or set based on a schedule if trusted time references are
available [Al10]; here we prefer a fixed value. We also propose to
use an AIMD algorithm, with increase and decreases as noted.
Although these parameters are somewhat arbitrary, their initial
values are not important except that the algorithm is AIMD and the
MaxIW should not exceed that recommended for other systems on the
Internet. Current proposals, including default current operation,
are degenerate cases of the algorithm below for given parameters -
Touch, (TBD) Expires January 16, 2013 [Page 4]
Internet-Draft Automating the Initial Window in TCP July 2012
notably MulDec = 1.0 and AddIncr = 0 MSS, thus disabling the
automatic part of the algorithm.
The proposed algorithm is as follows:
0. On boot:
IW = MaxIW; # assume this is in bytes, and an even number of MSS
1. Upon starting a new connection
CWND = IW;
conncount++;
IWnotchecked = 1; # true
2. During a connection's SYN-ACK processing, if SYN-ACK includes
ECN, treat as if the IW is too large
if (IWnotchecked && (synackecn == 1)) {
losscount++;
IWnotchecked = 0; # never check again
}
3. During a connection, if retransmission occurs, check the seqno of
the outgoing packet (in bytes) to see if the resent segment fixes
an IW loss:
if (Retransmitting && IWnotchecked && ((ISN - seqno) < IW))) {
losscount++;
IWnotchecked = 0; # never do this entire "if" again
} else {
IWnotchecked = 0; # you're beyond the IW so stop checking
}
4. Once every 1000 conections, as a separate process (i.e., not as
part of processing a given connection):
Touch, (TBD) Expires January 16, 2013 [Page 5]
Internet-Draft Automating the Initial Window in TCP July 2012
if (conncount > 1000) {
if (losscount/conncount > threshold) {
# the number of connections with errors is too high
IW = IW * MulDecr;
} else {
IW = IW + AddIncr;
}
}
We recognize that this algorithm can yield a false positive when the
sequence number wraps around. This can be avoided using either PAWS
[RFC1323] context or 64-bit internal sequence numbers (as in TCP-AO
[RFC5925]). Alternately, false positives can be allowed since they
are expected to be infrequent and thus will not affect the overall
statistics of the algorithm.
The following additional constraints are imposed:
>> The automatic IW algorithm MUST initialize to MaxIW, in the
absence of other context information.
If there are too few connections to make a decision, or if there is
otherwise insufficient information to increase the IW, then the
MaxIW defaults to the current recommended value.
>> An implementation may allow the MaxIW to grow beyond the
currently recommended Internet default, but not more than 2 segments
per calendar year.
If an endpoint has a persistent history of successfully transmitting
IW segments without loss, then it is allowed to probe the Internet
to determine if larger IW values have similar success. This probing
is limited and requires a trusted time source, otherwise the MaxIW
remains constant.
>> An implementation MUST adjust the IW based on loss statistics at
least once every 1000 connections.
An endpoint needs to be sufficiently reactive to IW loss.
>> An implementation MUST decrease the IW by at least one MSS when
indicated during an evaluation interval.
An endpoint that detects loss needs to decrease its IW by at least
one MSS, otherwise it is not participating in an automatic reactive
algorithm.
Touch, (TBD) Expires January 16, 2013 [Page 6]
Internet-Draft Automating the Initial Window in TCP July 2012
>> An implementation MUST increase by no more than 2 MSS per
evaluation interval.
An endpoint that does not experience IW loss needs to probe the
network incrementally.
>> An implementation SHOULD use an IW that is an integer multiple of
2 MSS.
The IW should remain a multiple of 2 MSS segments, to enable
efficient ACK compression without incurring unnecessary timeouts.
>> An implementation MUST decrease the IW if more than 95% of
connections have IW losses.
Again, this is to ensure an implementation is sufficiently reactive.
>> An implementation MAY group IW values and statistics within
subsets of connections. Such grouping MAY use any information about
connections to form groups except loss statistics.
There are some TCP connections which might not be counted at all,
such as those to/from loopback addresses, or those within the same
subnet as that of a local interface (for which congestion control is
sometimes disabled anyway). This may also include connections that
terminate before the IW is full, i.e., as a separate check at the
time of the connection closing.
The period over which the IW is updated is intended to be a long
timescale, e.g., a month or so, or 1,000 connections, whichever is
longer. An implementation might check the IW once a month, and
simply not update the IW or clear the connection counts in months
where the number of connections is too small.
5. Discussion
There are numerous parameters to the above algorithm that are
compliant with the given requirements; this is intended to allow
variation in configuration and implementation while ensuring that
all such algorithms are reactive and safe.
This algorithm continues to assume segments because that is the
basis of most TCP implementations. It might be useful to consider
revising the specifications to allow byte-based congestion given
sufficient experience.
Touch, (TBD) Expires January 16, 2013 [Page 7]
Internet-Draft Automating the Initial Window in TCP July 2012
The algorithm checks for IW losses only during the first IW after a
connection start; it does not check for IW losses elsewhere the IW
is used, e.g., during slow-start restarts.
>> An implementation MAY detect IW losses during slow-start restarts
in addition to losses during the first IW of a connection. In this
case, the implementation MUST count each restart as a "connection"
for the purposes of connection counts and periodic rechecking of the
IW value.
False positives can occur during some kinds of segment reordering,
e.g., that might trigger spurious retransmissions even without a
true segment loss. These are not expected to be sufficiently common
to dominate the algorithm and its conclusions.
This mechanism does require additional per-connection state which is
currently common in some implementations, and is useful for other
reasons (e.g., the ISN is used in TCP-AO [RFC5925]). The mechanism
also benefits from persistent state kept across reboots, as would be
other state sharing mechanisms (e.g., TCP Control Block Sharing
[RFC2140]). The mechanism is inspired by RFC 2140's use of
information across connections.
The receive window (RWIN) is not involved in this calculation. The
size of RWIN is determined by receiver resources, and provides space
to accommodate segment reordering. It is not involved with
congestion control, which is the focus of this document and its
management of the IW.
6. Observations
The IW may not converge to a single, global value. It also may not
converge at all, but rather may oscillate by a few MSS as it
repeatedly probes the Internet for larger IWs and fails. Both
properties are consistent with TCP behavior during each individual
connection.
This mechanism assumes that losses during the IW are due to IW size.
Persistent errors that drop packets for other reasons - e.g., OS
bugs, can cause false positives. Again, this is consistent with
TCP's basic assumption that loss is caused by congestion and
requires backoff. This algorithm treats the IW of new connections as
a long-timescale backoff system.
Touch, (TBD) Expires January 16, 2013 [Page 8]
Internet-Draft Automating the Initial Window in TCP July 2012
7. Security Considerations
This algorithm presents an opportunity for an intelligent attack to
reduce the IW of a given system, by repeatedly dropping packets
during the IW only. An intermediate that can drop packets in a
controlled manner can already impact the performance of a
connection, and can reduce the congestion window of an ongoing
connection in ways that impact performance more than just dropping
during the IW.
8. IANA Considerations
This document has no IANA considerations. This section should be
removed prior to publication.
9. Conclusions
<Add any conclusions>
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
Initial Window", RFC 3390 (Standards Track), Oct. 2002.
[RFC5681] Allman, M., Paxson, V., Blanton, E., "TCP Congestion
Control," RFC 5681 (Standards Track), Sep. 2009.
10.2. Informative References
[Al10] Allman, M., "Initial Congestion Window Specification",
(work in progress), draft-allman-tcpm-bump-initcwnd-00,
Nov. 2010.
[Ch10] Chu, J., Dukkipati, N., Cheng, Y., Mathis, M., "Increasing
TCP's Initial Window," (work in progress), draft-ietf-
tcpm-initcwnd-04, Jun. 2012.
[Ja88] Jacobson, V., M. Karels, "Congestion Avoidance and
Control", Proc. Sigcomm 1988.
[RFC1323] Jacobson, V., Braden, R., Borman, D., "TCP Extensions for
High Performance", RFC 1323, May 1992.
Touch, (TBD) Expires January 16, 2013 [Page 9]
Internet-Draft Automating the Initial Window in TCP July 2012
[RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms", RFC2001
(Standards Track), Jan. 1997.
[RFC2140] Touch, J., "TCP Control Block Interdependence", RFC 2140 /
STD 7(Informational), Apr. 1997.
[RFC2414] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
Initial Window", RFC 2414 (Experimental), Sept. 1998.
[RFC2581] Allman, M., Paxson, V., Stevens, W., "TCP Congestion
Control," RFC2581 (Standards Track), Apr. 1999.
[RFC2861] Handley, M., Padhye, J., Floyd, S., "TCP Congestion Window
Validation", RFC2861 (Experimental), June 2000.
[RFC5925] Touch, J., A. Mankin, R. Bonica, "The TCP Authentication
Option", RFC 5925 (Standards Track), June 2010.
11. Acknowledgments
Mark Allman and Aki Nyrjinen contributed to the development of this
algorithm. Members of the TCPM mailing list also participated in
providing useful feedback.
This document was prepared using 2-Word-v2.0.template.dot.
Authors' Addresses
Joe Touch
USC/ISI
4676 Admiralty Way
Marina del Rey, CA 90292-6695 U.S.A.
Phone: +1 (310) 448-9151
Email: touch@isi.edu
Touch, (TBD) Expires January 16, 2013 [Page 10]