Internet DRAFT - draft-sallantin-iccrg-initial-spreading
draft-sallantin-iccrg-initial-spreading
INTERNET-DRAFT R.Sallantin
Intended Status: Proposed Standard CNES/TAS/TESA
Expires: September 13, 2014 C.Baudoin
F.Arnal
Thales Alenia Space
E.Dubois
CNES
E.Chaput
A.Beylot
IRIT
March 12, 2014
Safe increase of the TCP's Initial Window
Using Initial Spreading
draft-sallantin-iccrg-initial-spreading-01
Abstract
This document proposes a new fast start-up mechanism for TCP that can
be used to speed the beginning of an Internet connection and then
improved the short-lived TCP connections performance.
Initial Spreading allows to safely increase the Initial Window size
in any cases, and notably in congested networks.
Merging the increase in the IW with the spacing of the segments
belonging to the Initial Window (IW), Initial Spreading is a very
simple mechanism that improves short-lived TCP flows performance and
do not deteriorate long-lived TCP flows performance.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Sallantin, et al. Expires September 2014 [Page 1]
INTERNET DRAFT Initial spreading March 12, 2014
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Initial Spreading mechanism . . . . . . . . . . . . . . . . . . 4
4 Spreading Time Choice . . . . . . . . . . . . . . . . . . . . . 5
4.1 Considerations . . . . . . . . . . . . . . . . . . . . . . 5
4.2 Burst impact on losses . . . . . . . . . . . . . . . . . . 5
4.3 Tmax . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Implementation considerations . . . . . . . . . . . . . . . . . 6
5.1 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2 Pacing in AQM . . . . . . . . . . . . . . . . . . . . . . . 6
5.3 TSO/GSO . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.4 Delayed Ack . . . . . . . . . . . . . . . . . . . . . . . . 7
6 Open discussions . . . . . . . . . . . . . . . . . . . . . . . 7
6.1 Increasing the upper bound TCP's IW to more than 10
segments . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.2 Initial Spreading and LFN . . . . . . . . . . . . . . . . . 8
7 Security Considerations . . . . . . . . . . . . . . . . . . . . 8
8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8
9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
9.1 Normative References . . . . . . . . . . . . . . . . . . . 9
9.2 Informative References . . . . . . . . . . . . . . . . . . 9
Sallantin, et al. Expires September 2014 [Page 2]
INTERNET DRAFT Initial spreading March 12, 2014
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10
1 Introduction
Whether due to a long delay (e.g. Long Fat Networks) or a large
queuing latency, a long Round Trip Time (RTT) deteriorates regular
slow-start performance. This particularly impacts the short-lived
connections[FA11]. Several protocols and even new network
architectures have been proposed to deal with this issue.
The original idea of Initial Spreading [SB13] was to consider a long
RTT as a resource to exploit, rather than as a constant to bypass. As
soon as the RTT is larger than a few milliseconds, it can therefore
be used as an opportunity to safely send a large amount of data
during the first RTT after the connection has opened. Spacing the
data along the RTT would in fact hopefully guarantee a high
independent probability that each segment is successfully received.
This approach resembles a combination of 2 TCP mechanisms: Pacing and
Increase in the Initial Window. Both mechanisms have then been
studied in depth to design Initial Spreading as an efficient fast
start-up TCP mechanism, and notably avoid their respective flaws or
weaknesses.
The original Pacing idea is to space the segments of a same window
along an RTT to prevent generating bursts as far as possible. Hence,
each segment arrives separately at the buffer and the impact on its
queue is minimized. The bit rate can then reach its maximum. However,
[AS00] has pointed out that this lack of bursts is responsible for
poor performance. Pacing has a tendency to overload the network, and
then cause a synchronization of the flows, that seriously damages
both individual and global performance.
RFC 6928 [RFC6928] suggests to enlarge the IW size up to ten
segments. Several articles and studies demonstrated that this would
allow transmission of 90% of the connections in one RTT [DR10]. In
most cases, and when the network is not congested in particular, this
solution is probably the best one for dealing with short-lived TCP
flows. However, in a congested environment, sending a large IW in one
burst is likely to impact the buffers and then deteriorate the
individual connection. Correlation between the segments of a same
burst is responsible for major impairments when regarding the short-
lived connections, and in particular for the connections that can be
sent in one RTT (number of segments to be transmitted inferior to the
upper bound value of the TCP's IW):
Sallantin, et al. Expires September 2014 [Page 3]
INTERNET DRAFT Initial spreading March 12, 2014
o a decrease of the probability to successfully transmit the entire
window.
o an increase of the probability of successive segment losses.
o a significant reduction of the number of potential Duplicated
Acknowledgements that are necessary to trigger fast loss recovery
mechanisms and avoid to wait for a Retransmission Time Out.
For the peculiar case of short-lived connections, experiments
have shown that the loss of one segment of the Initial burst could
not be recovered using Recovery mechanisms.
In favor of a conservative approach, [RFC3390] recommended the use of
an IW equal to 3.
Both mechanisms therefore suffer from a burst-related phenomenon, but
in opposite ways.
Initial Spreading has been designed to tackle previous burst issues.
Simulations and experimentations show that Initial Spreading is not
only efficient in case of LFNs but also for other networks with small
RTT.
2 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3 Initial Spreading mechanism
Initial Spreading [SB13] mechanism uses the permitted upper bound
value of the TCP's IW (e.g; RFC 6928 [RFC6928] suggests to use 10 for
this value). Initial Spreading spaces out a number of segments
inferior or equal to this value across the first RTT before letting
the TCP algorithm continue conventionally:
(1) The RTT is measured during the SYN-SYN/ACK exchange.
(2) According to the RTT value, a Spreading Time (Tspreading) is
computed (cf. section 5). Depending on the number of segments to
be sent, until n segments are sent every Tspreading.
(3) After the transmission of the IW, the regular TCP algorithm is
used.
Sallantin, et al. Expires September 2014 [Page 4]
INTERNET DRAFT Initial spreading March 12, 2014
Thus, bursts do not downgrade the transmission of short-lived
connections, but continue to prevent an overload of the network in
the case of long-lived connections.
4 Spreading Time Choice
4.1 Considerations
It has been observed that most of the savings enabled by the Initial
Spreading in congested environments comes from the independence of
the segments sent during the first RTT. Indeed, experimentations have
shown that preventing the bursts, Initial Spreading enables each
segment of the IW to have an independent loss probability.
This reduces the latency variance and then, the average latency. But,
precautions should be taken to not deteriorate the performance in un-
congested network.
To be efficient, Initial Spreading should therefore take the best of
several constraints:
o Tspreading MUST be large enough for the losses to be un-
correlated.
o Tspreading SHOULD be the shortest possible to not add an un-
necessary delay (notably in un-congested network).
o Implementation MUST be light and respects Kernel constraints.
4.2 Burst impact on losses
It has been observed that 2 segments are belonging to one burst if
they do encounter the same bottleneck buffer state, and that the
minimal spreading depends on the bottleneck throughput. Segments
spread with Tspreading < BottleneckThroughput/MSS will face the same
buffer state, and then will not be spread enough for the losses to be
un-correlated.
4.3 Tmax
Tmax is the upper bound value of Tspreading. It has two main
purposes:
o it enables Initial Spreading to be not dependent of the RTT
measurement. This last introduces some uncertainty in the
mechanism and increases the latency variance.
o it reduces the mean latency.
Sallantin, et al. Expires September 2014 [Page 5]
INTERNET DRAFT Initial spreading March 12, 2014
Tmax's choice results then in a trade-off. Indeed, a larger Tmax
would enable the Initial Spreading to be efficient with lower
bottleneck throughput (cf. section 4.2), when a lower value would
reduce the impact of the Initial Spreading on un-congested networks,
but also decreased the benefits of the Initial Spreading.
In case Tspreading would not be large enough to insure a loss
independence, Initial Spreading does not introduce additional delay
but performs in a similar way than RFC6928.
The authors RECOMMEND the use of a Tmax equal to 2 ms. This value
enables to enhance the performance of network with a bit-rate greater
than 6 Mb/s, and introduces a maximal additional latency of 2*n ms.
4.4 Algorithm
Tspreading is computed as follows:
1. RTT/n is compared to Tmax, the maximal value of spreading,
with n the permitted upper bound value of the TCP's IW.
2. If RTT/IW < Tmax,
Tspreading = RTT/IW
3. If RTT/IW >= Tmax,
Tspreading = Tmax
5 Implementation considerations
In this section, we discuss a number of aspects surrounding the
Initial Spreading implementations.
5.1 Timers
High resolution timers MUST be used instead of Jiffy timers to
implement the Initial Spreading.
Using a jiffy timer may therefore result in the transmission of new
bursts and reduce Initial Spreading benefits: emissions of multiple
TCP flows are synchronized via the Jiffies timer, so when m parallel
flows are sent, a burst of m segments may be transmitted.
Finally, using HRTimer enables to keep the Initial Spreading
algorithm simple (cf. section 4.4), and notably to not use a lower
bound value for Tspreading.
5.2 Pacing in AQM
Sallantin, et al. Expires September 2014 [Page 6]
INTERNET DRAFT Initial spreading March 12, 2014
The authors RECOMMEND to apply the pacing in the Active Queue
Management (AQM). This would enable to reduce the overload in the TCP
stack.
5.3 TSO/GSO
TSO/GSO is used to reduce the CPU overhead of TCP/IP on fast
networks. Instead of doing the segmentation in the kernel, large
packets are sent to the Network Interface Card (NIC). The
segmentation is then achieved by the NIC or just before the entry
into the driver's xmit routine.
In its current design, Initial Spreading is not working when TSO or
GSO are activated, but using Initial Spreading with an inactive
TSO/GSO still enables better performance.
Two options can be foreseen for the joint use of Initial Spreading
and TSO/GSO:
(1) disable TSO/GSO for the first RTT, with no impact on performance
since the throughput is limited by the IW.
(2) implement Initial Spreading using the TCP Offload Engine (TOE)
[RFC5522].
5.4 Delayed Ack
The use of Delayed Ack (Del Ack) does not downgrade Initial Spreading
efficiency.
Regarding long-lived connections and notably TCP's steady state, the
effects of Del Ack are lessened by new TCP's flavors (such as TCP
Cubic or Compound TCP [HR08][TS06]) which tend to adapt their
congestion algorithm to take into account whether the receiver uses
the Del Ack option or not. In doing so, they can prevent the
connection from being too slow, and still continue to reduce
acknowledgments traffic. In the event of short-lived connections, the
use of Del Ack does not modify the transmission of the IW. There is
then no change in the burst propagation.
6 Open discussions
In this section, we introduce possible improvements for Initial
Spreading and new perspectives.
Sallantin, et al. Expires September 2014 [Page 7]
INTERNET DRAFT Initial spreading March 12, 2014
6.1 Increasing the upper bound TCP's IW to more than 10 segments
[DR10] have shown that an IW of 10 segments enables to send more than
90% of the web objects in one RTT. So the authors recommend to use
Initial Spreading as a complement to [RFC6928].
If the average size of the web objects continues to evolve, Initial
Spreading can be used to raise the IW size. Simulations and
experiments showed even better results with an IW equal to 12.
Thus, Initial Spreading paves the way for larger IW. Further studies
are needed to assess the impact on the networks, notably in terms of
individual performance, fairness, friendliness and global
performance.
6.2 Initial Spreading and LFN
The space community designed middleboxes to mitigate poor TCP
performance for network with large RTT [FA11]. Proxy Enhancement
Performance (PEP) are generally used in LFN and in particular in
satellite communication systems [RFC3135] and offer very good TCP
performance.
Nevertheless, some recent studies have emphasized major impairments
occasioned by the use of satellite-specific transport solutions, and
notably TCP-PEPs, in a global context. The break of the end-to-end
TCP semantic, which is required to isolate the satellite segment, is
notably responsible for an increased complexity in case of mobility
scenarios or security context. This strongly mitigates PEPs benefits
and reopens the debate on their relevance[DC10].
Many researchers have outlined that new TCP releases perform well for
long-lived TCP connections, even in satellite environment [SC12], but
continue to suffer from very poor performance in case of short-lived
TCP connections.
Initial Spreading enables to reduce the RTT consequences for short-
lived TCP connections and could be an end-to-end alternative to PEP.
7 Security Considerations
The security considerations found in [RFC5681] apply to this
document. No additional security problems have been identified with
Initial Spreading at this time.
8 IANA Considerations
Sallantin, et al. Expires September 2014 [Page 8]
INTERNET DRAFT Initial spreading March 12, 2014
This document contains no IANA considerations.
9 References
9.1 Normative References
[RFC3390] A. Allman and S. Floyd, "Increasing tcp's initial window,"
RFC 3390, IETF, Proposed Standard, 2002.
[RFC5532] T. Talpey, C. Juszczak, "Network File System (NFS) Remote
Direct Memory Access (RDMA) Problem Statement," RFC 5532,
IETF, Informational, May 2009.
[RFC6928] J. Chu, N. Dukkipati, Y. Cheng, and M. Mathis, "Increasing
tcp's initial window," RFC 6928, IETF, Experimental, Jan.
2013.
[AH98] A. Allman, C. Hayes, and S. Ostermann, "An evaluation of TCP
with Larger Initial Windows," ACM Computer Communication
Review, 1998.
[AS00] A. Aggarwal, S. Savage, and T. Anderson, "Understanding the
performance of TCP pacing," in INFOCOM, vol. 3, mar 2000,
pp. 1157-1165.
[DR10] N. Dukkipati, T. Refice, Y. Cheng, J. Chu, T. Herbert, A.
Agarwal, A. Jain, and N. Sutin, "An Argument for
Increasing TCP's Initial Congestion Window," SIGCOMM
Comput. Commun. Rev., vol. 40, no. 3, pp. 26-33, Jun.
2010.
[SB13] R. Sallantin, C. Baudoin, E. Chaput, E. Dubois, F. Arnal, and
A. Beylot, "Initial spreading: a fast start-up tcp
mechanism," proceedings of LCN, 2013.
9.2 Informative References
[RFC3135] J. Border, M. Kojo, J. Griner, G. Montenegro, Z. Shelby,
"Performance Enhancing Proxies Intended to Mitigate Link-
Related Degradations," RFC 3135, IETF, Informational, June
2001.
[DF10] E. Dubois, J. Fasson, C. Donny, and E. Chaput, "Enhancing tcp
based communications in mobile satellite scenarios: Tcp
peps issues and solutions," in Proc. 5th Advanced
satellite multimedia systems conference (asma) and the
11th signal processing for space communications workshop
Sallantin, et al. Expires September 2014 [Page 9]
INTERNET DRAFT Initial spreading March 12, 2014
(spsc), pages 476-483, 2010.
[FA11] A. Fairhurst, G. Arjuna, H. Cruickshank, and C. Baudoin,
"Transport challenges facing a next generation hybrid
satellite internet," in International Journal of Satellite
Communications and networking, 2011.
[HR08] S. Ha, I. Rhee, and L. Xu, "CUBIC: A New TCP-Friendly High-
Speed TCP Variant," SIGOPS Oper. Syst. Rev., vol. 42, no.
5, pp. 64-74, Jul. 2008.
[LC09] R. Lacamera, D. Caini, C. Firrincieli, "Comparative
performance evaluation of tcp variants on satellite
environments," in ICC'09 Proceedings of the 2009 IEEE
international conference on Communications, pages Pages
5161-5165, 2009.
[SC12] R. Sallantin, E. Chaput, E. P. Dubois, C. Baudoin, F. Arnal,
and A.-L.Beylot, "On the sustainability of PEPs for
satellite Internet access," in ICSSC. AIAA, 2012.
[TS06] K. Tan, J. Song, Q. Zhang, and M. Sridharan, "Compound TCP: A
Scalable and TCP-friendly Congestion Control for High-
speed Networks," in 4th International workshop on
Protocols for Fast Long-Distance Networks (PFLDNet), 2006.
Authors' Addresses
Comments are solicited and should be addressed to the working group's
mailing list at iccrg@irtf.org and/or the authors:
Renaud Sallantin
CNES/TAS/TESA
IRIT/ENSEEIHT 2, rue Charles Camichel BP 7122
31071 Toulouse Cedex 7
France
Phone: +33 6 48 07 86 44
Email: renaud.sallantin@gmail.com
Cedric Baudoin
Thales Alenia Space (TAS)
26 Avenue Jean Francois Champollion,
31100 Toulouse
France
Email: cedric.baudoin@thalesaleniaspace.com
Sallantin, et al. Expires September 2014 [Page 10]
INTERNET DRAFT Initial spreading March 12, 2014
Fabrice Arnal
Thales Alenia Space
Email: fabrice.arnal@thalesaleniaspace.com
Emmanuel Dubois
Centre National des Etudes Spatiales (CNES)
18 Avenue Edouard Belin
31400 Toulouse
France
Email: emmanuel.Dubois@cnes.Fr
Emmanuel Chaput
IRIT
IRIT / ENSEEIHT 2, rue Charles Camichel BP 7122
31071 Toulouse Cedex 7
France
Email: emmanuel.chaput@enseeiht.fr
Andre-Luc Beylot
IRIT
Email: andre-Luc.Beylot@enseeiht.fr
Sallantin, et al. Expires September 2014 [Page 11]