Internet DRAFT - draft-bagnulo-tcpm-esn
draft-bagnulo-tcpm-esn
Network Working Group M. Bagnulo
Internet-Draft UC3M
Intended status: Experimental Y. Nishida
Expires: April 2, 2018 GE Global Research
September 29, 2017
TCP ESN: Extended Sequence Numbers for TCP
draft-bagnulo-tcpm-esn-00.txt
Abstract
This note defines the Extended Sequence Number (ESN) experimental
modification to TCP to increase TCP's sequence number using the
TimeStamp (TS) option. It also modifies the Window Scale (WS) option
to support larger receiver window enable by the extended sequence
number space. At this stage, the purpose of this document is to
discuss different design choices to generate discussion about the
approach to follow.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 2, 2018.
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Bagnulo & Nishida Expires April 2, 2018 [Page 1]
Internet-Draft TCP ESN September 2017
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Design rationale . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Reduced option space consumption in the SYN and graceful
fallback . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2. Deployability . . . . . . . . . . . . . . . . . . . . . . 4
3. RTTM With Extended Sequence Number Prefix . . . . . . . . . . 4
4. Middleboxes Implications . . . . . . . . . . . . . . . . . . 7
5. SACK for Extended Sequence Number . . . . . . . . . . . . . . 8
6. Impacts On Other TCP Extensions . . . . . . . . . . . . . . . 8
6.1. PAWS . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.2. Eifel Detection Algorithm . . . . . . . . . . . . . . . . 9
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9
8. Security Considerations . . . . . . . . . . . . . . . . . . . 9
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
10.1. Normative References . . . . . . . . . . . . . . . . . . 9
10.2. Informative References . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Overview
The proposed Extended Sequence Number (ESN) mechanism re-purposes the
TS option [RFC7323] to carry a prefix for the sequence number and a
prefix for the Acknowledgement number, increasing the sequence number
used in TCP connections.
As currently defined, the TS option contains two 32-bit fields, TSval
and TSecr. The current ESN proposal re-defines TSval to carry a
prefix for the sequence number and TSecr to carry a prefix for the
Acknowledgment number. In this way, the actual sequence number
corresponding to the first data byte contained in the segment would
the the concatenation of the value contained in the TSval and the
value of the Sequence Number field of the TCP header. The
Acknowledgment sequence number would be the concatenation of the
value contained in the TSecr and the value of the Acknowledgment
Number field of the TCP header.
The proposed ESN mechanism also modifies the WS option as follows:
First, values up to 46 are allowed (enabling a RCV window up to
2^62). These are encoded in the 6 less significant bits of the
shift.count. Second, the remaining two (most significant) bits are
turned into flags. In particular, the most significant bit is used
Bagnulo & Nishida Expires April 2, 2018 [Page 2]
Internet-Draft TCP ESN September 2017
as the ESN flag to indicate the ESN support in the connection.
Specifically, when the ESN bit is set to 1 in the WS carried in a SYN
or a SYN/ACK, it means that: i) the TS option is being used for
extended sequence numbers, as defined above, and ii) that the sender
of the WS option with the ESN bit set supports receiver window up to
2^62 in this connection. The ESN flag defined this way allows
endpoints to express and negotiate ESN support during the TCP 3-way
handshake.
The sequence number of a TCP segment using ESN is the result of
prepending the prefix carried in the TS Value and the sequence number
contained in the Sequence Number field of the TCP header. Similarly,
the ACK number is the result of prepending the value in the TS Echo
Reply value and the value in the ACK field of the TCP header.
When a client wants to use the extended sequence number for a new
connection, it sends a SYN with both the TS and the WS options. In
the WS option, it sets the ESN flag to inform that it wants to use
ESN for this connection. It encodes the most significant bits of the
sequence number in the TS Value and the remaining bits of the
extended sequence number in the sequence number field in the TCP
header. Since the ACK flag is not set in the TCP header of the SYN
packet, the TS Echo Value is set to zero (as defined in [RFC7323]).
If the server also supports the extended sequence number mechanism,
the server replies with a SYN/ACK carrying both the TS and WS
options. In the WS option it sets the ESN flag to confirm the ESN
support. It encodes the prefix of its own extended sequence number
in the TS Value and the prefix of the ACK in the TS Echo Reply.
If the server does not support ESN, it will respond with a SYN/ACK
containing a WS option carrying a value lower then 14 i.e. with the
most significant bit set to 0. It may also include the TS option
indicating its willingness to use timestamps as defined in RFC7323 in
this connection. Upon the reception of the SYN/ACK, the client can
gracefully fall back to use TS are defined in RFC7323, in particular,
PAWS can be used.
2. Design rationale
Our proposal is to re-utilize the TCP TS option to carry a sequence
number offset in addition to the existing 32 bits sequence number.
This approach is similar to [I-D.looney-tcpm-64-bit-seqnos] although
it has distinct difference. while [I-D.looney-tcpm-64-bit-seqnos]
proposes to allocate a new TCP option, we propose to utilize existing
TS option instead. We believe this approach will have the following
advantages.
Bagnulo & Nishida Expires April 2, 2018 [Page 3]
Internet-Draft TCP ESN September 2017
2.1. Reduced option space consumption in the SYN and graceful fallback
The maximum size of the TCP header (including options) is 60 bytes
(this is because the Data Offset field of the TCP header is 4 bits
and can expresses the offset in 32-bit words). Since the TCP basic
header is 20 bytes, a segment can carry 40 bytes of options at most.
This is particularly pressing for the TCP SYN and TCP SYN/ACK
packets. Currently, there is a fair number of options that are
frequently carried in SYN packets, especially in high performance
communications. In particular, the MSS option (2 bytes) [RFC0793],
the SACK permitted option (2 bytes)[RFC2018], the Window Scale option
(3 bytes) and the TimeStamp option (used for PAWS) (10 bytes)
[RFC7323]. All these options account for 17 bytes. The are other
options that are becoming increasingly popular. For instance, The
option length of TCP Fast Open (TFO) [RFC7413] is 6 bytes or 18 bytes
depending on the length of the cookie used. There are other options
that require SYN and SYN/ACK option space such as MP_CAPABLE in
[RFC6824], or TCP-AO [RFC5925].
This means that for instance, a TCP client that would like to
initiate a connection including the MSS option, SACK permitted option
the WS and TS options and also carry a TFO option would not have room
to carry an additional 10 byte long option for the extended sequence
number. Since our approach utilizes TS option, additional option
space for extended sequence number is not needed.
The proposed ESN approach allows for using the extended sequence
number if both endpoints support it while enabling graceful fall-
back. A client supporting ESN would include the TS option and set
the flag in the WS option indicating the ESN support. If the server
does not support ESN, the connection can still be established using
32 bit sequence numbers and the TS and WS options as defined in
RFC7323 (in particular PAWS can be used in the connection).
2.2. Deployability
[HONDA11] reported that unknown options in the SYN prone to be
removed with higher probability than known options. Hence, we
believe utilizing existing options will have better chances to avoid
unwanted middleboxes' interferences. Although it would be useful to
perform some other measurements specifically about how frequently the
TS option is removed.
3. RTTM With Extended Sequence Number Prefix
[RFC7323] defined two uses for the TS option: PAWS and RTTM. When
re-purposing the TS option for ESN, we argue that the use of TS for
carrying extended sequence number subsumes the uses of PAWS.
Bagnulo & Nishida Expires April 2, 2018 [Page 4]
Internet-Draft TCP ESN September 2017
However, this is not the case for RTTM. We identify the following
alternatives in order to archive RTTM when re-purposing the TS option
for ESN.
Option 1:
This approach uses the most significant bit (MSB) of both TSval
and TSecr as a flag as depicted in Figure 1. If the MSB is set
to 1, it means the field contained a sequence number prefix. If
it is reset, it means that it contains a timestamp. This means
that we use 31 bits for the extended sequence number prefix,
resulting in 63 bit long sequence numbers. The main problem here
is that the segments containing the timestamp lack the sequence
number prefix information. So, for instance, it is not possible
to have more that 2^32 bytes in flight if any of the segments in
flight is carrying and actual timestamp, since there is the
possibility of confusion (in particular is the receive window is
large enough to accommodate two packets with the same 32 bit
sequence number, then the receiver would not be able to figure
out the right place for the packet that carries the timestamp and
does not carry the sequence number prefix). So, if we want to
use this option, the receiver window cannot be larger than 2^32.
However, this restriction does not address all the problems. If
a duplicated packet carrying a timestamp in the TS option gets
delay one RTT or more and the 32 bit sequence number wraps
around, then the receiver can potentially take this old
duplicated packet for a new packet with the same sequence number
suffix. It would be possible to rely on PAWS for detecting and
eliminating this packets. However, in order for PAWS to be used,
it is necessary to keep the timestamp information stored in
TS.recent updated. This requires that at least a few actual
timestamps are exchanged every 2^31 sequence numbers.
Summarizing, the constraints to use this option are first that
the light-size is less than 2^32 and that at least n (n=4?)
timestamps are exchanged every 2^32 bytes of data. We believe
this is poor alternative, especially due to the flight-size
constraint.
+-------+-------+-+---------------------+-+----------------------+
|Kind=8 | 10 |F| TSval or Prefix |F| TSecr or Prefix |
+-------+-------+-+---------------------+-+----------------------+
8 8 1 31 1 31
Figure 1: Time Stamp Option format for Option 1
Option 2:
Bagnulo & Nishida Expires April 2, 2018 [Page 5]
Internet-Draft TCP ESN September 2017
This approach uses the TSecr in some packets to exchange
timestamps. The idea here is that all data segments carry the
extended sequence number prefix in the TSval but that some
packets do not carry ACK information, which is acceptable because
we use cumulative ACKs as long as this only affects a few packets
(e.g. one packet per RTT do not carry ACK information). In order
to enable both uses of the TSecr (timestamp or sequence number
prefix), we need to use 2 bits to encode whether the TSecr
carries either an extended sequence number prefix for the ACK, a
timestamp or a timestamp echo. This implies that there are 30
bits left in TSecr for the actual value, resulting in 30 bit
timestamps and 62 bit sequence numbers The receiver of a packet
carrying the TS option carrying an actual timestamp or timestamp
echo should discard the ACK information since it cannot know the
the prefix of the seq number carried in the ACK field. This
option seems a reasonable trade-off. If this option is adopted,
RTTM could only be used sporadically. However, this may not be a
concern, since it is likely that it would be possible to measure
the RTT at least once every RTT which is likely to be enough for
estimating the RTT for the RTO calculation (see [RFC7323] for
further details).
+-------+-------+--+--------------------+--+---------------------+
|Kind=8 | 10 |F | TSval or Prefix |F | TSecr or Prefix |
+-------+-------+--+--------------------+--+---------------------+
8 8 2 30 2 30
Figure 2: Time Stamp Option format for Option 2
Option 3:
This approach splits the TSval and the TSecr into two 16-bit
fields resulting in 16 bit timestamps and 48 bit sequence
numbers. 48 bit sequence numbers are a significant improvement
from the current 32 bit sequence numbers, so it is probably
enough. It is possible to encode the timestamp information using
16 bits. For example, [I-D.trammell-tcpm-timestamp-interval]
proposes to encode timestamp information using 16 bits, which
could be used in this option.
Bagnulo & Nishida Expires April 2, 2018 [Page 6]
Internet-Draft TCP ESN September 2017
+-------+-------+-----------+-----------+------------+-----------+
|Kind=8 | 10 | Prefix | TSval | Prefix | TSecr |
+-------+-------+-----------+-----------+------------+-----------+
8 8 16 16 16 16
Figure 3: Time Stamp Option format for Option 3
Option 4:
This approach Only uses the TS for one single purpose per
connection either the original purpose or ESN. This will be less
attractive because the RTTM cannot be used with ESN in the same
connection.
+-------+-------+-----------------------+------------------------+
|Kind=8 | 10 | Prefix | Prefix |
+-------+-------+-----------------------+------------------------+
8 8 32 32
Figure 4: Time Stamp Option format for Option 4
Based on the observations above, we believe option 2 and 3 would be
worth for further discussions while option 1 and 4 can be discarded
due to major drawbacks.
4. Middleboxes Implications
It has been observed in [HONDA11] that some middleboxes insert the TS
Option. Also, there may be boxes out there that modify the sequence
number, while not terminating the connection. In order to detect
these cases that would break the proposed mechanism, it would be
beneficial to add an extra safety measure requiring that the prefix
encoded in the TS Option replicates the most significant bits of the
value included in the Sequence number field. In this way, a server
supporting the extended sequence number mechanism cannot only verify
the flag in the WS option, but also check if the TS value matches
with the 31 most significant bits in the Sequence Number field in the
TCP header. If they do not match, the server should not negotiate
the use of the extended sequence number mechanism (i.e. it replies
with the WS option resetting the flag for the extended sequence
number mechanism). This is adopted from
[I-D.looney-tcpm-64-bit-seqnos].
In case that the server is a legacy server, it will reply without the
WS option or with the WS option with a shift.count value lower than
Bagnulo & Nishida Expires April 2, 2018 [Page 7]
Internet-Draft TCP ESN September 2017
15. In this case, the client falls back to regular TCP without the
extended sequence number and regular timestamps.
5. SACK for Extended Sequence Number
In the case of SACK blocks, there are two possible complementary
approaches:
1. we use the currently defined SACK options identifying bits using
32 bit sequence numbers. These are used in a connection that has
successfully negotiated ESN, the prefix carried in the TSecr of
the message applies also to the sequence numbers identifying the
SACK blocks. The limitation of such approach is that all SACK
blocks in a single SACK option must use to the same prefix, which
prevents from SACKing older blocks. However, it is not certain
that if we really need to report wide range of SACK blocks in a
single SACK option. Another issue would be the case where a SACK
option is detached from the original packet and attached to a
different one. One possible mitigation for this would be
discarding SACK info in case of suspicious as SACK is optional
info and a SACK info usually is carried in multiple ACKs.
2. define a new SACK block option for extended sequence numbers as
proposed in [I-D.looney-tcpm-64-bit-seqnos].
There are a couple of observations regarding the last option using
the new SACK block option. First, note that the currently SACK
permitted option could still be used. Hence, if a connection
negotiated both SACK and ESN, we may presume that it supports the new
SACK block option. If the ESN negotiation fails, it means that
32-bit SACK are to be used for that connection, providing graceful
fallback.
6. Impacts On Other TCP Extensions
Since this proposal repurpose the existing use of timestamp option,
some other proposals that use the option will be affected. We
investigated the impacts on the following TCP extensions and propose
modifications to make them work with the proposal.
6.1. PAWS
In order to perform PAWS, receives need to check if the timestamp
option in an arrived packet contains sequence number prefix or
timestamp info by checking the most significant bit. If it contains
timestamp info, it process the timestamp info as described
Section 5.3 in [RFC7323]. If it contains sequence number prefix, it
can know the extended sequence number of the packet based on the
Bagnulo & Nishida Expires April 2, 2018 [Page 8]
Internet-Draft TCP ESN September 2017
into. If the extended sequence number is outside of the window, the
packet will be discarded as PAWS.
6.2. Eifel Detection Algorithm
If Eifel detection algorithm [RFC3522] is activated, senders performs
the logics described in Section 3.2 of [RFC3522] with the following
two modifications. First, TCP sender MUST set timestamp info when it
retransmit packets. Second, if TCP sender receives the ACK with
sequence number prefix for the retransmitted packet, it should treat
as if the timestamp is smaller than the value of RetransmitTS.
7. Acknowledgments
8. Security Considerations
9. IANA Considerations
10. References
10.1. Normative References
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, DOI 10.17487/RFC0793, September 1981,
<https://www.rfc-editor.org/info/rfc793>.
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R.
Scheffenegger, Ed., "TCP Extensions for High Performance",
RFC 7323, DOI 10.17487/RFC7323, September 2014,
<https://www.rfc-editor.org/info/rfc7323>.
10.2. Informative References
[HONDA11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A.,
Handley, M., and H. Tokuda, "Is it still possible to
extend TCP?", ACM IMC 2011, 2011.
[I-D.looney-tcpm-64-bit-seqnos]
jlooney@juniper.net, j., "64-bit Sequence Numbers for
TCP", draft-looney-tcpm-64-bit-seqnos-00 (work in
progress), March 2017.
[I-D.trammell-tcpm-timestamp-interval]
Scheffenegger, R., Kuehlewind, M., and B. Trammell,
"Encoding of Time Intervals for the TCP Timestamp Option",
draft-trammell-tcpm-timestamp-interval-01 (work in
progress), July 2013.
Bagnulo & Nishida Expires April 2, 2018 [Page 9]
Internet-Draft TCP ESN September 2017
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
Selective Acknowledgment Options", RFC 2018,
DOI 10.17487/RFC2018, October 1996,
<https://www.rfc-editor.org/info/rfc2018>.
[RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
for TCP", RFC 3522, DOI 10.17487/RFC3522, April 2003,
<https://www.rfc-editor.org/info/rfc3522>.
[RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
June 2010, <https://www.rfc-editor.org/info/rfc5925>.
[RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
"TCP Extensions for Multipath Operation with Multiple
Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013,
<https://www.rfc-editor.org/info/rfc6824>.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
<https://www.rfc-editor.org/info/rfc7413>.
Authors' Addresses
Marcelo Bagnulo
UC3M
Email: marcelo@it.uc3m.es
Yoshifumi Nishida
GE Global Research
2623 Camino Ramon
San Ramon, CA 94583
USA
Email: nishida@wide.ad.jp
Bagnulo & Nishida Expires April 2, 2018 [Page 10]