Transport Services and low latency
draft-petlund-latency-transport-services-00

Abstract

This document categorises different classes of network latency, discusses possible metrics for determining the characteristics of latency-sensitive flows and addresses the use of transport services as a means for achieving transport latency reduction.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 18, 2014.

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Introduction
1.1. Requirements Language
2. Time-dependent applications
2.1. On the characteristics of latency-sensitive traffic
3. Challenges in identifying traffic characteristics
4. Examples of choices of protocols and options that influence latency
4.1. Protocols
4.2. Options and mechanisms
5. Discussion
6. IANA Considerations
7. Security Considerations
8. References
8.1. Normative References
8.2. Informative References
Author's Address

1. Introduction

Modern operating systems provide a myriad of different protocols and options to tweak the network performance. Even for veterans within the field of transport protocols it is hard to stay fully up to date on all possibilities and combinations of options that may help reduce latency for a networked application. Also, care needs to be taken so that the transport protocols and options chosen will not be disruptive to other services or to an application if it changes network behaviour. For application developers in general to be able to select the best possible subset of mechanisms and protocols to support their time-dependent networked application, a measure of abstraction is required. This document discusses different classes of network latency with examples of how to reduce the delay for each class. It also makes suggestions for how an application can specify its intended behaviour to the transport services as a foundation for optimising the underlying services with regard to latency.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Time-dependent applications

While completion time for bulk data transfer is deemed important, the focus of industry and research has lately shifted towards understanding the diversity of Internet traffic today. Many flows are in some way application limited, that is, their sending pattern is not exclusively determined by congestion control, but also by the timing of data received from the application. Such applications are in many cases time-dependent, since the events driving the data production is triggered by real-life interaction or events [AP09].

While downloading applications dominated the Internet for a long time, overprovisioning in backbone networks has allowed interactive applications to succeed. Audio and video conferences that previously required leased lines are conducted over the Internet. Multiplayer games that are played over the Internet are also prevalent. Even Web applications generate increasingly interactive Internet traffic as more web pages are generated dynamically and contain interactively updated elements. The flows of such time-dependent applications now represent a large proportion of the total number of flows in the Internet. These flows have to tackle a large variety of access network technologies, all with different characteristics. Depending on the type of application, "low latency" may carry several meanings from a transport viewpoint. The next section elaborates on different classes of latency.

2.1. On the characteristics of latency-sensitive traffic

The scenarios described in this section are derived from a set of known latency-creating factors from which networked applications suffer. These categories are not exclusive, an application may suffer from more than one of them. They are, however, distinguishable at the transport layer and may require different avoidance techniques. The main categories of application latency that have been identified so far are:

(1)

Real-time interaction for applications with a lasting duration:

(a): Per-packet latency: applications that have signalling-like communication driven by actions or events. Each small message is equally important and congestion-control is not applied as there is never a queue building on the sender side. Examples include sensors triggered by events or online gaming traffic.
(b): Per-burst latency: interactive elements are sent in bursts over a persistent connection. Collapsing the congestion window (cwnd) between bursts will reduce the per-burst completion time and increase the delivery latency for a burst while keeping the cwnd open may cause overestimation of the available bandwidth. Examples include video streaming over persistent connections and financial applications synchronised by trading synchronisation barriers.
(c): Per-burst latency with multiple reconnects: is the above scenario, but where the streams makes new connections with regular intervals. This behaviour aggravates the bursty scenario by adding connection initialisation latency and start-up latency to each newly initialised connection. Examples include several methods of adaptive TCP-based video streaming and Web-based Content Management Systems.

(4)

Start-up latency: the time it takes for a connection to find its correct send-rate. The faster the correct bandwidth can be determined and the flow assume the correct send rate, the lower the latency. Examples include non-adaptive high-quality video-on-demand delivery and Cloud-based office applications.

(5)

Flow completion time: when a browser opens a range of different connections when loading a specific webpage, the latency experienced by the user is determined by the completion time of the subflows. Very often, such flows are very small, often not more then one packet, thus motivating measures like increasing the initial window. Examples include heavily styled web pages, messaging systems and news tickers, but this problem affects also interactive web browsing.

There is also the latency induced by extra RTTs needed to set up a connection, for instance to initiate a security protocol or to negotiate options.

There are flows that fluctuate between the behaviours described above. In such cases, care must be taken to not blindly apply mechanisms that will reduce the latency for one of the cases, but increase latency for others. In addition, some particular applications suffer more when there is a large variation in latency (jitter) than from a somewhat higher mean latency. This includes applications with real-time interaction, such as on-line games. In general, latency is characterised by a number of features including its higher order moments and distribution. The most important set of features varies between different latency-sensitive applications. There is a need to consider all of these traffic behaviours to properly address the topic of latency for transport services.

3. Challenges in identifying traffic characteristics

It can be hard to reliably identify the flow characteristics from the viewpoint of the transport layer. At the time when a flow starts, the transport can make no assumptions about what its traffic patterns can be [MF14] (AP: unless guessing from 5-tuple). In order for the transport to make qualified decision on which protocols and options to apply for a given flow to reduce latency, more information must be provided to the transport:

(1): The transport service tries to identify the traffic characteristics of the flow. This is a possibility for long-lasting flows that can be instrumented by the transport service. Such monitoring is, however, challenging since the experienced behaviour is dependent network characteristics like RTT and bottleneck capacity.
(2): The application informs the transport layer about its intended behaviour. For short flows and reducing startup latency, this is the only option as no information are yet available about the flow's characteristics.

For the transport layer to be able to identify relevant traffic characteristics, it is useful to review the metrics that are available to the transport layer. Examples of metrics are:

(1): Packets in flight: "FLIGHT SIZE", according to the TCP congestion control specification [RFC5681], is the number of packets that have been transmitted, but not yet acknowledged. For efficient retransmissions, in reliable protocols, this is an important metric as a flow with less than 4 packets in flight cannot trigger a fast retransmission. This leads to high recovery delays for many application-limited flows. Packets in flight is, however, not a static indicator of traffic behaviour as it is dependent on the RTT of the connection.
(2): Packet intertransmission time: the rate by which the application delivers data to the transport layer over time gives an indication of the traffic characteristics of the application. A constantly application-limited flow will not need a tuned congestion control, but will be sensitive to recovery delays as discussed in the bullet about "Packets in flight".
(3): Payload size: the payload size is application-specific and does not relate to any network phenomena. Time-dependent, event driven traffic often send packets with payload sizes less than the maximum transmission unit (MTU)[AP09]. For greedy streams, the packets will all fill an MTU.
(4): Stream duration: Analysis of Internet traffic shows that a majority of the flows are very short in duration[MF14]. When a browser loads a webpage, dozens of connections are usually opened, each transferring a small part of the webpage content. Streams carrying event-based or interactive communication, on the contrary, are usually persistent and longer-lasting. Thus, measuring whether a stream terminates within a short time interval (less than one second) can provide information that may help predict the continued behaviour of the flow.
(5): Burstiness: if a flow has a traffic pattern with bursts of activity followed by periods of inactivity, getting up to speed quickly in the active periods will be important to the application. The size of each burst is also relevant to the choices made by the transport layer.
(6): Send queue backlog: a flow that is network limited will build a send queue while waiting for data to be transferred. By monitoring the send queue size, the transport layer can get relevant information about the flow.

The combination of information provided by the above listed metrics may help the transport services to make qualified decisions on the flow characteristics and choose the right services for reducing latency.

4. Examples of choices of protocols and options that influence latency

List examples of protocols and options that influence latency.

4.1. Protocols

Examples of protocols with a short discussion on latency implications.

TCP:
UDP:
SCTP:
Other: DCCP ++?

4.2. Options and mechanisms

Examples of protocol options and mechanisms with a short discussion on their influence on latency.

Nagle's algorithm (delays small packets)
Delayed ACKs (delays feedback)
Limited transmit
Early retransmit
RTO restart
New CWV
TFO
PRSCTP

5. Discussion

Whether properties should be submitted by the applications in addition to services is an item for discussion and should be treated in future revisions of the document.

6. IANA Considerations

This memo includes no request to IANA.

7. Security Considerations

This document does not raise any new security issues.

8. References

8.1. Normative References

[RFC2119]	Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5681]	Allman, M., Paxson, V. and E. Blanton, "TCP Congestion Control", RFC 5681, September 2009.

8.2. Informative References

[AP09]	Petlund, A.P., "Improving Latency for interactive, thin-stream applications over reliable transport", Thesis Unipub, Kristian Ottosens hus, Pb. 33 Blindern, 0313 Oslo, December 2009.
[MF14]	Fuchs, M.F., "Time-Dependent Thin Transport Layer Streams: Characterization, Empirical Observation and Protocol Support", Master Thesis University of Kaiserslautern, Germany, January 2014.

Author's Address

Andreas Petlund Simula Research Laboratory Rolfsbukta 4 B, Fornebu, 1364 Norway Phone: +47 99 27 36 22 EMail: apetlund@simula.no