Internet DRAFT - draft-kuehlewind-spud-use-cases
draft-kuehlewind-spud-use-cases
Network Working Group M. Kuehlewind, Ed.
Internet-Draft B. Trammell, Ed.
Intended status: Informational ETH Zurich
Expires: September 19, 2016 March 18, 2016
Use Cases for a Substrate Protocol for User Datagrams (SPUD)
draft-kuehlewind-spud-use-cases-01
Abstract
This document identifies use cases for explicit cooperation between
endpoints and middleboxes in the Internet under endpoint control.
These use cases range from relatively low level applications
(improving the ability for UDP-based protocols to traverse firewalls)
through support for new transport services (in-flow prioritization
for graceful in-network degradation of media streams). They are
intended to provide background for deriving the requirements for a
Substrate Protocol for User Datagrams (SPUD), as discussed at the IAB
Stack Evolution in a Middlebox Internet (SEMI) workshop in January
2015 and the SPUD BoF session at IETF 92 in March 2015.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 19, 2016.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Kuehlewind & Trammell Expires September 19, 2016 [Page 1]
Internet-Draft SPUD Use Cases March 2016
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Principles and Assumptions . . . . . . . . . . . . . . . 3
1.1.1. Trust and Integrity . . . . . . . . . . . . . . . . . 4
1.1.2. Endpoint Control . . . . . . . . . . . . . . . . . . 4
1.1.3. Least Exposure . . . . . . . . . . . . . . . . . . . 4
2. Firewall Traversal for UDP-Encapsulated Traffic . . . . . . . 4
2.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 5
2.2. Information Exposed . . . . . . . . . . . . . . . . . . . 5
2.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 7
2.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 7
3. On-Path State Lifetime Discovery and Management . . . . . . . 7
3.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 7
3.2. Information Exposed . . . . . . . . . . . . . . . . . . . 8
3.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 9
3.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 9
4. Path MTU Discovery . . . . . . . . . . . . . . . . . . . . . 10
4.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 10
4.2. Information Exposed . . . . . . . . . . . . . . . . . . . 10
4.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 10
4.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 11
4.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 11
5. Low-Latency Service . . . . . . . . . . . . . . . . . . . . . 11
5.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 11
5.2. Information Exposed . . . . . . . . . . . . . . . . . . . 12
5.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 12
5.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 13
5.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 13
6. Reordering Sensitivity . . . . . . . . . . . . . . . . . . . 13
6.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 14
6.2. Information Exposed . . . . . . . . . . . . . . . . . . . 14
6.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 15
6.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 15
6.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 15
7. Application-Limited Flows . . . . . . . . . . . . . . . . . . 15
7.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 15
7.2. Information Exposed . . . . . . . . . . . . . . . . . . . 16
7.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 16
7.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 17
Kuehlewind & Trammell Expires September 19, 2016 [Page 2]
Internet-Draft SPUD Use Cases March 2016
7.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 17
8. Priority Multiplexing . . . . . . . . . . . . . . . . . . . . 17
8.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 17
8.2. Information Exposed . . . . . . . . . . . . . . . . . . . 17
8.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 18
8.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 18
8.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 18
9. In-Band Measurement . . . . . . . . . . . . . . . . . . . . . 18
9.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 18
9.2. Information Exposed . . . . . . . . . . . . . . . . . . . 19
9.3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 20
9.4. Deployment Incentives . . . . . . . . . . . . . . . . . . 20
9.5. Security, Privacy, and Trust . . . . . . . . . . . . . . 20
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
11. Security Considerations . . . . . . . . . . . . . . . . . . . 21
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21
13. Informative References . . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction
This document describe use cases for a common Substrate Protocol for
User Datagrams (SPUD) that could be used by superstrate transport or
application protocols to explicitly expose information to and
exchange information with middleboxes about application traffic and
network conditions.
For each use case, we first describe a problem that is difficult or
impossible to solve with presently deployable protocols within the
present Internet architecture. We then discuss which information is
exposed by endpoints about the traffic sent, and/or by SPUD-aware
middleboxes and routers about the path that traffic will traverse.
We also suggest potential mechanisms to use that exposed information
at middleboxes and/or endpoints, in order to demonstrate the
feasibility of using the exposed information to the given use
case.The described mechanisms are not necessarily proposals for
moving forward, nor do they necessarily represent the best approach
for applying the exposed information, but should illustrate and
motivate the applicability of the exposed information. We further
discuss incentives for deployment and any security, privacy, and
trust issues that arise in exposing and/or making use of the
information.
1.1. Principles and Assumptions
We make a few assumptions about first principles in elaborating these
use cases
Kuehlewind & Trammell Expires September 19, 2016 [Page 3]
Internet-Draft SPUD Use Cases March 2016
1.1.1. Trust and Integrity
In this document, we assume no pre-existing trust relationship
between the communication endpoints and any middlebox or router on
the path. We must therefore always assume that information that is
exposed can be incorrect, and/or that the information will be
ignored.
This implies that while endpoints can verify the integrity of
information exposed by remote endpoints, they cannot verify the
integrity of information exposed by middleboxes. Middleboxes cannot
verify the integrity of any information at all. In limited
situations where a trust relationship can be established, e.g.,
between a managed end-user device in an enterprise network and a
corporate firewall, this verifiability can be improved.
1.1.2. Endpoint Control
We further assume that all information exposure by middleboxes
happens under explicit endpoint control. For that reason, the
information exposed by middleboxes in this document takes only two
forms. In the first form, "accumulation", the endpoint creates space
in the header for middleboxes to use to signal to the remote
endpoint, which then sends the information back to the originating
endpoint via a feedback channel. In the second form, the middlebox
sends a packed directly back to the endpoint with additional
information about why a packet was dropped. Other communications
patterns may be possible, depending on the first principles chosen;
this is a subject of future work.
1.1.3. Least Exposure
Additionally, this document follows the principle of least exposure:
in each use case, we attempt to define the minimum amount of
information exposed by endpoints and middleboxes required by the
proposed mechanism to solve the identified problem. In addition to
being good engineering practice, this approach reduces the risk to
privacy through inadvertent irrelevant metadata exposure, reduces the
amount of information available for application fingerprinting, and
reduces the risk that exposed information could otherwise be used for
unintended purposes.
2. Firewall Traversal for UDP-Encapsulated Traffic
We presume, following an analysis of requirements in
[I-D.trammell-spud-req], as well as trends in transport protocol
development (e.g. QUIC, the RTCWEB data channel) that UDP
encapsulation will prove a viable approach for deploying new
Kuehlewind & Trammell Expires September 19, 2016 [Page 4]
Internet-Draft SPUD Use Cases March 2016
protocols in the Internet. This, however, leads us to a first
problem that must be solved.
2.1. Problem Statement
UDP is often blocked by firewalls, or only enabled for a few well-
known applications (e.g. DNS, NTP). Recent measurement work has
shown that somewhere between 4% and 8% of Internet hosts may be
affected by UDP impairment, depending on the population studied.
Some networks (e.g. enterprise networks behind corporate firewalls)
are far more likely to block UDP than others (e.g. residential
wireline access networks).
In addition, some network operators assume that UDP is not often used
for high-volume traffic, and is often a source of spoofing or
reflected attack traffic, and is therefore safe to block or route-
limit. This assumption is becoming less true than it once was: the
volume of (good) UDP traffic is growing, mostly due to voice and
video (real-time) services (e.g. RTCWEB) where TCP is not suitable.
Even if firewall vendors and administrators are willing to change
firewall rules to allow more diverse UDP services, it is hard to
track session state for UDP traffic. As UDP is unidirectional, it is
unknown whether the receiver is willing to accept the connection.
Further there is no way to figure how long state must be maintained
once established. To efficiently establish state along the path we
need an explicit contract, as is done implicitly with TCP today.
2.2. Information Exposed
To maintain state in the network, it must be possible to easily
assign each packet to a session that is passing a certain network
node. This state should be bound to something beyond the five-tuple
to link packets together. In [I-D.trammell-spud-req], we propose the
use of identifiers for groups of packets, called ("tubes"). This
allows for differential treatment of different packets within one
five-tuple flow, presuming the application has control over
segmentation and can provide requirements on a per-tube basis. Tube
IDs must be hard to guess: a tube ID in addition to a five-tuple as
an identifier, given significant entropy in the tube ID, provides an
additional assurance that only devices along the path or devices
cooperating with devices along the path can send packets that will be
recognized by middleboxes and endpoints as valid.
Further, to maintain state, the sender must explicitly indicate the
start and end of a tube to the path, while the receiver must confirm
connection establishment. This, together with the first packet
following the confirmation, provides a guarantee of return
Kuehlewind & Trammell Expires September 19, 2016 [Page 5]
Internet-Draft SPUD Use Cases March 2016
routability; i.e. that the sender is actually at the address it says
it is. This implies all SPUD tubes must be bidirectional, or at
least support a feedback channel for this confirmation. Even though
UDP is not a bidirectional transport protocol, often services on top
of UDP are bidirectional anyway. Even if not, we only require one
packet to acknowledge a new connection. This is low overhead for
this basic security feature. This connection set-up should not
impose any additional start-up latency, so the sender must be also
able to send payload data in the first packet.
If a firewall blocks a SPUD packet, it can be beneficial for the
sender to know why the packet was blocked. Therefore a SPUD-aware
middlebox should be able to send error messages. Such an error
message can either be sent directly to the sender itself, or
alternatively to the receiver that can decide to forward the error
message to a sender or not.
2.3. Mechanism
A firewall or middlebox can use the tube ID as an identifier for its
session state information. If the tube ID is large enough it will be
hard for a non- eavesdropping attacker to guess the ID.
If a firewall receives a SPUD message that signals the start of a
connection, it can decide to establish new state for this tube.
Alternatively, it can also forward the packet to the receiver and
wait if the connection is wanted before establishing state. To not
require forwarding of unknown payload, a firewall might want to
forward the initial SPUD packet without payload and only send the
full packet if the connection has be accepted by the receiver.
The firewall must still maintain a timer to delete the state of a
tube if no packets were received for a while. However, if a end
signal is received the firewall can remove the state information
faster.
If a firewall receives a SPUD message which does not indicate the
start of a new tube and no state is available for this tube, it may
decide to block the traffic. This can happen if the state has
already timed out or if the traffic was rerouted. In addition a
firewall may send an error message to the sender or the receiver
indicating that no state information is available. If the sender
receives such a message it can resend a start signal (potentially
together with other tube state information) and continue its
transmission.
Kuehlewind & Trammell Expires September 19, 2016 [Page 6]
Internet-Draft SPUD Use Cases March 2016
2.4. Deployment Incentives
The ability to use existing firewall management best practices with
new transport services over SPUD is necessary to ensure the
deployability of SPUD. In today's Internet, application developers
really only have two choices for transport protocols: TCP, or
transports implemented at the application layer and encapsulated over
UDP. SPUD provides a common shim layer for the second case, and the
firewall traversal facility it provides makes these transports more
likely to deploy.
It is not expected that the information provided by SPUD will enable
all generic UDP-encapsulated transports to safely pass firewalls.
However, it does make state handling easier for new services that a
firewall administrator is willing to allow.
2.5. Security, Privacy, and Trust
The tube ID is scoped to the five-tuple. While this makes the tube
ID useless for session mobility, it does mean that the valid ID space
is sufficiently sparse to maintain the "hard to guess" property, and
prevents tube IDs from being misused to track flows from the same
endpoint across multiple addresses. This limitation may need further
discussion.
By providing information about connection setup, SPUD exposes
information equivalent to that available in the TCP header. It makes
connection lifetime information explicit and accessible without
specific higher-layer/application- level knowledge.
3. On-Path State Lifetime Discovery and Management
Once the problem of connection setup is solved, the problem arises of
managing the lifetime of state associated with that connection at
various devices along the path: NAT and stateful firewall state
timeouts are a common cause of connectivity issues in the Internet.
3.1. Problem Statement
Devices along the path that must keep state in order to function
cannot assume that signals tearing down a connection are provided
reliably. This is also the case for current TCP traffic. Therefore,
all stateful on-path devices must implement a mechanism to remove the
state if no traffic is seen for a given flow or tube for a while.
Usually this is implemented by maintaining a timeout since the last
observed packet.
Kuehlewind & Trammell Expires September 19, 2016 [Page 7]
Internet-Draft SPUD Use Cases March 2016
If the timeouts are set too low, on-path state might be discarded
while the endpoint connection is still alive; in the case of
firewalls and NATs, this can lead to unreliable connectivity. The
common solution to this problem is for applications or transport
protocols that do not have any productive traffic to send to send
"heartbeat" or "keep-alive" packets to reset the state timeout along
the path. However, since the minimum timeout along the path is
unknown to the endpoint, implementers of transport and application .
A default value of 150ms is commonly used today. This represents a
fairly rapid generation of nonproductive traffic, and is especially
onerous on battery- powered mobile devices, which must wake up radios
and switch to a higher-power mode to transmit these nonproductive
packets, leading to suboptimal power usage and shorter battery life.
3.2. Information Exposed
SPUD can be used to request that SPUD-aware middleboxes along the
path expose their minimum state timeout value. Here, the sending
endpoint sends a "accumulate minimum timeout" request along with some
scratch space for middleboxes to place their timeout information in.
Each middlebox inspects this value, and writes its own timeout only
if lower than the present value.
Applications may also send a "timeout proposal" to devices along the
path using a SPUD declaration that a given tube will send a packet at
least once per interval, and if no packet is seen within that
interval, it is safe to tear down state.
These two declarations may be used together, with middleboxes willing
to use the application's value setting their timeouts on a per-tube
basis, or exposing a lower timeout value to allow the application to
adjust.
3.3. Mechanism
If a SPUD-aware middlebox that uses a timeout to clean up per-tube
state receives a SPUD minimum timeout accumulation, it should expose
its own timeout value if smaller than the one already given.
Alternatively, if a value is already given, it might decide to use
the given value as timeout for the state information of this tube.
An endpoint receiving an accumulated minimum timeout should send it
back to its remote peer via a feedback channel. Timeouts on each
direction of a connection between two endpoints may, of course, be
different, and are handled separately by this mechanism.
If a SPUD-aware middlebox that uses a timeout to clean up per-tube
state receives a timeout proposal, it should set its timeout
accordingly, subject to its own policy and configuration.
Kuehlewind & Trammell Expires September 19, 2016 [Page 8]
Internet-Draft SPUD Use Cases March 2016
These mechanisms are of course completely advisory: there may be non-
SPUD aware middleboxes on path which will ignore any proposed timeout
and not expose their timeout information, and middleboxes must be
configured with maximum timeout proposal they will accept in order to
defend against state exhaustion attacks.
Endpoints must therefore be combine the use of these signals endpoint
with a dynamic timeout discovery and adaptation mechanism, which uses
the signals to set initial guesses as to the path timeout.
3.4. Deployment Incentives
Initially, if not widely deployed, there will be not much benefit to
using this extension.
However, we can assume that there are usually only a small number of
middleboxes on a given network path that hold per-tube state
information. Endpoints have an incentive to request minimum timeout
and to propose timeouts to improve convergence time for dynamic
timeout adaptation mechanisms, and middleboxes have an incentive to
cooperate to improve reliability of connections as well as state
management. It is therefore likely that if information is exposed by
a middlebox, this information is correct and can be used.
The more SPUD gets deployed, the more often endpoints will be able to
set the heartbeat interval correctly. This will reduce the amount of
unproductive traffic as well as the number of reconnections that
cause additional latency.
Likewise, SPUD-aware middleboxes that expose timeout information are
able to handle timeouts more flexibly, e.g. announcing lower timeout
values when they have less space available for new state. Further if
an endpoint announces a low pre-set value because the endpoint knows
that it will only have short idle periods, the timeout interval could
be reduced.
3.5. Security, Privacy, and Trust
Timeout proposals increase the risk of state exhaustion attacks for
SPUD-aware middleboxes that naively follow them. Likewise,
accumulated minimum timeouts could be used by malicious middleboxes
to induce floods of useless heartbeat traffic along the path, and/or
exhaust resources on endpoints that naively follow them. All timeout
proposals and minimum timeouts must therefore be inputs to a dynamic
timeout selection process, both at endpoints and on-path devices,
which use these signals as hints but clamp their timeouts to sane
values set by local policy.
Kuehlewind & Trammell Expires September 19, 2016 [Page 9]
Internet-Draft SPUD Use Cases March 2016
While device timeout and heartbeat interval are generally not linked
to privacy-sensitive information, a timeout proposal may add a number
of bits of entropy to an endpoint's unique fingerprint. It is
therefore advisable to suggest a small number of useful timeout
proposals, in order to reduce this value's contribution to an
endpoint fingerprint.
4. Path MTU Discovery
Similar to the state timeout problem is the Path MTU problem:
differing MTUs on different devices along the path can lead to
fragmentation or connectivity issues. This problem is made worse by
the increasing proliferation of tunnels in the Internet, which reduce
the MTU by the amount required for tunnel headers.
4.1. Problem Statement
In order to efficiently send packets along a path end to end, they
must be sized to fit in the MTU of the "narrowest" link along the
path. Algorithms for path MTU discovery have been defined and
standardized for a quarter century, in [RFC1191] for IPv4 and
[RFC1981] for IPv6, but they are not often implemented due in part to
widespread impairment of ICMP. Packetization Layer Path MTU
Discovery [RFC4821] (PLPMTUD) is a more recent attempt to solve the
problem, which has the advantage of being transport-protocol
independent and functional without ICMP feedback. SPUD, as a shim
between UDP and superstrate transport protocols, is at the right
place in the stack to implement PLPMTUD, and explicit cooperation can
enhance its operation.
4.2. Information Exposed
SPUD can be used to request that SPUD-aware middleboxes along the
path expose their next-hop path MTU value. Here, the sending
endpoint sends a "accumulate minimum MTU" request along with some
scratch space for middleboxes to place the next-hop MTU for the given
tube. Each middlebox inspects this value, and writes its own next-
hop MTU only if lower than the present value.
A SPUD-aware middlebox that receives a packet that is too big for the
next-hop MTU can send back a signal associated with the tube directly
to the sender, including the next-hop MTU.
4.3. Mechanism
PLPMTUD functions by dynamically increasing the size of packets sent,
and reacting to the loss of the first "too large" packet as an MTU
reduction signal, instead of a congestion signal. This must be
Kuehlewind & Trammell Expires September 19, 2016 [Page 10]
Internet-Draft SPUD Use Cases March 2016
implemented in cooperation with the superstrate transport protocol,
as it is responsible for how non-MTU-related loss is treated.
When an endpoint receives an accumulated minimum MTU, it should
should send it back to its remote peer via a feedback channel. The
minimum of this value and any direct next-hop MTU signals received
from SPUD-aware middleboxes can be used as a hint to the sender's
PLPMTUD process, as a likely upper bound for path MTU associated with
a tube.
4.4. Deployment Incentives
As with state lifetime discovery, these signals are of little initial
utility to endpoints before SPUD-aware middleboxes are deployed.
However, SPUD-aware middleboxes that sit at potential MTU breakpoints
along a path, either those which terminate tunnels or bridge networks
with two different link types, have an incentive to improve
reliability by responding to accumulation requests and sending next-
hop MTU messages to SPUD-aware endpoints.
4.5. Security, Privacy, and Trust
As with state lifetime discovery, Minimum MTU and next-hop MTU
signals could be used by malicious middleboxes to set the endpoint's
maximum packet size to inefficiently small sizes, if the endpoint
follows them naively. For that reason, endpoints should use this
information only as hints to improve the operation of PLPMTUD, and
may probe above the value derived from the SPUD- supplied information
when deemed appropriate by endpoint policy or transport protocol
requirements.
5. Low-Latency Service
5.1. Problem Statement
Networks are often optimized for low loss rates and high throughput
by providing large buffers that can absorb traffic spikes and rate
variations while holding enough data to keep the link full. This is
beneficial for applications like high-priority bulk transfer, where
only the total transfer time is of interest. High-volume interactive
applications, such as videoconferencing, however, have very different
requirements. Usually these applications can tolerate higher loss
rates, while having hard latency requirements.
Large network buffers may induce high queuing delays due to cross
traffic using loss-based congestion control, which must periodically
fill the buffer to induce loss during probing for additional
bandwidth. This queueing delay can negatively impact the quality of
Kuehlewind & Trammell Expires September 19, 2016 [Page 11]
Internet-Draft SPUD Use Cases March 2016
experience for competing interactive applications, even making them
unusable.
5.2. Information Exposed
The simplest mechanism for solving this problem is to separate loss-
sensitive from latency-sensitive traffic, as proposed using DSCP
codepoints in [I-D.you-tsvwg-latency-loss-tradeoff]. This signal
could also be emitted as a per-packet signal within SPUD, since DSCP
codepoints are often used for internal traffic engineering and
therefore cleared at network borders. This indication does not
prioritize one kind of traffic over the other: while loss- sensitive
traffic might face larger buffer delay but lower loss rate, latency-
sensitive traffic has to make exactly the opposite tradeoff.
An endpoint can also indicate a maximum acceptable single-hop
queueing delay per tube, expressed in milliseconds. While this
mechanism does not guarantee that sent packets will experience less
than the requested delay due to queueing delay, it can significantly
reduce the amount of traffic uselessly sitting in queues, since at
any given instance only a small number of queues along a path
(usually only zero or one) will be full.
5.3. Mechanism
A middlebox may use the loss-/latency tradeoff signal to assign
packet to the appropriate type of service, if different services are
implemented at this middlebox. Traffic not indicating a low loss or
low latency preference would still be assigned to today's best-effort
service, while a new low latency service would be introduced in
addition.
The simplest implementation of such a low latency service (without
disturbing existing traffic) is to manage traffic with the latency-
sensitive flag set in a separate queue. This queue either, in
itself, provides only a short buffer which induces a hard limit for
the maximum (per-queue) delay or uses an AQM (such as PIE/CoDel) that
is configured to keep the queuing delay low.
In such a two-queue system the network provider must decide about
bandwidth sharing between both services, and might or might not
expose this information. Initially there will only be a few flows
that indicate low latency preference. Therefore at the beginning
this service might have a low maximum bandwidth share assigned in the
scheduler. However, the sharing ratio should be adapted to the
traffic load/number of flows in each service class over long
timescales.
Kuehlewind & Trammell Expires September 19, 2016 [Page 12]
Internet-Draft SPUD Use Cases March 2016
Applications and endpoints setting the latency sensitivity flag on a
tube must be prepared to experience relatively higher loss rates on
that tube, and should use techniques such as Forward Error Correction
(FEC) to cope with these losses.
If a maximum per-hop delay is indicated by the sender, a SPUD- aware
router might drop any packet which would be placed in a queue that
has more than the maximum single-hop delay at that point in time
before queue admission. Thereby the overall congestion can be
reduced early instead of withdrawing the packet at the receiver after
it has blocked network resources for other traffic.
A transport protocol at an endpoint indicating the maximum per-hop
delay must be aware that is might face higher loss rates under
congestion than competing traffic on the same bottleneck.
5.4. Deployment Incentives
Application developers go to a great deal of effort to make latency-
sensitive traffic work over today's Internet. However, if large
delays are induced by the network, an application at the endpoint
cannot do much. Therefore applications can benefit from further
support by the network.
Network operators have already realized a need to better support low
latency services. However, they want to avoid any service
degradation for existing traffic as well as risking stability due to
large configuration changes. Introducing an additional service for
latency-sensitive traffic that can exist in parallel to today's
network service helps this problem.
5.5. Security, Privacy, and Trust
An application cannot benefit from wrongly indicating loss- or
latency- sensitivity, as it has to make a tradeoff between low loss
and potential high delay or low delay and potential high loss.
A simple classification of traffic as loss- or latency-sensitive does
not expose privacy-critical information about the user's behavior;
indeed, it exposes far less than presently used by DPI-based traffic
classifiers that would be used to determine the latency sensitivity
of traffic passing a middlebox.
6. Reordering Sensitivity
Kuehlewind & Trammell Expires September 19, 2016 [Page 13]
Internet-Draft SPUD Use Cases March 2016
6.1. Problem Statement
TCP's fast retransmit mechanism interprets the reception of three
duplicated acknowledgement (where the acknowledgement number is the
same than in the previous acknowledgement) as a signal for loss
detection. However, a missing packet in the sequence number space
must not always be lost. Simple reordering where one packet takes a
longer path than (at least three) subsequent packets can have the
same effect.
In addition in TCP, loss is an implicit signal for network
congestion. Therefore the reception of three duplicated
acknowledgement will cause a TCP sender to reduce its sending rate.
To avoid unnecessary performance decreases, today's in-network
mechanisms usually aim to avoid reordering. However, this
complicates these mechanism significantly and usually requires per-
flow state, e.g. in case of Equal Cost Multipath (ECMP) routing where
a hash of the 5 tuple would need to be mapped to the right path.
Even though the majority of traffic in the Internet is still TCP, it
is likely that new protocols will be design such that they are (more)
robust to reordering. Further with an increasing deployment of ECN,
even TCP's congestion control reaction based on duplicated
acknowledgements could be relaxed (e.g. by reducing the sending rate
gradually depending on the number of lost packets).
However, as middlebox can not know if a certain traffic flow is
sensitive to reordering or not, they have to treat all traffic as
equally and try to always avoid reordering. (This does not only
complicate these mechanism but might also block the deployment of new
services.)
6.2. Information Exposed
Reordering-sensitivity is a per tube signal (as reordering can only
happen with a flow multiple packets). However, to avoid state in
middlebox, it would be beneficial to have a reordering-sensitive flag
in each packet.
A transport should set the bit if it is not sensitive to reordering,
e.g. if it uses a more advance mechanism (than duplicated
acknowledgement) for loss detection, or if the congestion control
reaction to this signal imposes only a small performances penalty, or
if the flow is short enough that it will not impact its performance.
Kuehlewind & Trammell Expires September 19, 2016 [Page 14]
Internet-Draft SPUD Use Cases March 2016
6.3. Mechanism
A middlebox that implement an in-network function that could lead to
varying end-to-end delay and reordering (as packets might overtake
each other on different paths or within the network device), do not
need to perform any additional action if the reordering-sensitivity
flag is not set. However, if the flag is set, the middlebox should
avoid reordering by e.g. holding per- tube state and make sure that
all packets belonging to the same tube will not be re-ordered.
6.4. Deployment Incentives
Today by default middlebox assume that all traffic is reordering-
sensitive which complicates certain in-network mechanism or might
also block the deployment of new services. If a middlebox would know
that certain traffic is not reordering-sensitive, it could reduce
state, speed-up processing, or even implement new services.
Applications that are not loss-sensitive (because they e.g. uses FEC)
usually are also not reordering-sensitive. At the same time these
application are often sensitive to latency. If the transport handles
reordering appropriately and signal this semantic information to the
network, the appropriate network treatment can likely also result in
lower end-to-end or at least enables the network device to impose any
additional delay (e.g. to set up state) on these packets.
6.5. Security, Privacy, and Trust
No trust relationship is needed as the provided information do not
results in a preferential treatment. Only transport semantics are
exposed that to not contain any private information.
7. Application-Limited Flows
7.1. Problem Statement
Many flows are application-limited, where the application itself
adapts the limit to changing traffic conditions or link
characteristics, such as with unicast adaptive bitrate streaming
video. This adaptation is difficult, since TCP cross-traffic will
often probe for available bandwidth more aggressively than the
application's control loop. Further complicating the situation is
the fact that rate adaptation may have negative effects on the user's
quality of experience, and should therefore be done infrequently.
Kuehlewind & Trammell Expires September 19, 2016 [Page 15]
Internet-Draft SPUD Use Cases March 2016
7.2. Information Exposed
A SPUD endpoint sending application-limited traffic can provide an
explicit per-tube indication of the maximum intended data rate needed
by the current encoding or data source. If the bottleneck device is
SPUD-aware, it can use this information to decide how to correctly
treat the tube, e.g. setting a rate limit or scheduling weight if
served from its own queue.
A SPUD endpoint could also send a "minimum rate limit accumulation"
request, similar to the other accumulation requests outlined above,
where SPUD-aware routers and middleboxes could note the maximum
bandwidth available to a tube. Receiving this signal on a feedback
channel could allow a sender to more quickly adapt its sending rate.
This rate limit information might be derived from local per-flow or
per-tube rate limit policy, as well as from current information about
load at the router.
These signals can be sent throughout the lifetime of the flow, to
help adapt to changing application demands and/or network conditions.
7.3. Mechanism
Maximum expected data rate exposed by the endpoints could be used to
make routing decisions and queue selection decisions at SPUD-aware
routers, if different paths or queues with different capacity, delay,
and load characteristics are available.
A SPUD-aware router that indicates a rate limit can be used by the
sender to choose an encoding. However, the sender should still
implement a mechanism to probe for available bandwidth to verify the
provided information. As a certain rate limit is expected, the
sender should probe carefully around this rate.
These mechanisms can also be used for rate increases. If a sender
receives an indication that more bandwidth is available it should
probe carefully, instead of switching to the higher rate immediately,
and decrease its sensitivity to loss (e.g. through the use of
additional FEC) which will provide additional protection as soon as
the new capacity limit is reached. Likewise, a SPUD- aware router
that receives an indication that a flow intends to increase its might
prioritize this flow for a certain (short) time to enable a smoother
transition.
Kuehlewind & Trammell Expires September 19, 2016 [Page 16]
Internet-Draft SPUD Use Cases March 2016
7.4. Deployment Incentives
Endpoints that indicate maximum sending rate for application-limited
traffic on SPUD-aware networks allow the operators of those networks
to better handle traffic. This can benefit the service quality and
increase the user's satisfaction with the provided network service.
Currently applications have no good indication when to change their
coding rate. Rate increases are especially hard. Further, frequent
rate changes should be avoided for quality of experience.
Cooperative indication of intended and available sending rate for
application-limited flows can simplify probing, and provide signals
beyond loss to react effectively to congestion.
7.5. Security, Privacy, and Trust
Both endpoints and SPUD-aware middleboxes should react defensively to
rate limit and rate intention information. Endpoints and middleboxes
should use measurement and probing to verify that rate information is
accurate, but the exposed rate information can be used as hints to
routing, scheduling, and rate determination processes.
8. Priority Multiplexing
8.1. Problem Statement
Many services require multiple parallel transmissions to transfer
different kinds of data which have clear priority relationships among
them. For example, in WebRTC, audio frames should be prioritized
over video frames. Sometimes these transmissions happen in different
flows, and sometimes some packets within a flow have higher priority
than others, for example I-frames in video transmissions. However,
current networks will treat all packets the same in case of
congestion and might e.g. drop audio packets while video and control
traffic are still transmitted.
8.2. Information Exposed
A SPUD sender may indicate a that one tube should "yield" to another,
i.e. that it should have lower relative priority than another tube
in the same flow. Similarly, individual packets within a tube could
be marked as having lower priority. This information can be used to
preferentially drop less important packets e.g. carrying information
that could be recovered by FEC.
With a stronger integration of codec and transport protocols, SPUD
could even indicate more fine-grained priority levels to provide
automatic graceful degradation of service within the network itself.
Kuehlewind & Trammell Expires September 19, 2016 [Page 17]
Internet-Draft SPUD Use Cases March 2016
8.3. Mechanism
Designing a general-purpose mechanism that maps relative priorities
from the yield information exposed via SPUD to correct per-tube and
per-packet treatment at any point in the Internet, is an extremely
hard problem and a possible subject for future research. It appears
impossible at this writing to design a straightforward mapping
function from these relative priorities per- flow to absolute
priorities across flows in a fair way.
However, in the not-uncommon case that exists in many access
networks, where the bottleneck link has per-user queues and can
enforce per-user fairness, the relative priorities can be mapped to
absolute priorities, and simple priority queueing at the bottleneck
can be used. Lower priority packets within a tube, however, should
be assigned to the tube's priority class, and preferentially dropped
instead, e.g. using a different drop threshold at the queue.
8.4. Deployment Incentives
Deployment incentives for priority multiplexing are similar to those
for bandwidth declaration for app-limited flows as in Section 7.4:
endpoints that correctly declare priority information will experience
better quality of service on SPUD-enabled networks, and SPUD-enabled
networks get information that allows them to better manage traffic.
8.5. Security, Privacy, and Trust
Since yield information can only be used to disadvantage an
application's traffic relative to its own traffic, there is no
incentive for applications to declare incorrect yielding.
The pattern and relative volume of traffic in different yield classes
may be used to "fingerprint" certain applications, though it is not
clear whether this provides additional information beyond that
contained inter-packet delay and volume patterns.
9. In-Band Measurement
9.1. Problem Statement
The current Internet protocol stack has very limited facilities for
network measurement and diagnostics. The only explicit measurement
feature built into the stack is ICMP Echo ("ping"). In the meantime,
the Internet measurement community has defined many inference- and
assumption-based approaches for getting better information out of the
network: traceroute and BGP looking glasses for topology information,
TCP sequence number and TCP timestamp based approaches for latency
Kuehlewind & Trammell Expires September 19, 2016 [Page 18]
Internet-Draft SPUD Use Cases March 2016
and loss estimation, and so on. Each of these uses values placed on
the wire for the internal use of the protocol, not for measurement
purposes, and do not necessarily apply to the deployment of new
protocols or changes to the use of those values by protocol
implementations. Approaches involving the encryption of transport
protocol and application headers (indeed, including that the authors
advance in [I-D.trammell-spud-req]) will break most of these, as
well.
Replacing the information used for measurement with values defined
explicitly to be used for measurement in a transport protocol
independent way allows explicit endpoint control of measurability and
measurement overhead.
We note that current work in IPPM [I-D.ietf-ippm-6man-pdm-option]
proposes a roughly equivalent, IPv6-only, kernel-implementation-only
facility.
9.2. Information Exposed
The "big five" metrics - latency, loss, jitter, data rate / goodput,
and reordering - can be measured using a relatively simple set of
primitives. Packet receipt acknowledgment using a cumulative nonce
echo allows both endpoint and on-path measurement of loss and
reordering as well as goodput (when combined with layer 3 packet
length headers). A timestamp echo facility, analogous to TCP's
timestamp option but using an explicitly defined, constant-rate clock
and exposure of local delta (time between receipt and subsequent
transmission).
The cumulative nonce echo consists of two values: a number
identifying a given packet (nonce), which also identifies all
retransmissions of the packet, and a number which is the sum of all
packet identifiers received from the remote endpoint (echo), modulo
the maximum value of the echo field. Nonces need not be sequential,
or even monotonic, but two packets with the same nonce should not be
simultaneously in flight. These are exposed on a per-packet basis,
but need not appear on every packet in the tube or flow, with the
caveat that lower sampling rates lead to lower sensitivity.
The timestamp echo consists of three values: The time in terms of
ticks of a constant rate clock that a packet is sent, the echo of the
last such timestamp received from the remote endpoint, and the number
of ticks of the sender's clock between the receipt of the last
timestamp from the remote endpoint and the transmission of the packet
containing the echo. This last delta value is the missing link in
TCP sequence number based and timestamp option based latency
estimation.
Kuehlewind & Trammell Expires September 19, 2016 [Page 19]
Internet-Draft SPUD Use Cases March 2016
The information exposed is roughly equivalent than that currently
exposed by TCP as a side effect of its operation, but defined such
that they are explicitly useful for measurement, useful regardless of
transport protocol, and such that information exposure is in the
explicit control of the endpoint (when the superstrate transport
protocol's headers are encrypted).
9.3. Mechanism
The nonce and timestamp echo information, emitted as per-packet
signals in the SPUD header, can be used by any device which can see
it to estimate performance metrics on a per-tube basis. This
includes both remote endpoints, as well as passive performance
measurement devices colocated with network gateways.
9.4. Deployment Incentives
Initial deployment of this facility is most likely in closed networks
such as enterprise data centers, where a single administrative entity
owns the network and the endpoints, can control which flows and tubes
are annotated with measurement information, and can benefit from the
additional insight given during network troubleshooting by explicit
measurement headers.
Further, since the provided measurement information is exposed by
SPUD to the far-endpoint, it can be used for performance enhancement
on these layers. Once the facility is deployed in SPUD-aware
endpoints, it can also be used for inter-network and cross-Internet
performance measurement and debugging (replacing today's processing-
intensive DPI mechanisms).
9.5. Security, Privacy, and Trust
The cumulative nonce and timestamp echo leaks no more information
about the traffic than the TCP header does. Indeed, since the
cumulative nonce does not include sequence number information or
other protocol-internal information, it allows passive measurement of
loss and latency without giving measurement devices access to
information they could use to spoof valid packets within a transport
layer connection.
In order to prevent middleboxes from modifying measurement-relevant
information, these per-packet signals will need to be integrity
protected by SPUD.
Performance measurement boxes at gateways which observe and aggregate
these signals will necessarily need to trust their accuracy, but can
Kuehlewind & Trammell Expires September 19, 2016 [Page 20]
Internet-Draft SPUD Use Cases March 2016
verify their plausibility by calculating nonce sums and synchronizing
timing clocks.
10. IANA Considerations
This document has no actions for IANA.
11. Security Considerations
Security and privacy considerations for each use case are given in
the corresponding Security, Privacy, and Trust subsection.
12. Acknowledgments
This document grew in part out of discussions of initial use cases
for middlebox cooperation at the IAB SEMI Workshop and the IETF 92
SPUD BoF; thanks to the participants. Some use case details came out
of discussions with the authors of the [I-D.trammell-spud-req]: in
addition to the editors of this document, David Black, Ken Calvert,
Ted Hardie, Joe Hildebrand, Jana Iyengar, and Eric Rescorla.
Section 9 is based in part on discussions and ongoing work with Mark
Allman and Rob Beverly.
This work is supported by the European Commission under Horizon 2020
grant agreement no. 688421 Measurement and Architecture for a
Middleboxed Internet (MAMI), and by the Swiss State Secretariat for
Education, Research, and Innovation under contract no. 15.0268. This
support does not imply endorsement.
13. Informative References
[I-D.hildebrand-spud-prototype]
Hildebrand, J. and B. Trammell, "Substrate Protocol for
User Datagrams (SPUD) Prototype", draft-hildebrand-spud-
prototype-03 (work in progress), March 2015.
[I-D.ietf-ippm-6man-pdm-option]
Elkins, N. and M. Ackermann, "IPv6 Performance and
Diagnostic Metrics (PDM) Destination Option", draft-ietf-
ippm-6man-pdm-option-01 (work in progress), October 2015.
[I-D.trammell-spud-req]
Trammell, B. and M. Kuehlewind, "Requirements for the
design of a Substrate Protocol for User Datagrams (SPUD)",
draft-trammell-spud-req-02 (work in progress), March 2016.
Kuehlewind & Trammell Expires September 19, 2016 [Page 21]
Internet-Draft SPUD Use Cases March 2016
[I-D.you-tsvwg-latency-loss-tradeoff]
You, J., Welzl, M., Trammell, B., Kuehlewind, M., and K.
Smith, "Latency Loss Tradeoff PHB Group", draft-you-tsvwg-
latency-loss-tradeoff-00 (work in progress), March 2016.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
DOI 10.17487/RFC1191, November 1990,
<http://www.rfc-editor.org/info/rfc1191>.
[RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
for IP version 6", RFC 1981, DOI 10.17487/RFC1981, August
1996, <http://www.rfc-editor.org/info/rfc1981>.
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007,
<http://www.rfc-editor.org/info/rfc4821>.
Authors' Addresses
Mirja Kuehlewind (editor)
ETH Zurich
Gloriastrasse 35
8092 Zurich
Switzerland
Email: mirja.kuehlewind@tik.ee.ethz.ch
Brian Trammell (editor)
ETH Zurich
Gloriastrasse 35
8092 Zurich
Switzerland
Email: ietf@trammell.ch
Kuehlewind & Trammell Expires September 19, 2016 [Page 22]