Internet DRAFT - draft-iyengar-minion-concept
draft-iyengar-minion-concept
Network Working Group J. Iyengar
Internet-Draft Franklin and Marshall College
Intended status: Standards Track S. Cheshire
Expires: April 24, 2014 J. Graessley
Apple
October 21, 2013
Minion - Service Model and Conceptual API
draft-iyengar-minion-concept-02
Abstract
Minion uses TCP-format packets on-the-wire, to provide full
compatibility with existing NATs, Firewalls, and similar middleboxes,
but provides a richer set of facilities to the application. Minion's
richer facilities include a message-oriented API rather than TCP's
unstructured byte-stream service model, multiplexing of multiple
messages (or message streams) on a single connection, interleaving of
multiplexed messages (to eliminate head-of-line blocking), message
cancellation, request/reply support, ordered and unordered messages,
superseding messages, chained messages, multiple priority levels with
byte-granularity preemption, and DTLS Security. Minion can be
implemented entirely as a user-level library, without waiting for any
special support from OS vendors, and provides immediate benefits to
application developers. Additionally, Minion is able to take
advantage of some simple kernel extensions to provide enhanced
services that go beyond what is possible with traditional TCP. These
kernel extensions are optional, and even without them, Minion offers
worthwhile benefits to application developers.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 24, 2014.
Iyengar, et al. Expires April 24, 2014 [Page 1]
Internet-Draft Minion Conceptual API October 2013
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
1. Conventions and Terminology Used in this Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
"Key words for use in RFCs to Indicate Requirement Levels" [RFC2119].
2. Introduction
Back in 1983 application developers had the choice of UDP [RFC0768]
or TCP [RFC0793]. UDP preserves message boundaries, but provides no
reliability, ordering, flow control, or congestion control, and only
supports small messages (typically UDP packets larger than 1468 bytes
result in undesirable IP fragmentation). TCP provides these
important facilities, and doesn't impose any message size limit --
but only because it doesn't have any concept of messages, and doesn't
claim to preserve message boundaries. Consequently, whichever base
protocol the application developer chose, they were left building
part of the transport-layer solution themselves.
Thirty years later, in 2013, little has changed. Application
developers on mainstream platforms like Android and iOS still have
the same two choices -- UDP and TCP.
Attempts to provide richer application facilities have failed to
achieve widespread adoption. Protocols like SCTP are not supported
by mainstream NAT gateways. Consequently, mainstream apps for
platforms like Android and iOS don't use SCTP, because it would
severely limit their real-world deployment. Consequently, operating
systems like Android and iOS don't have in-kernel native
implementations of SCTP, because there's little developer demand for
Iyengar, et al. Expires April 24, 2014 [Page 2]
Internet-Draft Minion Conceptual API October 2013
a protocol they can't use. Consequently, there's little incentive
for NAT gateway vendors to do the work to add support for a protocol
that's neither supported in the popular operating systems nor used by
mainstream applications.
Like SCTP, Minion goes beyond UDP and TCP by providing richer
application facilities, making it possible to create applications
that work better and more reliably (and can be brought to market
quicker and easier) than is possible when each application has to re-
create those facilities from scratch every time.
However, unlike SCTP, Minion provides facilities that can be used by
an application developer immediately, without having to wait for OS
support or NAT gateway support. OS support and NAT gateway support
can come later, and provide additional incremental improvements.
This incremental deployment path -- which begins first with the
application developer who can choose to use Minion and immediately
reap the benefits of that decision -- is an important property of
Minion, and removes one of the major obstacles that hindered SCTP
adoption.
When used without kernel support, Minion acts like a typical TCP-
based application protocol, and as such, performs as well as any
other TCP-based application protocol. However, unlike most
application-specific protocols, Minion also offers the potential of
kernel support giving better low-latency message performance and
better prioritization. A general application protocol is unlikely to
receive special-case kernel support tailored to support that one
specific application, but as a general-purpose transport protocol
built to support a wide range of applications, special kernel support
for Minion is feasible.
Minion preserves the important properties of TCP, like congestion
control, while adding a range of richer application facilities:
Message Oriented
Rather than an unstructured byte stream, Minion supports messages.
TCP provides an unstructured byte stream, but virtually every
application needs to send and receive semantic messages, which
means that virtually every application needs to build its own
message framing mechanism on top of TCP. In contrast, Minion
respects and preserves semantic boundaries. If an application
writes a 27-byte message followed by a 53-byte message, then a 27-
byte message and a 53-byte message are delivered to the receiving
client, not a single combined block of 80 bytes.
Iyengar, et al. Expires April 24, 2014 [Page 3]
Internet-Draft Minion Conceptual API October 2013
Arbitrary Size Messages
While most Minion messages are expected to be small, Minion itself
imposes no upper limit on message size. For example, a 6 gigabyte
movie download could be sent as a single Minion message. Messages
do not have to fit in memory. A very large message can be
generated incrementally by the sender, and will be delivered
incrementally to the receiver.
Multiplexing
Multiple messages can be sent, in both directions, on a single
Minion connection. Unlike protocols like HTTP/1.0 where each
request used a separate connection, many Minion messages can share
a connection.
Interleaving
When multiple (possibly large) messages are being sent
concurrently on a single Minion connection, the connection
bandwidth is shared round-robin between the messages. This avoids
head-of-line blocking, where messages are blocked waiting for a
large message to complete.
Cancellation
Messages do not have to be sent to completion. If either the
sender or the receiver determines that a message is no longer
needed, then that single message can be cancelled without having
to tear down the entire Minion connection.
Unordered Messages
One of the main arguments that is often presented to justify why a
particular application protocol is built on UDP instead of TCP is
that, "UDP is better for 'real time' applications." The
supporting reasoning for this is often that, "TCP insists on
continuing to retransmit data long after the client doesn't need
any more." In truth the real problem is not retransmission; it is
that the conventional TCP APIs don't allow received data to be
delivered out of order. Suppose a TCP sender has 50 packets in
flight at any given time (e.g. the bandwidth x delay product is 75
kB) then the loss of a single packet causes all 49 following
packets to stall at the receiver because the API doesn't allow for
them to be delivered to the client until the missing packet has
been received. A simple kernel extension (in the form of a new
socket option) removes this limitation, and allows out-of-order
data to be delivered to the client. This avoids the problem where
a single lost TCP segment causes all the following TCP segments to
be delayed.
Note that this kernel extension is not *required* for a client to
use Minion; it is an optional extension that provides better
Iyengar, et al. Expires April 24, 2014 [Page 4]
Internet-Draft Minion Conceptual API October 2013
performance for real-time applications in situations where there
is packet loss or reordering. For many applications it is an
irrelevant benefit and they can operate perfectly well without it.
For a few applications it is a significant benefit, and it allows
Minion to provide the low-latency performance that often drives
developers to use UDP.
Sender Ordering
While out-of-order data can be useful, there are are also many
cases where there is a logical order to send the data. For
example, H.264 video P-frames may not strictly depend upon each
other for decoding, but there is still a logical order to send
them. It makes sense for earlier frames to be sent sooner than
later frames. Therefore, delivering all frames concurrently,
sharing the available bandwidth fairly, may cause the first frame
to be delivered unecessarily late, and the last frame to be
delivered unecessarily early. Respecting the natural order of the
frames makes sense. However, in the event of packet loss that
delays a particular frame beyond its playback time, there is no
reason to artificially delay delivery of later frames that have
already been received. For data of this kind, the sender uses
Sender Ordering to indicate that a particular message should
follow another message on the wire.
Receiver Ordering
There are cases where the delivery ordering requirements are
stricter than merely expressing a preference about transmission
order. In some cases the receiver cannot usefully use (certain)
messages out of order, and delivering them potentially out of
order would burden the application with the task of sorting the
messages back into the correct order before processing them. In
such cases, it is more convenient for the application to have
Minion deliver messages to it in the right order. For this reason
Minion supports receiver-ordered messages as well as sender-
ordered and unordered messages. Unordered messages and ordered
messages are supported simultaneously on a single Minion
connection.
Other transport protocols support the notion of multiple message
streams sharing a single connection. Minion takes this idea and
generalizes it to the more expressive notion of Receiver Ordering
Message Dependencies. Receiver Ordering Message Dependencies
indicate that a dependent message must not be delivered before the
message it depends upon.
Traditional message streams can be created in Minion by using a
sequence of Receiver Ordering Message Dependencies: If message B
is specified to follow message A, and message C is specified to
Iyengar, et al. Expires April 24, 2014 [Page 5]
Internet-Draft Minion Conceptual API October 2013
follow message B, and so on, then messages A,B,C... form an
ordered "stream". Similarly, if at the same time message Q is
specified to follow message P, and message R is specified to
follow message Q, and so on, then messages P,Q,R... form another
independent ordered "stream" of their own.
In addition to such disjoint ordered streams (A,B,C... and
P,Q,R...), Receiver Ordering Message Dependencies also allow
richer relationships to be expressed. For example, in H.264
video, P-frames reference I-frames, but P-frames do not reference
other P-frames. If a single P-frame is lost or delayed, it is not
necessary to delay all subsequent P-frames. Each P-frame has a
time it is due to be displayed, and when that time arrives the
frame should be displayed if possible, even if (or especially if)
preceding P-frames did not arrive in time. However, there is no
benefit in delivering a P-frame to the application before the
I-frame it depends upon.
To give another example, a web browser client may need to retrieve
many resources to display a page, but it cannot display *any* of
the page until it has received the style sheet. Consequently it
would be beneficial if the web browser client could request all of
the resources it needs, but for each one, indicate that it depends
on the style sheet resource (or upon some other resource which
depends by transitive closure on the sheet resource). This
dependency information tells the sender that it should not devote
*any* bytes of available bandwidth to delivering other resources
until after it has completed sending the all-important style
sheet.
Superseding Messages
Similar to Receiver Ordered Messages are Superseding Messages.
Where a Receiver Ordered Message follows its predecessor, a
Superseding Message follows its predecessor *and renders it
irrelevant*. For example, in a real-time video game, a message
giving the player's current position supersedes any previous
message giving a previous position. Generally, old position data
becomes irrelevant when newer position data is available.
Messages of this kind are specified to be Superseding Messages.
On the sending side, if Message B is specified to supersede
Message A, and Message A has not yet been sent, then Message A is
discarded without being sent. This avoids wasting network
capacity sending obsolete data. If Message A has been sent, but
(due to packet loss or reordering) arrives at the receiver after
Message B, then the receiver discards the late-arriving message A
without delivering it to the client. This avoids delivering data
to the application in the wrong order. The Superseding Message
mechanism allows client application code to specify that a certain
Iyengar, et al. Expires April 24, 2014 [Page 6]
Internet-Draft Minion Conceptual API October 2013
group of messages have a specified delivery order, without
delaying later messages in the event that a prior message has not
yet arrived.
Chained Messages
Minion is intended to be used to deliver messages containing a
single logical semantic unit. Although Minion can "stream" a
message of unbounded size to the receiver, Minion is not generally
intended to be used to batch multiple logical semantic units into
single large message, which is then "streamed" to the receiver,
which then parses the incoming "streamed" message as it arrives
for the logical semantic units contained within it. Part of the
purpose of Minion is to take the burden of message framing off the
application writer; treating a single unbounded "streaming" Minion
message like a TCP connection places the burden of parsing firmly
back in the hands of the application developer.
In the event that a logical message contains multiple related
parts, like a header with an associated body, Minion can
facilitate this structuring through the use of Chained Messages.
Chained Messages are substantially similar to Receiver Ordering
Message Dependencies, except that in addition to controlling the
order of data transmission on the wire, and the order of message
delivery to the client, the chained message relationship is also
exposed to the client application at the receiving end. Instead
of being delivered to the Minion connection's main message handler
function, the way most messages are, Chained Messages are
delivered instead to the Chained Message handler function of the
previous message in the chain.
The Chained Message mechanism allows an application to provide a
main message handler function that receives and processes the
"header" portion of each two-part message, and that main message
handler function in turn provides a different message handler
function that receives and processes the subsequent "body"
portion.
As with other messages, each component in a message chain can
optionally generate an explicit reply, which is delivered to the
reply handler for the specific message in the chain that elicited
the reply.
If any message in a chain is cancelled by the sender or the
receiver, then all subsequent messages in that chain are
implicitly cancelled.
The sender of a chain of messages may wait at each step for a
Iyengar, et al. Expires April 24, 2014 [Page 7]
Internet-Draft Minion Conceptual API October 2013
reply confirming that the previous message was acceptable before
sending the next message of the chain, or it may send the entire
chain and let the receiver cancel the message chain if an error
occurs.
Request/Reply Support
Many application protocols are request/reply-oriented. Minion
facilitates this by allowing an outgoing message to be explicitly
identified as a reply to a previously-received message, which
causes the reply message to be delivered automatically to the
appropriate message handler at the receiving end.
When a message has a reply handler, the remote peer is expected to
generate exactly one reply for this message. If the remote peer
does not, that is a client programming bug. This is because the
outbound message ID is reserved until one reply is received, then
discarded. If no reply is received, the ID will remain active
indefinitely. If a reply is sent for a message that does not have
a reply handler (or, for a message that does have a reply handler,
more than one reply is sent) then that is also a client
programming bug. If this happens then an error message SHOULD be
logged (to facilitate debugging) and the message MUST NOT be
delivered to the client application (e.g. reply messages that do
not match any current active message ID MUST NOT be delivered to
the connection's default message handler instead).
When a client application generates a reply to a received Minion
message, the reply inherits the ordering properties of the message
to which it replies.
If a request message was unordered, then its reply is unordered.
If a request message B was declared to have a Sender Ordering
contstraint that it follow message A on the wire, then its reply
message B' automatically inherits the Sender Ordering contstraint
that it follow reply message A' on the wire. In the event that
message A does not generate an actual reply A', before reply
message B' will be eligible for sending, the client application
must signal via some other API mechanism (e.g. by disposing of the
message object) that it has finished handling message A.
If a request message B was declared to have a Receiver Ordering
contstraint that it be delivered after message A, then its reply
message B' automatically inherits the Receiver Ordering
contstraint that it must be delivered after reply message A'.
If a request message B was declared to be a Chained Message
following message A, then its reply message B' becomes a Receiver
Iyengar, et al. Expires April 24, 2014 [Page 8]
Internet-Draft Minion Conceptual API October 2013
Ordering message that must be delivered after reply message A'.
Subject to the Sender Ordering and Receiver Ordering constraints,
replies can be interleaved and share bandwidth just as other
messages do.
Replies can themselves generate further replies, resulting in an
unbounded back-and-forth of ping-pong messages, each going to the
appropriate reply handler on the receiving side.
Priority Levels
While Sender Ordering, Receiver Ordering, Superseding Messages,
and Message Chaining allow relationships between messages to be
adequately expressed where they are known in advance, sometimes
there are urgent messages that need to be sent at short notice
that are not known in advance. For example, consider a music
application which is streaming out audio data with a generous
playback buffer, and then the user performs a user-interface
operation to change the volume level. We would like this volume
change to be performed as promptly as possible, regardless of how
much audio data is queued up in the transmit buffer. For this
reason Minion supports four priority levels. A higher-priority
message can preempt a lower-priority message at any arbitrary byte
boundary in the lower-priority data stream. (This byte-
granularity preemption is made possible by the Minion wire
protocol [minprot]).
Minion provides strict priorities, meaning that no lower-priority
data at all is sent as long as there is higher-priority data
waiting. This means that a sustained flow of higher-priority data
can starve lower-priority data indefinitely. For this reason,
Minion priorities are intended to support small amounts of high-
priority data intermixed with larger amounts of lower-priority
data. If the amount of high-priority data exceeds the current
throughput of the Minion connection then all the available
throughput will be consumed attempting to meet the high-priority
data demand, and no lower-priority data will be sent. If this
outcome is undesirable to the application, it should ensure that
it does not generate sustained high-priority data at a rate
exceeding the network throughput for a prolonged period of time.
Minion does not attempt to provide proportional or weighted
bandwidth allocation between different priority levels.
The Minion model is that if message B has a Receiver Ordering or
Sender Ordering dependency upon message A, then Minion should not
expend any available throughput delivering any part of message B
until after message A has been entirely sent. Similarly, if
message C is higher priority than messages A and B, then Minion
Iyengar, et al. Expires April 24, 2014 [Page 9]
Internet-Draft Minion Conceptual API October 2013
should not expend any available throughput delivering any part of
messages A or B until after message C has been entirely sent.
Because of the strict priority policy, a Minion message may have a
Receiver Ordering or Sender Ordering dependency on another message
of equal or higher priority, but not on a message of lower
priority.
The priority of a Minion message cannot be changed mid-
transmission.
DTLS Security
Minion includes security support. Because of the potential for
out-of-order message reception, Minion uses DTLS (which includes
an explicit record number) instead of TLS (which assumes strictly-
ordered delivery over TCP).
Iyengar, et al. Expires April 24, 2014 [Page 10]
Internet-Draft Minion Conceptual API October 2013
3. Stream-Free Operation
Minion eschews the notion of multiple message streams multiplexed on
a single connection. Rather than requiring every message to be
assigned to one particular ordered stream (or assigned to no steam
for unordered messages), each Minion message stands alone. Messages
are ordered and scheduled not by virtue of what linear ordered stream
they are assigned to, but by virtue of explicitly expressed message
dependencies.
Iyengar, et al. Expires April 24, 2014 [Page 11]
Internet-Draft Minion Conceptual API October 2013
4. Conceptual API
While different implementations in different languages may provide
APIs that differ in details and programming model, the common
conceptual framework of Minion APIs is as follows:
o Outbound Connections
* Create new Minion Connection to remote peer, with handler
function or object to receive incoming messages.
* Close Minion Connection when it is no longer needed.
o Inbound Connections
* Listen on a port for incoming connections.
* Upon receipt of incoming connection request, a new Minion
Connection object (substantially similar to the Outbound Minion
Connection object above) is generated, and delivered to the
application.
* Set handler function or object to handle incoming messages
received on an inbound connection.
* Stop listening for incoming connections.
o Sending Messages
* Create new Outbound Minion Message, associated with an existing
connection (either outbound or inbound), specifying the
priority level for the message.
* Optionally, indicate Sender Ordering for this message by
reference to some previously-created Minion message.
* Optionally, indicate Receiver Ordering for this message, or
Superseding for this message, or Chaining for this message, or
that this message is a reply to some previously received Minion
message. Note that these four options are mutually exclusive.
An outgoing message can be identified as a response to a
received message, or a subsequent member of a multi-part
message chain, or a message with a Receiver Ordering Message
Dependency, or a message that supersedes an earlier message,
but not more than one of these four things.
* Optionally, provide a reply handler function or object to
receive replies to this message.
Iyengar, et al. Expires April 24, 2014 [Page 12]
Internet-Draft Minion Conceptual API October 2013
* Provide (possibly incomplete) data for the message.
* Optionally, add further units of data to the message.
* Indicate when message is complete. This tells the Minion
implementation layer that it should now send the message.
Alternatively, a message can also be cancelled if it is no
longer needed.
* Dispose of the message when it is no longer needed.
o Receiving Messages
* Upon receipt of a message, a handler function or object is
handed a new inbound message:
* If the message is a chained continuation message, and a
specific handler exists for that chain, then that specific
handler is invoked.
* Else, if the message is a reply, and a specific handler exists
for the originating message, then that specific handler is
invoked.
* Else, the Minion Connection's generic message handler is
invoked.
* Read data from the message. For large messages, this may not
be the entire message. After one or more reads, a return code
(or similar) indicates to the application when the message is
complete (or alternatively, that is is incomplete, and will not
be completed, because it has been cancelled by the sender).
* The application may decide to reject a message before it has
been entirely received, by canceling it.
* The message handler may generate outbound messages in response
to the received message, including outbound explicit reply
messages, outbound chained messages, and simple outbound
standalone messages.
* Dispose of the received message when it is no longer needed.
Iyengar, et al. Expires April 24, 2014 [Page 13]
Internet-Draft Minion Conceptual API October 2013
5. Connection-Level Flow Control
Minion operates using connection-level flow control. Not having
streams, Minion does not implement per-stream flow control; nor does
it implement message-level flow control. This means that if a Minion
client sends a message or messages to a peer that fails to consume
those messages, then the entire connection could become staled until
those messages are consumed. For this reason Minion clients should
refrain from sending messages that the receiving end is not in a
position to consume. Minion supports arbitrarily large messages, but
this facility should only be used in cases where the receiver is
expected to be able to consume the message as it arrives.
[Should we discover that there are credible cases where message-level
flow control proves to be necessary, this design decision can be
reconsidered.]
6. Client Isolation
Minion allows multiple messages to share the available throughput of
a single connection. The sources of those multiple messages (if not
the same application) are assumed to be mutually trusting. Minion
does not attempt to prevent one message source on a connection from
consuming an unfair share of the bandwidth, nor does Minion attempt
to guard against a client that fails to read its messages, causing
the receive window to close, thereby preventing any messages from
being received.
In the event that some proxy or similar technology allows multiple
mutually untrusting clients to share a single Minion connection, that
application-layer code that is allowing the single Minion connection
to be shared is responsible for policing the traffic so that the
single Minion connection is shared reasonably.
Iyengar, et al. Expires April 24, 2014 [Page 14]
Internet-Draft Minion Conceptual API October 2013
7. TCP-derived benefits
By virtue of building upon TCP, Minion is able to leverage work being
done to enhance TCP. For example, Minion could eliminate the three
way handshake to set up a new connection via the new developments
being made surrounding TCP Fast Open.
Similarly, Minion could offer mobility and load balancing across
multiple paths by building on Multipath TCP. In addition, future
extensions to Minion could implement Multipath functionality using
messages at the Minion-layer rather than using TCP options, which
could improve compatibility with middleboxes that drop packets
containing TCP options.
8. IANA Considerations
No IANA actions are required by this document.
9. Security Considerations
No new security risks occur as a result of using this protocol.
10. Acknowledgements
Many thanks to Bryan Ford, Padma Bhooma and Anumita Biswas for their
contributions to the development of Minion.
11. References
11.1. Normative References
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
August 1980.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, September 1981.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
11.2. Informative References
[minprot] Iyengar, J., "Minion - Wire Protocol",
draft-iyengar-minion-protocol-02 (work in progress),
Iyengar, et al. Expires April 24, 2014 [Page 15]
Internet-Draft Minion Conceptual API October 2013
October 2013.
Authors' Addresses
Janardhan Iyengar
Franklin and Marshall College
Mathematics and Computer Science
PO Box 3003
Lancaster, Pennsylvania 17604-3003
USA
Phone: +1 717 358 4774
Email: janardhan.iyengar@fandm.edu
Stuart Cheshire
Apple Inc.
1 Infinite Loop
Cupertino, California 95014
USA
Phone: +1 408 974 3207
Email: cheshire@apple.com
Josh Graessley
Apple Inc.
1 Infinite Loop
Cupertino, California 95014
USA
Phone: +1 408 974 5710
Email: jgraessley@apple.com
Iyengar, et al. Expires April 24, 2014 [Page 16]