DetNet | N. Finn |
Internet-Draft | Huawei Technologies Co. Ltd |
Intended status: Standards Track | B. Varga |
Expires: May 3, 2018 | J. Farkas |
Ericsson | |
October 30, 2017 |
DetNet Bounded Latency
draft-finn-bounded-latency-00
This document a model for DetNet to achieve bounded latency and zero congestion loss using existing and in-progress standards from IEEE 802 and RFCs from IETF.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 3, 2018.
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The ability for IETF Deterministic Networking (DetNet) or IEEE 802.1 Time-Sensitive Networking (TSN) to provide bounded latency and zero congestion loss depends upon A) configuring and allocating network resources for the exclusive use of DetNet/TSN flows; B) identifying, in the data plane, the resources to be utilized by any given packet, and C) the detailed behavior of those resources, especially transmission queue selection, so that latency bounds can be reliably assured. Thus, DetNet is an example of an INTSERV Guaranteed Quality of Service [RFC2212]
As explained in [I-D.ietf-detnet-architecture], DetNet flows are characterized by 1) a maximum bandwidth, guaranteed either by the transmitter or by strict input metering; and 2) a requirement for a guaranteed worst-case end-to-end latency. That latency guarantee, in turn, provides the opportunity to supply enough buffer space to guarantee zero congestion loss. To be of use to the applications identified in [I-D.ietf-detnet-use-cases], it must be possible to calculate, before the transmission of a DetNet flow commences, the worst-case network latency and the amount of buffer space required at each hop to ensure against congestion loss. The detailed behavior of the mechanism(s) used to select the next packet for transmission at each output port is critical in making this determination. A detailed timing model, breaking down the time taken for each packet to traverse each element in the model, along with possible variations, is required, because seemingly minor implementation variations can generate large uncertainties in the number of required buffers. Such inconsistencies must be identified, and where possible, minimized. This timing model must also include non-TSN/DetNet queuing techniques insofar their use can affect the DetNet flows.
The IEEE 802.1 Working Group has standardized a number of specific techniqueues that can be used by routers or hosts. These documents include [IEEE8021Q] (Clause 34), [IEEE802.1Qch], [IEEE802.1Qci], [IEEE8021Qbv], [IEEE8021Qbu], [IEEE8023br].
[[NOTE (to be removed from a future revision): The queuing and transmission selection methods defined in IEEE 802.1Q and its amendments are all in the context of implementing those methods in an 802.1Q bridge; they are not all specified for use in an end station, much less in a router. It is the intention of the authors of this draft to create a document in some Standards Development Organization (SDO) that provides normative reference points for a document from any SDO describing any device, e.g. a host or a router. That would make the 802.1 queuing techniques readily available to a router or host. As that document develops, so too will this draft evolve.]]
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The lowercase forms with an initial capital "Must", "Must Not", "Shall", "Shall Not", "Should", "Should Not", "May", and "Optional" in this document are to be interpreted in the sense defined in [RFC2119], but are used where the normative behavior is defined in documents published by SDOs other than the IETF.
This document uses the terms defined in [I-D.ietf-detnet-architecture].
In Figure 1 we see a breakdown of the per-hop latency experienced by a packet in terms that are suitable for computing both hop-by-hop latency, and per-hop buffer requirements.
DetNet relay node A DetNet relay node B +-----------------+ +-----------------+ | Queue | | Queue | | +-+-+-+ | | +-+-+-+ | -->+ | | | + +------->+ | | | + +---> | +-+-+-+ | | +-+-+-+ | | | | | +-----------------+ +-----------------+ |<----->|<--->|<->|<------>|<----->|<--->|<->|<-- 2,3 4 5 1 2,3 4 5 1 2,3 1: Output delay 3: Preemption delay 2: Link delay 4: Processing delay 5: Queuing delay
Figure 1: Timing model for DetNet or TSN
In Figure 1, we see two DetNet relay nodes (typically, bridges or routers), with a wired link between them. In this model, the only queues we deal with explicitly are attached to the output port; other queues are modeled as variations in the other delay times. (E.g., an input queue could be modeled as either a variation in the link delay [2] or the processing delay [4].) There are five delays that a packet can experience from hop to hop.
Not shown in Figure 1 are the other output queues that we presume are also attached to that same output port as the queue shown, and against which this shown queue competes for transmission opportunities.
The initial and final measurement point in this analysis (that is, the definition of a "hop") is the point at which a packet is selected for output. In general, any queue selection method that is suitable for use in a DetNet network includes a detailed specification as to exactly when packets are selected for transmission. Any variations in any of the delay times 1-4 result in a need for additional buffers in the queue. If all delays 1-4 are constant, then any variation in the time at which packets are inserted into a queue depends entirely on the timing of packet selection in the previous node. If the delays 1-4 are not constant, then additional buffers are required in the queue to absorb these variations. Thus:
When the input rate to an output queue exceeds the output rate for a sufficient length of time, the queue must overflow. This is congestion loss, and this is what deterministic networking seeks to avoid.
Imagine a completely saturated DetNet network, in which all is part of some number of DetNet flows, and 100% of each link's bandwidth is allocated to some number of DetNet Flows using that link. Every source is transmitting at exactly its allotted rate. The DetNet flows traverse the network in all directions; no two DetNet flows take exactly the same path through the network. Imagine that there are no variations in the output delay (1), link delay (2), and processing delay (4), and there is no preemption delay (3).
Imagine now that one DetNet flow, DetNet flow A, stops. On some output port through which DetNet flow A was passing, when the transmission opportunity for one of DetNet flow A's packets comes up, the DetNet relay node must either output nothing, or output a packet belonging to some other DetNet flow B. If it outputs a packet from DetNet flow B, then in the long term, it is exceeding the normal rate for DetNet flow B, and runs the risk of overflowing the queues for DetNet flow B in the next hop. With sufficient analysis, it may be possible to determine the limits for how much excess data in DetNet flow B, or DetNet flow C, from this and from other ports feeding the next hop, can be accommodated before causing an overflow.
However, this analysis is very difficult. DetNet avoids the analysis by transmitting nothing (or transmitting a non-DetNet packet) when it has nothing to transmit for a given DetNet flow. This leads to DetNet making the following requirement for DetNet relay nodes:
For every DetNet flow traversing a DetNet relay node, sufficient data is buffered in that a DetNet relay node to ensure that a transmission opportunity for that DetNet flow is never missed, unless the source of the DetNet flow slows or stops. That is, for every DetNet flow, over some finite time scale, the input rate equals the output rate.
Sophisticated QoS mechanisms are available in Layer 3 (L3), see, e.g., [RFC7806] for an overview. In general, we assume that "Layer 3" queues, shapers, meters, etc., are instantiated hierarchically above the "Layer 2" queuing mechanisms, among which packets compete for opportunities to be transmitted on a physical (or sometimes, logical) medium. These "Layer 2 queuing mechanisms" are not the province solely of bridges; they are an essential part of any DetNet relay node. As illustrated by numerous implementation examples, the "Layer 3" some of mechanisms described in documents such as [RFC7806] are often integrated, in an implementation, with the "Layer 2" mechanisms also implemented in the same system. An integrated model is needed in order to successfully predict the interactions among the different queuing mechanisms needed in a network carrying both DetNet flows and non-DetNet flows. See Section 6 for a more complete discussion of the expanded model.
Figure 2 shows the (very simple) model for the flow of packets through the queues of an IEEE 802.1Q bridge. Packets are assigned to a class of service. The classes of service are mapped to some number of physical FIFO queues. IEEE 802.1Q allows a maximum of 8 classes of service, but it is more common to implement 2 or 4 queues on most ports.
| +--------------V---------------+ | Class of Service Assignment | +--+-------+---------------+---+ | | | +--V--+ +--V--+ +--V--+ |Class| |Class| |Class| | 0 | | 1 | . . . | n | |queue| |queue| |queue| +--+--+ +--+--+ +--+--+ | | | +--V-------V---------------V--+ | Transmission selection | +--------------+--------------+ | V
Figure 2: IEEE 802.1Q Queuing Model: Data flow
Some relevant mechanisms are hidden in this figure, and are performed in the "Class n queue" box:
The Class of Service Assignment function can be quite complex, since the introduction of [IEEE802.1Qci]. In addition to the Layer 2 priority expressed in the 802.1Q VLAN tag, a bridge can utilize any of the following information to assign a packet to a particular class of service (queue):
The "Transmission selection" function decides which queue is to transfer its oldest packet to the output port when a transmission opportunity arises.
Figure 2 must be modified if the output port supports preemption ([IEEE8021Qbu] and [IEEE8023br]). This modification is shown in Figure 3.
| +------------------------------V------------------------------+ | Class of Service Assignment | +--+-------+-------+-------+-------+-------+-------+-------+--+ | | | | | | | | +--V--+ +--V--+ +--V--+ +--V--+ +--V--+ +--V--+ +--V--+ +--V--+ |Class| |Class| |Class| |Class| |Class| |Class| |Class| |Class| | a | | b | | c | | d | | e | | f | | g | | h | |queue| |queue| |queue| |queue| |queue| |queue| |queue| |queue| +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ | | | +-+ | | | | | | | | | | | | +--V-------V-------V------+ +V-----V-------V-------V-------V--+ | Interrupted xmit select | | Preempting xmit select | 802.1 +-------------+-----------+ +----------------+----------------+ | | ====== +-------------V-----------+ +----------------V----------------+ | Preemptible MAC | | Express MAC | 802.3 +--------+----------------+ +----------------+----------------+ | | +--------V-----------------------------------V----------------+ | MAC merge sublayer | +--------------------------+----------------------------------+ | +--------------------------V----------------------------------+ | PHY (unaware of preemption) | +--------------------------+----------------------------------+ | V
Figure 3: IEEE 802.1Q Queuing Model: Data flow with preemption
From Figure 3, we can see that, in the IEEE 802 model, the preemption feature is modeled as consisting of two MAC/PHY stacks, one for packets that can be interrupted, and one for packets that can interrupt the interruptible packets. The Class of Service (queue) determines which packets are which. In Figure 3, the classes of service are marked "a, b, ..." instead of with numbers, in order to avoid any implication about which numeric Layer 2 priority values correspond to preemptible or preempting queues. Although it shows three queues going to the preemptible MAC/PHY, any assignment is possible.
In Figure 4, we expand the "Transmission selection" function of Figure 3.
Figure 4 does NOT show the data path. It shows an example of a configuration of the IEEE 802.1Q transmission selection box shown in Figure 2 and Figure 3. Each queue m presents a "Class m Ready" signal. These signals go through various logic, filters, and state machines, until a single queue's "not empty" signal is chosen for presentation to the underlying MAC/PHY. When the MAC/PHY is ready to take another output packet, then a packet is selected from the one queue (if any) whose signal manages to pass all the way through the transmission selection function.
+-----+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ |Class| |Class| |Class| |Class| |Class| |Class| |Class| |Class| | 1 | | 0 | | 4 | | 5 | | 6 | | 7 | | 2 | | 3 | |Ready| |Ready| |Ready| |Ready| |Ready| |Ready| |Ready| |Ready| +--+--+ +--+--+ +--+--+ +-XXX-+ +--+--+ +--+--+ +--+--+ +--+--+ | | | | | | | | +--V--+ +--V--+ +--+--+ +--V--+ | +--V--+ +--V--+ | |Prio.| |Prio.| |Prio.| |Prio.| | |Sha- | |Sha- | | | 0 | | 4 | | 5 | | 6 | | | per| | per| | | PFC | | PFC | | PFC | | PFC | | | A | | B | | +--+--+ +--+--+ +-XXX-+ +-XXX-+ | +--+--+ +-XXX-+ | | | | | +--V--+ +--V--+ +--V--+ +--+--+ +--+--+ +--V--+ +--V--+ +--+--+ |Time | |Time | |Time | |Time | |Time | |Time | |Time | |Time | | Gate| | Gate| | Gate| | Gate| | Gate| | Gate| | Gate| | Gate| | 1 | | 0 | | 4 | | 5 | | 6 | | 7 | | 2 | | 3 | +--+--+ +-XXX-+ +--+--+ +--+--+ +-XXX-+ +--+--+ +-XXX-+ +--+--+ | | | +--V-------+-------V-------+--+ | |802.1Q Enhanced Transmission | | | Selection (ETS) = Weighted | | | Fair Queuing (WFQ) | | +--+-------+------XXX------+--+ | | | +--V-------+-------+-------+-------+-------V-------+-------+--+ | Strict Priority selection (rightmost first) | +-XXX------+-------+-------+-------+-------+-------+-------+--+ | V
Figure 4: 802.1Q Transmission Selection
The following explanatory notes apply to Figure 4
Using the model of Section 4, we can model any system, even one that is very complex, including separate line cards, MAC/PHY modules, mid-planes, backplanes, control/forwarding boards, etc. However, in a complex case, the variations in the processing delay (4) may become so large as to make any latency or buffer requirement analysis relatively useless.
If a DetNet node is sufficiently complex that simply assigning a minimum and maximum to the some delay (typically, the processing delay, 4) results in insufficiently accurate computations for latency or buffer requirements, the DetNet node can be modeled as a federation of DetNet relay nodes, each conforming to the model.
In the simplest example, system with input queues on each port could be modeled having a two-port DetNet relay node inserted into each input port, each with some number of output queues (which model the input queues).
Extending the models described in Section 5 to routers requires a number of steps:
A QoS architecture integrating both Layer 3 and Layer 2 features is necessary to exploit the benefits provided by the different layers if a DetNet network includes link(s) or sub-network(s) equipped with TSN features. For instance, it can be crucial for a time-critical DetNet flow to leverage TSN features in a Layer 2 sub-network in order to meet the DetNet flow's requirements, which may be spoiled otherwise.
Figure 5 provides a theoretical illustration for the integration of the Layer 3 and Layer 2 QoS architecture. The figure only shows the queuing after the routing decision. The figure also illustrates potential implementation dependent borders (Brdr). The borders shown in the figure are critical in the sense that the high priority DetNet flows may, in some implementations, have to be transferred via a different Service Access Points (SAPs) through these borders than the low priority (background) flows. Having a single SAP for these very different traffic types may result in possible QoS degradation for the DetNet flows because packets of other flows could delay the transmission of DetNet packets. For instance, different SAPs are needed for the DetNet flows and other flows when they get to Layer 3 queuing after the routing decision via Brdr-d. Furthermore, a different SAP may be needed for DetNet packets than other packets when they get to Layer 2 queuing from Layer 3 queuing via Brdr-c. Certainly, in the 802.1/802.3 model, different SAPs are needed for the express and for the preemptible frames when they get to the MAC layer from Layer 2 queuing via Brdr-b, which is provided by the IEEE 802.1Q architecture as shown in Figure 3. It depends on the implementation whether or not Brdr-a exists.
| +---------------V-----------+ | Forwarding | +--------+----------+--+----+ | | | === Brdr-d +--------V--------+ | | | CoS Assignment | | | +-----------------+ | | |Que-|Que-|..|Que-| | | Layer 3 queuing | ue | ue |..| ue | | | and shaping +-----------------+ | | (optional) | Xmit selection | | | +--------+----+---+ | | | | | | === Brdr-c +-V----V-----V--V-+ | CoS Assignment | +-----------------+ Layer 2 queuing |Que-|Que-|..|Que-| and shapng | ue | ue |..| ue | (always present) +-----------------+ | Xmit selection | +--+-----------+--+ | | === Brdr-b +------V----+ +---V-------+ |Preemptible| | Express | | MAC | | MAC | +------+----+ +----+------+ | | === Brdr-a +------V------------V------+ | PHY | +------------+-------------+ | V
Figure 5: Combined L2/L3 Queueing Data Model