Internet Engineering Task Force E. Grossman, Ed.
Internet-Draft DOLBY
Intended status: Informational C. Gunther
Expires: April 21, 2016 HARMAN
P. Thubert
P. Wetterwald
CISCO
J. Raymond
HYDRO-QUEBEC
J. Korhonen
BROADCOM
Y. Kaneko
Toshiba
S. Das
Applied Communication Sciences
Y. Zha
HUAWEI
October 19, 2015

Deterministic Networking Use Cases
draft-grossman-detnet-use-cases-00

Abstract

This draft documents requirements in several diverse industries to establish multi-hop paths for characterized flows with deterministic properties. In this context deterministic implies that streams can be established which provide guaranteed bandwidth and latency which can be established from either a Layer 2 or Layer 3 (IP) interface, and which can co-exist on an IP network with best-effort traffic.

Additional requirements include optional redundant paths, very high reliability paths, time synchronization, and clock distribution. Industries considered include wireless for industrial applications, professional audio, electrical utilities, building automation systems, radio/mobile access networks, automotive, and gaming.

For each case, this document will identify the application, identify representative solutions used today, and what new uses an IETF DetNet solution may enable.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 21, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Pro Audio Use Cases

(This section was derived from draft-gunther-detnet-proaudio-req-01)

1.1. Introduction

The professional audio and video industry includes music and film content creation, broadcast, cinema, and live exposition as well as public address, media and emergency systems at large venues (airports, stadiums, churches, theme parks). These industries have already gone through the transition of audio and video signals from analog to digital, however the interconnect systems remain primarily point-to-point with a single (or small number of) signals per link, interconnected with purpose-built hardware.

These industries are now attempting to transition to packet based infrastructure for distributing audio and video in order to reduce cost, increase routing flexibility, and integrate with existing IT infrastructure.

However, there are several requirements for making a network the primary infrastructure for audio and video which are not met by todays networks and these are our concern in this draft.

The principal requirement is that pro audio and video applications become able to establish streams that provide guaranteed (bounded) bandwidth and latency from the Layer 3 (IP) interface. Such streams can be created today within standards-based layer 2 islands however these are not sufficient to enable effective distribution over wider areas (for example broadcast events that span wide geographical areas).

Some proprietary systems have been created which enable deterministic streams at layer 3 however they are engineered networks in that they require careful configuration to operate, often require that the system be over designed, and it is implied that all devices on the network voluntarily play by the rules of that network. To enable these industries to successfully transition to an interoperable multi-vendor packet-based infrastructure requires effective open standards, and we believe that establishing relevant IETF standards is a crucial factor.

It would be highly desirable if such streams could be routed over the open Internet, however even intermediate solutions with more limited scope (such as enterprise networks) can provide a substantial improvement over todays networks, and a solution that only provides for the enterprise network scenario is an acceptable first step.

We also present more fine grained requirements of the audio and video industries such as safety and security, redundant paths, devices with limited computing resources on the network, and that reserved stream bandwidth is available for use by other best-effort traffic when that stream is not currently in use.

1.2. Fundamental Stream Requirements

The fundamental stream properties are guaranteed bandwidth and deterministic latency as described in this section. Additional stream requirements are described in a subsequent section.

1.2.1. Guaranteed Bandwidth

Transmitting audio and video streams is unlike common file transfer activities because guaranteed delivery cannot be achieved by re-trying the transmission; by the time the missing or corrupt packet has been identified it is too late to execute a re-try operation and stream playback is interrupted, which is unacceptable in for example a live concert. In some contexts large amounts of buffering can be used to provide enough delay to allow time for one or more retries, however this is not an effective solution when live interaction is involved, and is not considered an acceptable general solution for pro audio and video. (Have you ever tried speaking into a microphone through a sound system that has an echo coming back at you? It makes it almost impossible to speak clearly).

Providing a way to reserve a specific amount of bandwidth for a given stream is a key requirement.

1.2.2. Bounded and Consistent Latency

Latency in this context means the amount of time that passes between when a signal is sent over a stream and when it is received, for example the amount of time delay between when you speak into a microphone and when your voice emerges from the speaker. Any delay longer than about 10-15 milliseconds is noticeable by most live performers, and greater latency makes the system unusable because it prevents them from playing in time with the other players (see slide 6 of [SRP_LATENCY]).

The 15ms latency bound is made even more challenging because it is often the case in network based music production with live electric instruments that multiple stages of signal processing are used, connected in series (i.e. from one to the other for example from guitar through a series of digital effects processors) in which case the latencies add, so the latencies of each individual stage must all together remain less than 15ms.

In some situations it is acceptable at the local location for content from the live remote site to be delayed to allow for a statistically acceptable amount of latency in order to reduce jitter. However, once the content begins playing in the local location any audio artifacts caused by the local network are unacceptable, especially in those situations where a live local performer is mixed into the feed from the remote location.

In addition to being bounded to within some predictable and acceptable amount of time (which may be 15 milliseconds or more or less depending on the application) the latency also has to be consistent. For example when playing a film consisting of a video stream and audio stream over a network, those two streams must be synchronized so that the voice and the picture match up. A common tolerance for audio/video sync is one NTSC video frame (about 33ms) and to maintain the audience perception of correct lip sync the latency needs to be consistent within some reasonable tolerance, for example 10%.

A common architecture for synchronizing multiple streams that have different paths through the network (and thus potentially different latencies) is to enable measurement of the latency of each path, and have the data sinks (for example speakers) buffer (delay) all packets on all but the slowest path. Each packet of each stream is assigned a presentation time which is based on the longest required delay. This implies that all sinks must maintain a common time reference of sufficient accuracy, which can be achieved by any of various techniques.

This type of architecture is commonly implemented using a central controller that determines path delays and arbitrates buffering delays.

1.2.2.1. Optimizations

The controller might also perform optimizations based on the individual path delays, for example sinks that are closer to the source can inform the controller that they can accept greater latency since they will be buffering packets to match presentation times of farther away sinks. The controller might then move a stream reservation on a short path to a longer path in order to free up bandwidth for other critical streams on that short path. See slides 3-5 of [SRP_LATENCY].

Additional optimization can be achieved in cases where sinks have differing latency requirements, for example in a live outdoor concert the speaker sinks have stricter latency requirements than the recording hardware sinks. See slide 7 of [SRP_LATENCY].

Device cost can be reduced in a system with guaranteed reservations with a small bounded latency due to the reduced requirements for buffering (i.e. memory) on sink devices. For example, a theme park might broadcast a live event across the globe via a layer 3 protocol; in such cases the size of the buffers required is proportional to the latency bounds and jitter caused by delivery, which depends on the worst case segment of the end-to-end network path. For example on todays open internet the latency is typically unacceptable for audio and video streaming without many seconds of buffering. In such scenarios a single gateway device at the local network that receives the feed from the remote site would provide the expensive buffering required to mask the latency and jitter issues associated with long distance delivery. Sink devices in the local location would have no additional buffering requirements, and thus no additional costs, beyond those required for delivery of local content. The sink device would be receiving the identical packets as those sent by the source and would be unaware that there were any latency or jitter issues along the path.

1.3. Additional Stream Requirements

The requirements in this section are more specific yet are common to multiple audio and video industry applications.

1.3.1. Deterministic Time to Establish Streaming

Some audio systems installed in public environments (airports, hospitals) have unique requirements with regards to health, safety and fire concerns. One such requirement is a maximum of 3 seconds for a system to respond to an emergency detection and begin sending appropriate warning signals and alarms without human intervention. For this requirement to be met, the system must support a bounded and acceptable time from a notification signal to specific stream establishment. For further details see [ISO7240-16].

Similar requirements apply when the system is restarted after a power cycle, cable re-connection, or system reconfiguration.

In many cases such re-establishment of streaming state must be achieved by the peer devices themselves, i.e. without a central controller (since such a controller may only be present during initial network configuration).

Video systems introduce related requirements, for example when transitioning from one camera feed to another. Such systems currently use purpose-built hardware to switch feeds smoothly, however there is a current initiative in the broadcast industry to switch to a packet-based infrastructure (see [STUDIO_IP] and the ESPN DC2 use case described below).

1.3.2. Use of Unused Reservations by Best-Effort Traffic

In cases where stream bandwidth is reserved but not currently used (or is under-utilized) that bandwidth must be available to best-effort (i.e. non-time-sensitive) traffic. For example a single stream may be nailed up (reserved) for specific media content that needs to be presented at different times of the day, ensuring timely delivery of that content, yet in between those times the full bandwidth of the network can be utilized for best-effort tasks such as file transfers.

This also addresses a concern of IT network administrators that are considering adding reserved bandwidth traffic to their networks that users will just reserve a ton of bandwidth and then never un-reserve it even though they are not using it, and soon they will have no bandwidth left.

1.3.3. Layer 3 Interconnecting Layer 2 Islands

As an intermediate step (short of providing guaranteed bandwidth across the open internet) it would be valuable to provide a way to connect multiple Layer 2 networks. For example layer 2 techniques could be used to create a LAN for a single broadcast studio, and several such studios could be interconnected via layer 3 links.

1.3.4. Secure Transmission

Digital Rights Management (DRM) is very important to the audio and video industries. Any time protected content is introduced into a network there are DRM concerns that must be maintained (see [CONTENT_PROTECTION]). Many aspects of DRM are outside the scope of network technology, however there are cases when a secure link supporting authentication and encryption is required by content owners to carry their audio or video content when it is outside their own secure environment (for example see [DCI]).

As an example, two techniques are Digital Transmission Content Protection (DTCP) and High-Bandwidth Digital Content Protection (HDCP). HDCP content is not approved for retransmission within any other type of DRM, while DTCP may be retransmitted under HDCP. Therefore if the source of a stream is outside of the network and it uses HDCP protection it is only allowed to be placed on the network with that same HDCP protection.

1.3.5. Redundant Paths

On-air and other live media streams must be backed up with redundant links that seamlessly act to deliver the content when the primary link fails for any reason. In point-to-point systems this is provided by an additional point-to-point link; the analogous requirement in a packet-based system is to provide an alternate path through the network such that no individual link can bring down the system.

1.3.6. Link Aggregation

For transmitting streams that require more bandwidth than a single link in the target network can support, link aggregation is a technique for combining (aggregating) the bandwidth available on multiple physical links to create a single logical link of the required bandwidth. However, if aggregation is to be used, the network controller (or equivalent) must be able to determine the maximum latency of any path through the aggregate link (see Bounded and Consistent Latency section above).

1.3.7. Traffic Segregation

Sink devices may be low cost devices with limited processing power. In order to not overwhelm the CPUs in these devices it is important to limit the amount of traffic that these devices must process.

As an example, consider the use of individual seat speakers in a cinema. These speakers are typically required to be cost reduced since the quantities in a single theater can reach hundreds of seats. Discovery protocols alone in a one thousand seat theater can generate enough broadcast traffic to overwhelm a low powered CPU. Thus an installation like this will benefit greatly from some type of traffic segregation that can define groups of seats to reduce traffic within each group. All seats in the theater must still be able to communicate with a central controller.

There are many techniques that can be used to support this requirement including (but not limited to) the following examples.

1.3.7.1. Packet Forwarding Rules, VLANs and Subnets

Packet forwarding rules can be used to eliminate some extraneous streaming traffic from reaching potentially low powered sink devices, however there may be other types of broadcast traffic that should be eliminated using other means for example VLANs or IP subnets.

1.3.7.2. Multicast Addressing (IPv4 and IPv6)

Multicast addressing is commonly used to keep bandwidth utilization of shared links to a minimum.

Because of the MAC Address forwarding nature of Layer 2 bridges it is important that a multicast MAC address is only associated with one stream. This will prevent reservations from forwarding packets from one stream down a path that has no interested sinks simply because there is another stream on that same path that shares the same multicast MAC address.

Since each multicast MAC Address can represent 32 different IPv4 multicast addresses there must be a process put in place to make sure this does not occur. Requiring use of IPv6 address can achieve this, however due to their continued prevalence, solutions that are effective for IPv4 installations are also required.

1.4. Integration of Reserved Streams into IT Networks

A commonly cited goal of moving to a packet based media infrastructure is that costs can be reduced by using off the shelf, commodity network hardware. In addition, economy of scale can be realized by combining media infrastructure with IT infrastructure. In keeping with these goals, stream reservation technology should be compatible with existing protocols, and not compromise use of the network for best effort (non-time-sensitive) traffic.

1.5. Security Considerations

Many industries that are moving from the point-to-point world to the digital network world have little understanding of the pitfalls that they can create for themselves with improperly implemented network infrastructure. DetNet should consider ways to provide security against DoS attacks in solutions directed at these markets. Some considerations are given here as examples of ways that we can help new users avoid common pitfalls.

1.5.1. Denial of Service

One security pitfall that this author is aware of involves the use of technology that allows a presenter to throw the content from their tablet or smart phone onto the A/V system that is then viewed by all those in attendance. The facility introducing this technology was quite excited to allow such modern flexibility to those who came to speak. One thing they hadn't realized was that since no security was put in place around this technology it left a hole in the system that allowed other attendees to "throw" their own content onto the A/V system.

1.5.2. Control Protocols

Professional audio systems can include amplifiers that are capable of generating hundreds or thousands of watts of audio power which if used incorrectly can cause hearing damage to those in the vicinity. Apart from the usual care required by the systems operators to prevent such incidents, the network traffic that controls these devices must be secured (as with any sensitive application traffic). In addition, it would be desirable if the configuration protocols that are used to create the network paths used by the professional audio traffic could be designed to protect devices that are not meant to receive high-amplitude content from having such potentially damaging signals routed to them.

1.6. A State-of-the-Art Broadcast Installation Hits Technology Limits

ESPN recently constructed a state-of-the-art 194,000 sq ft, $125 million broadcast studio called DC2. The DC2 network is capable of handling 46 Tbps of throughput with 60,000 simultaneous signals. Inside the facility are 1,100 miles of fiber feeding four audio control rooms. (See details at [ESPN_DC2] ).

In designing DC2 they replaced as much point-to-point technology as they possibly could with packet-based technology. They constructed seven individual studios using layer 2 LANS (using IEEE 802.1 AVB) that were entirely effective at routing audio within the LANs, and they were very happy with the results, however to interconnect these layer 2 LAN islands together they ended up using dedicated links because there is no standards-based routing solution available.

This is the kind of motivation we have to develop these standards because customers are ready and able to use them.

1.7. Acknowledgements

The editors would like to acknowledge the help of the following individuals and the companies they represent:

Jeff Koftinoff, Meyer Sound

Jouni Korhonen, Associate Technical Director, Broadcom

Pascal Thubert, CTAO, Cisco

Kieran Tyrrell, Sienda New Media Technologies GmbH

2. Utility Telecom Use Cases

(This section was derived from draft-wetterwald-detnet-utilities-reqs-02)

2.1. Overview

[I-D.finn-detnet-problem-statement] defines the characteristics of a deterministic flow as a data communication flow with a bounded latency, extraordinarily low frame loss, and a very narrow jitter. This document intends to define the utility requirements for deterministic networking.

Utility Telecom Networks

The business and technology trends that are sweeping the utility industry will drastically transform the utility business from the way it has been for many decades. At the core of many of these changes is a drive to modernize the electrical grid with an integrated telecommunications infrastructure. However, interoperability, concerns, legacy networks, disparate tools, and stringent security requirements all add complexity to the grid transformation. Given the range and diversity of the requirements that should be addressed by the next generation telecommunications infrastructure, utilities need to adopt a holistic architectural approach to integrate the electrical grid with digital telecommunications across the entire power delivery chain.

Many utilities still rely on complex environments formed of multiple application-specific, proprietary networks. Information is siloed between operational areas. This prevents utility operations from realizing the operational efficiency benefits, visibility, and functional integration of operational information across grid applications and data networks. The key to modernizing grid telecommunications is to provide a common, adaptable, multi-service network infrastructure for the entire utility organization. Such a network serves as the platform for current capabilities while enabling future expansion of the network to accommodate new applications and services.

To meet this diverse set of requirements, both today and in the future, the next generation utility telecommunnications network will be based on open-standards-based IP architecture. An end-to-end IP architecture takes advantage of nearly three decades of IP technology development, facilitating interoperability across disparate networks and devices, as it has been already demonstrated in many mission-critical and highly secure networks.

IEC (International Electrotechnical Commission) and different National Committees have mandated a specific adhoc group (AHG8) to define the migration strategy to IPv6 for all the IEC TC57 power automation standards. IPv6 is seen as the obvious future telecommunications technology for the Smart Grid. The Adhoc Group has disclosed, to the IEC coordination group, their conclusions at the end of 2014.

It is imperative that utilities participate in standards development bodies to influence the development of future solutions and to benefit from shared experiences of other utilities and vendors.

2.2. Telecommunications Trends and General telecommunications Requirements

These general telecommunications requirements are over and above the specific requirements of the use cases that have been addressed so far. These include both current and future telecommunications related requirements that should be factored into the network architecture and design.

2.2.1. General Telecommunications Requirements

2.2.1.1. Migration to Packet-Switched Network

Throughout the world, utilities are increasingly planning for a future based on smart grid applications requiring advanced telecommunications systems. Many of these applications utilize packet connectivity for communicating information and control signals across the utility's Wide Area Network (WAN), made possible by technologies such as multiprotocol label switching (MPLS). The data that traverses the utility WAN includes:

WANs support this wide variety of traffic to and from substations, the transmission and distribution grid, generation sites, between control centers, and between work locations and data centers. To maintain this rapidly expanding set of applications, many utilities are taking steps to evolve present time-division multiplexing (TDM) based and frame relay infrastructures to packet systems. Packet-based networks are designed to provide greater functionalities and higher levels of service for applications, while continuing to deliver reliability and deterministic (real-time) traffic support.

2.2.2. Applications, Use cases and traffic patterns

Among the numerous applications and use cases that a utility deploys today, many rely on high availability and deterministic behaviour of the telecommunications networks. Protection use cases and generation control are the most demanding and can't rely on a best effort approach.

2.2.2.1. Transmission use cases

Protection means not only the protection of the human operator but also the protection of the electric equipments and the preservation of the stability and frequency of the grid. If a default occurs on the transmission or the distribution of the electricity, important damages could occured to the human operator but also to very costly electrical equipments and perturb the grid leading to blackouts. The time and reliability requirements are very strong to avoid dramatic impacts to the electrical infrastructure.

2.2.2.1.1. Tele Protection

The key criteria for measuring Teleprotection performance are command transmission time, dependability and security. These criteria are defined by the IEC standard 60834 as follows:

Additional key elements that may impact Teleprotection performance include bandwidth rate of the Teleprotection system and its resiliency or failure recovery capacity. Transmission time, bandwidth utilization and resiliency are directly linked to the telecommunications equipments and the connections that are used to transfer the commands between relays.

2.2.2.1.1.1. Latency Budget Consideration

Delay requirements for utility networks may vary depending upon a number of parameters, such as the specific protection equipments used. Most power line equipment can tolerate short circuits or faults for up to approximately five power cycles before sustaining irreversible damage or affecting other segments in the network. This translates to total fault clearance time of 100ms. As a safety precaution, however, actual operation time of protection systems is limited to 70- 80 percent of this period, including fault recognition time, command transmission time and line breaker switching time. Some system components, such as large electromechanical switches, require particularly long time to operate and take up the majority of the total clearance time, leaving only a 10ms window for the telecommunications part of the protection scheme, independent of the distance to travel. Given the sensitivity of the issue, new networks impose requirements that are even more stringent: IEC standard 61850 limits the transfer time for protection messages to 1/4 - 1/2 cycle or 4 - 8ms (for 60Hz lines) for the most critical messages.

2.2.2.1.1.2. Asymetric delay

In addition to minimal transmission delay, a differential protection telecommunications channel must be synchronous, i.e., experiencing symmetrical channel delay in transmit and receive paths. This requires special attention in jitter-prone packet networks. While optimally Teleprotection systems should support zero asymmetric delay, typical legacy relays can tolerate discrepancies of up to 750us.

The main tools available for lowering delay variation below this threshold are:

2.2.2.1.1.2.1. Other traffic characteristics

2.2.2.1.1.2.2. Teleprotection network requirements

The following table captures the main network requirements (this is based on IEC 61850 standard)

Teleprotection network requirements
Teleprotection Requirement Attribute
One way maximum delay 4-10 ms
Asymetric delay required Yes
Maximum jitter less than 250 us (750 us for legacy IED)
Topology Point to point, point to Multi-point
Availability 99.9999
precise timing required Yes
Recovery time on node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 0.1% to 1%

2.2.2.1.2. Inter-Trip Protection scheme

Inter-tripping is the controlled tripping of a circuit breaker to complete the isolation of a circuit or piece of apparatus in concert with the tripping of other circuit breakers. The main use of such schemes is to ensure that protection at both ends of a faulted circuit will operate to isolate the equipment concerned. Inter-tripping schemes use signaling to convey a trip command to remote circuit breakers to isolate circuits.

Inter-Trip protection network requirements
Inter-Trip protection Requirement Attribute
One way maximum delay 5 ms
Asymetric delay required No
Maximum jitter Not critical
Topology Point to point, point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 0.1%

2.2.2.1.3. Current Differential Protection Scheme

Current differential protection is commonly used for line protection, and is typical for protecting parallel circuits. A main advantage for differential protection is that, compared to overcurrent protection, it allows only the faulted circuit to be de-energized in case of a fault. At both end of the lines, the current is measured by the differential relays, and based on Kirchhoff's law, both relays will trip the circuit breaker if the current going into the line does not equal the current going out of the line. This type of protection scheme assumes some form of communications being present between the relays at both end of the line, to allow both relays to compare measured current values. A fault in line 1 will cause overcurrent to be flowing in both lines, but because the current in line 2 is a through following current, this current is measured equal at both ends of the line, therefore the differential relays on line 2 will not trip line 2. Line 1 will be tripped, as the relays will not measure the same currents at both ends of the line. Line differential protection schemes assume a very low telecommunications delay between both relays, often as low as 5ms. Moreover, as those systems are often not time-synchronized, they also assume symmetric telecommunications paths with constant delay, which allows comparing current measurement values taken at the exact same time.

Current Differential Protection requirements
Current Differential protection Requirement Attribute
One way maximum delay 5 ms
Asymetric delay Required Yes
Maximum jitter less than 250 us (750us for legacy IED)
Topology Point to point, point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 0.1%

2.2.2.1.4. Distance Protection Scheme

Distance (Impedance Relay) protection scheme is based on voltage and current measurements. A fault on a circuit will generally create a sag in the voltage level. If the ratio of voltage to current measured at the protection relay terminals, which equates to an impedance element, falls within a set threshold the circuit breaker will operate. The operating characteristics of this protection are based on the line characteristics. This means that when a fault appears on the line, the impedance setting in the relay is compared to the apparent impedance of the line from the relay terminals to the fault. If the relay setting is determined to be below the apparent impedance it is determined that the fault is within the zone of protection. When the transmission line length is under a minimum length, distance protection becomes more difficult to coordinate. In these instances the best choice of protection is current differential protection.

Distance Protection requirements
Distance protection Requirement Attribute
One way maximum delay 5 ms
Asymetric delay Required No
Maximum jitter Not critical
Topology Point to point, point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 0.1%

2.2.2.1.5. Inter-Substation Protection Signaling

This use case describes the exchange of Sampled Value and/or GOOSE (Generic Object Oriented Substation Events) message between Intelligent Electronic Devices (IED) in two substations for protection and tripping coordination. The two IEDs are in a master-slave mode.

The Current Transformer or Voltage Transformer (CT/VT) in one substation sends the sampled analog voltage or current value to the Merging Unit (MU) over hard wire. The merging unit sends the time-synchronized 61850-9-2 sampled values to the slave IED. The slave IED forwards the information to the Master IED in the other substation. The master IED makes the determination (for example based on sampled value differentials) to send a trip command to the originating IED. Once the slave IED/Relay receives the GOOSE trip for breaker tripping, it opens the breaker. It then sends a confirmation message back to the master. All data exchanges between IEDs are either through Sampled Value and/or GOOSE messages.

Inter-Substation Protection requirements
Inter-Substation protection Requirement Attribute
One way maximum delay 5 ms
Asymetric delay Required No
Maximum jitter Not critical
Topology Point to point, point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 1%

2.2.2.1.6. Intra-Substation Process Bus Communications

This use case describes the data flow from the CT/VT to the IEDs in the substation via the merging unit (MU). The CT/VT in the substation send the sampled value (analog voltage or current) to the Merging Unit (MU) over hard wire. The merging unit sends the time-synchronized 61850-9-2 sampled values to the IEDs in the substation in GOOSE message format. The GPS Master Clock can send 1PPS or IRIG-B format to MU through serial port, or IEEE 1588 protocol via network. Process bus communication using 61850 simplifies connectivity within the substation and removes the requirement for multiple serial connections and removes the slow serial bus architectures that are typically used. This also ensures increased flexibility and increased speed with the use of multicast messaging between multiple devices.

Intra-Substation Protection requirements
Intra-Substation protection Requirement Attribute
One way maximum delay 5 ms
Asymetric delay Required No
Maximum jitter Not critical
Topology Point to point, point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on Node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes - No
Packet loss 0.1%

2.2.2.1.7. Wide Area Monitoring and Control Systems

The application of synchrophasor measurement data from Phasor Measurement Units (PMU) to Wide Area Monitoring and Control Systems promises to provide important new capabilities for improving system stability. Access to PMU data enables more timely situational awareness over larger portions of the grid than what has been possible historically with normal SCADA (Supervisory Control and Data Acquisition) data. Handling the volume and real-time nature of synchrophasor data presents unique challenges for existing application architectures. Wide Area management System (WAMS) makes it possible for the condition of the bulk power system to be observed and understood in real-time so that protective, preventative, or corrective action can be taken. Because of the very high sampling rate of measurements and the strict requirement for time synchronization of the samples, WAMS has stringent telecommunications requirements in an IP network that are captured in the following table:

WAMS Special Communication Requirements
WAMS Requirement Attribute
One way maximum delay 50 ms
Asymetric delay Required No
Maximum jitter Not critical
Topology Point to point, point to Multi-point, Multi-point to Multi-point
Bandwidth 100 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on Node failure less than 50ms - hitless
performance management Yes, Mandatory
Redundancy Yes
Packet loss 1%

2.2.2.1.8. IEC 61850 WAN engineering guidelines requirement classification

The IEC (International Electrotechnical Commission) has recently published a Technical Report which offers guidelines on how to define and deploy Wide Area Networks for the interconnections of electric substations, generation plants and SCADA operation centers. The IEC 61850-90-12 is providing a classification of WAN communication requirements into 4 classes. You will find herafter the table summarizing these requirements:

61850-90-12 Communication Requirements; Courtesy of IEC
WAN Requirement Class WA Class WB Class WC Class WD
Application field EHV (Extra High Voltage) HV (High Voltage) MV (Medium Voltage) General purpose
Latency 5 ms 10 ms 100 ms > 100 ms
Jitter 10 us 100 us 1 ms 10 ms
Latency Asymetry 100 us 1 ms 10 ms 100 ms
Time Accuracy 1 us 10 us 100 us 10 to 100 ms
Bit Error rate 10-7 to 10-6 10-5 to 10-4 10-3
Unavailability 10-7 to 10-6 10-5 to 10-4 10-3
Recovery delay Zero 50 ms 5 s 50 s
Cyber security extremely high High Medium Medium

2.2.2.2. Distribution use case

2.2.2.2.1. Fault Location Isolation and Service Restoration (FLISR)

As the name implies, Fault Location, Isolation, and Service Restoration (FLISR) refers to the ability to automatically locate the fault, isolate the fault, and restore service in the distribution network. It is a self-healing feature whose purpose is to minimize the impact of faults by serving portions of the loads on the affected circuit by switching to other circuits. It reduces the number of customers that experience a sustained power outage by reconfiguring distribution circuits. This will likely be the first wide spread application of distributed intelligence in the grid. Secondary substations can be connected to multiple primary substations. Normally, static power switch statuses (open/closed) in the network dictate the power flow to secondary substations. Reconfiguring the network in the event of a fault is typically done manually on site to operate switchgear to energize/de-energize alternate paths. Automating the operation of substation switchgear allows the utility to have a more dynamic network where the flow of power can be altered under fault conditions but also during times of peak load. It allows the utility to shift peak loads around the network. Or, to be more precise, alters the configuration of the network to move loads between different primary substations. The FLISR capability can be enabled in two modes:

  • Managed centrally from DMS (Distribution Management System), or
  • Executed locally through distributed control via intelligent switches and fault sensors.

There are 3 distinct sub-functions that are performed:

1. Fault Location Identification

This sub-function is initiated by SCADA inputs, such as lockouts, fault indications/location, and, also, by input from the Outage Management System (OMS), and in the future by inputs from fault-predicting devices. It determines the specific protective device, which has cleared the sustained fault, identifies the de-energized sections, and estimates the probable location of the actual or the expected fault. It distinguishes faults cleared by controllable protective devices from those cleared by fuses, and identifies momentary outages and inrush/cold load pick-up currents. This step is also referred to as Fault Detection Classification and Location (FDCL). This step helps to expedite the restoration of faulted sections through fast fault location identification and improved diagnostic information available for crew dispatch. Also provides visualization of fault information to design and implement a switching plan to isolate the fault.

2. Fault Type Determination

I. Indicates faults cleared by controllable protective devices by distinguishing between:

a. Faults cleared by fuses

b. Momentary outages

c. Inrush/cold load current

II. Determines the faulted sections based on SCADA fault indications and protection lockout signals

III. Increases the accuracy of the fault location estimation based on SCADA fault current measurements and real-time fault analysis

3. Fault Isolation and Service Restoration

Once the location and type of the fault has been pinpointed, the systems will attempt to isolate the fault and restore the non-faulted section of the network. This can have three modes of operation:

I. Closed-loop mode : This is initiated by the Fault location sub-function. It generates a switching order (i.e., sequence of switching) for the remotely controlled switching devices to isolate the faulted section, and restore service to the non-faulted sections. The switching order is automatically executed via SCADA.

II. Advisory mode : This is initiated by the Fault location sub-function. It generates a switching order for remotely and manually controlled switching devices to isolate the faulted section, and restore service to the non-faulted sections. The switching order is presented to operator for approval and execution.

III. Study mode : the operator initiates this function. It analyzes a saved case modified by the operator, and generates a switching order under the operating conditions specified by the operator.

With the increasing volume of data that are collected through fault sensors, utilities will use Big Data query and analysis tools to study outage information to anticipate and prevent outages by detecting failure patterns and their correlation with asset age, type, load profiles, time of day, weather conditions, and other conditions to discover conditions that lead to faults and take the necessary preventive and corrective measures.

FLISR Communication Requirements
FLISR Requirement Attribute
One way maximum delay 80 ms
Asymetric delay Required No
Maximum jitter 40 ms
Topology Point to point, point to Multi-point, Multi-point to Multi-point
Bandwidth 64 Kbps
Availability 99.9999
precise timing required Yes
Recovery time on Node failure Depends on customer impact
performance management Yes, Mandatory
Redundancy Yes
Packet loss 0.1%

2.2.2.3. Generation use case

2.2.2.3.1. Frequency Control / Automatic Generation Control (AGC)

The system frequency should be maintained within a very narrow band. Deviations from the acceptable frequency range are detected and forwarded to the Load Frequency Control (LFC) system so that required up or down generation increase / decrease pulses can be sent to the power plants for frequency regulation. The trend in system frequency is a measure of mismatch between demand and generation, and is a necessary parameter for load control in interconnected systems.

Automatic generation control (AGC) is a system for adjusting the power output of generators at different power plants, in response to changes in the load. Since a power grid requires that generation and load closely balance moment by moment, frequent adjustments to the output of generators are necessary. The balance can be judged by measuring the system frequency; if it is increasing, more power is being generated than used, and all machines in the system are accelerating. If the system frequency is decreasing, more demand is on the system than the instantaneous generation can provide, and all generators are slowing down.

Where the grid has tie lines to adjacent control areas, automatic generation control helps maintain the power interchanges over the tie lines at the scheduled levels. The AGC takes into account various parameters including the most economical units to adjust, the coordination of thermal, hydroelectric, and other generation types, and even constraints related to the stability of the system and capacity of interconnections to other power grids.

For the purpose of AGC we use static frequency measurements and averaging methods are used to get a more precise measure of system frequency in steady-state conditions.

During disturbances, more real-time dynamic measurements of system frequency are taken using PMUs, especially when different areas of the system exhibit different frequencies. But that is outside the scope of this use case.

FCAG Communication Requirements
FCAG (Frequency Control Automatic Generation) Requirement Attribute
One way maximum delay 500 ms
Asymetric delay Required No
Maximum jitter Not critical
Topology Point to point
Bandwidth 20 Kbps
Availability 99.999
precise timing required Yes
Recovery time on Node failure N/A
performance management Yes, Mandatory
Redundancy Yes
Packet loss 1%

2.2.3. Specific Network topologies of Smart Grid Applications

Utilities often have very large private telecommunications networks. It covers an entire territory / country. The main purpose of the network, until now, has been to support transmission network monitoring, control, and automation, remote control of generation sites, and providing FCAPS (Fault. Configuration. Accounting. Performance. Security) services from centralized network operation centers.

Going forward, one network will support operation and maintenance of electrical networks (generation, transmission, and distribution), voice and data services for ten of thousands of employees and for exchange with neighboring interconnections, and administrative services. To meet those requirements, utility may deploy several physical networks leveraging different technologies across the country: an optical network and a microwave network for instance. Each protection and automatism system between two points has two telecommunications circuits, one on each network. Path diversity between two substations is key. Regardless of the event type (hurricane, ice storm, etc.), one path shall stay available so the SPS can still operate.

In the optical network, signals are transmitted over more than tens of thousands of circuits using fiber optic links, microwave and telephone cables. This network is the nervous system of the utility's power transmission operations. The optical network represents ten of thousands of km of cable deployed along the power lines.

Due to vast distances between transmission substations (for example as far as 280km apart), the fiber signal can be amplified to reach a distance of 280 km without attenuation.

2.2.4. Precision Time Protocol

Some utilities do not use GPS clocks in generation substations. One of the main reasons is that some of the generation plants are 30 to 50 meters deep under ground and the GPS signal can be weak and unreliable. Instead, atomic clocks are used. Clocks are synchronized amongst each other. Rubidium clocks provide clock and 1ms timestamps for IRIG-B. Some companies plan to transition to the Precision Time Protocol (IEEE 1588), distributing the synchronization signal over the IP/MPLS network.

The Precision Time Protocol (PTP) is defined in IEEE standard 1588. PTP is applicable to distributed systems consisting of one or more nodes, communicating over a network. Nodes are modeled as containing a real-time clock that may be used by applications within the node for various purposes such as generating time-stamps for data or ordering events managed by the node. The protocol provides a mechanism for synchronizing the clocks of participating nodes to a high degree of accuracy and precision.

PTP operates based on the following assumptions :

  • It is assumed that the network eliminates cyclic forwarding of PTP messages within each communication path (e.g., by using a spanning tree protocol). PTP eliminates cyclic forwarding of PTP messages between communication paths.
  • PTP is tolerant of an occasional missed message, duplicated message, or message that arrived out of order. However, PTP assumes that such impairments are relatively rare.
  • PTP was designed assuming a multicast communication model. PTP also supports a unicast communication model as long as the behavior of the protocol is preserved.
  • Like all message-based time transfer protocols, PTP time accuracy is degraded by asymmetry in the paths taken by event messages. Asymmetry is not detectable by PTP, however, if known, PTP corrects for asymmetry.

A time-stamp event is generated at the time of transmission and reception of any event message. The time-stamp event occurs when the message's timestamp point crosses the boundary between the node and the network.

IEC 61850 will recommend the use of the IEEE PTP 1588 Utility Profile (as defined in IEC 62439-3 Annex B) which offers the support of redundant attachment of clocks to Paralell Redundancy Protcol (PRP) and High-availability Seamless Redundancy (HSR) networks.

2.3. IANA Considerations

This memo includes no request to IANA.

2.4. Security Considerations

2.4.1. Current Practices and Their Limitations

Grid monitoring and control devices are already targets for cyber attacks and legacy telecommunications protocols have many intrinsic network related vulnerabilities. DNP3, Modbus, PROFIBUS/PROFINET, and other protocols are designed around a common paradigm of request and respond. Each protocol is designed for a master device such as an HMI (Human Machine Interface) system to send commands to subordinate slave devices to retrieve data (reading inputs) or control (writing to outputs). Because many of these protocols lack authentication, encryption, or other basic security measures, they are prone to network-based attacks, allowing a malicious actor or attacker to utilize the request-and-respond system as a mechanism for command-and-control like functionality. Specific security concerns common to most industrial control, including utility telecommunication protocols include the following:

  • Network or transport errors (e.g. malformed packets or excessive latency) can cause protocol failure.
  • Protocol commands may be available that are capable of forcing slave devices into inoperable states, including powering-off devices, forcing them into a listen-only state, disabling alarming.
  • Protocol commands may be available that are capable of restarting communications and otherwise interrupting processes.
  • Protocol commands may be available that are capable of clearing, erasing, or resetting diagnostic information such as counters and diagnostic registers.
  • Protocol commands may be available that are capable of requesting sensitive information about the controllers, their configurations, or other need-to-know information.
  • Most protocols are application layer protocols transported over TCP; therefore it is easy to transport commands over non-standard ports or inject commands into authorized traffic flows.
  • Protocol commands may be available that are capable of broadcasting messages to many devices at once (i.e. a potential DoS).
  • Protocol commands may be available to query the device network to obtain defined points and their values (i.e. a configuration scan).
  • Protocol commands may be available that will list all available function codes (i.e. a function scan).
  • Bump in the wire (BITW) solutions : A hardware device is added to provide IPSec services between two routers that are not capable of IPSec functions. This special IPsec device will intercept then intercept outgoing datagrams, add IPSec protection to them, and strip it off incoming datagrams. BITW can all IPSec to legacy hosts and can retrofit non-IPSec routers to provide security benefits. The disadvantages are complexity and cost.

These inherent vulnerabilities, along with increasing connectivity between IT an OT networks, make network-based attacks very feasible. Simple injection of malicious protocol commands provides control over the target process. Altering legitimate protocol traffic can also alter information about a process and disrupt the legitimate controls that are in place over that process. A man- in-the-middle attack could provide both control over a process and misrepresentation of data back to operator consoles.

2.4.2. Security Trends in Utility Networks

Although advanced telecommunications networks can assist in transforming the energy industry, playing a critical role in maintaining high levels of reliability, performance, and manageability, they also introduce the need for an integrated security infrastructure. Many of the technologies being deployed to support smart grid projects such as smart meters and sensors can increase the vulnerability of the grid to attack. Top security concerns for utilities migrating to an intelligent smart grid telecommunications platform center on the following trends:

  • Integration of distributed energy resources
  • Proliferation of digital devices to enable management, automation, protection, and control
  • Regulatory mandates to comply with standards for critical infrastructure protection
  • Migration to new systems for outage management, distribution automation, condition-based maintenance, load forecasting, and smart metering
  • Demand for new levels of customer service and energy management

This development of a diverse set of networks to support the integration of microgrids, open-access energy competition, and the use of network-controlled devices is driving the need for a converged security infrastructure for all participants in the smart grid, including utilities, energy service providers, large commercial and industrial, as well as residential customers. Securing the assets of electric power delivery systems, from the control center to the substation, to the feeders and down to customer meters, requires an end-to-end security infrastructure that protects the myriad of telecommunications assets used to operate, monitor, and control power flow and measurement. Cyber security refers to all the security issues in automation and telecommunications that affect any functions related to the operation of the electric power systems. Specifically, it involves the concepts of:

  • Integrity : data cannot be altered undetectably
  • Authenticity : the telecommunications parties involved must be validated as genuine
  • Authorization : only requests and commands from the authorized users can be accepted by the system
  • Confidentiality : data must not be accessible to any unauthenticated users

When designing and deploying new smart grid devices and telecommunications systems, it's imperative to understand the various impacts of these new components under a variety of attack situations on the power grid. Consequences of a cyber attack on the grid telecommunications network can be catastrophic. This is why security for smart grid is not just an ad hoc feature or product, it's a complete framework integrating both physical and Cyber security requirements and covering the entire smart grid networks from generation to distribution. Security has therefore become one of the main foundations of the utility telecom network architecture and must be considered at every layer with a defense-in-depth approach. Migrating to IP based protocols is key to address these challenges for two reasons:

1. IP enables a rich set of features and capabilities to enhance the security posture

2. IP is based on open standards, which allows interoperability between different vendors and products, driving down the costs associated with implementing security solutions in OT networks.

Securing OT (Operation technology) telecommunications over packet-switched IP networks follow the same principles that are foundational for securing the IT infrastructure, i.e., consideration must be given to enforcing electronic access control for both person-to-machine and machine-to-machine communications, and providing the appropriate levels of data privacy, device and platform integrity, and threat detection and mitigation.

2.5. Acknowledgements

Faramarz Maghsoodlou, Ph. D. IoT Connected Industries and Energy Practice Cisco

Pascal Thubert, CTAO Cisco

3. Building Automation Systems Use Cases

3.1. Introduction

Building Automation System (BAS) is a system that manages various equipment and sensors in buildings (e.g., heating, cooling and ventilating) for improving residents' comfort, reduction of energy consumption and automatic responses in case of failure and emergency. For example, BAS measures temperature of a room by using various sensors and then controls the HVAC (Heating, Ventilating, and air Conditioning) system automatically to maintain the temperature level and minimize the energy consumption.

There are typically two layers of network in a BAS. Upper one is called management network and the lower one is called field network. In management networks, an IP-based communication protocol is used while in field network, non-IP based communication protocols (a.k.a., field protocol) are mainly used.

There are many field protocols used in today's deployment in which some medium access control and physical layers protocols are standards-based and others are proprietary based. Therefore the BAS needs to have multiple MAC/PHY modules and interfaces to make use of multiple field protocols based devices. This situation not only makes BAS more expensive with large development cycle of multiple devices but also creates the issue of vendor lock-in with multiple types of management applications.

The other issue with some of the existing field networks and protocols are security. When these protocols and network were developed, it was assumed that the field networks are isolated physically from external networks and therefore the network and protocol security was not a concern. However, in today's world many BASes are managed remotely and is connected to shared IP networks and it is also not uncommon that same IT infrastructure is used be it office, home or in enterprise networks. Adding network and protocol security to existing system is a non-trivial task.

This document first describes the BAS functionalities, its architecture and current deployment models. Then we discuss the use cases and field network requirements that need to be satisfied by deterministic networking.

3.2. BAS Functionality

Building Automation System (BAS) is a system that manages various devices in buildings automatically. BAS primarily performs the following functions:

  • Measures states of devices in a regular interval. For example, temperature or humidity or illuminance of rooms, on/off state of room lights, open/close state of doors, FAN speed, valve, running mode of HVAC, and its power consumption.
  • Stores the measured data into a database (Note: The database keeps the data for several years).
  • Provides the measured data for BAS operators for visualization.
  • Generates alarms for abnormal state of devices (e.g., calling operator's cellular phone, sending an e-mail to operators and so on).
  • Controls devices on demand.
  • Controls devices with a pre-defined operation schedule (e.g., turn off room lights at 10:00 PM).

3.3. BAS Architecture

A typical BAS architecture is described below in Figure 1. There are several elements in a BAS.

        +----------------------------+
        |                            |
        |       BMS        HMI       |
        |        |          |        |
        |  +----------------------+  |
        |  |  Management Network  |  |
        |  +----------------------+  |
        |        |          |        |
        |        LC         LC       |
        |        |          |        |
        |  +----------------------+  |
        |  |     Field Network    |  |
        |  +----------------------+  |
        |     |     |     |     |    |
        |    Dev   Dev   Dev   Dev   |
        |                            |
        +----------------------------+

        BMS := Building Management Server
        HMI := Human Machine Interface
        LC  := Local Controller
        

Figure 1: BAS architecture

Human Machine Interface (HMI): It is commonly a computing platform (e.g., desktop PC) used by operators. Operators perform the following operations through HMI.

  • Monitoring devices: HMI displays measured device states. For example, latest device states, a history chart of states, a popup window with an alert message. Typically, the measured device states are stored in BMS (Building Management Server).
  • Controlling devices: HMI provides ability to control the devices. For example, turn on a room light, set a target temperature to HVAC. Several parameters (a target device, a control value, etc.), can be set by the operators which then HMI sends to a LC (Local Controller) via the management network.
  • Configuring an operational schedule: HMI provides scheduling capability through which operational schedule is defined. For example, schedule includes 1) a time to control, 2) a target device to control, and 3) a control value. A specific operational example could be turn off all room lights in the building at 10:00 PM. This schedule is typically stored in BMS.

Building Management Server (BMS) collects device states from LCs (Local Controllers) and stores it into a database. According to its configuration, BMS executes the following operation automatically.

  • BMS collects device states from LCs in a regular interval and then stores the information into a database.
  • BMS sends control values to LCs according to a pre-configured schedule.
  • BMS sends an alarm signal to operators if it detects abnormal devices states. For example, turning on a red lamp, calling operators' cellular phone, sending an e-mail to operators.

BMS and HMI communicate with Local Controllers (LCs) via IP-based communication protocol standardized by BACnet/IP [bacnetip], KNX/IP [knx]. These protocols are commonly called as management protocols. LCs measure device states and provide the information to BMS or HMI. These devices may include HVAC, FAN, doors, valves, lights, sensors (e.g., temperature, humidity, and illuminance). LC can also set control values to the devices. LC sometimes has additional functions, for example, sending a device state to BMS or HMI if the device state exceeds a certain threshold value, feedback control to a device to keep the device state at a certain state. Typical example of LC is a PLC (Programmable Logic Controller).

Each LC is connected with a different field network and communicates with several tens or hundreds of devices via the field network. Today there are many field protocols used in the field network. Based on the type of field protocol used, LC interfaces and its hardware/software could be different. Field protocols are currently non-IP based in which some of them are standards-based (e.g., LonTalk [lontalk], Modbus [modbus], Profibus [profibus], FL-net [flnet],) and others are proprietary.

3.4. Deployment Model

An example BAS system deployment model for medium and large buildings is depicted in Figure 2 below. In this case the physical layout of the entire system spans across multiple floors in which there is normally a monitoring room where the BAS management entities are located. Each floor will have one or more LCs depending upon the number of devices connected to the field network.

        +--------------------------------------------------+
        |                                          Floor 3 |
        |     +----LC~~~~+~~~~~+~~~~~+                     |
        |     |          |     |     |                     |
        |     |         Dev   Dev   Dev                    |
        |     |                                            |
        |---  |  ------------------------------------------|
        |     |                                    Floor 2 |
        |     +----LC~~~~+~~~~~+~~~~~+  Field Network      |
        |     |          |     |     |                     |
        |     |         Dev   Dev   Dev                    |
        |     |                                            |
        |---  |  ------------------------------------------|
        |     |                                    Floor 1 |
        |     +----LC~~~~+~~~~~+~~~~~+   +-----------------|
        |     |          |     |     |   | Monitoring Room |
        |     |         Dev   Dev   Dev  |                 |
        |     |                          |    BMS   HMI    |
        |     |   Management Network     |     |     |     |
        |     +--------------------------------+-----+     |
        |                                |                 |
        +--------------------------------------------------+
        

Figure 2: Deployment model for Medium/Large Buildings

Each LC is then connected to the monitoring room via the management network. In this scenario, the management functions are performed locally and reside within the building. In most cases, fast Ethernet (e.g. 100BASE-TX) is used for the management network. In the field network, variety of physical interfaces such as RS232C, and RS485 are used. Since management network is non-real time, Ethernet without quality of service is sufficient for today's deployment. However, the requirements are different for field networks when they are replaced by either Ethernet or any wireless technologies supporting real time requirements (Section 3.4).

Figure 3 depicts a deployment model in which the management can be hosted remotely. This deployment is becoming popular for small office and residential buildings whereby having a standalone monitoring system is not a cost effective solution. In such scenario, multiple buildings are managed by a remote management monitoring system.

                                                 +---------------+
                                                 | Remote Center |
                                                 |               |
                                                 |  BMS     HMI  |
        +------------------------------------+   |   |       |   |
        |                            Floor 2 |   |   +---+---+   |
        |    +----LC~~~~+~~~~~+ Field Network|   |       |       |
        |    |          |     |              |   |     Router    |
        |    |         Dev   Dev             |   +-------|-------+
        |    |                               |           |
        |--- | ------------------------------|           |
        |    |                       Floor 1 |           |
        |    +----LC~~~~+~~~~~+              |           |
        |    |          |     |              |           |
        |    |         Dev   Dev             |           |
        |    |                               |           |
        |    |   Management Network          |     WAN   |
        |    +------------------------Router-------------+
        |                                    |
        +------------------------------------+
        

Figure 3: Deployment model for Small Buildings

In either case, interoperability today is only limited to the management network and its protocols. In existing deployment, there are limited interoperability opportunity in the field network due to its nature of non-IP-based design and requirements.

3.5. Use cases and Field Network Requirements

In this section, we describe several use cases and corresponding network requirements.

3.5.1. Environmental Monitoring

In this use case, LCs measure environmental data (e.g. temperatures, humidity, illuminance, CO2, etc.) from several sensor devices at each measurement interval. LCs keep latest value of each sensor. BMS sends data requests to LCs to collect the latest values, then stores the collected values into a database. Operators check the latest environmental data that are displayed by the HMI. BMS also checks the collected data automatically to notify the operators if a room condition was going to bad (e.g., too hot or cold). The following table lists the field network requirements in which the number of devices in a typical building will be ~100s per LC.

Field Network Requirements for Environmental Monitoring
Metric Requirement
Measurement interval 100 msec
Availability 99.999 %

There is a case that BMS sends data requests at each 1 second in order to draw a historical chart of 1 second granularity. Therefore 100 msec measurement interval is sufficient for this use case, because typically 10 times granularity (compared with the interval of data requests) is considered enough accuracy in this use case. A LC needs to measure values of all sensors connected with itself at each measurement interval. Each communication delay in this scenario is not so critical. The important requirement is completing measurements of all sensor values in the specified measurement interval. The availability in this use case is very high (Three 9s).

3.5.2. Fire Detection

In the case of fire detection, HMI needs to show a popup window with an alert message within a few seconds after an abnormal state is detected. BMS needs to do some operations if it detects fire. For example, stopping a HVAC, closing fire shutters, and turning on fire sprinklers. The following table describes requirements in which the number of devices in a typical building will be ~10s per LC.

Field Network Requirements for Fire Detection
Metric Requirement
Measurement interval 10s of msec
Communication delay < 10s of msec
Availability 99.9999 %

In order to perform the above operation within a few seconds (1 or 2 seconds) after detecting fire, LCs should measure sensor values at a regular interval of less than 10s of msec. If a LC detects an abnormal sensor value, it sends an alarm information to BMS and HMI immediately. BMS then controls HVAC or fire shutters or fire sprinklers. HMI then displays a pop up window and generates the alert message. Since the management network does not operate in real time, and software run on BMS or HMI requires 100s of ms, the communication delay should be less than ~10s of msec. The availability in this use case is very high (Four 9s).

3.5.3. Feedback Control

Feedback control is used to keep a device state at a certain value. For example, keeping a room temperature at 27 degree Celsius, keeping a water flow rate at 100 L/m and so on. The target device state is normally pre-defined in LCs or provided from BMS or from HMI.

In feedback control procedure, a LC repeats the following actions at a regular interval (feedback interval).

  1. The LC measures device states of the target device.
  2. The LC calculates a control value by considering the measured device state.
  3. The LC sends the control value to the target device.

The feedback interval highly depends on the characteristics of the device and a target quality of control value. While several tens of milliseconds feedback interval is sufficient to control a valve that regulates a water flow, controlling DC motors requires several milliseconds interval. The following table describes the field network requirements in which the number of devices in a typical building will be ~10s per LC.

Field Network Requirements for Feedback Control
Metric Requirement
Feedback interval ~10ms - 100ms
Communication delay < 10s of msec
Communication jitter < 1 msec
Availability 99.9999 %

Small communication delay and jitter are required in this use case in order to provide high quality of feedback control. This is currently offered in production environment with hgh availability (Four 9s).

3.6. Security Considerations

Both network and physical security of BAS are important. While physical security is present in today's deployment, adequate network security and access control are either not implemented or configured properly. This was sufficient in networks while they are isolated and not connected to the IT or other infrastructure networks but when IT and OT (Operational Technology) are connected in the same infrastructure network, network security is essential. The management network being an IP-based network does have the protocols and knobs to enable the network security but in many cases BAS for example, does not use device authentication or encryption for data in transit. On the contrary, many of today's field networks do not provide any security at all. Following are the high level security requirements that the network should provide:

  • Authentication between management and field devices (both local and remote)
  • Integrity and data origin authentication of communication data between field and management devices
  • Confidentiality of data when communicated to a remote device
  • Availability of network data for normal and disaster scenario

4. Wireless for Industrial Use Cases

(This section was derived from draft-thubert-6tisch-4detnet-01)

4.1. Introduction

The emergence of wireless technology has enabled a variety of new devices to get interconnected, at a very low marginal cost per device, at any distance ranging from Near Field to interplanetary, and in circumstances where wiring may not be practical, for instance on fast-moving or rotating devices.

At the same time, a new breed of Time Sensitive Networks is being developed to enable traffic that is highly sensitive to jitter, quite sensitive to latency, and with a high degree of operational criticality so that loss should be minimized at all times. Such traffic is not limited to professional Audio/ Video networks, but is also found in command and control operations such as industrial automation and vehicular sensors and actuators.

At IEEE802.1, the Audio/Video Task Group [IEEE802.1TSNTG] Time Sensitive Networking (TSN) to address Deterministic Ethernet. The Medium access Control (MAC) of IEEE802.15.4 [IEEE802154] has evolved with the new TimeSlotted Channel Hopping (TSCH) [I-D.ietf-6tisch-tsch] mode for deterministic industrial-type applications. TSCH was introduced with the IEEE802.15.4e [IEEE802154e] amendment and will be wrapped up in the next revision of the IEEE802.15.4 standard. For all practical purpose, this document is expected to be insensitive to the future versions of the IEEE802.15.4 standard, which is thus referenced undated.

Though at a different time scale, both TSN and TSCH standards provide Deterministic capabilities to the point that a packet that pertains to a certain flow crosses the network from node to node following a very precise schedule, as a train that leaves intermediate stations at precise times along its path. With TSCH, time is formatted into timeSlots, and an individual cell is allocated to unicast or broadcast communication at the MAC level. The time-slotted operation reduces collisions, saves energy, and enables to more closely engineer the network for deterministic properties. The channel hopping aspect is a simple and efficient technique to combat multi-path fading and co-channel interferences (for example by Wi-Fi emitters).

The 6TiSCH Architecture [I-D.ietf-6tisch-architecture] defines a remote monitoring and scheduling management of a TSCH network by a Path Computation Element (PCE), which cooperates with an abstract Network Management Entity (NME) to manage timeSlots and device resources in a manner that minimizes the interaction with and the load placed on the constrained devices.

This Architecture applies the concepts of Deterministic Networking on a TSCH network to enable the switching of timeSlots in a G-MPLS manner. This document details the dependencies that 6TiSCH has on PCE [PCE] and DetNet [I-D.finn-detnet-architecture] to provide the necessary capabilities that may be specific to such networks. In turn, DetNet is expected to integrate and maintain consistency with the work that has taken place and is continuing at IEEE802.1TSN and AVnu.

4.2. Terminology

Readers are expected to be familiar with all the terms and concepts that are discussed in "Multi-link Subnet Support in IPv6" [I-D.ietf-ipv6-multilink-subnets].

The draft uses terminology defined or referenced in [I-D.ietf-6tisch-terminology] and [I-D.ietf-roll-rpl-industrial-applicability].

The draft also conforms to the terms and models described in [RFC3444] and uses the vocabulary and the concepts defined in [RFC4291] for the IPv6 Architecture.

4.3. 6TiSCH Overview

The scope of the present work is a subnet that, in its basic configuration, is made of a TSCH [I-D.ietf-6tisch-tsch] MAC Low Power Lossy Network (LLN).

            ---+-------- ............ ------------
               |      External Network       |
               |                          +-----+
            +-----+                       | NME |
            |     | LLN Border            |     |
            |     | router                +-----+
            +-----+
          o    o   o
   o     o   o     o
      o   o LLN   o    o     o
         o   o   o       o
                 o

Figure 4: Basic Configuration of a 6TiSCH Network

In the extended configuration, a Backbone Router (6BBR) federates multiple 6TiSCH in a single subnet over a backbone. 6TiSCH 6BBRs synchronize with one another over the backbone, so as to ensure that the multiple LLNs that form the IPv6 subnet stay tightly synchronized.

               ---+-------- ............ ------------
                  |      External Network       |
                  |                          +-----+
                  |             +-----+      | NME |
               +-----+          |  +-----+   |     |
               |     | Router   |  | PCE |   +-----+
               |     |          +--|     |
               +-----+             +-----+
                  |                   |
                  | Subnet Backbone   |
            +--------------------+------------------+
            |                    |                  |
         +-----+             +-----+             +-----+
         |     | Backbone    |     | Backbone    |     | Backbone
    o    |     | router      |     | router      |     | router
         +-----+             +-----+             +-----+
    o                  o                   o                 o   o
        o    o   o         o   o  o   o         o  o   o    o
   o             o        o  LLN      o      o         o      o
      o   o    o      o      o o     o  o   o    o    o     o

Figure 5: Extended Configuration of a 6TiSCH Network

If the Backbone is Deterministic, then the Backbone Router ensures that the end-to-end deterministic behavior is maintained between the LLN and the backbone. This SHOULD be done in conformance to the DetNet Architecture [I-D.finn-detnet-architecture] which studies Layer-3 aspects of Deterministic Networks, and covers networks that span multiple Layer-2 domains. One particular requirement is that the PCE MUST be able to compute a deterministic path and to end across the TSCH network and an IEEE802.1 TSN Ethernet backbone, and DetNet MUST enable end-to-end deterministic forwarding.

6TiSCH defines the concept of a Track, which is a complex form of a uni-directional Circuit ([I-D.ietf-6tisch-terminology]). As opposed to a simple circuit that is a sequence of nodes and links, a Track is shaped as a directed acyclic graph towards a destination to support multi-path forwarding and route around failures. A Track may also branch off and rejoin, for the purpose of the so-called Packet Replication and Elimination (PRE), over non congruent branches. PRE may be used to complement layer-2 Automatic Repeat reQuest (ARQ) to meet industrial expectations in Packet Delivery Ratio (PDR), in particular when the Track extends beyond the 6TiSCH network.


                  +-----+
                  | IoT |
                  | G/W |
                  +-----+
                     ^  <---- Elimination
                    | |
     Track branch   | |
            +-------+ +--------+ Subnet Backbone
            |                  |
         +--|--+            +--|--+
         |  |  | Backbone   |  |  | Backbone
    o    |  |  | router     |  |  | router
         +--/--+            +--|--+
    o     /    o     o---o----/       o
        o    o---o--/   o      o   o  o   o
   o     \  /     o               o   LLN    o
      o   v  <---- Replication
          o


Figure 6: End-to-End deterministic Track

In the example above, a Track is laid out from a field device in a 6TiSCH network to an IoT gateway that is located on a IEEE802.1 TSN backbone.

The Replication function in the field device sends a copy of each packet over two different branches, and the PCE schedules each hop of both branches so that the two copies arrive in due time at the gateway. In case of a loss on one branch, hopefully the other copy of the packet still makes it in due time. If two copies make it to the IoT gateway, the Elimination function in the gateway ignores the extra packet and presents only one copy to upper layers.

At each 6TiSCH hop along the Track, the PCE may schedule more than one timeSlot for a packet, so as to support Layer-2 retries (ARQ). It is also possible that the field device only uses the second branch if sending over the first branch fails.

In current deployments, a TSCH Track does not necessarily support PRE but is systematically multi-path. This means that a Track is scheduled so as to ensure that each hop has at least two forwarding solutions, and the forwarding decision is to try the preferred one and use the other in case of Layer-2 transmission failure as detected by ARQ.

4.3.1. TSCH and 6top

6top is a logical link control sitting between the IP layer and the TSCH MAC layer, which provides the link abstraction that is required for IP operations. The 6top operations are specified in [I-D.wang-6tisch-6top-sublayer].

The 6top data model and management interfaces are further discussed in [I-D.ietf-6tisch-6top-interface] and [I-D.ietf-6tisch-coap].

The architecture defines "soft" cells and "hard" cells. "Hard" cells are owned and managed by an separate scheduling entity (e.g. a PCE) that specifies the slotOffset/channelOffset of the cells to be added/moved/deleted, in which case 6top can only act as instructed, and may not move hard cells in the TSCH schedule on its own.

4.3.2. SlotFrames and Priorities

A slotFrame is the base object that the PCE needs to manipulate to program a schedule into an LLN node. Elaboration on that concept can be fond in section "SlotFrames and Priorities" of [I-D.ietf-6tisch-architecture]

IEEE802.15.4 TSCH avoids contention on the medium by formatting time and frequencies in cells of transmission of equal duration. In order to describe that formatting of time and frequencies, the 6TiSCH architecture defines a global concept that is called a Channel Distribution and Usage (CDU) matrix; a CDU matrix is a matrix of cells with an height equal to the number of available channels (indexed by ChannelOffsets) and a width (in timeSlots) that is the period of the network scheduling operation (indexed by slotOffsets) for that CDU matrix. The size of a cell is a timeSlot duration, and values of 10 to 15 milliseconds are typical in 802.15.4 TSCH to accommodate for the transmission of a frame and an ack, including the security validation on the receive side which may take up to a few milliseconds on some device architecture.

The frequency used by a cell in the matrix rotates in a pseudo-random fashion, from an initial position at an epoch time, as the matrix iterates over and over.

A CDU matrix is computed by the PCE, but unallocated timeSlots may be used opportunistically by the nodes for classical best effort IP traffic. The PCE has precedence in the allocation in case of a conflict.

In a given network, there might be multiple CDU matrices that operate with different width, so they have different durations and represent different periodic operations. It is recommended that all CDU matrices in a 6TiSCH domain operate with the same cell duration and are aligned, so as to reduce the chances of interferences from slotted-aloha operations. The PCE MUST compute the CDU matrices and shared that knowledge with all the nodes. The matrices are used in particular to define slotFrames.

A slotFrame is a MAC-level abstraction that is common to all nodes and contains a series of timeSlots of equal length and precedence. It is characterized by a slotFrame_ID, and a slotFrame_size. A slotFrame aligns to a CDU matrix for its parameters, such as number and duration of timeSlots.

Multiple slotFrames can coexist in a node schedule, i.e., a node can have multiple activities scheduled in different slotFrames, based on the precedence of the 6TiSCH topologies. The slotFrames may be aligned to different CDU matrices and thus have different width. There is typically one slotFrame for scheduled traffic that has the highest precedence and one or more slotFrame(s) for RPL traffic. The timeSlots in the slotFrame are indexed by the SlotOffset; the first cell is at SlotOffset 0.

The 6TiSCH architecture introduces the concept of chunks ([I-D.ietf-6tisch-terminology]) to operate such spectrum distribution for a whole group of cells at a time. The CDU matrix is formatted into a set of chunks, each of them identified uniquely by a chunk-ID. The PCE MUST compute the partitioning of CDU matrices into chunks and shared that knowledge with all the nodes in a 6TiSCH network.


             +-----+-----+-----+-----+-----+-----+-----+     +-----+
chan.Off. 0  |chnkA|chnkP|chnk7|chnkO|chnk2|chnkK|chnk1| ... |chnkZ|
             +-----+-----+-----+-----+-----+-----+-----+     +-----+
chan.Off. 1  |chnkB|chnkQ|chnkA|chnkP|chnk3|chnkL|chnk2| ... |chnk1|
             +-----+-----+-----+-----+-----+-----+-----+     +-----+
               ...
             +-----+-----+-----+-----+-----+-----+-----+     +-----+
chan.Off. 15 |chnkO|chnk6|chnkN|chnk1|chnkJ|chnkZ|chnkI| ... |chnkG|
             +-----+-----+-----+-----+-----+-----+-----+     +-----+
                0     1     2     3     4     5     6          M

Figure 7: CDU matrix Partitioning in Chunks

The appropriation of a chunk can be requested explicitly by the PCE to any node. After a successful appropriation, the PCE owns the cells in that chunk, and may use them as hard cells to set up Tracks.

4.3.3. Schedule Management by a PCE

6TiSCH supports a mixed model of centralized routes and distributed routes. Centralized routes can for example be computed by a entity such as a PCE. Distributed routes are computed by RPL.

Both methods may inject routes in the Routing Tables of the 6TiSCH routers. In either case, each route is associated with a 6TiSCH topology that can be a RPL Instance topology or a track. The 6TiSCH topology is indexed by a Instance ID, in a format that reuses the RPLInstanceID as defined in RPL [RFC6550].

Both RPL and PCE rely on shared sources such as policies to define Global and Local RPLInstanceIDs that can be used by either method. It is possible for centralized and distributed routing to share a same topology. Generally they will operate in different slotFrames, and centralized routes will be used for scheduled traffic and will have precedence over distributed routes in case of conflict between the slotFrames.

Section "Schedule Management Mechanisms" of the 6TiSCH architecture describes 4 paradigms to manage the TSCH schedule of the LLN nodes: Static Scheduling, neighbor-to-neighbor Scheduling, remote monitoring and scheduling management, and Hop-by-hop scheduling. The Track operation for DetNet corresponds to a remote monitoring and scheduling management by a PCE.

The 6top interface document [I-D.ietf-6tisch-6top-interface] specifies the generic data model that can be used to monitor and manage resources of the 6top sublayer. Abstract methods are suggested for use by a management entity in the device. The data model also enables remote control operations on the 6top sublayer.

[I-D.ietf-6tisch-coap] defines an mapping of the 6top set of commands, which is described in [I-D.ietf-6tisch-6top-interface], to CoAP resources. This allows an entity to interact with the 6top layer of a node that is multiple hops away in a RESTful fashion.

[I-D.ietf-6tisch-coap] also defines a basic set CoAP resources and associated RESTful access methods (GET/PUT/POST/DELETE). The payload (body) of the CoAP messages is encoded using the CBOR format. The PCE commands are expected to be issued directly as CoAP requests or to be mapped back and forth into CoAP by a gateway function at the edge of the 6TiSCH network. For instance, it is possible that a mapping entity on the backbone transforms a non-CoAP protocol such as PCEP into the RESTful interfaces that the 6TiSCH devices support. This architecture will be refined to comply with DetNet [I-D.finn-detnet-architecture] when the work is formalized.

4.3.4. Track Forwarding

By forwarding, this specification means the per-packet operation that allows to deliver a packet to a next hop or an upper layer in this node. Forwarding is based on pre-existing state that was installed as a result of the routing computation of a Track by a PCE. The 6TiSCH architecture supports three different forwarding model, G-MPLS Track Forwarding (TF), 6LoWPAN Fragment Forwarding (FF) and IPv6 Forwarding (6F) which is the classical IP operation. The DetNet case relates to the Track Forwarding operation under the control of a PCE.

A Track is a unidirectional path between a source and a destination. In a Track cell, the normal operation of IEEE802.15.4 Automatic Repeat-reQuest (ARQ) usually happens, though the acknowledgment may be omitted in some cases, for instance if there is no scheduled cell for a retry.

Track Forwarding is the simplest and fastest. A bundle of cells set to receive (RX-cells) is uniquely paired to a bundle of cells that are set to transmit (TX-cells), representing a layer-2 forwarding state that can be used regardless of the network layer protocol. This model can effectively be seen as a Generalized Multi-protocol Label Switching (G-MPLS) operation in that the information used to switch a frame is not an explicit label, but rather related to other properties of the way the packet was received, a particular cell in the case of 6TiSCH. As a result, as long as the TSCH MAC (and Layer-2 security) accepts a frame, that frame can be switched regardless of the protocol, whether this is an IPv6 packet, a 6LoWPAN fragment, or a frame from an alternate protocol such as WirelessHART or ISA100.11a.

A data frame that is forwarded along a Track normally has a destination MAC address that is set to broadcast - or a multicast address depending on MAC support. This way, the MAC layer in the intermediate nodes accepts the incoming frame and 6top switches it without incurring a change in the MAC header. In the case of IEEE802.15.4, this means effectively broadcast, so that along the Track the short address for the destination of the frame is set to 0xFFFF.

A Track is thus formed end-to-end as a succession of paired bundles, a receive bundle from the previous hop and a transmit bundle to the next hop along the Track, and a cell in such a bundle belongs to at most one Track. For a given iteration of the device schedule, the effective channel of the cell is obtained by adding a pseudo-random number to the channelOffset of the cell, which results in a rotation of the frequency that used for transmission. The bundles may be computed so as to accommodate both variable rates and retransmissions, so they might not be fully used at a given iteration of the schedule. The 6TiSCH architecture provides additional means to avoid waste of cells as well as overflows in the transmit bundle, as follows:

In one hand, a TX-cell that is not needed for the current iteration may be reused opportunistically on a per-hop basis for routed packets. When all of the frame that were received for a given Track are effectively transmitted, any available TX-cell for that Track can be reused for upper layer traffic for which the next-hop router matches the next hop along the Track. In that case, the cell that is being used is effectively a TX-cell from the Track, but the short address for the destination is that of the next-hop router. It results that a frame that is received in a RX-cell of a Track with a destination MAC address set to this node as opposed to broadcast must be extracted from the Track and delivered to the upper layer (a frame with an unrecognized MAC address is dropped at the lower MAC layer and thus is not received at the 6top sublayer).

On the other hand, it might happen that there are not enough TX-cells in the transmit bundle to accommodate the Track traffic, for instance if more retransmissions are needed than provisioned. In that case, the frame can be placed for transmission in the bundle that is used for layer-3 traffic towards the next hop along the track as long as it can be routed by the upper layer, that is, typically, if the frame transports an IPv6 packet. The MAC address should be set to the next-hop MAC address to avoid confusion. It results that a frame that is received over a layer-3 bundle may be in fact associated to a Track. In a classical IP link such as an Ethernet, off-track traffic is typically in excess over reservation to be routed along the non-reserved path based on its QoS setting. But with 6TiSCH, since the use of the layer-3 bundle may be due to transmission failures, it makes sense for the receiver to recognize a frame that should be re-tracked, and to place it back on the appropriate bundle if possible. A frame should be re-tracked if the Per-Hop-Behavior group indicated in the Differentiated Services Field in the IPv6 header is set to Deterministic Forwarding, as discussed in Section 4.4.1. A frame is re-tracked by scheduling it for transmission over the transmit bundle associated to the Track, with the destination MAC address set to broadcast.

There are 2 modes for a Track, transport mode and tunnel mode.

4.3.4.1. Transport Mode

In transport mode, the Protocol Data Unit (PDU) is associated with flow-dependant meta-data that refers uniquely to the Track, so the 6top sublayer can place the frame in the appropriate cell without ambiguity. In the case of IPv6 traffic, this flow identification is transported in the Flow Label of the IPv6 header. Associated with the source IPv6 address, the Flow Label forms a globally unique identifier for that particular Track that is validated at egress before restoring the destination MAC address (DMAC) and punting to the upper layer.

                       |                                    ^
   +--------------+    |                                    |
   |     IPv6     |    |                                    |
   +--------------+    |                                    |
   |  6LoWPAN HC  |    |                                    |
   +--------------+  ingress                              egress
   |     6top     |   sets     +----+          +----+     restores
   +--------------+  dmac to   |    |          |    |     dmac to
   |   TSCH MAC   |   brdcst   |    |          |    |      self
   +--------------+    |       |    |          |    |       |
   |   LLN PHY    |    +-------+    +--...-----+    +-------+
   +--------------+

Track Forwarding, Transport Mode

4.3.4.2. Tunnel Mode

In tunnel mode, the frames originate from an arbitrary protocol over a compatible MAC that may or may not be synchronized with the 6TiSCH network. An example of this would be a router with a dual radio that is capable of receiving and sending WirelessHART or ISA100.11a frames with the second radio, by presenting itself as an access Point or a Backbone Router, respectively.

In that mode, some entity (e.g. PCE) can coordinate with a WirelessHART Network Manager or an ISA100.11a System Manager to specify the flows that are to be transported transparently over the Track.

   +--------------+
   |     IPv6     |
   +--------------+
   |  6LoWPAN HC  |
   +--------------+             set            restore
   |     6top     |            +dmac+          +dmac+
   +--------------+          to|brdcst       to|nexthop
   |   TSCH MAC   |            |    |          |    |
   +--------------+            |    |          |    |
   |   LLN PHY    |    +-------+    +--...-----+    +-------+
   +--------------+    |   ingress                 egress   |
                       |                                    |
   +--------------+    |                                    |
   |   LLN PHY    |    |                                    |
   +--------------+    |                                    |
   |   TSCH MAC   |    |                                    |
   +--------------+    | dmac =                             | dmac =
   |ISA100/WiHART |    | nexthop                            v nexthop
   +--------------+

Figure 8: Track Forwarding, Tunnel Mode

In that case, the flow information that identifies the Track at the ingress 6TiSCH router is derived from the RX-cell. The dmac is set to this node but the flow information indicates that the frame must be tunneled over a particular Track so the frame is not passed to the upper layer. Instead, the dmac is forced to broadcast and the frame is passed to the 6top sublayer for switching.

At the egress 6TiSCH router, the reverse operation occurs. Based on metadata associated to the Track, the frame is passed to the appropriate link layer with the destination MAC restored.

4.3.4.3. Tunnel Metadata

Metadata coming with the Track configuration is expected to provide the destination MAC address of the egress endpoint as well as the tunnel mode and specific data depending on the mode, for instance a service access point for frame delivery at egress. If the tunnel egress point does not have a MAC address that matches the configuration, the Track installation fails.

In transport mode, if the final layer-3 destination is the tunnel termination, then it is possible that the IPv6 address of the destination is compressed at the 6LoWPAN sublayer based on the MAC address. It is thus mandatory at the ingress point to validate that the MAC address that was used at the 6LoWPAN sublayer for compression matches that of the tunnel egress point. For that reason, the node that injects a packet on a Track checks that the destination is effectively that of the tunnel egress point before it overwrites it to broadcast. The 6top sublayer at the tunnel egress point reverts that operation to the MAC address obtained from the tunnel metadata.

4.4. Operations of Interest for DetNet and PCE

In a classical system, the 6TiSCH device does not place the request for bandwidth between self and another device in the network. Rather, an Operation Control System invoked through an Human/Machine Interface (HMI) indicates the Traffic Specification, in particular in terms of latency and reliability, and the end nodes. With this, the PCE must compute a Track between the end nodes and provision the network with per-flow state that describes the per-hop operation for a given packet, the corresponding timeSlots, and the flow identification that enables to recognize when a certain packet belongs to a certain Track, sort out duplicates, etc...

For a static configuration that serves a certain purpose for a long period of time, it is expected that a node will be provisioned in one shot with a full schedule, which incorporates the aggregation of its behavior for multiple Tracks. 6TiSCH expects that the programing of the schedule will be done over COAP as discussed in 6TiSCH Resource Management and Interaction using CoAP [I-D.ietf-6tisch-coap].

But an Hybrid mode may be required as well whereby a single Track is added, modified, or removed, for instance if it appears that a Track does not perform as expected for, say, PDR. For that case, the expectation is that a protocol that flows along a Track (to be), in a fashion similar to classical Traffic Engineering (TE) [CCAMP], may be used to update the state in the devices. 6TiSCH provides means for a device to negotiate a timeSlot with a neighbor, but in general that flow was not designed and no protocol was selected and it is expected that DetNet will determine the appropriate end-to-end protocols to be used in that case.

Stream Management Entity


                      Operational System and HMI

   -+-+-+-+-+-+-+ Northbound -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

             PCE         PCE              PCE              PCE

   -+-+-+-+-+-+-+ Southbound -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

           --- 6TiSCH------6TiSCH------6TiSCH------6TiSCH--
  6TiSCH /     Device      Device      Device      Device   \
  Device-                                                    - 6TiSCH
         \     6TiSCH      6TiSCH      6TiSCH      6TiSCH   /  Device
           ----Device------Device------Device------Device--

			

Figure 9

4.4.1. Packet Marking and Handling

Section "Packet Marking and Handling" of [I-D.ietf-6tisch-architecture] describes the packet tagging and marking that is expected in 6TiSCH networks.

4.4.1.1. Tagging Packets for Flow Identification

For packets that are routed by a PCE along a Track, the tuple formed by the IPv6 source address and a local RPLInstanceID is tagged in the packets to identify uniquely the Track and associated transmit bundle of timeSlots.

It results that the tagging that is used for a DetNet flow outside the 6TiSCH LLN MUST be swapped into 6TiSCH formats and back as the packet enters and then leaves the 6TiSCH network.

Note: The method and format used for encoding the RPLInstanceID at 6lo is generalized to all 6TiSCH topological Instances, which includes Tracks.

4.4.1.2. Replication, Retries and Elimination

6TiSCH expects elimination and replication of packets along a complex Track, but has no position about how the sequence numbers would be tagged in the packet.

As it goes, 6TiSCH expects that timeSlots corresponding to copies of a same packet along a Track are correlated by configuration, and does not need to process the sequence numbers.

The semantics of the configuration MUST enable correlated timeSlots to be grouped for transmit (and respectively receive) with a 'OR' relations, and then a 'AND' relation MUST be configurable between groups. The semantics is that if the transmit (and respectively receive) operation succeeded in one timeSlot in a 'OR' group, then all the other timeSLots in the group are ignored. Now, if there are at least two groups, the 'AND' relation between the groups indicates that one operation must succeed in each of the groups.

On the transmit side, timeSlots provisioned for retries along a same branch of a Track are placed a same 'OR' group. The 'OR' relation indicates that if a transmission is acknowledged, then further transmissions SHOULD NOT be attempted for timeSlots in that group. There are as many 'OR' groups as there are branches of the Track departing from this node. Different 'OR' groups are programmed for the purpose of replication, each group corresponding to one branch of the Track. The 'AND' relation between the groups indicates that transmission over any of branches MUST be attempted regardless of whether a transmission succeeded in another branch. It is also possible to place cells to different next-hop routers in a same 'OR' group. This allows to route along multi-path tracks, trying one next-hop and then another only if sending to the first fails.

On the receive side, all timeSlots are programmed in a same 'OR' group. Retries of a same copy as well as converging branches for elimination are converged, meaning that the first successful reception is enough and that all the other timeSlots can be ignored.

4.4.1.3. Differentiated Services Per-Hop-Behavior

Additionally, an IP packet that is sent along a Track uses the Differentiated Services Per-Hop-Behavior Group called Deterministic Forwarding, as described in [I-D.svshah-tsvwg-deterministic-forwarding].

4.4.2. Topology and capabilities

6TiSCH nodes are usually IoT devices, characterized by very limited amount of memory, just enough buffers to store one or a few IPv6 packets, and limited bandwidth between peers. It results that a node will maintain only a small number of peering information, and will not be able to store many packets waiting to be forwarded. Peers can be identified through MAC or IPv6 addresses, but a Cryptographically Generated Address [RFC3972] (CGA) may also be used.

Neighbors can be discovered over the radio using mechanism such as beacons, but, though the neighbor information is available in the 6TiSCH interface data model, 6TiSCH does not describe a protocol to pro-actively push the neighborhood information to a PCE. This protocol should be described and should operate over CoAP. The protocol should be able to carry multiple metrics, in particular the same metrics as used for RPL operations [RFC6551]

The energy that the device consumes in sleep, transmit and receive modes can be evaluated and reported. So can the amount of energy that is stored in the device and the power that it can be scavenged from the environment. The PCE SHOULD be able to compute Tracks that will implement policies on how the energy is consumed, for instance balance between nodes, ensure that the spent energy does not exceeded the scavenged energy over a period of time, etc...

4.5. Security Considerations

On top of the classical protection of control signaling that can be expected to support DetNet, it must be noted that 6TiSCH networks operate on limited resources that can be depleted rapidly if an attacker manages to operate a DoS attack on the system, for instance by placing a rogue device in the network, or by obtaining management control and to setup extra paths.

4.6. Acknowledgments

This specification derives from the 6TiSCH architecture, which is the result of multiple interactions, in particular during the 6TiSCH (bi)Weekly Interim call, relayed through the 6TiSCH mailing list at the IETF.

The authors wish to thank: Kris Pister, Thomas Watteyne, Xavier Vilajosana, Qin Wang, Tom Phinney, Robert Assimiti, Michael Richardson, Zhuo Chen, Malisa Vucinic, Alfredo Grieco, Martin Turon, Dominique Barthel, Elvis Vogli, Guillaume Gaillard, Herman Storey, Maria Rita Palattella, Nicola Accettura, Patrick Wetterwald, Pouria Zand, Raghuram Sudhaakar, and Shitanshu Shah for their participation and various contributions.

5. Cellular Radio Use Cases

(This section was derived from draft-korhonen-detnet-telreq-00)

5.1. Introduction and background

The recent developments in telecommunication networks, especially in the cellular domain, are heading towards transport networks where precise time synchronization support has to be one of the basic building blocks. While the transport networks themselves have practically transitioned to all-AP packet based networks to meet the bandwidth and cost requirements, a highly accurate clock distribution has become a challenge. Earlier the transport networks in the cellular domain were typically time division and multiplexing (TDM) -based and provided frequency synchronization capabilities as a part of the transport media. Alternatively other technologies such as Global Positioning System (GPS) or Synchronous Ethernet (SyncE) [SyncE] were used. New radio access network deployment models and architectures may require time sensitive networking services with strict requirements on other parts of the network that previously were not considered to be packetized at all. The time and synchronization support are already topical for backhaul and midhaul packet networks [MEF], and becoming a real issue for fronthaul networks. Specifically in the fronthaul networks the timing and synchronization requirements can be extreme for packet based technologies, for example, in order of sub +-20 ns packet delay variation (PDV) and frequency accuracy of +0.002 PPM [Fronthaul].

Both Ethernet and IP/MPLS [RFC3031] (and PseudoWires (PWE) [RFC3985] for legacy transport support) have become popular tools to build and manage new all-IP radio access networks (RAN) [I-D.kh-spring-ip-ran-use-case]. Although various timing and synchronization optimizations have already been proposed and implemented including 1588 PTP enhancements [I-D.ietf-tictoc-1588overmpls][I-D.mirsky-mpls-residence-time], these solution are not necessarily sufficient for the forthcoming RAN architectures or guarantee the higher time-synchronization requirements [CPRI]. There are also existing solutions for the TDM over IP [RFC5087] [RFC4553] or Ethernet transports [RFC5086]. The really interesting and important existing work for time sensitive networking has been done for Ethernet [TSNTG], which specifies the use of IEEE 1588 time precision protocol (PTP) [IEEE1588] in the context of IEEE 802.1D and IEEE 802.1Q. While IEEE 802.1AS [IEEE8021AS] specifies a Layer-2 time synchronizing service other specification, such as IEEE 1722 [IEEE1722] specify Ethernet-based Layer-2 transport for time-sensitive streams. New promising work seeks to enable the transport of time-sensitive fronthaul streams in Ethernet bridged networks [IEEE8021CM]. Similarly to IEEE 1722 there is an ongoing standardization effort to define Layer-2 transport encapsulation format for transporting radio over Ethernet (RoE) in IEEE 1904.3 Task Force [IEEE19043].

As already mentioned all-IP RANs and various "haul" networks would benefit from time synchronization and time-sensitive transport services. Although Ethernet appears to be the unifying technology for the transport there is still a disconnect providing Layer-3 services. The protocol stack typically has a number of layers below the Ethernet Layer-2 that shows up to the Layer-3 IP transport. It is not uncommon that on top of the lowest layer (optical) transport there is the first layer of Ethernet followed one or more layers of MPLS, PseudoWires and/or other tunneling protocols finally carrying the Ethernet layer visible to the user plane IP traffic. While there are existing technologies, especially in MPLS/PWE space, to establish circuits through the routed and switched networks, there is a lack of signaling the time synchronization and time-sensitive stream requirements/reservations for Layer-3 flows in a way that the entire transport stack is addressed and the Ethernet layers that needs to be configured are addressed. Furthermore, not all "user plane" traffic will be IP. Therefore, the same solution need also address the use cases where the user plane traffic is again another layer or Ethernet frames. There is existing work describing the problem statement [I-D.finn-detnet-problem-statement] and the architecture [I-D.finn-detnet-architecture] for deterministic networking (DetNet) that eventually targets to provide solutions for time-sensitive (IP/transport) streams with deterministic properties over Ethernet-based switched networks.

This document describes requirements for deterministic networking in a cellular telecom transport networks context. The requirements include time synchronization, clock distribution and ways of establishing time-sensitive streams for both Layer-2 and Layer-3 user plane traffic using IETF protocol solutions.

The recent developments in telecommunication networks, especially in the cellular domain, are heading towards transport networks where precise time synchronization support has to be one of the basic building blocks. While the transport networks themselves have practically transitioned to all-AP packet based networks to meet the bandwidth and cost requirements, a highly accurate clock distribution has become a challenge. Earlier the transport networks in the cellular domain were typically time division and multiplexing (TDM) -based and provided frequency synchronization capabilities as a part of the transport media. Alternatively other technologies such as Global Positioning System (GPS) or Synchronous Ethernet (SyncE) [SyncE] were used. New radio access network deployment models and architectures may require time sensitive networking services with strict requirements on other parts of the network that previously were not considered to be packetized at all. The time and synchronization support are already topical for backhaul and midhaul packet networks [MEF], and becoming a real issue for fronthaul networks. Specifically in the fronthaul networks the timing and synchronization requirements can be extreme for packet based technologies, for example, in order of sub +-20 ns packet delay variation (PDV) and frequency accuracy of +0.002 PPM [Fronthaul].

Both Ethernet and IP/MPLS [RFC3031] (and PseudoWires (PWE) [RFC3985] for legacy transport support) have become popular tools to build and manage new all-IP radio access networks (RAN) [I-D.kh-spring-ip-ran-use-case]. Although various timing and synchronization optimizations have already been proposed and implemented including 1588 PTP enhancements [I-D.ietf-tictoc-1588overmpls][I-D.mirsky-mpls-residence-time], these solution are not necessarily sufficient for the forthcoming RAN architectures or guarantee the higher time-synchronization requirements [CPRI]. There are also existing solutions for the TDM over IP [RFC5087] [RFC4553] or Ethernet transports [RFC5086]. The really interesting and important existing work for time sensitive networking has been done for Ethernet [TSNTG], which specifies the use of IEEE 1588 time precision protocol (PTP) [IEEE1588] in the context of IEEE 802.1D and IEEE 802.1Q. While IEEE 802.1AS [IEEE8021AS] specifies a Layer-2 time synchronizing service other specification, such as IEEE 1722 [IEEE1722] specify Ethernet-based Layer-2 transport for time-sensitive streams. New promising work seeks to enable the transport of time-sensitive fronthaul streams in Ethernet bridged networks [IEEE8021CM]. Similarly to IEEE 1722 there is an ongoing standardization effort to define Layer-2 transport encapsulation format for transporting radio over Ethernet (RoE) in IEEE 1904.3 Task Force [IEEE19043].

As already mentioned all-IP RANs and various "haul" networks would benefit from time synchronization and time-sensitive transport services. Although Ethernet appears to be the unifying technology for the transport there is still a disconnect providing Layer-3 services. The protocol stack typically has a number of layers below the Ethernet Layer-2 that shows up to the Layer-3 IP transport. It is not uncommon that on top of the lowest layer (optical) transport there is the first layer of Ethernet followed one or more layers of MPLS, PseudoWires and/or other tunneling protocols finally carrying the Ethernet layer visible to the user plane IP traffic. While there are existing technologies, especially in MPLS/PWE space, to establish circuits through the routed and switched networks, there is a lack of signaling the time synchronization and time-sensitive stream requirements/reservations for Layer-3 flows in a way that the entire transport stack is addressed and the Ethernet layers that needs to be configured are addressed. Furthermore, not all "user plane" traffic will be IP. Therefore, the same solution need also address the use cases where the user plane traffic is again another layer or Ethernet frames. There is existing work describing the problem statement [I-D.finn-detnet-problem-statement] and the architecture [I-D.finn-detnet-architecture] for deterministic networking (DetNet) that eventually targets to provide solutions for time-sensitive (IP/transport) streams with deterministic properties over Ethernet-based switched networks.

This document describes requirements for deterministic networking in a cellular telecom transport networks context. The requirements include time synchronization, clock distribution and ways of establishing time-sensitive streams for both Layer-2 and Layer-3 user plane traffic using IETF protocol solutions.

5.2. Network architecture

Figure Figure 10 illustrates a typical, 3GPP defined, cellular network architecture, which also has fronthaul and midhaul network segments. The fronthaul refers to the network connecting base stations (base band processing units) to the remote radio heads (antennas). The midhaul network typically refers to the network inter-connecting base stations (or small/pico cells).

Fronthaul networks build on the available excess time after the base band processing of the radio frame has completed. Therefore, the available time for networking is actually very limited, which in practise determines how far the remote radio heads can be from the base band processing units (i.e. base stations). For example, in a case of LTE radio the Hybrid ARQ processing of a radio frame is allocated 3 ms. Typically the processing completes way earlier (say up to 400 us, could be much less, though) thus allowing the remaining time to be used e.g. for fronthaul network. 200 us equals roughly 40 km of optical fiber based transport (assuming round trip time would be total 2*200 us). The base band processing time and the available "delay budget" for the fronthaul is a subject to change, possibly dramatically, in the forthcoming "5G" to meet, for example, the envisioned reduced radio round trip times, and other architecural and service requirements [NGMN].

The maximum "delay budget" is then consumed by all nodes and required buffering between the remote radio head and the base band processing in addition to the distance incurred delay. Packet delay variation (PDV) is problematic to fronthaul networks and must be minimized. If the transport network cannot guarantee low enough PDV additional buffering has to be introduced at the edges of the network to buffer out the jitter. Any buffering will eat up the total available delay budget, though. Section Section 5.3 will discuss the PDV requirements in more detail.

           Y (remote radios)
            \
        Y__  \.--.                   .--.       +------+
           \_(    `.     +---+     _(Back`.     | 3GPP |
    Y------( Front  )----|eNB|----(  Haul  )----| core |
          ( `  .Haul )   +---+   ( `  .  )  )   | netw |
          /`--(___.-'      \      `--(___.-'    +------+
       Y_/     /            \.--.       \
            Y_/            _( Mid`.      \
                          (   Haul )      \
                         ( `  .  )  )      \
                          `--(___.-'\_____+---+    (small cells)
                                \         |SCe|__Y
                               +---+      +---+
                            Y__|eNB|__Y
                               +---+
                             Y_/   \_Y ("local" radios)

Figure 10: Generic 3GPP-based cellular network architecture with Front/Mid/Backhaul networks

5.3. Time synchronization requirements

Cellular networks starting from long term evolution (LTE) [TS36300] [TS23401] radio the phase synchronization is also needed in addition to the frequency synchronization. The commonly referenced fronthaul network synchronization requirements are typically drawn from the common public radio interface (CPRI) [CPRI] specification that defines the transport protocol between the base band processing - radio equipment controller (REC) and the remote antenna - radio equipment (RE). However, the fundamental requirements still originate from the respective cellular system and radio specifications such as the 3GPP ones [TS25104][TS36104][TS36211] [TS36133].

The fronthaul time synchronization requirements for the current 3GPP LTE-based networks are listed below:

Transport link contribution to radio frequency error:


+-2 PPB. The given value is considered to be "available" for the fronthaul link out of the total 50 PPB budget reserved for the radio interface.
Delay accuracy:


+-8.138 ns i.e. +-1/32 Tc (UMTS Chip time, Tc, 1/3.84 MHz) to downlink direction and excluding the (optical) cable length in one direction. Round trip accuracy is then +-16.276 ns. The value is this low to meet the 3GPP timing alignment error (TAE) measurement requirements.
Packet delay variation (PDV):
  • For multiple input multiple output (MIMO) or TX diversity transmissions, at each carrier frequency, TAE shall not exceed 65 ns (i.e. 1/4 Tc).
  • For intra-band contiguous carrier aggregation, with or without MIMO or TX diversity, TAE shall not exceed 130 ns (i.e. 1/2 Tc).
  • For intra-band non-contiguous carrier aggregation, with or without MIMO or TX diversity, TAE shall not exceed 260 ns (i.e. one Tc).
  • For inter-band carrier aggregation, with or without MIMO or TX diversity, TAE shall not exceed 260 ns.

The above listed time synchronization requirements are hard to meet even with point to point connected networks, not to mention cases where the underlying transport network actually constitutes of multiple hops. It is expected that network deployments have to deal with the jitter requirements buffering at the very ends of the connections, since trying to meet the jitter requirements in every intermediate node is likely to be too costly. However, every measure to reduce jitter and delay on the path are valuable to make it easier to meet the end to end requirements.

In order to meet the timing requirements both senders and receivers must is perfect sync. This asks for a very accurate clock distribution solution. Basically all means and hardware support for guaranteeing accurate time synchronization in the network is needed. As an example support for 1588 transparent clocks (TC) in every intermediate node would be helpful.

5.4. Time-sensitive stream requirements

In addition to the time synchronization requirements listed in Section Section 5.3 the fronthaul networks assume practically error free transport. The maximum bit error rate (BER) has been defined to be 10^-12. When packetized that would equal roughly to packet error rate (PER) of 2.4*10^-9 (assuming ~300 bytes packets). Retransmitting lost packets and/or using forward error coding (FEC) to circumvent bit errors are practically impossible due additional incurred delay. Using redundant streams for better guarantees for delivery is also practically impossible due to high bandwidth requirements fronthaul networks have. For instance, current uncompressed CPRI bandwidth expansion ratio is roughly 20:1 compared to the IP layer user payload it carries in a "radio sample form".

The other fundamental assumption is that fronthaul links are symmetric. Last, all fronthaul streams (carrying radio data) have equal priority and cannot delay or pre-empt each other. This implies the network has always be sufficiently under subscribed to guarantee each time-sensitive flow meets their schedule.

Mapping the fronthaul requirements to [I-D.finn-detnet-architecture] Section 3 "Providing the DetNet Quality of Service" what is seemed usable are:

  • (a) Zero congestion loss.
  • (b) Pinned-down paths.

The current time-sensitive networking features may still not be sufficient for fronthaul traffic. Therefore, having specific profiles that take the requirements of fronthaul into account are deemed to be useful [IEEE8021CM].

The actual transport protocols and/or solutions to establish required transport "circuits" (pinned-down paths) for fronthaul traffic are still undefined. Those are likely to include but not limited to solutions directly over Ethernet, over IP, and MPLS/PseudoWire transport.

5.5. Security considerations

Establishing time-sensitive streams in the network entails reserving networking resources sometimes for a considerable long time. It is important that these reservation requests must be authenticated to prevent malicious reservation attempts from hostile nodes or even accidental misconfiguration. This is specifically important in a case where the reservation requests span administrative domains. Furthermore, the reservation information itself should be digitally signed to reduce the risk where a legitimate node pushed a stale or hostile configuration into the networking node.

6. Other Use Cases

(This section was derived from draft-zha-detnet-use-case-00)

6.1. Introduction

The rapid growth of the today's communication system and its access into almost all aspects of daily life has led to great dependency on services it provides. The communication network, as it is today, has applications such as multimedia and peer-to-peer file sharing distribution that require Quality of Service (QoS) guarantees in terms of delay and jitter to maintain a certain level of performance. Meanwhile, mobile wireless communications has become an important part to support modern sociality with increasing importance over the last years. A communication network of hard real-time and high reliability is essential for the next concurrent and next generation mobile wireless networks as well as its bearer network for E-2-E performance requirements.

Conventional transport network is IP-based because of the bandwidth and cost requirements. However the delay and jitter guarantee becomes a challenge in case of contention since the service here is not deterministic but best effort. With more and more rigid demand in latency control in the future network [METIS], deterministic networking [I-D.finn-detnet-architecture] is a promising solution to meet the ultra low delay applications and use cases. There are already typical issues for delay sensitive networking requirements in midhaul and backhaul network to support LTE and future 5G network [net5G]. And not only in the telecom industry but also other vertical industry has increasing demand on delay sensitive communications as the automation becomes critical recently.

More specifically, CoMP techniques, D-2-D, industrial automation and gaming/media service all have great dependency on the low delay communications as well as high reliability to guarantee the service performance. Note that the deterministic networking is not equal to low latency as it is more focused on the worst case delay bound of the duration of certain application or service. It can be argued that without high certainty and absolute delay guarantee, low delay provisioning is just relative [rfc3393], which is not sufficient to some delay critical service since delay violation in an instance cannot be tolerated. Overall, the requirements from vertical industries seem to be well aligned with the expected low latency and high determinist performance of future networks

This document describes several use cases and scenarios with requirements on deterministic delay guarantee within the scope of the deterministic network [I-D.finn-detnet-problem-statement].

6.2. Critical Delay Requirements

Delay and jitter requirement has been take into account as a major component in QoS provisioning since the birth of Internet. The delay sensitive networking with increasing importance become the root of mobile wireless communications as well as the applicable areas which are all greatly relied on low delay communications. Due to the best effort feature of the IP networking, mitigate contention and buffering is the main solution to serve the delay sensitive service. More bandwidth is assigned to keep the link low loaded or in another word, reduce the probability of congestion. However, not only lack of determinist but also has limitation to serve the applications in the future communication system, keeping low loaded cannot provide deterministic delay guarantee. Take the [METIS] that documents the fundamental challenges as well as overall technical goal of the 5G mobile and wireless system as the starting point. It should supports: -1000 times higher mobile data volume per area, -10 times to 100 times higher typical user data rate, -10 times to 100 times higher number of connected devices, -10 times longer battery life for low power devices, and -5 times reduced End-to-End (E2E) latency, at similar cost and energy consumption levels as today's system. Taking part of these requirements related to latency, current LTE networking system has E2E latency less than 20ms [LTE-Latency] which leads to around 5ms E2E latency for 5G networks. It has been argued that fulfill such rigid latency demand with similar cost will be most challenging as the system also requires 100 times bandwidth as well as 100 times of connected devices. As a result to that, simply adding redundant bandwidth provisioning can be no longer an efficient solution due to the high bandwidth requirements more than ever before. In addition to the bandwidth provisioning, the critical flow within its reserved resource should not be affected by other flows no matter the pressure of the network. Robust defense of critical flow is also not depended on redundant bandwidth allocation. Deterministic networking techniques in both layer-2 and layer-3 using IETF protocol solutions can be promising to serve these scenarios.

6.3. Coordinated multipoint processing (CoMP)

In the wireless communication system, Coordinated multipoint processing (CoMP) is considered as an effective technique to solve the inter-cell interference problem to improve the cell-edge user throughput [CoMP].

6.3.1. CoMP Architecture

             +--------------------------+
             |           CoMP           |
             +--+--------------------+--+
                |                    |
          +----------+             +------------+
          |  Uplink  |             |  Downlink  |
          +-----+----+             +--------+---+
                |                           |
     -------------------              -----------------------
     |         |       |              |           |         |
+---------+ +----+  +-----+       +------------+ +-----+  +-----+
|  Joint  | | CS |  | DPS |       |    Joint   | | CS/ |  | DPS |
|Reception| |    |  |     |       |Transmission| | CB  |  |     |
+---------+ +----+  +-----+       +------------+ +-----+  +-----+
     |                                     |
     |-----------                          |-------------
     |          |                          |            |
+------------+  +---------+       +----------+   +------------+
|    Joint   |  |   Soft  |       | Coherent |   |     Non-   |
|Equalization|  |Combining|       |    JT    |   | Coherent JT|
+------------+  +---------+       +----------+   +------------+

Figure 11: Framework of CoMP Technology

As shown in Figure 11, CoMP reception and transmission is a framework that multiple geographically distributed antenna nodes cooperate to improve the performance of the users served in the common cooperation area. The design principal of CoMP is to extend the current single-cell to multi-UEs transmission to a multi-cell- to-multi-UEs transmission by base station cooperation. In contrast to single-cell scenario, CoMP has critical issues such as: Backhaul latency, CSI (Channel State Information) reporting and accuracy and Network complexity. Clearly the first two requirements are very much delay sensitive and will be discussed in next section.

6.3.2. Delay Sensitivity in CoMP

As the essential feature of CoMP, signaling is exchanged between eNBs, the backhaul latency is the dominating limitation of the CoMP performance. Generally, JT and JP may benefit from coordinating the scheduling (distributed or centralized) of different cells in case that the signaling exchanging between eNBs is limited to 4-10ms. For C-RAN the backhaul latency requirement is 250us while for D-RAN it is 4-15ms. And this delay requirement is not only rigid but also absolute since any uncertainty in delay will down the performance significantly. Note that, some operator's transport network is not build to support Layer-3 transfer in aggregation layer. In such case, the signaling is exchanged through EPC which means delay is supposed to be larger. CoMP has high requirement on delay and reliability which is lack by current mobile network systems and may impact the architecture of the mobile network.

6.4. Industrial Automation

Traditional "industrial automation" terminology usually refers to automation of manufacturing, quality control and material processing. "Industrial internet" and "industrial 4.0" [EA12] is becoming a hot topic based on the Internet of Things. This high flexible and dynamic engineering and manufacturing will result in a lot of so-called smart approaches such as Smart Factory, Smart Products, Smart Mobility, and Smart Home/Buildings. No doubt that ultra high reliability and robustness is a must in data transmission, especially in the closed loop automation control application where delay requirement is below 1ms and packet loss less than 10E-9. All these critical requirements on both latency and loss cannot be fulfilled by current 4G communication networks. Moreover, the collaboration of the industrial automation from remote campus with cellular and fixed network has to be built on an integrated, cloud-based platform. In this way, the deterministic flows should be guaranteed regardless of the amount of other flows in the network. The lack of this mechanism becomes the main obstacle in deployment on of industrial automation.

6.5. Vehicle to Vehicle

V2V communication has gained more and more attention in the last few years and will be increasingly growth in the future. Not only equipped with direct communication system which is short ranged, V2V communication also requires wireless cellular networks to cover wide range and more sophisticated services. V2V application in the area autonomous driving has very stringent requirements of latency and reliability. It is critical that the timely arrival of information for safety issues. In addition, due to the limitation of processing of individual vehicle, passing information to the cloud can provide more functions such as video processing, audio recognition or navigation systems. All of those requirements lead to a highly reliable connectivity to the cloud. On the other hand, it is natural that the provisioning of low latency communication is one of the main challenges to be overcome as a result of the high mobility, the high penetration losses caused by the vehicle itself. As result of that, the data transmission with latency below 5ms and a high reliability of PER below 10E-6 are demanded. It can benefit from the deployment of deterministic networking with high reliability.

6.6. Gaming, Media and Virtual Reality

Online gaming and cloud gaming is dominating the gaming market since it allow multiple players to play together with more challenging and competing. Connected via current internet, the latency can be a big issue to degrade the end users' experience. There different types of games and FPS (First Person Shooting) gaming has been considered to be the most latency sensitive online gaming due to the high requirements of timing precision and computing of moving target. Virtual reality is also receiving more interests than ever before as a novel gaming experience. The delay here can be very critical to the interacting in the virtual world. Disagreement between what is seeing and what is feeling can cause motion sickness and affect what happens in the game. Supporting fast, real-time and reliable communications in both PHY/MAC layer, network layer and application layer is main bottleneck for such use case. The media content delivery has been and will become even more important use of Internet. Not only high bandwidth demand but also critical delay and jitter requirements have to be taken into account to meet the user demand. To make the smoothness of the video and audio, delay and jitter has to be guaranteed to avoid possible interruption which is the killer of all online media on demand service. Now with 4K and 8K video in the near future, the delay guarantee become one of the most challenging issue than ever before. 4K/8K UHD video service requires 6Gbps-100Gbps for uncompressed video and compressed video starting from 60Mbps. The delay requirement is 100ms while some specific interactive applications may require 10ms delay [UHD-video].

6.7. Acknowledgments

This document has benefited from reviews, suggestions, comments and proposed text provided by the following members, listed in alphabetical order: Jing Huang, Junru Lin, Lehong Niu and Oilver Huang.

7. Informative References

[ACE] IETF, "Authentication and Authorization for Constrained Environments"
[bacnetip] ASHRAE, "Annex J to ANSI/ASHRAE 135-1995 - BACnet/IP", January 1999.
[CCAMP] IETF, "Common Control and Measurement Plane"
[CoMP] NGMN Alliance, "RAN EVOLUTION PROJECT COMP EVALUATION AND ENHANCEMENT", NGMN Alliance NGMN_RANEV_D3_CoMP_Evaluation_and_Enhancement_v2.0, March 2015.
[CONTENT_PROTECTION] Olsen, D., "1722a Content Protection", 2012.
[CPRI] CPRI Cooperation, "Common Public Radio Interface (CPRI); Interface Specification", CPRI Specification V6.1, July 2014.
[DCI] Digital Cinema Initiatives, LLC, "DCI Specification, Version 1.2", 2012.
[DICE] IETF, "DTLS In Constrained Environments"
[EA12] Evans, P. and M. Annunziata, "Industrial Internet: Pushing the Boundaries of Minds and Machines", November 2012.
[ESPN_DC2] Daley, D., "ESPN's DC2 Scales AVB Large", 2014.
[flnet] Japan Electrical Manufacturers' Association, "JEMA 1479 - English Edition", September 2012.
[Fronthaul] Chen, D. and T. Mustala, "Ethernet Fronthaul Considerations", IEEE 1904.3, February 2015.
[HART] www.hartcomm.org, "Highway Addressable remote Transducer, a group of specifications for industrial process and control devices administered by the HART Foundation"
[I-D.finn-detnet-architecture] Finn, N., Thubert, P. and M. Teener, "Deterministic Networking Architecture", Internet-Draft draft-finn-detnet-architecture-01, March 2015.
[I-D.finn-detnet-problem-statement] Finn, N. and P. Thubert, "Deterministic Networking Problem Statement", Internet-Draft draft-finn-detnet-problem-statement-04, October 2015.
[I-D.ietf-6tisch-6top-interface] Wang, Q. and X. Vilajosana, "6TiSCH Operation Sublayer (6top) Interface", Internet-Draft draft-ietf-6tisch-6top-interface-04, July 2015.
[I-D.ietf-6tisch-architecture] Thubert, P., "An Architecture for IPv6 over the TSCH mode of IEEE 802.15.4", Internet-Draft draft-ietf-6tisch-architecture-08, May 2015.
[I-D.ietf-6tisch-coap] Sudhaakar, R. and P. Zand, "6TiSCH Resource Management and Interaction using CoAP", Internet-Draft draft-ietf-6tisch-coap-03, March 2015.
[I-D.ietf-6tisch-terminology] Palattella, M., Thubert, P., Watteyne, T. and Q. Wang, "Terminology in IPv6 over the TSCH mode of IEEE 802.15.4e", Internet-Draft draft-ietf-6tisch-terminology-05, July 2015.
[I-D.ietf-6tisch-tsch] Watteyne, T., Palattella, M. and L. Grieco, "Using IEEE802.15.4e TSCH in an IoT context: Overview, Problem Statement and Goals", Internet-Draft draft-ietf-6tisch-tsch-06, March 2015.
[I-D.ietf-ipv6-multilink-subnets] Thaler, D. and C. Huitema, "Multi-link Subnet Support in IPv6", Internet-Draft draft-ietf-ipv6-multilink-subnets-00, July 2002.
[I-D.ietf-roll-rpl-industrial-applicability] Phinney, T., Thubert, P. and R. Assimiti, "RPL applicability in industrial networks", Internet-Draft draft-ietf-roll-rpl-industrial-applicability-02, October 2013.
[I-D.ietf-tictoc-1588overmpls] Davari, S., Oren, A., Bhatia, M., Roberts, P. and L. Montini, "Transporting Timing messages over MPLS Networks", Internet-Draft draft-ietf-tictoc-1588overmpls-07, October 2015.
[I-D.kh-spring-ip-ran-use-case] Khasnabish, B., hu, f. and L. Contreras, "Segment Routing in IP RAN use case", Internet-Draft draft-kh-spring-ip-ran-use-case-02, November 2014.
[I-D.mirsky-mpls-residence-time] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S. and S. Vainshtein, "Residence Time Measurement in MPLS network", Internet-Draft draft-mirsky-mpls-residence-time-07, July 2015.
[I-D.svshah-tsvwg-deterministic-forwarding] Shah, S. and P. Thubert, "Deterministic Forwarding PHB", Internet-Draft draft-svshah-tsvwg-deterministic-forwarding-04, August 2015.
[I-D.thubert-6lowpan-backbone-router] Thubert, P., "6LoWPAN Backbone Router", Internet-Draft draft-thubert-6lowpan-backbone-router-03, February 2013.
[I-D.wang-6tisch-6top-sublayer] Wang, Q. and X. Vilajosana, "6TiSCH Operation Sublayer (6top)", Internet-Draft draft-wang-6tisch-6top-sublayer-02, October 2015.
[IEC61850-90-12] TC57 WG10, IEC., "IEC 61850-90-12 TR: Communication networks and systems for power utility automation - Part 90-12: Wide area network engineering guidelines", 2015.
[IEC62439-3:2012] TC65, IEC., "IEC 62439-3: Industrial communication networks - High availability automation networks - Part 3: Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)", 2012.
[IEEE1588] IEEE, "IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems", IEEE Std 1588-2008, 2008.
[IEEE1722] IEEE, "1722-2011 - IEEE Standard for Layer 2 Transport Protocol for Time Sensitive Applications in a Bridged Local Area Network", IEEE Std 1722-2011, 2011.
[IEEE19043] IEEE Standards Association, "IEEE 1904.3 TF", IEEE 1904.3, 2015.
[IEEE802.1TSNTG] IEEE Standards Association, "IEEE 802.1 Time-Sensitive Networks Task Group", March 2013.
[IEEE802154] IEEE standard for Information Technology, "IEEE std. 802.15.4, Part. 15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks"
[IEEE802154e] IEEE standard for Information Technology, "IEEE standard for Information Technology, IEEE std. 802.15.4, Part. 15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks, June 2011 as amended by IEEE std. 802.15.4e, Part. 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs) Amendment 1: MAC sublayer", April 2012.
[IEEE8021AS] IEEE, "Timing and Synchronizations (IEEE 802.1AS-2011)", IEEE 802.1AS-2001, 2011.
[IEEE8021CM] Farkas, J., "Time-Sensitive Networking for Fronthaul", Unapproved PAR, PAR for a New IEEE Standard; IEEE P802.1CM, April 2015.
[ISA100] ISA/ANSI, "ISA100, Wireless Systems for Automation"
[ISA100.11a] ISA/ANSI, "Wireless Systems for Industrial Automation: Process Control and Related Applications - ISA100.11a-2011 - IEC 62734", 2011.
[ISO7240-16] ISO, "ISO 7240-16:2007 Fire detection and alarm systems -- Part 16: Sound system control and indicating equipment", 2007.
[knx] KNX Association, "ISO/IEC 14543-3 - KNX", November 2006.
[lontalk] ECHELON, "LonTalk(R) Protocol Specification Version 3.0", 1994.
[LTE-Latency] Johnston, S., "LTE Latency: How does it compare to other technologies", March 2014.
[MEF] MEF, "Mobile Backhaul Phase 2 Amendment 1 -- Small Cells", MEF 22.1.1, July 2014.
[METIS] METIS, "Scenarios, requirements and KPIs for 5G mobile and wireless system", ICT-317669-METIS/D1.1 ICT-317669-METIS/D1.1, April 2013.
[modbus] Modbus Organization, "MODBUS APPLICATION PROTOCOL SPECIFICATION V1.1b", December 2006.
[net5G] Ericsson, "5G Radio Access, Challenges for 2020 and Beyond", Ericsson white paper wp-5g, June 2013.
[NGMN] NGMN Alliance, "5G White Paper", NGMN 5G White Paper v1.0, February 2015.
[PCE] IETF, "Path Computation Element"
[profibus] IEC, "IEC 61158 Type 3 - Profibus DP", January 2001.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, December 1998.
[RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, DOI 10.17487/RFC2474, December 1998.
[RFC3031] Rosen, E., Viswanathan, A. and R. Callon, "Multiprotocol Label Switching Architecture", RFC 3031, DOI 10.17487/RFC3031, January 2001.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V. and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001.
[RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation Metric for IP Performance Metrics (IPPM)", RFC 3393, DOI 10.17487/RFC3393, November 2002.
[RFC3444] Pras, A. and J. Schoenwaelder, "On the Difference between Information Models and Data Models", RFC 3444, DOI 10.17487/RFC3444, January 2003.
[RFC3972] Aura, T., "Cryptographically Generated Addresses (CGA)", RFC 3972, DOI 10.17487/RFC3972, March 2005.
[RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-Edge (PWE3) Architecture", RFC 3985, DOI 10.17487/RFC3985, March 2005.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2006.
[RFC4553] Vainshtein, A. and YJ. Stein, "Structure-Agnostic Time Division Multiplexing (TDM) over Packet (SAToP)", RFC 4553, DOI 10.17487/RFC4553, June 2006.
[RFC4903] Thaler, D., "Multi-Link Subnet Issues", RFC 4903, DOI 10.17487/RFC4903, June 2007.
[RFC4919] Kushalnagar, N., Montenegro, G. and C. Schumacher, "IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and Goals", RFC 4919, DOI 10.17487/RFC4919, August 2007.
[RFC5086] Vainshtein, A., Sasson, I., Metz, E., Frost, T. and P. Pate, "Structure-Aware Time Division Multiplexed (TDM) Circuit Emulation Service over Packet Switched Network (CESoPSN)", RFC 5086, DOI 10.17487/RFC5086, December 2007.
[RFC5087] Stein, Y(J)., Shashoua, R., Insler, R. and M. Anavi, "Time Division Multiplexing over IP (TDMoIP)", RFC 5087, DOI 10.17487/RFC5087, December 2007.
[RFC6282] Hui, J. and P. Thubert, "Compression Format for IPv6 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, DOI 10.17487/RFC6282, September 2011.
[RFC6550] Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, JP. and R. Alexander, "RPL: IPv6 Routing Protocol for Low-Power and Lossy Networks", RFC 6550, DOI 10.17487/RFC6550, March 2012.
[RFC6551] Vasseur, JP., Kim, M., Pister, K., Dejean, N. and D. Barthel, "Routing Metrics Used for Path Calculation in Low-Power and Lossy Networks", RFC 6551, DOI 10.17487/RFC6551, March 2012.
[RFC6775] Shelby, Z., Chakrabarti, S., Nordmark, E. and C. Bormann, "Neighbor Discovery Optimization for IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs)", RFC 6775, DOI 10.17487/RFC6775, November 2012.
[SRP_LATENCY] Gunther, C., "Specifying SRP Latency", 2014.
[STUDIO_IP] Mace, G., "IP Networked Studio Infrastructure for Synchronized & Real-Time Multimedia Transmissions", 2007.
[SyncE] ITU-T, "G.8261 : Timing and synchronization aspects in packet networks", Recommendation G.8261, August 2013.
[TEAS] IETF, "Traffic Engineering Architecture and Signaling"
[TS23401] 3GPP, "General Packet Radio Service (GPRS) enhancements for Evolved Universal Terrestrial Radio Access Network (E-UTRAN) access", 3GPP TS 23.401 10.10.0, March 2013.
[TS25104] 3GPP, "Base Station (BS) radio transmission and reception (FDD)", 3GPP TS 25.104 3.14.0, March 2007.
[TS36104] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA); Base Station (BS) radio transmission and reception", 3GPP TS 36.104 10.11.0, July 2013.
[TS36133] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA); Requirements for support of radio resource management", 3GPP TS 36.133 12.7.0, April 2015.
[TS36211] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA); Physical channels and modulation", 3GPP TS 36.211 10.7.0, March 2013.
[TS36300] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Universal Terrestrial Radio Access Network (E-UTRAN); Overall description; Stage 2", 3GPP TS 36.300 10.11.0, September 2013.
[TSNTG] IEEE Standards Association, "IEEE 802.1 Time-Sensitive Networks Task Group", 2013.
[UHD-video] Holub, P., "Ultra-High Definition Videos and Their Applications over the Network", The 7th International Symposium on VICTORIES Project PetrHolub_presentation, October 2014.
[WirelessHART] www.hartcomm.org, "Industrial Communication Networks - Wireless Communication Network and Communication Profiles - WirelessHART - IEC 62591", 2010.

Authors' Addresses

Ethan Grossman (editor) Dolby Laboratories, Inc. 1275 Market Street San Francisco, CA 94103 USA Phone: +1 415 645 4726 EMail: ethan.grossman@dolby.com URI: http://www.dolby.com
Craig Gunther Harman International 10653 South River Front Parkway South Jordan, UT 84095 USA Phone: +1 801 568-7675 EMail: craig.gunther@harman.com URI: http://www.harman.com
Pascal Thubert Cisco Systems, Inc Building D 45 Allee des Ormes - BP1200 MOUGINS - Sophia Antipolis, 06254 FRANCE Phone: +33 497 23 26 34 EMail: pthubert@cisco.com
Patrick Wetterwald Cisco Systems 45 Allees des Ormes Mougins, 06250 FRANCE Phone: +33 4 97 23 26 36 EMail: pwetterw@cisco.com
Jean Raymond Hydro-Quebec 1500 University Montreal, H3A3S7 Canada Phone: +1 514 840 3000 EMail: raymond.jean@hydro.qc.ca
Jouni Korhonen Broadcom Corporation 3151 Zanker Road San Jose, CA 95134 USA EMail: jouni.nospam@gmail.com
Yu Kaneko Toshiba 1 Komukai-Toshiba-cho, Saiwai-ku, Kasasaki-shi Kanagawa, Japan, EMail: yu1.kaneko@toshiba.co.jp
Subir Das Applied Communication Sciences 150 Mount Airy Road, Basking Ridge New Jersey, 07920, USA, EMail: sdas at appcomsci dot com
Yiyong Zha Huawei Technologies EMail: zhayiyong@huawei.com