Networking Working Group | J. Tripathi, Ed. |
Internet-Draft | J. C. de Oliveira, Ed. |
Intended status: Informational | Drexel University |
Expires: September 30, 2013 | JP. Vasseur, Ed. |
Cisco Systems, Inc. | |
March 29, 2013 |
Why Reactive Protocols are ill-suited for LLNs
draft-tripathi-roll-reactive-applicability-00
This document describes serious issues regarding the use of reactive routing protocols in Low power and lossy network. Routing requirements for various LLN deployments are studied in order to judge how reactive routing may or may not fit to those criteria. This document is intended to point out shortcomings of reactive protocols to operate in LLN.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 30, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The IETF has chartered a Working Group called ROLL (Routing Over Low power and Lossy networks) in 2008 with the objective of specifying the routing requirements of Low power and Lossy Networks (LLN), and then either select an existing routing protocol or specify and design a new routing protocol in light of the unique requirements of these networks. This led to the specification of a new routing protocol called RPL (see [RFC6550]) along with other specifications related to RPL.
Despite the existence of a standard track routing protocol for LLN, discussions have been taken place as to whether other routing approaches could be taken, such as deploying reactive routing protocols in LLNs, such as Smart metering networks, industrial automation, water management networks. The aim of this document is not to discuss a specific reactive routing protocol but why such routing protocols are ill-suited for LLNs. For the sake of illustration, a reactive protocol such as LOADng ([I-D.clausen-lln-loadng]) is introduced in IETF, with results demonstrating how it offers improvement over standard AODV protocol for LLNs ([clausen-load-vtc]). The aim of this document is to highly the number of shortcomings and technical issues in using reactive routing in such networks. Note that reactive is perfectly applicable to other types of ad-hoc networks, this document exclusively discussed why reactive routing is ill-suited to LLNs such as Smart Grid networks, Industrial Automation networks to mention a few.
Please refer to [ROLL-TERMS] for terminology.
A LLN often has a more demanding traffic pattern than simple traffic between two peer nodes (P2P traffic). The nature of deployments and applications running over a LLN often requires a given node to send the same data to multiple recipients, requiring multicasting or point-to-multipoint (P2MP) support from the routing protocol. These destinations may be several hops away, requiring an efficient dissemination method. Examples of such traffic include:
Indeed, [RFC5548] necessitates the support of P2MP traffic for a protocol to be suitable for U-LLN deployment. It describes - "Thus, the protocol(s) should be optimized to support a large number of unicast flows from the sensing nodes or sensing clusters towards a LBR, or highly directed multicast or anycast flows from the nodes towards multiple LBRs." and "A U-LLN may also need to support efficient large-scale messaging to groups of actuators."
While a reactive routing protocol such as LOADng (see [I-D.clausen-lln-loadng]) may be altered to support an on-demand bi-directional route between any two nodes in the network, the protocol does not have provision to support P2MP traffic. A naive and very costly solution would be to create a copy of the same message/application data to send for each destination. Also, to find the route to each destination, the node may have to create separate route request (RREQ) messages and broadcast them. This broadcast event creates a huge control overhead. Number of intended destinations for this P2MP traffic can reach an order of hundreds or thousands, which is very common in a LLN deployed in urban area, or for a number of Advanced Metering Infrastructure (AMI) meters in a particular region in a smart grid deployment. If such is the case, the protocol does not scale well with the network size in terms of control overhead. If a number of nodes are added in a region, the situation may become worse. Hence, reactive routing protocols in general may become unsuitable to be deployed in a large scale U-LLN where P2MP traffic needs to be supported, even if for 1-2 times a day.
Likewise P2MP traffic, MP2P traffic needs to be supported by a routing protocol intended to be designed for an LLN. This traffic pattern arises when multiple nodes may try to send data to a single sink at the same time. As [RFC5548] describes, "A U-LLN should support occasional large-scale traffic flows from sensing nodes through LBRs (to nodes outside the U-LLN), such as system-wide alerts." By nature, this kind of traffic may be time-critical, and the alerts may need to be delivered within a small time-frame. Similar traffic may include critical scenarios in an industrial automation LLN deployment, where all nodes in one area may need to report malfunction, fire, or other emergency in that part of the manufacturing plant.
Since by default, reactive routing does not provide any efficient MP2P routing paradigm, all nodes will create their own route request (RREQ) in order to find the route to the LBR or server, if the route is expired or not cached. This situation will create separate broadcast control messages from each node, which may lead to a broadcast storm in the network, similar to the P2MP traffic scenario. The broadcast storm may result in individual RREPs reaching the initiator node much later, yielding a high delay for the time-critical alert packets.
Since in reactive routing for MP2P traffic is not efficient, and does not follow an aggregation tree, it becomes difficult to implement a data aggregation algorithm compatible to AODV or any of its derivatives as well.
A LLN that is provisioned to be used for data gathering purpose only may include additional application layer modules in the future. Smart grid deployments may need to implement new modules of management traffic from the base stations to AMI meters, in addition to what is envisioned at present. LLNs are evolving and therefore it is expected that new applications and requirements will be part of its future (a LLN may also be re-purposed).
Reactive protocols discover route to destination on an on-demand basis. If communication between same source-destination pair are spaced far apart in time, the protocol tends to discover the routes every time communication is requested by the application layer. With reactive routing protocols, adding a new application module which requires more data communication in between the existing data traffic pattern will incur control overhead. Hence, if a network is designed to operate within bounds in terms of maximum control overhead load, adding new application modules may well force the control overhead to surpass the designed maximum limit. For example, a deployment requiring both MP2P and P2P application may incur more overhead than a deployment which is currently working with only data aggregation. Since LLNs will undoubtedly require more application modules and management modules to be augmented in future, a suitable routing protocol should not incur more control overhead with each added traffic. For the sake of illustration, many Smart Grid networks, which were originally designed for the purpose of advanced metering, now require a multi-service networks in support of a variety of applications including meter reading, use of meters of alarms, distribution automation and electric vehicles, leading to a variety of traffic patterns each with different Quality of Service requirements.
A reactive protocol is well-suited for a traffic pattern where data transfer is not very frequent. However, if the traffic pattern includes periodic data reporting, even as low as a few times in a day, the traffic pattern will induce periodic broadcast of route request throughout the network. A simple example scenario can clarify this: assume an application in a U-LLN requires periodic data reporting every 6 hours or 4 times in a day - morning, noon, evening and night. If the network consists of 2000 nodes, which is a very conservative number in a typical U-LLN, the application alone will create a route request broadcast for each sensor node every 11 seconds, on average. Thus, over the life of the sensor network, a reactive protocol will use more control overhead than a proactive protocol.
The amount of flooding to discover routes may also be controlled via tweaking the route expiry time or route validity time. If a route is active, nodes should not waste network resources trying to find out the route to the same destination. Keeping a high expiry time for the routes, on the other hand may prevent flooding when next time data is generated for the same destination. However, the path may well have been invalid by the route expiry time. Considering LLN link characteristics, link flipping is a very frequent event. Hence, high route expiry time may lead a node to find out invalidity of a path, thus forcing to flood the network again for route discovery. Thus, increasing route expiry time or route validity time for an AODV-based reactive protocol may not prove to control flooding in LLN. Proactively choosing a back-up path proves to be effective way to ensure valid routing path in presence of link flaps. Furthermore, if traffic is sent along a broken path, a new request would consequently be generated, thus increasing the control traffic load, in addition to incurring additional delays for the user data.
Note that there is a lot of experimental evidence supporting the claim that using floods or scoped floods to discover routes is ill-suited to low-power and lossy networks (LLNs). This is due to the low-power requirement. In low-power wireless networks, broadcast packets usually cost much more to transmit than unicast ones.
Reactive routing protocols usually relies on route caching for discovered destination. Hence, if any node participates in multiple active flows in the network, the node needs to store next hop (and validity) information for each source and destination node in the network. Thus, depending on the user traffic, some nodes tend to increase their routing table size proportional to number of flows passing through themselves. Hence, the nodes require more memory storage to operate successfully in these networks, depending on the traffic pattern. However, characteristics of LLNs never guarantee enough storage space in any node for storing routing tables. In LLNs, thus it is always advantageous to store route to a subset of nodes and still be able to find a path to any destination in the network. Reactive routing protocols, in their current specification or any variant, fail to achieve low memory requirement in nodes.
Destination oriented flooding in LLN, tends to worsen this situation. Multiple route requests may reach the same node at the same time for different destinations. Even though the destination may never be reached through the concerned node, the node still have to process and re-broadcast each request, along with its neighbors. Vital memory is consumed in RREQ/RREP processing and buffering in this method.
Apart from providing a route between any two nodes in the network, a routing protocol suitable for LLN should be able to handle additional constraints. A LLN mainly consists of constrained devices, both functionality and memory-wise, and inherently heterogeneous in nature. Hence, any routing protocol suitable for LLN should support node constraint based routing. This requirement is mandated in [RFC5548] as follows: " the routing protocol MUST be able to advertise node capabilities that will be exclusively used by the routing protocol engine for routing decision". For example, the routing protocol should avoid a node with less battery power while routing to reach a server. Similarly, for industrial automation requirements, [RFC5673] also needs a routing protocol to provide device-aware routing, as it describes "The routing algorithm MUST support node-constrained routing (e.g., taking into account the existing energy state as a node constraint). Node constraints include power and memory, as well as constraints placed on the device by the user, such as battery life". For home routing automation, [RFC5826] specifies, " The routing protocol SHOULD route via mains-powered nodes if possible. The routing protocol MUST support constraint-based routing taking into account node properties (CPU, memory, level of energy, sleep intervals, safety/convenience of changing battery)".
Clearly, recognizing a node's capability and routing accordingly is an important aspect for any routing protocol designed to be suitable for LLNs. However, any AODV-based protocol (such as DYMO [I-D.DYMO], LOADng) in their current specification, fail to provide routes based on any such constraint. Currently known reactive routing protocols do not have any provision to determine whether the next-hop node in a route has enough battery power to sustain the route, or whether the next-hop node is main powered or provides a particular functionality. Thus, these protocols fail to provide requirements mandated by [RFC5548], [RFC5673] and [RFC5826] for routing in a LLN.
Some data in a LLN are delay sensitive by nature. While data generated for periodic reporting can be delivered even up to few seconds later, emergency alarms, fault notification and alert packets need to be delivered as quickly as possible. According to [RFC5673], in an industrial automation setting, "Non-critical closed-loop applications have a latency requirement that can be as low as 100 milliseconds but many control loops are tolerant of latencies above 1 second". Clearly, the types of alert packets need a path to the destination beforehand as soon as they are generated. However, reactive protocols depend on finding a path first when data is generated. Since the receipt of an RREP packet can take up to the network traversal time, for large networks the delivery of an emergency packet may take more than few hundred milliseconds. Also, if this emergency situation initiates a system wide alert from all nodes in the region to one or multiple servers outside the LLN, each node will create their own broadcast to reach the destination. Similar to the condition in section 3, the broadcast storm may lead to huge delay in receipt of RREP packets, and thus delay in delivering the emergency packet. The broadcast will lead to high control overhead, clogging the network, as well as loss of some RREQ/RREP packets.
TBD