Network Working Group | J. Chroboczek |
Internet-Draft | IRIF, University of Paris-Diderot |
Intended status: Informational | April 6, 2018 |
Expires: October 8, 2018 |
Applicability of the Babel routing protocol
draft-ietf-babel-applicability-02
Where we argue that although OSPF and IS-IS are fine protocols, there exists a space where the Babel routing protocol (RFC 6126bis) can be useful.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 8, 2018.
Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Babel [RFC6126bis] is a routing protocol based on the familiar distance-vector algorithm (sometimes known as distributed Bellman-Ford) augmented with mechanisms for loop avoidance (there is no "counting to infinity") and starvation avoidance. In this document, we argue that there exist niches where Babel is useful and that are not adequately served by the mature, efficient and highly refined protocols that are usually deployed, such as OSPF [RFC5340] and IS-IS [RFC1195].
At its core, Babel is a traditional distance-vector protocol based on the distributed Bellman-Ford algorithm, similar in principle to RIP [RFC2453], but with two obvious extensions: provisions for explicit neighbour reachability, bidirectional reachability and link-quality sensing, and support for multiple address families (e.g., IPv6 and IPv4) in a single protocol instance.
Algorithms of this class are simple to understand and simple to implement, but unfortunately they do not work very well — they suffer from "counting to infinity", a case of pathologically slow convergence in some topologies after a link has been brought down. Babel uses a mechanism pioneered by EIGRP [DUAL] [RFC7868], known as "feasibility", which avoids routing loops and therefore makes counting to infinity impossible.
Feasibility is a very conservative mechanism, one that not only rejects all looping routes, but also rejects some loop-free routes; it can easily lead to a situation known as starvation, where a router rejects all routes to a given destination, even those that are loop-free. In order to recover from starvation, Babel uses a mechanism pioneered by DSDV [DSDV] and known as "sequenced routes". In Babel, this mechanism is generalised to deal with prefixes of arbitrary length and routes announced at multiple points in a single routing domain (DSDV was a pure mesh protocol, and did not need to deal with such details).
The sequenced routes algorithm is slow to react to a starvation episode. In Babel, starvation recovery is accelerated by using explicit requests (known as "seqno requests" in the protocol) to signal a starvation episode and to cause a new sequenced route to be propagated in the network. In the absence of packet loss, this mechanism is provably complete and clears the starvation in time proportional to the diameter of the network, at the cost of some additional signalling traffic.
The fairly strong properties of the Babel protocol (convergence, loop avoidance, starvation avoidance) rely on some rather weak properties of the network and the metric being used. The most significant are: [RFC5340] or IS-IS [RFC1195] which are layered over a reliable flooding algorithm and make some rather strong requirements on the underlying network and metric.
In particular, Babel does not assume a reliable transport, it does not require an ordered transport, it does not require transitive communication, and it does not require that the metric be discrete (continuous metrics are possible, reflecting for example packet loss rates). This is in contrast to traditional link-state routing protocols such as OSPF
Babel is a conceptually simple protocol. It consists of a familiar algorithm (distributed Bellman-Ford) augmented with three simple and well-defined mechanisms (feasibility, sequenced routes and explicit requests). Given a sufficiently friendly audience, the principles behind Babel can be explained in 15 minutes, and a full description of the protocol can be done in 52 minutes (one microcentury).
An important consequence is that Babel is easy to implement. While Babel is a young protocol, there already exist four independent implementations, one of which was reportedly written and debugged in just two nights.
Babel's correctness depends on a small number of fairly weak and reasonably obvious properties. This makes Babel in many ways a robust protocol:
These robustness properties have important consequences for the applicability of the protocol: Babel works (more or less efficiently) in a wide range of networks where traditional routing protocols give up.
Babel's packet format has a number of features designed to make the protocol extensible, and a number of extensions have been designed to make Babel work in situations that were not envisioned when the protocol was initially designed. This extensibility is not an accident, but a consequence of the design of the protocol: it is easy to check whether a given extension violates the assumptions made by the protocol.
Remarkably enough, all of the extensions designed to date interoperate with the base protocol and with each other. Again, this is a consequence of the protocol design: in order to check the interoperability of two implementations of Babel, it is enough to verify that the interaction of the two does not violate the protocol's assumptions.
Notable extensions deployed to date include:
Some other extensions have been designed, but have not seen deployment yet (and their usefulness is yet to be demonstrated):
Babel has some undesirable properties that make it suboptimal or even unusable in some deployments.
The main mechanisms used by Babel to reconverge after a topology change are reactive: triggered updates, triggered retractions and explicit requests. However, in the presence of heavy packet loss, Babel relies on periodic updates to clear routing pathologies. This reliance on periodic updates makes Babel unsuitable in at least two kinds of deployments:
While there exist techniques that allow a Babel speaker to function with a partial routing table (e.g., by using just a default route), the basic design of the protocol is that every Babel speaker has a full routing table. In networks where some nodes are too constrained to hold a full routing table, protocols such as AODVv2 [AODVv2], RPL [RFC6550] and LOADng [LOADng] may be preferable to Babel.
Babel's loop-avoidance mechanism relies on making a route unreachable after a retraction until all neighbours have been guaranteed to have acted upon the retraction, even in the presence of packet loss. Unless the optional algorithm described in Section 3.5.5 of [RFC6126bis] is implemented, this entails that a node is unreachable for a few minutes after the most specific route to it has been retracted. This property may make Babel undesirable in networks that perform automatic aggregation.
In this section, we give a few examples of environments where Babel has been successfully deployed.
Babel is able to deal with both classical, prefix-based ("Internet-style") routing and flat ("mesh-style") routing over non-transitive link technologies. Because of that, it has seen a number of succesful deployments in medium-sized hybrid networks, networks that combine a wired, aggregated backbone with meshy wireless bits at the edges. No other routing protocol known to us is similarly robust and efficient in this particular type of network.
The algorithms used by Babel (loop avoidance, hysteresis, delayed updates) allow it to remain stable and efficient in the presence of unstable metrics, even in the presence of a feedback loop. For this reason, it has been successfully deployed in large scale overlay networks, built out of thousands of tunnels spanning continents, where it is used with a metric computed from links' latencies [DELAY-BASED].
This particular application depends on the extension for RTT-sensitive routing.
While Babel is a general-purpose routing protocol, it has been repeatedly shown to be competitive with dedicated routing protocols for wireless mesh networks [REAL-WORLD] [BRIDGING-LAYERS]. While this particular niche is already served by a number of mature protocols, notably OLSR-ETX and OLSRv2 [RFC7181] equipped with the DAT metric [RFC7779], Babel has seen a moderate amount of successful deployment in pure mesh networks.
Because of its small size and simple configuration, Babel has been deployed in small, unmanaged networks (three to five routers), where it serves as a more efficient replacement for RIP [RFC2453], over which it has two significant advantages: the ability to route multiple address families (IPv6 and IPv4) in a single protocol instance, and good support for using wireless links for transit.
This document requires no IANA actions. [RFC Editor: please remove this section before publication.]
As in all distance-vector routing protocols, a Babel speaker receives reachability information from its neighbours, which by default is trusted. A number of attacks are possible if this information is not suitably protected, either by a lower-layer mechanism or by an extension to the protocol itself (e.g. [RFC7298]).
Implementors and deployers must be aware of the insecure nature of the base protocol, and must take suitable measures to ensure that the protocol is deployed as securely as required by the application.