Internet DRAFT - draft-chroboczek-babel-doesnt-care
draft-chroboczek-babel-doesnt-care
Network Working Group J. Chroboczek
Internet-Draft PPS, University of Paris-Diderot
Intended status: Informational April 4, 2015
Expires: October 6, 2015
Babel doesn't care
draft-chroboczek-babel-doesnt-care-00
Abstract
Where we claim that, unlike typical IGPs, the Babel routing protocol
just doesn't care.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 6, 2015.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Chroboczek Expires October 6, 2015 [Page 1]
Internet-Draft Babel doesn't care April 2015
Table of Contents
1. Introduction: why Babel doesn't care . . . . . . . . . . . . 2
1.1. Consequences and applicability of the protocol . . . . . 3
2. Structure of the Babel routing protocol . . . . . . . . . . . 3
2.1. Distance-vector protocol . . . . . . . . . . . . . . . . 4
2.2. Link sensing . . . . . . . . . . . . . . . . . . . . . . 4
2.3. Metric agnosticity . . . . . . . . . . . . . . . . . . . 5
2.4. Flexible route selection policies . . . . . . . . . . . . 5
2.5. Loop avoidance . . . . . . . . . . . . . . . . . . . . . 5
2.6. Blackhole elimination . . . . . . . . . . . . . . . . . . 6
3. Examples of how Babel doesn't care . . . . . . . . . . . . . 6
3.1. Lossy networks . . . . . . . . . . . . . . . . . . . . . 6
3.2. Non-transitive (mesh) networks . . . . . . . . . . . . . 7
3.3. Non-default route selection policies . . . . . . . . . . 7
3.4. Non-distributive metrics . . . . . . . . . . . . . . . . 7
3.5. Delay-based routing . . . . . . . . . . . . . . . . . . . 8
3.6. Source-specific routing . . . . . . . . . . . . . . . . . 9
3.7. Interoperability . . . . . . . . . . . . . . . . . . . . 9
4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 9
5. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction: why Babel doesn't care
The core of the Babel routing protocol [RFC6126] consists of a
distance vector protocol based on a distributed variant of the
Bellman-Ford algorithm combined with a loop avoidance and a blackhole
elimination mechanism. As long as these mechanisms are left intact,
Babel will push packets in roughly the right direction following a
loop-free path.
Babel doesn't care what happens in the network, in the metric, or in
the route selection policy, or whether routing information is
propagated in a timely manner through the network. In particular,
Babel has been shown to work on unstable networks, on wireless
networks, on mesh networks, on hybrid networks (networks composed of
an unstable meshy bit combined with a stable wired infrastructure),
on overlay networks, and has been extended with source-specific
routing. Some of these extensions involved fundamental changes to
the structure of metrics or even the forwarding semantics, but, to a
great extent, Babel just didn't care.
This is in contrast with familiar link-state IGPs, such as OSPF and
IS-IS. These link-state protocols deeply care that all nodes in a
given area have their link state databases synchronised in a timely
manner, lest routing loops and other pathologies occur. OSPF and IS-
IS therefore require careful (see?) implementation of timely and
Chroboczek Expires October 6, 2015 [Page 2]
Internet-Draft Babel doesn't care April 2015
reliable flooding, do not support dynamically computed metrics
without careful attention to stability issues, do not allow any form
of filtering or non-default route selection except at area
boundaries, and require a distributive metric.
1.1. Consequences and applicability of the protocol
Babel doesn't care where the routing information comes from --
whatever information it is given, it maintains a set of loop-free
paths. As we show in Section 3, this has allowed Babel to either
work out of the box or be easily adapted to a wide range of
situations that are difficult or outright impossible to solve with
the familiar link-state routing protocols.
This enormous flexibility of the protocol makes Babel particularly
adapted to tricky networks where the features of familiar routing
protocols (shortest-path next-hop routing over transitive links with
statically defined metrics) are not good enough. This can include
mesh networks, overlay networks, radio networks, etc. Since all of
these kinds of networks are handled by a single protocol, Babel is
particularly applicable to hybrid networks, whose different parts
need different algorithms, for example networks composed of a meshy
part and a stable, wired part.
Babel is not necessarily the best choice in large, stable, carefully
administered networks, where a more rigid protocol such as OSPF or
IS-IS will have lower overhead and be easier to debug.
2. Structure of the Babel routing protocol
The Babel routing protocol [RFC6126] is a loop-avoiding distance
vector protocol. The core Babel protocol is reasonably simple (it is
described in 30 pages in RFC 6126 and has been reimplemented from
scratch in slightly over 20 hours by Markus Stenberg), and consists
of the following components:
o a distance vector routing protocol that is metric-agnostic and
supports flexible routing policies;
o explicit mechanisms (Hello/IHU) for link sensing and bidirectional
reachability detection;
o a loop-avoidance mechanism that makes strong guarantees even in
transient state;
o a provably complete blackhole avoidance mechanism.
Chroboczek Expires October 6, 2015 [Page 3]
Internet-Draft Babel doesn't care April 2015
In the rest of this section, we describe these mechanisms in more
detail.
2.1. Distance-vector protocol
At the core of Babel lies a distance-vector routing protocol, based
on a distributed variant of the Bellman-Ford algorithm, similiar to
the core of BGP or EIGRP.
Distance-vector IGPs have become unfashionable sometime in the 1990s.
However, distance vector has a number of very pleasant properties: it
is simple to implement, reasonably robust in the presence of
implementation mistakes, and allows the use of redundant routing
tables (routing tables with multiple routes to a single destination),
which makes reconvergence after a link failure almost instantaneous
(no packet exchanges) when the alternate routes are already
available.
Another important property is that, unlike link-state, distance
vector builds its routes consistently with the direction of
forwarding in next-hop routing. This makes it robust in the presence
of routing inconsistencies, a property on which Babel relies when it
delays sending updates or uses a non-distributive metric (see
Section 3.4 below).
2.2. Link sensing
Like most modern routing protocols, Babel has a simple and
lightweight mechanism for detecting neighbours and eliminating
assymetric neighbours. Every node sends periodic Hello TLVs;
somewhat less often, every node sends IHU ("I Heard You") TLVs that
contain the list of all the nodes from which it has heard a Hello.
Hello TLVs carry a sequence number (not to be confused with the
sequence number used for loop avoidance). Additionally, both Hello
and IHU TLVs carry an explicit interval -- an upper bound on the time
after which a new Hello or IHU can be expected. This allows nodes
with different parameters to reliably establish neighbour
relationships, since the hold time for a neighbour can be determined
dynamically from the explicit interval. Additionally, the Hello
seqno allows a node to vary its Hello to send unscheduled Hellos
without confusing its neigbours, for example after detecting a
mobility event.
Both Hello and IHU TLVs are extensible. For example, the delay-based
routing extension [BABEL-RTT] uses Hellos and IHUs to carry
timestamps and timestamp echoes respectively.
Chroboczek Expires October 6, 2015 [Page 4]
Internet-Draft Babel doesn't care April 2015
2.3. Metric agnosticity
Distance vector's robustness and the extensibility of Babel's packet
format conspire to make Babel metric-agnostic: a metric in Babel
doesn't need to be an integer, it can be any sort of algebra that is
equipped with a strictly-monotonic ordering (Section 3.5.2 of
[RFC6126]). In particular, a Babel metric need not be distributive
-- a property that is used by Babel's radio interference extension
[BABEL-Z]. More about this in Section 3.4 below.
What is more, the loop-avoidance features of Babel ensure that a
packet is pushed in roughly the right direction, following a loop-
free path, even when the metric is extremely unstable. This property
is what makes delay-based routing work reliably even in the presence
of congestion.
2.4. Flexible route selection policies
Distance vector is flexible, in the sense that routes can be dropped
("filtered") out at any point in the network, and that a node can
that is unwilling to forward packets may advertise an artificially
inflated metric ("prepending"). This is very different from link
state, where filtering can only be done at area boundaries. The
reference implementation of Babel includes a powerful language for
filtering and prepending, and this feature has proven to be extremely
useful to work around flaws in the data plane (e.g. for discarding
NATed or firewalled routes).
What is more, route selection need not choose the route with smallest
metric -- any route with finite metric will do, as long as the choice
is consistent with the data plane. This property is used by Babel to
improve stability by applying hysteresis during route selection.
2.5. Loop avoidance
Like EIGRP and BGP, Babel is a loop-avoiding protocol: only those
routes are accepted that can be proven to be loop-free. A route that
has this property is said to be "feasible".
Babel's feasibility condition is based on the one used in EIGRP
[DUAL], but is structured similarly to DSDV [DSDV]. Thus, Babel
avoids EIGRP's active phase (not needed with DSDV-style feasibility)
while also avoiding DSDV's issues with retractions (the even-odd
mechanism of DSDV is not necessary with EIGRP-style feasibility) and
minor metric fluctuations (EIGRP-style feasibility can cushion up to
one hop of fluctuation).
Chroboczek Expires October 6, 2015 [Page 5]
Internet-Draft Babel doesn't care April 2015
Babel's loop-freedom properties are well understood (Section 1.1 of
[RFC6126]) and have been proven (the proof has not been published
yet).
2.6. Blackhole elimination
It is well known that DSDV-style loop avoidance, as used in Babel,
has an issue: it may sometimes cause spurious blackholes. This
situation is known as "starvation" in Babel (Section 2.5 of
[RFC6126]).
DSDV attempted to work around the issue by updating a route's seqno
periodically, which made DSDV unsuitable for high-mobility scenarios:
after moving a DSDV laptop to a new place, the user would need to
wait for up to 30 seconds for a new seqno to be generated and
propagate across the network. What is more, the need to propagate
new seqnos in a timely manner put an upper bound on DSDV's update
interval.
Babel uses a different mechanism for clearing spurious blackholes,
which is based on the observation that it should be the starving node
that triggers a seqno update. When a node detects that it is
suffering from starvation (all the routes to a given destination are
unfeasible), it sends an end-to-end "seqno request" that requests
that the source increase its seqno number. Since seqno requests
follow routes that have not been proven to be loop-free, they are
equipped with a hop count; combined with duplicate request detection,
this mechanism is complete and reasonably efficient.
3. Examples of how Babel doesn't care
In the rest of this document, we describe a number of situations in
which Babel just doesn't care.
3.1. Lossy networks
Babel was designed to support lossy networks well, and one of the
metrics suggested in Appendix A of [RFC6126] depends on packet loss.
This metric tends to fluctuate in normal usage, but two features of
Babel conspire to make it work well: Babel's feasibility function is
able to keep a route feasible when the metric doesn't fluctuate by
more than the cost of one hop, and Babel's loop avoidance ensures
that packets are pushed in the right direction even when the metric
fluctuates.
The metric may fluctuate -- but Babel doesn't care.
Chroboczek Expires October 6, 2015 [Page 6]
Internet-Draft Babel doesn't care April 2015
3.2. Non-transitive (mesh) networks
A non-transitive (or mesh) network is one in which there is no well-
defined notion of link -- where the neighbour relationship between
interfaces is not transitive. Routing protocols such as OSPF and IS-
IS require a transitive network. Babel just doesn't care.
(In fact, it does care a little -- the split horizon optimisation can
be enabled on a transitive, lossless link.)
3.3. Non-default route selection policies
Babel doesn't require that a router always pick the route with
smallest metric. The route selection procedure uses this property to
prefer stable routes (routes that have been around for a while) to
potentially unstable ones (routes that have popped up recently), and
could potentially take into account information that has been
obtained from other sources than the protocol (say, battery charge
information).
Choosing a path other than the shortest one would break a link-state
protocol. Babel doesn't care.
3.4. Non-distributive metrics
In most routing protocols, metrics consist of some set of bounded
integers equipped with the usual addition. But there is no
requirement that the metric have such a simple structure. Babel will
work with any metric that is strictly monotonic on the left:
m < n.m
Link-state protocols make one further requirement on the metric,
called left distributivity or isotonicity, which is not required by
distance vector protocols:
m <= m' ---> n.m <= n.m'
To see how a non-distributive metric can arise, consider Babel's
radio interference extension [BABEL-Z], which aims to minimise the
number of times a route crosses a given frequency. Suppose that A is
a single radio router, while routers B and C have two radios; suppose
further that the dashed frequency has less packet loss than the
dotted frequency:
Chroboczek Expires October 6, 2015 [Page 7]
Internet-Draft Babel doesn't care April 2015
q
p -----
A ----- B ..... C
q'
Then the preferred route from A to C is p.q', while the preferred
route from B to C is q. The metric is non-distributive:
q <= q' but p.q > p.q'
(A note for BGP specialists: this is analogous to what happens in BGP
when q is a customer route, and is therefore preferred by B, while q'
is a shorter peer route, and would therefore be preferable for A.
Many familiar BGP policies can be modelled by non-distributive
metrics.)
Using a non-distributive metric is impossible in link-state
protocols, where it may lead to persistent routing loops. Babel
doesn't care, and converges to a Nash equilibrium. (This assertion
is currently unproven, but similar properties are known to hold for
Bellman-Ford.)
3.5. Delay-based routing
Overlay networks -- networks built over tunnels or VPNs -- present a
complex network topology as a single hop. Efficient routing in such
networks requires distinguishing between local and remote hops, which
is usually done by a human administrator manually tweaking the
metric.
The delay-based extension to Babel [BABEL-RTT] [DELAY-BASED] uses RTT
measurements to automatically prefer local routes. Such an approach
leads to a negative feedback loop between the control plane and the
data plane, which naturally leads to routing oscillations (a route is
preferred when it has low RTT, which causes traffic to flow through
it, which in turn causes its RTT to increase, which causes a
different route to be preferred). In a link-state protocol, such
oscillations would lead to repeated refloodings and SPF
recomputation. Babel, on the other hand, doesn't care: even in the
presence of oscillations, it is pushing packets in the right
direction following loop-free paths.
(Oscillations cause packet reordering, which is something the
transport layer cares about, so while Babel itself might not care, we
have equipped our implementation with a number of mechanisms to limit
the amount of oscillation.)
Chroboczek Expires October 6, 2015 [Page 8]
Internet-Draft Babel doesn't care April 2015
3.6. Source-specific routing
On advice from the Homement working group, we have extended Babel
with support for source-specific routing [BABEL-SS] [SS-ROUTING].
This is a sizeable extension: it changes the behaviour of the data
plane (packets are forwarded according to both source and
destination), it changes the notion of most-specific route, and it
requires no less than three new TLVs to be carried by the protocol.
In this particular case, Babel does care, to a certain extent. The
new TLVs needed to be very carefully encoded so that even in the
presence of non-source-specific routers, the data plane remains
consistent with the control plane, and we needed to make some efforts
to ensure that the blackhole elimination mechanism wouldn't be broken
by source-specific routing. From a different point of view, however,
Babel still doesn't care: the loop avoidance mechanism applied with
no changes to the new kind of routes.
3.7. Interoperability
All of the extensions to Babel described above interoperate, to the
extent possible. For example, a delay-based router will reliably
exchange routes with a non-delay-based one, but will not perform any
RTT estimation. Similarly, a source-specific router will reliably
exchange non-specific routes with a non-source-specific router.
While achieving this requires careful encoding of the extension data
so that just the right bits are discarded by routers that don't
understand the extension (Section 4 of [BABEL-EXT]) and checking that
the forwarding behaviour remains consistent with the routing data,
Babel itself doesn't care -- it just computes paths and applies the
loop avoidance mechanism, whatever the source of the routing
information.
4. Conclusion
We have solved within the Babel routing protocol a number of
interesting routing problems that are difficult or outright
impossible to solve in other routing protocols, and we have done that
without breaking interoperability between the various variants of the
protocol. We claim that this makes Babel particularly suitable for
the kind of messy networks where such problems arise, and in
particular to hybrid networks, where different parts of the network
require different mechanisms to work well but don't necessarily wish
to run multiple routing protocols.
But that's us. Babel doesn't care.
Chroboczek Expires October 6, 2015 [Page 9]
Internet-Draft Babel doesn't care April 2015
5. References
[BABEL-EXT]
Chroboczek, J., "Extension Mechanism for the Babel Routing
Protocol", draft-chroboczek-babel-extension-mechanism-04
(work in progress), March 2015.
[BABEL-RTT]
Jonglez, B. and J. Chroboczek, "Delay-based Metric
Extension for the Babel Routing Protocol", Internet Draft
draft-jonglez-babel-rtt-extension-00, July 2014.
[BABEL-SS]
Boutier, M. and J. Chroboczek, "Source-Specific Routing in
Babel", draft-boutier-babel-source-specific-00 (work in
progress), November 2014.
[BABEL-Z] Chroboczek, J., "Diversity Routing for the Babel Routing
Protocol", draft-chroboczek-babel-diversity-routing-00
(work in progress), July 2014.
[DELAY-BASED]
Jonglez, B., Boutier, M., and J. Chroboczek, "A delay-
based routing metric", March 2015,
<http://arxiv.org/abs/1403.3488>.
Submitted for publication.
[DSDV] Perkins, C. and P. Bhagwat, "Highly Dynamic Destination-
Sequenced Distance-Vector Routing (DSDV) for Mobile
Computers", ACM SIGCOMM'94 Conference on Communications
Architectures, Protocols and Applications 234-244, 1994.
[DUAL] Garcia Luna Aceves, J., "Loop-Free Routing Using Diffusing
Computations", IEEE/ACM Transactions on Networking 1:1,
February 1993.
[RFC6126] Chroboczek, J., "The Babel Routing Protocol", RFC 6126,
February 2011.
[SS-ROUTING]
Boutier, M. and J. Chroboczek, "Source-sensitive routing",
March 2015, <http://arxiv.org/abs/1403.0445>.
To appear in Proc. IFIP Networking 2015.
Chroboczek Expires October 6, 2015 [Page 10]
Internet-Draft Babel doesn't care April 2015
Author's Address
Juliusz Chroboczek
PPS, University of Paris-Diderot
Case 7014
75205 Paris Cedex 13
France
Email: jch@pps.univ-paris-diderot.fr
Chroboczek Expires October 6, 2015 [Page 11]