Internet DRAFT - draft-villamizar-rtgwg-route-stability
draft-villamizar-rtgwg-route-stability
RTGWG C. Villamizar
Internet-Draft OCCNC
Intended status: Informational March 14, 2014
Expires: September 15, 2014
Stability of Interior Routing and Load Balancing Techniques
draft-villamizar-rtgwg-route-stability-00
Abstract
This is an informational document intended to outline what is known
about routing stability of interior gateway protocols (IGPs), traffic
engineering techniques (TE) used in MPLS, and load balancing
techniques. Known stable and unstable conditions provide bounds of
known behavior. Network operators and equipment suppliers have
relied heavily on operational experience as existence proof of
instability or evidence of apparent stability. Between the bounds of
known stable and unstable behavior there may be conditionally stable
protocols and deployment scenarios. There is some discussion of the
difficulty of mathematical stability proof for some techniques and a
challenge to the research community.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 15, 2014.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
Villamizar Expires September 15, 2014 [Page 1]
Internet-Draft Routing and Load Balance Stability March 2014
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Document Scope . . . . . . . . . . . . . . . . . . . . . 6
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1. Intererior Routing . . . . . . . . . . . . . . . . . . . 7
2.1.1. Distance Vector (DV) Protocols . . . . . . . . . . . 8
2.1.2. Link State (LS) Protocols . . . . . . . . . . . . . . 8
2.2. MPLS Traffic Engineering (TE) . . . . . . . . . . . . . . 9
2.3. Label Distribution Protocol (LDP) . . . . . . . . . . . . 9
2.4. IP Fast Reroute (IPFRR) and LDP Fast Reroute (LDPFRR) . . 10
2.5. Protection and Restoration . . . . . . . . . . . . . . . 10
2.6. Multipath Load Balancing . . . . . . . . . . . . . . . . 11
2.7. Common Deployments . . . . . . . . . . . . . . . . . . . 12
3. Problems to Avoid . . . . . . . . . . . . . . . . . . . . . . 12
3.1. Transient Loops and Black Holes . . . . . . . . . . . . . 12
3.2. Long Convergence Times . . . . . . . . . . . . . . . . . 13
3.3. Impact of Multiple Failures . . . . . . . . . . . . . . . 13
3.4. Oscillations and Slosh . . . . . . . . . . . . . . . . . 14
3.5. Positive Feedback Loops . . . . . . . . . . . . . . . . . 14
3.6. Scaling Limitations . . . . . . . . . . . . . . . . . . . 15
4. Link State (LS) Protocol Behavior . . . . . . . . . . . . . . 15
4.1. Transient Black Holes and Route Loops . . . . . . . . . . 15
4.2. Excessive Link Up and Link Down Transitions . . . . . . . 16
4.3. LSPDU fragment boundary change . . . . . . . . . . . . . 17
4.4. Convergence Time . . . . . . . . . . . . . . . . . . . . 17
4.5. Scaling Issues . . . . . . . . . . . . . . . . . . . . . 18
4.5.1. Flooding and Dense Mesh Topologies . . . . . . . . . 19
4.5.2. SPF Computation . . . . . . . . . . . . . . . . . . . 20
4.5.3. External Routes . . . . . . . . . . . . . . . . . . . 21
4.5.4. Forwarding Entry Installation . . . . . . . . . . . . 21
4.6. LDP Impact . . . . . . . . . . . . . . . . . . . . . . . 22
4.6.1. IPFRR and LDPFRR Impact . . . . . . . . . . . . . . . 22
4.7. Delay, Jitter, and Loss Metrics . . . . . . . . . . . . . 22
5. MPLS-TE behavior . . . . . . . . . . . . . . . . . . . . . . 23
5.1. Transient Behavior . . . . . . . . . . . . . . . . . . . 23
5.2. Time Complexity of TE Restoration . . . . . . . . . . . . 24
5.3. Scaling issues . . . . . . . . . . . . . . . . . . . . . 25
5.4. Delay, Jitter, and Loss Metrics . . . . . . . . . . . . . 25
6. Multipath Load Balancing . . . . . . . . . . . . . . . . . . 25
Villamizar Expires September 15, 2014 [Page 2]
Internet-Draft Routing and Load Balance Stability March 2014
6.1. Existing Multipath Techniques . . . . . . . . . . . . . . 26
6.2. Dynamic Load Balancing Feedback Loop . . . . . . . . . . 26
7. Summary of Factors in Load Balancing Stability . . . . . . . 27
7.1. Load Balance Granularity . . . . . . . . . . . . . . . . 28
7.2. Load Balance Adjustment Certainty . . . . . . . . . . . . 29
7.3. Load Balance Measurement Compatibility . . . . . . . . . 30
7.4. Load Balancing Decision Locality . . . . . . . . . . . . 30
7.4.1. Local Decision Load Balance . . . . . . . . . . . . . 31
7.4.2. Distributed Load Balance . . . . . . . . . . . . . . 31
7.5. Load Balance Timing and Traffic Adjustment . . . . . . . 32
7.6. Preventing Feedback Problems . . . . . . . . . . . . . . 33
8. Open Area for Research . . . . . . . . . . . . . . . . . . . 35
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36
11. Security Considerations . . . . . . . . . . . . . . . . . . . 36
12. Informative References . . . . . . . . . . . . . . . . . . . 36
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 41
1. Introduction
The consequences of deploying a marginally stable solution can be a
sudden and catastrophic transition to an unstable state. Events
related to instability vary in severity. Network meltdowns are often
documented only in standards forum discussions or operational forum
discussions and therefore being otherwise undocumented become the
topic of Internet Folklore.
Route oscillations can have a very detrimental effect on the network
performance perceived by end users. In the mid-1990s, severe route
oscillations could cause routers to fail, resulting in more route
oscillation and eventually a provider meltdown. While router
software today is more robust and therefore meltdown is very
unlikely, performance degregation due to route oscillation as
perceived by user traffic can be quite severe. Slow route
oscillation, also known as route slosh, results in performance
degregation though not as severe.
This document outlines what the routing community knows and doesn't
know about the topic of route stability in the types of networks
deployed today and in potential deployment based on existing IETF
work. A brief section also mentions areas for research (Section 8).
This document tries to avoid straying from well understood routing
protocol behaviors.
1.1. Terminology
Related terms are grouped together.
Villamizar Expires September 15, 2014 [Page 3]
Internet-Draft Routing and Load Balance Stability March 2014
The terms listed below describe undesireable routing behaviors. The
terms are listed in order of decreasing severity.
routing meltdown
A routing meltdown is a sustained period when a very
substantial number of routers either repeatedly crash and
reboot or repeatedly lose routing adjacencies.
route instability
Route instability is a sustain period when route change has a
cascading effect and causes widespread transient problems
such as route black holes or routing loops.
route loop
A route loop occurs when a cycle exists in forwarding
entries. Some protocols are prone to creation of transient
route loops.
black hole
A black hole exists when forwarding entries do not get
packets to their destination. Transient black holes may
occur during route change in some routing protocols.
route oscillation
Route oscillation is a condition where routes keep changing
in the absense of change in any conditions which should
legitimately cause route change.
route slosh
Route slosh is a proper subset of route oscillation. When
route slosh occurs routes keep switching among two or more
states where limited brief stability occurs between state
changes due to protocol timers or other factors. Route slosh
is essentially just slow route oscillation.
The set of terms below describe normal routing behaviors and other
routing terms. Related terms are grouped together within this list.
route change
Route change is as the name implies when routes and the
corresponding forwarding entries change. Route change may
occur for many reasons.
shared risk link group (SRLG)
An SRLG is is a group of links which due to a shared physical
resource are likely to fail simultaneously or nearly
simultaneously.
Villamizar Expires September 15, 2014 [Page 4]
Internet-Draft Routing and Load Balance Stability March 2014
network fault
A network fault occurs when a node or link fails. Node
protection is modeled using SRLGs covering all of the links
adjacent to a given node.
multiple fault
A multiple fault refers to the failure of multiple resources
causing links that are not covered by a common SRLG to fail.
link metric
Interior gateway protocols (IGPs) commonly use link metrics,
which is synonymous with link costs, to determine a best
route. See Section 2.1 and Section 4 for discussion of IGPs.
traffic engineering
Traffic engineering (TE) involves capacity information as
well as the simple link metrics used in link state protocols.
The primary focus here is the type of traffic engineering
commonly practiced in RSVP-TE networks. See Section 2.2 and
Section 5 for discussion of MPLS TE.
route recovery
Route recovery refers to getting traffic flowing after some
event such as a network fault disrupts forwarding. Route
recovery may involve protection and/or restoration.
route protection
Route protection refers to a form of recovery using an
alternate path, generally involving no signaling after the
fault to begin using the alternate path, and involving either
local detection of the fault or a simple proactive
connectivity verification.
zero bandwidth protection
In MPLS TE, the use of protection paths with zero bandwidth
allocation is common. This is discussed in Section 2.5.
route restoration
Route restoration refers to recovery that involves exchange
of routing information and/or some form of signaling to set
up an alternate path.
multipath
Villamizar Expires September 15, 2014 [Page 5]
Internet-Draft Routing and Load Balance Stability March 2014
A definition of the term "multipath" can be found in
[I-D.ietf-rtgwg-cl-requirement]. Common forms of multipath
include Equal Cost Multipath (ECMP), Ethernet Link
Aggregation, MPLS Link Bundling. See
[I-D.ietf-rtgwg-cl-use-cases] for discussion of existing
forms of multipath.
local decision
A local decision is one in which a single network element is
involved in the decision.
distributed decision
A distributed decision is one where the independent decisions
of multiple network elements have an effect on the outcome.
offline route computation
Offline route computation is where a centralized global
routing decision with the results of the decision passed to
all affected network elements. Global optimization is often
cited as its advantage, though in IP networks it is seldom
used due to very slow route restoration.
1.2. Document Scope
This document focuses primarily on routing stability of the interior
protocols used in service provider (SP) networks and on load
balancing techniques used in SP networks.
Service providers almost exclusively use link state (LS) routing
protocols as interior gateway protocols (IGPs) rather than distance
vector (DV) protocols. The two link state routing protocols commonly
used are OSPF [RFC2328] [RFC5340] and IS-IS [ISO.10589.1992]
[RFC5308]. The MPLS LDP protocol [RFC5036] creates MPLS forwarding
entries which either follow IGP routes or behaves in a manner similar
to a DV protocol. Techniques such as IPFRR and LDPFRR [RFC5714]
[RFC6981] alter the transient behavior of the IGPs or LDP and are
discussed briefly. MPLS traffic engineering (TE) as practiced using
RSVP-TE [RFC3209] makes use of IGP extensions based on [RFC3630]
[RFC5305] to carry TE information. MPLS-TE has inherently different
behavior than IGP routes alone or MPLS using LDP.
There has been activity in the IETF related to using secondary link
metrics related to delay, jitter, and loss
[I-D.ietf-ospf-te-metric-extensions]
[I-D.ietf-isis-te-metric-extensions]. This includes use of these
secondary metrics for MPLS TE path selection
[I-D.ietf-mpls-te-express-path].
Villamizar Expires September 15, 2014 [Page 6]
Internet-Draft Routing and Load Balance Stability March 2014
Load balancing using hashing techniques date back to at least the
mid-1980s, but were not formally documented until [RFC2991] and
[RFC2992] though focusing solely on already outdated use of hashing
in load balancing. It was not until documents such as
[I-D.ietf-rtgwg-cl-use-cases] and [I-D.ietf-mpls-forwarding] that
these load balancing techniques were well documented. There is also
interest in dynamic load balancing which will also make use of delay,
jitter, and loss information, either locally available or IGP
advertised [I-D.ietf-rtgwg-cl-requirement]. The topic of stability
with regard to this type of work is also discussed.
Much has been written about the stability of the BGP routing
protocol. However, since the focus of this document is on interior
routing stability, discussion of BGP stability is out of scope.
There is some brief discussion of the impact of interior routing
stability on BGP routes in Section 4.4 and Section 4.5.
Load balancers typically used by enterprise and to some extent in
data centers are not within scope. This type of load balancer is not
used in SP networks are the balancing algorithms are largely
proprietary.
2. Background
The greatest concern regarding stability and oscillations has been in
service provider (SP) networks, simply because any instability which
occurs more readily at large scale has in the past affected SP
networks first. It may be that today's very large data centers now
pose unique scalability and stability problems as well, but the focus
remains on SP networks. Some forms of instability can occur in
enterprise networks and even very small networks, however common
practice has evolved to make such occurances acheivable only through
severe misconfiguration.
This section provides a brief introduction to a set of protocols in
use in SP networks today. The focus is mainly on link-state routing
protocols and LDP, on MPLS TE, and on multipath load balancing.
These are the common underpinnings of SP networks and a recent topic
of stability concerns due to proposed extensions to today's common
practice.
2.1. Intererior Routing
Interior routing protocols are commonly referred to as interior
gateway protocols (IGPs). IGP is somewhat of a historic term but
still widely used. There are two types of IGPs in widespread use in
IP networks. There are numerous alternate techniques described in
research, but with no deployment. The two common types are Distance
Villamizar Expires September 15, 2014 [Page 7]
Internet-Draft Routing and Load Balance Stability March 2014
Vector (DV) and Link State (LS). In later sections only the behavior
of LS protocols is discussed since DV protocols are seldom if ever
used in large SP networks.
2.1.1. Distance Vector (DV) Protocols
Distance vector (DV) routing protocols are rarely if ever used by
service providers. Distance vector routing protocols exchange
distances to remote destinations.
One of the earliest DV protocols in widespread use was Routing
Information Protocol (RIP). The first version is now known as RIP-1
[RFC1058]. RIP-1 was used in the 1980s and 1990s but replaced by
RIP-2 [RFC2453] due to RIP-1 shortcomings, most severe at the time
being lack of CIDR [RFC4632] support. RIP-ng [RFC2080] provides an
extension to RIP-2 which supports IPv6.
RIP is generally only used in small networks such as homes, small
offices, and small enterprise. RIP's advantage is simplicity. RIP's
disadvantages are many. Among them are slow convergences as networks
become large.
Another DV routing protocol is Enhanced Interior Gateway Routing
Protocol (EIGRP). EIGRP is a much more advanced DV routing protocol
than RIP, eliminating most if not all of RIP's shortcomings. EIGRP
was a Cisco proprietary protocol roughly until 2013 when Cisco gave
permission for others to use EIGRP, recognizing that no SP would use
EIGRP due to the vendor lock-in.
Yet another DV routing protocol is Babel [RFC6126] which is well
suited for wired and wireless network and is particular well suited
for ad-hoc wireless networks. Babel is not expected to see any
deployment on SP networks, but could perhaps displace RIP on small
networks such as home, small office, and enterprise.
2.1.2. Link State (LS) Protocols
Link state (LS) routing protocols build a topology database based on
each router advertising a set of neighbor adjacencies and providing
information about the router's capabilities and about each adhacency.
This topology database is known as a link state database (LSDB).
Among the advantages of LS protocols is the ability to carry
additional topology information useful for other purposes, such as
MPLS traffic engineering (TE) (see Section 2.2).
Villamizar Expires September 15, 2014 [Page 8]
Internet-Draft Routing and Load Balance Stability March 2014
The two major LS protocols are open shortest path first (OSPF)
[RFC2328][RFC5340], defined by the IETF, and intermediate system to
intermediate system (IS-IS) [ISO.10589.1992], defined by ISO but
extended by IETF for IP use and for MPLS TE use.
Differences between OSPF and IS-IS have sparked heated discussion,
though most experts concede that use of either is more a matter of
preference than strong technical merit. OSPF sends a single router
link state advertisement (LSA) per router with multiple opaque LSAs
[RFC5250] carrying additional information. IS-IS sends a single link
state protocol data unit (LSPDU) per router with TLV packed into
multiple LSPDU fragments. Multiple LSPDU fragments are needed
because the size of the complete LSPDU often far exceeds the link
MTU.
Arguments regarding the technical merit of one of these two LS
protocols over the other are usually focused on information packing
and flooding efficiency. In both protocols the link state database
(LSDB) can become very large in large networks, particularly those
using MPLS TE. Nearly identical MPLS TE extensions exist in OSPF and
ISIS, with information exchanged in relatively small type-value-
length tuples (TLVs) extensions carried in OSPF opaque LSA or in the
IS-IS LSPDU fragments. In OSPF related TLVs are packed into LSA. In
IS-IS many related and unrelated TLVs are packed into a given LSDB
fragment.
The behavior of LS protocols is discussed in detail in Section 4.
2.2. MPLS Traffic Engineering (TE)
MPLS traffic engineering (TE) makes use of additional information
carried in the link state (LS) routing protocols OSPF and IS-IS (see
Section 2.1.2). MPLS label switched paths (LSPs) are set up using TE
extensions to RSVP, known collectively as RSVP-TE. The base RSVP-TE
protocol specification is [RFC3209].
The behavior of MPLS TE is discussed in detail in Section 5.
2.3. Label Distribution Protocol (LDP)
The MPLS Label Distribution Protocol (LDP) [RFC5036] can either make
use of the best path as computed by the IGP or act as an independent
DV protocol. IETF has decided not to add TE extensions to LDP
[RFC3468]. MPLS provides a light weight tunneling encapsulation and
LDP provides a relatively light weight protocol to support MPLS
forwarding and therefore to efficiently carry pseudowires (PW) and
virtual private network (VPN) services. LDP creates point to
multipoint (P2MP) LSPs directed to specific forwarding equivalence
Villamizar Expires September 15, 2014 [Page 9]
Internet-Draft Routing and Load Balance Stability March 2014
class (FEC) rather that the point to point (PTP) LSPs commonly used
in RSVP-TE.
2.4. IP Fast Reroute (IPFRR) and LDP Fast Reroute (LDPFRR)
IP/LDP Fast Reroute (IP/LDR FRR) [RFC5714] is also applicable in MPLS
networks. ECMP and Loop-Free Alternates (LFA) [RFC5286] are well
established IP/LDP FRR techniques and were the first methods to be
widely deployed. Work on IP/LDP FRR is ongoing within the IETF
RTGWG. Two topics actively discussed in RTGWG are microloops and
partial coverage of the established techniques in some network
topologies. [RFC5715] covers the topic of IP/LDP Fast Reroute
microloops and microloops prevention. RTGWG has developed additional
IP/LDP FRR techniques to handle coverage concerns. RTGWG is
extending LFA through the use of remote LFA
[I-D.ietf-rtgwg-remote-lfa]. Other techniques that require new
forwarding paths to be established are also under consideration,
including the IPFRR "not-via" technique defined in [RFC6981] and
maximally redundant trees (MRT)
[I-D.ietf-rtgwg-mrt-frr-architecture]. ECMP, LFA (but not remote
LFA) and MRT swap the top label to an alternate MPLS label. The
other methods operate in a similar manner to RFC 4090 facility backup
and push an additional label. IP/LDP FRR methods which push more
than one label have been suggested but are in early discussion.
The IPFRR techniques apply to LDP fast reroute (LDPFRR). Some
techniques make use of tunneling to improve coverage. The tunneling
could in principle be provided by GRE but in practice MPLS with LDP
is most often used to avoid a potential failure of a specific next-
hop or specific SRLG. The techniques which use tunneling require
that the router known the path that other routers will be using to
know that the potential failure will be avoided and therefore require
the use of a LS IGP rather than a DV IGP or no IGP and use of LDP as
a DV protocol.
2.5. Protection and Restoration
Network protection and restoration are collectively know as recovery.
Getting traffic to flow again after a fault is network recovery,
regardless of how it is accomplished.
Route protection is often handled largely in hardware or forwarding
card firmware rather than control plane software and can acheive
recovery within a few tens of milliseconds. While protection is
fast, protection against multiple faults is highly impractical.
Restoration may involve flooding updated routing information and may
involve signaling. Therefore restoration is inherently slower than
Villamizar Expires September 15, 2014 [Page 10]
Internet-Draft Routing and Load Balance Stability March 2014
protection but covers all variations of multiple faults except those
multiple faults which cause network partitioning and are inherently
irrepairable.
The practice of zero bandwidth protection in MPLS FRR [RFC4090]
relies on traffic class (TC) markings [RFC5462] and diffserv
[RFC3270] [RFC4124] to provide very fast recovery for high priority
traffic with no cost due to resource reservation of the protection
paths. This protection of high priority traffic may occur at the
expense of transient congestion for lower priority traffic. When
combined with restoration, particularly if restoration is relatively
fast, any congestion of lower priority traffic is eliminated or at
least minimized shortly after protection goes into effect.
A very cost effective deployment, and therefore quite common, is to
use MPLS FRR [RFC4090] for very fast (tens of msec) single fault
protection and to optimize router code for fast (well under a second)
restoration. Zero bandwidth reservation on the protection paths is
commonly and perhaps exclusively used today. The fast restoration
covers arbitrary multiple faults and relieves congestion after
initial zero bandwidth FRR protection. This type of deployment is
aimed at providing low cost and high availability for preferred
services and also providing low cost and reasonably short disruptions
for lower priority services when faults occur.
2.6. Multipath Load Balancing
Multipath load balancing in use today includes Equal Cost Multipath
(ECMP), Ethernet Link Aggregation [IEEE-802.1AX], and MPLS Link
Bundling [RFC4201]. In each case the load balancing involves a local
decision and generally uses static load balancing. In all of these
cases the load balancing is based on IP header and MPLS label stack
information as described in [I-D.ietf-mpls-forwarding]. Ethernet
Link Aggregation using Ethernet MAC addresses is entirely useless in
SP networks or any routed network. Therefore Ethernet Link
Aggregation as implemented by routers also uses IP header and MPLS
label stack information rather than Ethernet MAC addresses.
Static load balancing, being static, is inherently stable. Dynamic
load balancing is still a local decision, but where measurements are
made and the load balance is adjusted based on those measurements.
Measurements can be based on output link utilization, queue occupancy
or queuing delay, or queuing loss including active queue management
(AQM) loss, or some combination of these. Dynamic load balancing has
the potential to be unstable but has been successfully deployed in SP
networks providing existance proof that stable implementations are
possible.
Villamizar Expires September 15, 2014 [Page 11]
Internet-Draft Routing and Load Balance Stability March 2014
The behavior of multipath load balancing is discussed in Section 6.
2.7. Common Deployments
Rarely if ever are service provider (SP) network deployments exactly
alike, but most have a great deal in common. Nearly all major SP
(possibly all) use either OSPF or IS-IS. Some use no MPLS at all,
though this subset is shrinking. Many use LDP but not RSVP-TE. Some
use both LDP and RSVP-TE. Some use RSVP-TE but not LDP though this
may be less common than a decade ago. The use of MPLS FRR [RFC4090]
is thought to be more common than IPFRR or LDPFRR at least among
larger providers.
Protection is not always provided. Fast restoration is considered
essential even where protection is used due to the high occurance of
multiple faults despite efforts to account for shared resources using
SRLG.
All large SP use multipath techniques, typically ECMP and Link
Aggregation. Static load balancing has generally been used. Though
dynamic load balancing has been successfully deployed, today static
load balancing may be exclusively used due to the disappearance of
the one core router supplier supporting dynamic load balancing.
Given only these set of tools and given fixed link metrics, it is
difficult to build a network which is unstable. It is possible to
build an unstable network, even with fixed link metrics, if scaling
limitations described in Section 3 are ignored and exceeded, but as
mentioned in Section 2, with today's common practices this situation
is acheivable only through severe misconfiguration.
3. Problems to Avoid
Various existing routing protocols exhibit different transient
behaviors and different tendencies toward specific problems. Any
problem which is transient is generally considered less severe than a
problem that is persistent. This section describes some common
problems that new protocols, implementations, and deployments need to
avoid where possible. Section 4, Section 5, and Section 6 discuss
the tendency of some routing protocols or techniques to exhibit these
various problems.
3.1. Transient Loops and Black Holes
After a fault a transient black hole at the failed resource may be
unavoidable. Some routing protocols have a tendency to form
transient black holes elsewhere when a topology change occurs and can
also form transient loops in some circumstances. Transient loops can
Villamizar Expires September 15, 2014 [Page 12]
Internet-Draft Routing and Load Balance Stability March 2014
cause severe congestion within the loop, affecting traffic which
should not have been impacted from the initial fault.
3.2. Long Convergence Times
The most severe long convergence time problems are almost always due
to poor implementations or badly designed networks. However, even
among competent routing protocol implementations and competent
network designs, some routing protocols are more prone to long
convergence times than others.
While IGP protocols are not entirely immune to long convergence
times, MPLS TE is particularly prone to long convergence times unless
care is taken to avoid scaling issues. The shortest path first
(SPF), sometimes called shortest path tree (SPT) computation in a LS
protocol is time complexity order NlogL, where N is the number of
nodes and L is the number of links. For MPLS TE, one CSPF
computation is needed for each LSP and each CSPF computation is order
NlogL. A fault may affect the majority of the LSPs for which a given
LSR is ingress and therefore may involve order N^2logL (N squared log
L) computation for that LSR.
3.3. Impact of Multiple Failures
There are many situations in which the most common mode of failure or
a common mode of failure causes simultaneous failure of multiple
nodes or more often multiple links. For example, many links may
traverse a common physical fiber or fiber segment using wave division
multiplexing (WDM). These shared resources can be taken into account
using shared risk link groups (SRLG) when computing protection paths.
A multiple failure is one where network elements (nodes and/or links)
which do not share any common SRLG fail simultaneously.
Multiple failure occur on a regular basis. These occur as a result
of a potential shared resource failure which was either unknown or
overlooked or considered so improbable as to make it more practical
to ignore it. Examples of completely unavoidable multiple failures
are fibers on the same earthquake fault line, fibers crossing the
same bridge (where only bridge collapse would cause simultaneous
failure), and fibers on both sides of railroad tracks (where only a
train derailment would cause simultaneous failure), widespread
flooding. There are many other less dramatic failure scenarios which
are more common.
Protection does not cover multiple faults. Covering even double
faults using protection adds significant network complexity. Where
protection requires resource reservation such as in transport
networks, the need to reserve a multiple of the resource being
Villamizar Expires September 15, 2014 [Page 13]
Internet-Draft Routing and Load Balance Stability March 2014
protected make covering multiple faults prohibitably expensive in
most cases.
Using protection covering only single faults and having no
restoration capability is unacceptable for IP networks and
unacceptable for many of today's services. Protection covering only
single faults with no restoration is common practice for transport
protected services largely accounting for steep decline in use of
protected transport services. Using restoration with excessively
long convergence times is also unacceptable for IP and for many of
today's services.
3.4. Oscillations and Slosh
An extremely severe problem is persistent route oscillations. Other
than buggy implementations or exceeding implementation scaling
limits, producing oscillations in today's SP networks would require
misuse of link metrics which vary with traffic load or some other
form of positive feedback (see Section 3.5).
When protocol timers or other protocol features limit oscillation to
a fairly long period, the resulting low frequency route oscillation
is known as route slosh. Route slosh is also a severe problem if it
is persistent. If damped such that the slosh impact fairly quickly
reduces to zero, then some slosh may be acceptable.
3.5. Positive Feedback Loops
Any system with positive feedback is unstable. For example, consider
a simple electronic amplifier. With DC positive feedback, any small
signal would cause a saturated output. Amplifiers employ negative
feedback to improve linearity. However if the feedback is delayed a
little, then positive feedback can occur at high frequencies which
can result in a fried amplifier.
The classic case of positive feedback in routing is the delay based
link metrics used briefly in the 1980s [DBSPRA]. Various forms of
damping are discussed in the paper describing this use of delay based
link metrics, some with limited success, though in the end ARPANET
discontinued using delay based link metrics.
Use of delay, jitter, or loss has the potential for creating a
situation where a positive feedback loop exists and therefore where
oscillation occurs. This problem is discussed in Section 4.7,
Section 7.4.2 and Section 7.
A similar feedback may occur in MPLS TE if an "auto-bandwidth"
feature is used, though occurance unlikely. This feature varies the
Villamizar Expires September 15, 2014 [Page 14]
Internet-Draft Routing and Load Balance Stability March 2014
requested LSP resource reservation based on observed traffic measured
at the LSP ingress. Increasing the LSP resource reservation could
put an LSP on a longer delay path. The additional delay may reduce
the offerred load slightly and allow the LSP to be moved back to its
original path. In practice this problem is avoided by using only
very long term filtered measurements, though the result may be very
low frequency route slosh.
3.6. Scaling Limitations
Routers have finite resources. Scaling problems can occur in the
following situations.
1. When protocols are poorly designed and inherently consume
excessive resources.
2. When specific types of deployments are used with protocols which
consume excessive resources in those configurations.
3. When implementations provide inadequate resources to support what
today are considered normal deployments.
Resources which may become exhausted include specialized buffering,
storage, and computational resources.
Specific examples of scaling problems are given in Section 4.5 and
Section 5.3.
4. Link State (LS) Protocol Behavior
Link state (LS) protocols are simple and effective. Transient
behavior during route change is somewhat prone to route loops but
with reasonably good implementations convergence is fast and
therefore transient problems are short lived.
The two prominent link state (LS) protocols, OSPF and IS-IS, exhibit
very similar behavior. Only where there is a significant difference
will either of the two protocols be mentioned in this section.
4.1. Transient Black Holes and Route Loops
Villamizar Expires September 15, 2014 [Page 15]
Internet-Draft Routing and Load Balance Stability March 2014
Transient problem arise from the hop by hop nature of forwarding with
direct application of LS protocols. Each router makes a decision to
change routes independently and the order in which routers learn of a
LSDB change and make a decision based on changes can vary. Transient
situations can arise where routers in a subset of the topology have
installed routes based on a slightly different LSDB and either route
traffic in a two router loop which is suppressed resulting in a black
hole, or in a multirouter loop causing a forwarding loop.
As long as convergence is fast, the impact of transient problems is
small. A forwarding loop can create very severe but temporary
congestion. The temporary congestion created by a forwarding loop
will often affect traffic to destinations that should not have been
affected by the route change. LS convergence, including flooding,
routing decision, and installation of forwarding entries today can be
well under a second but does not rival the few tens of milliseconds
acheived by protection mechanisms.
4.2. Excessive Link Up and Link Down Transitions
Links are considered in an "up" or "down" state depending on whether
some criteria is met. The most simple criteria, successful delivery
of routing protocol hello packets, is among the most problematic.
Under some conditions, a link may undergo excessive state changes
unless the implementation takes some action to prevent this
situation. For example, hysterisis in the criteria may be used or
link state may be held in a "down" state as a result of frequent
change. Excessive transitions may occur when link state "down"
transitions are quickly triggerred by some number of errors and are
quickly reverted to the "up" state if the error rate falls below the
target rate. This can be solved by one or more of a number of means:
1. Hold the link in the down state for a minimum time period which
is long compared to the measurement interval.
2. Make use of hysteresis, such as a long period of completely error
free operation before changing link state back to "up".
3. Make use of a technique which progressively keeps the link down
for longer periods depending on the rate at which the error
threshholds are crossed. For example, [RFC2439] can be applied
to IGP link state at the originating router as well as applied to
BGP.
If link state changes are frequent, even just for a single link, then
a network running a LS protocol may spend a significant amount of
time in a transient state where route loops and black holes are
possible.
Villamizar Expires September 15, 2014 [Page 16]
Internet-Draft Routing and Load Balance Stability March 2014
4.3. LSPDU fragment boundary change
A problem unique to IS-IS is related to LSPDU fragement packing and
shifting boundaries. If an implementation maintains a sorted order
among TLVs, maintains tight packing of TLVs into LSPDU fragements,
and moves TLVs from one LSPDU fragment to another to maintain the
tight packing, then a transient condition can occur where the LSPDU
fragement from which a set of TLV is withdrawn arrives first and an
SPF is triggerred before the change to the LSPDU fragement where the
set of TLVs have been added is learned.
A worst case implementation strategy tries to pack the set of LSPDU
fragments as tightly as possible and in keeping them tightly packed
frequently moves TLVs across LSPDU boundaries when changes occur.
The boundary shift due to a link changing may result in erroneous
withdrawl of a set of unrelated TLVs. In addition, a larger set of
LSPDU fragments have to be advertised if boundaries are shifted than
if less tight packing is targeted.
There are numerous strategies to avoid moving fragment boundaries and
to eliminate the erroneous withdrawl of TLVs. One strategy is to set
a target in which LSPDU fragments are packed within a fractional
range of full. For example a target 1/2 to 3/4 might be used. A
second strategy is to separate the action of updating TLVs from the
action of moving boundaries to keep within the target packing range
and when moving boundaries delay sending the withdrawl of the TLVs
relative to their addition in the adjacent fragment. Other
strategies to avoid this problem are possible.
4.4. Convergence Time
Link state (LS) convergence time includes flooding, routing decision
(SPF plus changes to inter-area or inter-level routes plus changes to
external routes), and forwarding information base (FIB) update. The
forwarding entries used by external routes carried within the IGP and
by other routing protocols such as BGP are often affected by the
underlying IGP intra-area or intra-level routes. This second set of
routes may have to be changed, though with multistage forwarding
lookups used in many routers today most secondary changes are
completed automatically.
Operational experience has shown that flooding should be prioritized
over route recomputation. The SPF computation on fast hardware with
a small topology may occur in under 10 msec, and in some cases under
1 msec, however on slower hardware and with larger topologies, an SPF
may exceed 100 msec. If reflooding at each hop is delayed by SPF
time, the overall convergence is slowed. In addition, a single
resource fault may result in multiple topology changes (see
Villamizar Expires September 15, 2014 [Page 17]
Internet-Draft Routing and Load Balance Stability March 2014
Section 3.3 for background). Prioritizing reflooding and slightly
delaying SPF helps ensure that the next SPF will be based on the full
set of changes.
With a decision to prioritized reflooding and to delay SPF the
question arises as to how long to delay SPF. For most networks an
approximate worst case fault detection time is known. This detection
time often has a lower bound dictated by speed of light propogation
delay (about 1 msec per 200 km or 125 miles of fiber) and a practical
upper bound that is close to the lower bound. For long haul
terestrial networks a lower bound of 10-20 msec is common. For
trans-oceanic fiber this can be 50-100 msec. A common practice is to
set a timer when the first IGP change is received, reflood as quickly
as possible, and perform an SPF when the timer elapses. With a timer
that is a small multiple of worst case fault detection time the SPF
will tend to occur with a full set of LSDB changes rather than a
partial set of changes.
In a multithreaded control processor architecture, it is possible to
continue reflooding while an SPF is in progress, as long as newly
arriving changes do not affect the SPF in progress. In such a case a
delay in starting an SPF is often useful, but the delay can be made
shorter, particularly with a very fast SPF relative to worst case
fault detection time. With a very fast SPF, the limiting factor may
be forwarding entry install time.
Convergence time is also impacted by scaling issues discussed in
Section 4.5.
4.5. Scaling Issues
Scaling issues result from exceding the practical capacity of a given
resource. Some of the affected resources include the following:
socket buffer
Severe scaling problems can occur in full mesh or very dense
mesh networks due to limited short term buffering of flooded
data. See Section 4.5.1.
computation
Computation issue seldom occur except with poor
implementations or networks containing a very large single
OSPF area or IS-IS level. See Section 4.5.2.
LSDB size
An extremely large number of external routes carried by a LS
protocol can cause scaling problems. See Section 4.5.3.
Villamizar Expires September 15, 2014 [Page 18]
Internet-Draft Routing and Load Balance Stability March 2014
FIB install speed
The amount of time required to install an extremely large
number of forwarding entries can cause problems. See
Section 4.5.4.
Any of these problems can be made slight better by an exceptionally
good implementation or much worse by an unusually bad implementation.
Each of the subsections that follow illustrate a type of scaling that
is capable of overwhelming even the best implementations through
careless network design. Particularly bad implementations can have
their own unique scaling limitations. The set of scaling issues
described here is almost certain to be incomplete.
4.5.1. Flooding and Dense Mesh Topologies
A physical fault may affect a number of neighbor adjacencies
advertised by a number of routers. This is generally a relatively
small number, perhaps tens of adjacencies and a smaller number of
routers. In OSPF each router advertises a new router LSA plus it may
advertise opaque LSAs with additional information about the affected
links. In IS-IS the equivalent of the router LSA is generally packed
into one LSPDU fragement. One or more additional LSPDU fragements
may carry the TLVs with additional information about the affected
links. IS-IS passes each LSPDU fragment as a full MTU packet, which
for SP core networks is often in excess of 8 KB.
Each router originating a set of changes passes this updated
information, perhaps a few 10s of KB, to each remaining neighbor. If
the information is not already known from some other source, then any
given neighbor will then reflood to each of its neighbors.
Most SP networks have an average of about 2.5-4 core neighbors per
core router. For this type of topology, known as a sparse mesh, LS
protocol flooding (absent any bad implementation) is fast, efficient,
and free of scaling problems.
A very dense mesh network or a full mesh network represents a
pathologic case. Attempts to use an overlay model often result in a
full mesh or a very dense mesh.
In a full mesh of N routers, each router originating a set of changes
is adjacent to every other router, or N-1 routers. Each of the N-1
routers will reflood to its N-2 neighbors. The original 10s of KB
origninated as a result of a single fault is multiplied by a factor
that is order N^2 (N squared). Any given LSR may receive N feeds of
the same information. For example, if N is 1000 and 40 KB of change
was originated, a given router may receive 40 MB of route updates
when a single link changes state. This can overflow the memory
Villamizar Expires September 15, 2014 [Page 19]
Internet-Draft Routing and Load Balance Stability March 2014
allocated for socket buffers and cause a lot of this flooded
information to drop and have to be retransmitted, lengthening
convergence. Any part of the feed that is dropped at one router from
one neighbor but successfully received from another neighbor will be
sent the other way toward the router from which the information was
dropped.
If a router fails in a full mesh network of N nodes, the situation is
much worse than for a link failure. That one router is adjacent to
N-1 other routers and each must originate at least one LSA or LSPDU
fragement and likely a few. We'll call the number of LSAs or LSPDU
fragments originated per router K. Each of these N-1 routers
originate K LSAs or LSPDU fragments and send these to their N-2
neighbors which will reflood to N-3 of their neighbors. The total
number of LSAs or LSPDU fragements sent is order K*N^3. Each router
could receive a feed of K*N-1 LSAs or LSPDU fragments from each of
N-2 neighbors in a short time period. For example, for K=5, the MTU
is 8KB (IS-IS), and N=1000, a router can receive a set of 40 MB feeds
totaling about 40 GB. No router today or in the forseeable future
has that amount of socket buffer space.
Flooding provides a fast way to distribute route changes and works
extremely well on a typical SP network with a sparse mesh. Flooding
simply does not work well in large full mesh topologies or even large
very dense mesh topologies. For example, some attempts to overlay IP
networks with a LS protocol over an ATM full mesh in the mid 1990s
yielded severely unstable (and therefore short lived) networks. Full
mesh or dense mesh topologies should be avoided.
4.5.2. SPF Computation
The shortest path first (SPF) computation used in LS protocols
requires time order NlogL, where N is the number of nodes and L is
the number of links. In network designs with a node degree that is
relatively independent of N, for example a node degree of 3-5 on
average, computation time can be expressed as KNlogN, where K is a
constant. A rule of thumb which provided a reasonable estimator
around the year 2000 used KNlogN as an estimator with K=1usec. This
was actually much faster than most SPF of that day, but today K might
be smaller. For K=1usec and N=1000, KNlogN would be about 10,000
usec or 10 msec.
Villamizar Expires September 15, 2014 [Page 20]
Internet-Draft Routing and Load Balance Stability March 2014
From the prior example, it is clear that a network with close to
1,000 nodes in a single OSPF area or IS-IS level was supportable at
that time (circa 2000). Some five to ten years earlier and with
slower processors and some poorly written and therefore slower SPF
code, a rule of thumb of 400 nodes was commonly cited as the upper
limit for the size of an OSPF area. The figure 400 was said to be
emperically determined.
Though processors are getting faster and SPF code gradually improves
as well, there will always remain a practical limit to the number of
nodes that can be supported in a given OSPF area or IS-IS level. At
some point convergence times become excessive. In extreme cases
protocol timers of other protocols such as BGP (configured to use
faster timers than the BGP defaults) can expire due to excessive
convergence times and cause further network disruption.
4.5.3. External Routes
Due to OSPF and IS-IS periodic route refresh and due to the way in
which both protocols implement flooding, a very large number of
external routes should not be carried by the IGP. Implementations
have been known to tolerate up to tens of thousands of external
routes, but generally will become unstable with a BGP global routing
feed or worse yet, multiple BGP feeds into the IGP.
4.5.4. Forwarding Entry Installation
Operational experience supports implementations installing forwarding
entries for OSPF intra-area or ISIS intra-level routes first, then
installing forwarding entries for any inter-area routes or routes
from another level, and then external routes. External routes may be
carried in the IGP or carried by another protocol, most commonly BGP.
BGP may carry a very large number of routes.
By installing intra-area or intra-level forwarding entries first,
accurate forwarding with an area or level is better ensured. By
giving the next priority to inter-as or inter-level routes, accurate
forwarding within the infrastructure of the autonomous system (AS) is
better ensured. This ensures that protocols that depend on this
infrastructure will continue to operate in time of high route change.
Many forwarding engines allow either two stage, multistage, or fully
recursive forwarding lookup. Supporting multiple stages allows the
forwarding for external routes which depend on internal routes to
automatically change if only the internal route has changed. This
can dramatically decrease the number of forwarding entries that need
to be changed and reduce effective convergence time
Villamizar Expires September 15, 2014 [Page 21]
Internet-Draft Routing and Load Balance Stability March 2014
In a particularly bad implementation if forwarding entry installation
is slow, and if only single stage forwarding entries are supported,
and if IGP routes are not prioritized, forwarding entries can remain
incorrect long enough for IBGP peerings to go down (particularly if
BGP timers are reduced from the BGP default values). In such cases
it is possible for the route change resulting from IBGP peerings to
cause further churn.
Not since the late 1990s has there been implementations with severe
forwarding entry install time problems deployed in SP networks. The
problem is only worth mentioning in hopes that new vendors don't
repeat these mistakes.
4.6. LDP Impact
The behavior of MPLS using LDP is very similar to the behavior of LS
protocols. Some DV protocols support loop suppression through
maintaining a sequence of hops and this too is supported by LDP. The
LDP loop suppression avoids creating transient loops in LDP when
transient loops occur in the LS routing protocol. LDP therefore has
better transient behavior than a LS routing protocol alone. Services
which are carried only by LDP, such as PW and VPN services, cannot
loop. When IP is carried by LDP and where there is a fallback to IP
forwarding, transient loops will remain a problem for that IP
traffic.
4.6.1. IPFRR and LDPFRR Impact
IPFRR and LDPFRR are described in Section 2.4.
IPFRR provides protection against single faults. IPFRR can include
SRLG consideration. Timer mechanisms help avoid microloops by
delaying SPF based on distance from a fault, in that way sequencing
the transition from protection to restoration [RFC6976]. If multiple
faults occur IPFRR has the potential to be slightly
counterproductive. IPFRR may have a greater potential to create
transient loops in the event of a multiple fault. For example, LDP
without LDPFRR can effectively suppress loops, but with tunnels and
label stacking can create transient loops. In addition any loops
that form when a multiple fault occurs may be made to persist longer
due to the timers introduced to suppress microloops.
The protection against single faults offered by IPFRR and LDPFRR
greatly outways any transient misbehavior that may occur when
multiple faults occur.
4.7. Delay, Jitter, and Loss Metrics
Villamizar Expires September 15, 2014 [Page 22]
Internet-Draft Routing and Load Balance Stability March 2014
Past use of delay link metrics in LS protocols has resulted in
oscillations [DBSPRA] (see Section 3.5 for background). The impact
of path change due to link metric change has a major effect on
traffic levels and therefore an effect on the link metric. The
interdependency of path, traffic levels, and link metric creates the
potential for strong positive feedback and severe oscillations.
Oscillations due to the use of link metrics based on delay, jitter,
or loss in a LS protocols may be unavoidable. This is discussed
further in Section 7.
5. MPLS-TE behavior
MPLS traffic engineering (TE) forwarding is not decided hop by hop.
MPLS TE Label Switched Path (LSP) path selection is determined
entirely by the ingress LSR and nearly always include only strict
hops in the explicit route object (ERO) which dicates the LSP path in
RSVP-TE signaling [RFC3209]. The ingress LSR makes its path decision
based on estimated bandwidth requirements of the LSP and link
bandwidth availability advertised in the IGP. MPLS TE traffic cannot
flow on an LSP until the LSP is setup using RSVP-TE signaling. At
setup time, the estimated bandwidth requirements serves as a
bandwidth reservation. Due to the bandwidth accounting and the
ability for ingress LSR to direct traffic, the behavior of MPLS TE is
significantly different than that of LS protocols.
MPLS TE makes use of resource accounting, avoiding attempts to
utilize more capacity than any part of the topology can support. In
this way networks using RSVP-TE differs from those using LDP. In
addition, some MPLS TE implementations allow multiple LSP between a
given ingress and egress, each potentially taking different paths,
with load split across these LSP in proportion to their reserved
bandwidth. This is a form of static load balancing (see Section 6
and Section 7).
5.1. Transient Behavior
MPLS TE LSP are created using an explicit path from ingress LSR to
egress LSR. There is no opportunity for differences in the LSDB to
cause forwarding loops. MPLS TE therefore has better transient
response than LS protocols in that it is always loop free.
MPLS FRR [RFC4090] provides protection against single faults. Since
it is initiated at the point where the fault is detected, known as
the point of local repair (PLR), MPLS FRR is very fast, typically a
few tens of milliseconds. This provides an excellent response to
single faults, with restoration time often only slightly longer than
the fault detection time.
Villamizar Expires September 15, 2014 [Page 23]
Internet-Draft Routing and Load Balance Stability March 2014
When multiple faults occur, MPLS FRR may provide only partial
protection, unable to protect LSP where the fault affects both the
primary path and protection path. In this case recovery relies
entirely on restoration. In either cases of single fault or multiple
fault, restoration is used to return to a close to optimal traffic
layout and restore protection to LSP for which protection is needed.
Restoration times can be long if care is not taken in network design
or when using implementations which have not been optimized for very
fast restoration.
5.2. Time Complexity of TE Restoration
MPLS TE relies heavily on fast restoration, though fast restoration
is not easily achieved. Only the best implementations acheive
restoration times in the low number of seconds for regions of MPLS TE
deployment employing a full mesh of LSP with a hundred to a few
hundred nodes.
The reason that acheiving fast MPLS TE relies restoration is
difficult is the computational complexity of TE restoration. This is
discussed in Section 3.2.
Providers can subdivide their topology into multiple interconnected
regions with a full mesh of LSP within each region, greatly improving
convergence time, even if the topology is served by a single OSPF
area or IS-IS level. For example, an OSPF area of 1,000 nodes can be
roughly subdivided into 10 regions of about 100 nodes, plus a core
which interconnects the 10 regions using two or more LSR from each
region and where the core may have well under 100 nodes. Being ten
times smaller, the convergence in each region can be expected to be
more than a hundred times faster. If edge-to-edge LSP are needed, as
they often are, either LDP can be run over the RSVP-TE LSP or RSVP-TE
hierarchy [RFC4206] can be used.
Further compounding MPLS TE convergence time problems is the
possibility of crankback. Generally proposals in IETF which give an
ingress LSR too little information to compute a path and rely too
heavily on crankback are rejected for that reason. However in a
situation where a fault has occurred and multiple ingress LSR may try
to make use of a next best resource, crankback is likely to occur if
the resource is too limited to accommodate all of the LSP setups.
The impact of crankback can be reduced by implementations which
sequence LSP setups, setting up the larger LSP first and delaying
slighly those LSP setup for which a viable path still exists. The
amount of crankback that occurs can be greatly reduced by these small
delay between non-urgent LSP setup requests, prompt origination of
LSDB changes particularly for links which are approaching capacity
Villamizar Expires September 15, 2014 [Page 24]
Internet-Draft Routing and Load Balance Stability March 2014
depletion, and quick reflooding. This short list only touches on the
types of optimizations that have been implemented. A full treatment
of the topic of MPLS-TE optimizations is beyond the scope of this
document.
5.3. Scaling issues
The most common scaling problem in RSVP-TE is due to the CSPF
computational complexity described in Section 3.2 and Section 5.2.
A less common problem is overload of resources related to flooding
similar to the scaling problems described in Section 4.5.1 but
potentially made more severe by overly aggressive origination of LSDB
changes, in the worst case sending an LSDB change for each LSP setup.
The scaling limitations of RSVP-TE are dramatically impacted by
network topology. Providers can cause scaling problems through poor
network design. The result of poor network design will be highly
excessive convergence times. MPLS regions supporting a full mesh of
LSP should generally not exceed 100-200 nodes, and keeping well under
100 where possible is advisable. Building a RSVP-TE full mesh over a
lower layer full mesh overlay (for example, a layer-2 overlay) will
cause scaling problems related to flooding. If the overlay is large,
CSPF related flooding problems will also occur.
5.4. Delay, Jitter, and Loss Metrics
The problems with link metrics based on delay, jitter, and loss are
not as severe with MPLS TE as they are with link state (LS) protocols
(see Section 3.5 and Section 4.7) if individual LSP are slow to move
relative to readvertisement. One or perhaps a few of many LSP may
change a path to avoid a congested link, unlike LS protocols where a
change in metric causes all traffic for which another path has become
a lower cost to move.
Where there is only a single LSP from any ingress to egress the
problems of coarse granularity of traffic adjustment remains. This
is discussed further in Section 7.
6. Multipath Load Balancing
Multipath load balancing is briefly introduced in Section 2.6. The
minimal requirement for multipath load balancing is that packet order
is maintained within any given microflow, as called for in the
diffserv architecture [RFC2475]. MPLS-TP adds additional
requirements that make it necessary to treat an MPLS-TP LSP as a
microflow. Carrying MPLS-TP LSPs within an MPLS LSP is discussed in
[I-D.ietf-mpls-multipath-use].
Villamizar Expires September 15, 2014 [Page 25]
Internet-Draft Routing and Load Balance Stability March 2014
6.1. Existing Multipath Techniques
Common practice today involves static multipath load balancing based
on the assumption that the hash that is performed on incoming traffic
will evenly distribute the traffic over the hash space. For the case
where a very large number of small microflows exists, this is a valid
assumption and under 5% imbalance is common. In static load
balancing a hash is performed over the input traffic and the hash
space is divided proportionally to the capacity of component links.
[I-D.ietf-rtgwg-cl-requirement] uses the term flow rather than
microflow to acknowledge that the hash used in load balancing quite
often is not based on the defined characteristics of an IP microflow
in [RFC2475]. Often only label stack information is used, which only
works well if entropy labels [RFC6790] or PW flow labels [RFC6391]
are used. Using a hash does not generally identify single flows but
rather identifies sets of flows which all hash to the same value.
Static load balance has an advantage of being simple and also being
inherently stable. When a small number of very large flows exist,
usually mixed with a very large number of very small flows, static
load balancing will perform poorly. These large flows may be due to
large PW flows, high capacity IPsec gateways or other applications.
Dynamic load balancing can accomodate these large flows but has the
potential to be unstable. Dynamic load balance stability is discused
further in Section 6.2 and Section 7.
6.2. Dynamic Load Balancing Feedback Loop
In dynamic load balancing the same problem that exists for failed
prior techniques [DBSPRA] but with significant differences. In both
cases a reaction to excessive load moves traffic to another part of
the network and in both cases moving traffic away from a congested
part of the network changes the parameters being meaasured. There
are differences between dynamic load balancing used in multipath and
the load shifts in delay based link metrics in a LS protocol. These
differences make it possible to ensure stability in dynamic load
balancing where stability may not be possible for delay based link
metrics in a LS protocol.
Briefly some differences between delay based link metrics in a LS
protocol and dynamic load balancing are the following.
1. In a LS routing protocol a change to a link metric moves a lot of
traffic. In dynamic load balancing based on hashing, the amount
of traffic that can be moved at a time can be as little as the
traffic in one hash bucket.
Villamizar Expires September 15, 2014 [Page 26]
Internet-Draft Routing and Load Balance Stability March 2014
2. In dynamic load balancing it is possible to determine the amount
of traffic that will be moved before making an adjustment if
counters are available in the load balancing hardware.
3. When dynamic load balancing is used solely in local decisions,
the measurement and adjustment is made within the same router.
No protocol exchange outside the router is required.
Stability in dynamic load balancing when used in local decisions has
been successfully deployed and is more easily provably stable than
dynamic load balancing used in a distributed decision. See
Section 7.4.
The topic of preventing feedback problems in dynamic load balancing
is discussed further in Section 7.
7. Summary of Factors in Load Balancing Stability
Delay, jitter, loss, and link utilization based link metrics used in
LS protocols Section 4 or used in MPLS TE Section 5 and multipath
load balancing Section 6 are all forms of load balancing. All of
these forms of load balancing can be categorized in term of the
following characteristics.
static or dynamic
Static load balancing is inherently stable. Dynamic load
balancing is not inherently stable but neither is it
inherently unstable.
balance granularity
Dynamic load balancing granularity may be uncontrolled such
as LS routing protocols or controlled. A controlled load
balancing granularity may be very coarse such as MPLS TE
where entire LSP are moved, or fine such as hash based
multipath techniques.
adjustment certainty
The amount of traffic moved in an adjustment to dynamic load
balancing may be unknown or known. For example, a link
metric change in a LS protocol forces a load adjustment but
the magnitude of the adjustment is unknown until after the
adjustment is made. When moving an LSP, the amount of
traffic moved is known. When an adjustment is made in a hash
based technique, the amount of traffic moved is unknown if
there is no measurement available to determine the amount of
traffic falling within a group of hash values, but can be
known if counters have been provided in the forwarding
hardware.
Villamizar Expires September 15, 2014 [Page 27]
Internet-Draft Routing and Load Balance Stability March 2014
measurement compatibility
The units of measurement and the units of load balance
traffic adjustment may not be entirely compatible. For
example, if delay, jitter, and loss are measured at the point
of congestion and the measurements at the point of traffic
adjustment are offerred traffic load, it is hard to predict
how movement of traffic measured in Mb/s or Gb/s will impact
delay, jitter, or loss, particularly given the behavior of
TCP congestion avoidance.
decision locality
Measurements may be made locally and traffic adjusments may
be applied locally in which case no protocol extensions are
needed. Prior implementations of dynamic multipath operated
in this way and were deployed successfully. Measurements may
also be made and advertised by a LS protocol such that
decisions made by numerous other nodes affect the traffic at
the links being measured. Stability is more difficult to
ensure with this type of distributed load balance decision.
timing and traffic admustment
Any dynamic load balancing technique has timing factors that
can impact stability. For example, the frequency of
measurements and any filtering or hysteresis applied to
measurement can be dictated by specifications or made
advisory and modified in specific deployments. The frequency
of adjustments to load balance and changes to the adjustment
amounts can also be dictated by specifications or made
advisory and modified in specific deployments.
Each of the above characteristics affect whether a specific load
balancing technique is stable. Since static load balancing is
inherently stable, the remainder of this section focuses on dynamic
load balancing. The following subsections focuses on the
characteristics listed above. Section 7.6 discusses the interaction
of these characteristics and discusses factors in protocol design
which affect the tendency toward feedback problems and potential
instability.
7.1. Load Balance Granularity
Villamizar Expires September 15, 2014 [Page 28]
Internet-Draft Routing and Load Balance Stability March 2014
It is not possible to ensure stability in a technique which uses
uncontrolled load balancing, such as LS routing protocols. It may be
possible to introduce timing changes or other changes which tend
toward stability or reduce the impact somewhat (such as long timers).
This seemed to be a goal of [DBSPRA]. With no way to ensure
stability and very limited success in mitigating it, prior efforts to
load balance in LS routing protocols were abandoned and future
attempts strongly discouraged.
Load balancing can generally acheive better results with a fine
granularity than a coarse or very coarse granularity. However
techniques which use a coarse granularity can be made stable. MPLS
TE is inherently a load balancing technique which is somewhat static
by virtue of the normally configured bandwidth reservations and with
the coarse granularity of any given LSP bandwidth reservation.
Stability is ensured if LSP remain on their current path even if
better paths become available, subject to relatively long timers
preferably with timer jitter (timer jitter is a random component
included in the timer value).
Hash based load balancing is a fine granularity load balancing with
the granularity determined by the number of usable bits in the hash.
For example, entropy labels [RFC6790] or PW flow labels [RFC6391] are
limited to 20 bits of usable hash. This allows dividing traffic into
potentially a million groups of flows. Most implementations today
make use of tables which are much smaller than 20 bits and therefore
provide a somewhat more coarse granularity.
7.2. Load Balance Adjustment Certainty
The amount of traffic that will be moved when an adjustment to
dynamic load balancing is made may be either unknown or known prior
to making the adjustment.
Dynamic load balancing with an unknown traffic adjustment amount has
been deployed successfully. This success relies on there being a
reasonably good correlation between the fraction of the hash space
moved and the fraction of total traffic which falls into that hash
space.
When the traffic adjustment amount is unknown, very large flows
occasionally cause a large change in traffic when only a small change
was intended. In such cases a single hash value may have a
disproportionately large amount of traffic. A pathologic behavior
would be to continuously move a small range of hash values back and
forth as a result of this. This problem is avoidable.
Villamizar Expires September 15, 2014 [Page 29]
Internet-Draft Routing and Load Balance Stability March 2014
A far better situation exists when the traffic adjustment amount is
known. For MPLS TE LSP, the reservable bandwidth is known and most
LSR can measure traffic on a per LSP basis. For hash base multipath
few existing implementations have measurements over the hash space
for a particular multipath. Making such measurement is not difficult
but requires that the designer of the forwarding hardware had the
forsight to provide counters in the appropriate places.
7.3. Load Balance Measurement Compatibility
Measurements of delay, jitter, and loss at a point of congestion
cannot be readily translated into measurements of utilization at a
point where an adjustment to traffic load balance is made. For
example, one or more large TCP microflows may be in the TCP slow
start phase, where traffic flow increases exponentially until a loss
or ECN notification is detected. Prior to congestion, rising
utilization can be measured directly. After the onset of congestion,
utilization hits 100% and can no longer be used as a measure of
loading at the point of congestion. The amount of rise in delay and
jitter or the rate of rise in delay cannot easily be used to
determine the amount traffic to move. If loss is occurring, the
amount of loss cannot easily be be used to determine the amount
traffic to move but increasing loss may be correlated to the degree
of overutilization.
The measurement of delay and jitter is a measure of time, generally
measured in milliseconds (msec). Loss is measured as a fraction of
traffic. There is no reliable means to determine how much traffic
should be moved from one link with a given queuing delay to another
link with a different amount of queuing delay. The same applies to
two links with different amounts of loss or one link with queuing
delay but no loss and another link with loss.
Jitter is the most problematic of these measures. Jitter is a
function both of the link utilization and the burstiness of traffic
sources. A rise in jitter can occur at very low link utilizations.
Jitter has never been used as the basis of load balancing and
questions remain about the feasibility of doing so.
7.4. Load Balancing Decision Locality
Delay, jitter, and loss based link metrics used in LS protocols
Section 4 or used in MPLS TE Section 5 and multipath load balancing
Section 6 are all forms of load balancing. Load balancing can be
applied in two ways.
Villamizar Expires September 15, 2014 [Page 30]
Internet-Draft Routing and Load Balance Stability March 2014
1. Some types of load balancing can be applied locally, using only
measurement made at local interfaces to impact the proportion of
traffic going out those local interfaces.
2. Load balancing can be applied in a distributed fashion where
measurements are advertised, usually using additional information
carried by a LS protocol and action is taken by many routers
which make use of the resources for which measurements have been
made.
Some techniques are inherently local or inherently distributed.
Delay, jitter, and loss based link metrics used in LS protocols
Section 4 or used in MPLS TE Section 5 are inherently distributed.
Ethernet Link Aggregation [IEEE-802.1AX] is a multipath technique
which is inherently local. Other forms of multipath such as MPLS
Link Bundling [RFC4201] or any multipath technique which can make use
of MPLS LSP as component links are neither inherently local or
inherently distributed.
7.4.1. Local Decision Load Balance
It is easier to ensure stability when load balancing is entirely a
local decision. There is no protocol exchange required to do this
sort of load balancing and there are no interoperability issues with
neighbor routers. Therefore there has been no need for IETF to
dictate how this is done.
Common practice has evolved which is well known within the industry,
widely deployed, and very similar across implementations. It has
remained poorly documented in the IETF, being a local decision.
Load balancing in Ethernet Link Aggregation [IEEE-802.1AX] is
inherently a local decision. As commonly practiced today, ECMP and
Link Bundling also make use of local decisions.
7.4.2. Distributed Load Balance
Few attempts to make use of distributed load balancing have reached
the IETF. A prerequisite for distributed load balancing is a means
to advertise the measurement results. This prerequisite is met by
[I-D.ietf-ospf-te-metric-extensions] and
[I-D.ietf-isis-te-metric-extensions] but no means of applying these
metrics has been defined.
A very early attempt to make use of distributed load balancing was
OSPF-OMP, ISIS-OMP, and MPLS-OMP [I-D.ietf-ospf-omp]
[I-D.ietf-isis-omp] [I-D.villamizar-mpls-omp]. These drafts were
abandonned due to provider lack of interest at the time. This was
Villamizar Expires September 15, 2014 [Page 31]
Internet-Draft Routing and Load Balance Stability March 2014
partially due to the difficulty in providing mathematical proof of
stability.
There has been a renewed though cautious interest in distributed load
balancing as evidenced by [I-D.ietf-rtgwg-cl-requirement] and related
work. Any use of [I-D.ietf-ospf-te-metric-extensions] and
[I-D.ietf-isis-te-metric-extensions] are distributed load balancing
techniques. With these extensions with measurements based on delay
may optionally include queuing delay. Measurements of jitter are
inherently queuing based. Measurements of loss are also queuing
based except for lossy links such as wireless links. The authors of
these documents have intentionally put stability concerns out of
scope due to the difficulty in ensuring stability and even greater
difficulty of mathematically proving stability.
7.5. Load Balance Timing and Traffic Adjustment
In any system with feedback careful attention to timing is needed to
ensure that the system remains stable. For load balancing this may
be easier to accomplish where measurements and traffic adjustments
are entirely local.
The following timing considerations are related to making
measurements and propogating measurement results:
1. Utilization, delay, jitter, and loss measurement interval for a
direct measurement.
2. Type of filtering of direct measurements, if any, applied to
produce a reportable measurement.
3. Amount of positive or negative change to reportable measurement
needed before updating the reportable measurement. This amount
should be a function of the time since the last reported
measurement and a function of the importance of making an
adjustment at that measurement level.
4. The time necessary to propogate changes must be considered when
trying to determine if a technique is stable. This may be a
small but non-zero time.
Upon receiving an updated measurement result, any traffic adjustment
should be deferred for a target time period with jitter applied to
the timer. The target time period should be a function of the
following:
1. the percent contribution to the traffic at the link or path where
traffic would be moved from,
Villamizar Expires September 15, 2014 [Page 32]
Internet-Draft Routing and Load Balance Stability March 2014
2. the difference between the conditions at the link or path from
which traffic would be moved from and the link or path where the
traffic would be moved to,
3. any other factor that the protocol specification deems relevent
or that the implementation deems relevant and that is allowable
by the protocol.
Upon receiving an updated measurement result, any traffic adjustment
amount should be a function of the following:
1. the amount of traffic that needs to be moved from all traffic
sources to acheive balance,
2. the amount of traffic that can be moved from all traffic sources
to acheive balance,
3. the percent contribution to the traffic at the link or path where
traffic would be moved from,
4. the difference between the conditions at the link or path from
which traffic would be moved from and the link or path where the
traffic would be moved to,
5. any other factor that the protocol specification deems relevent
or that the implementation deems relevant and that is allowable
by the protocol,
6. any prior traffic adjustment between the set of links or paths
and the time that has elapsed since that adjustment.
It is important to note that for a given update there may be multiple
more lightly loaded links or paths for which traffic can be moved to
and there may be multiple more heavily load links or paths for which
traffic can be moved from. The amount of traffic for which there is
already a scheduled adjustment from or to a given link or path must
be taken into consideration when computing the amount of a new
traffic adjustment.
7.6. Preventing Feedback Problems
The goal of this section is not to produce an algorithm for traffic
adjustments. The goal is simply to point out factors that will
either increase or decrease the tendency toward feedback problems and
potential instability.
The most severe problem is positive feedback at some frequency (or
time period) causing a growing oscillation until saturation is
Villamizar Expires September 15, 2014 [Page 33]
Internet-Draft Routing and Load Balance Stability March 2014
reached. Also very sever is an undamped oscillation, which may not
grow but will continue forever.
Oscillations often occur due to overshoot in making an adjustment.
Overshoot can be caused in a distributed load balancing environment
by multiple traffic sources each making an adjustment that
individually would bring the network closer to balance but together
cause overshoot. This is why the contribution of a given traffic
source to the total traffic must be considered.
If there is a no tendency for overshoot at all, there is a tendency
to have no oscillation at all. If in the event of overshoot,
subsequent adjustments are always less aggresive, then there will be
a damped oscillation, known as a "ringing". Attempts to respond
quickly to a traffic imbalance may result in some overshoot. Damping
subsequent responses will reduce subsequent overshoot and reduce
ringing. Small overshoot and damped ringing is undesirable but might
be acceptable if there was some other gain such as faster reaction
time.
What follows are some suggestions to help ensure stability and
improve damping.
1. If there was a prior traffic adjustment between two links or
paths, then the following type of bounds should be applied.
a. If the new traffic adjustment is in the same direction, the
adjustment amount should be no more than the prior adjustment
amount multiplied by some factor (greater than one) that is
determined to by the protocol to help ensure stability. For
example, if the prior traffic adjustment was "recent" this
factor might be two or less (a doubling in the next
adjustment or less) and rise proportionately if this elapsed
time was above the "recent" threshhold.
b. If the new traffic adjustment is in the opposite direction,
the adjustment amount should be no more than the prior
adjustment amount multiplied by some fraction that is
determined to by the protocol to adequately damp any ringing.
For example, if the prior traffic adjustment was "recent"
this factor might be one half and rise proportionately with
time since last adjustment if this elapsed time was above the
"recent" threshhold, reaching unity at some very long time
where prior history may be less relevant.
2. When an updated measurement result and a traffic adjustment is
already scheduled, the computation of timer value to be used to
defer taking action and computation of adjustment amount should
Villamizar Expires September 15, 2014 [Page 34]
Internet-Draft Routing and Load Balance Stability March 2014
be repeated. The updated adjustment amount should be used. The
lesser of the remaining time on the prior timer and the new
computed timer value should be used.
3. Analysis of traffic adjustment scheduled time and traffic
adjustment amount should be repeated each time there is a change
to either side of an already scheduled traffic adjustment. When
traffic adjustment is strictly local, this should not impose a
high computational burden. For distributed load balancing, the
computational burden in a large mesh network will have to be
considered when considering protocol timer default value guidance
and safe bounds of timer values.
It is very difficult to mathematically prove that the specifications
for a particular dynamic load balancing method is stable. So far, at
best an argument can be made that specific provisions within a
particular dynamic load balancing method strongly tend toward
stability. See Section 8 for further discussion.
8. Open Area for Research
So far there is no mathematic proof of stability for any given
dynamic load balancing method. Having such a proof would go a long
way toward encouraging deployment of new methods.
Absent a mathematic proof of stability, some dynamic load balancing
methods for which a strong argument could be made for a tendency
toward stability have been deployed. Once deployed, successful
deployments provide an existance proof that networks can be stable,
but this existance proof falls short of ensuring that all deployments
will always be stable.
The discrte nature of traffic adjustment events rather than
continuous small adjustment prevent the use of linear systems
analysis. This is unfortunate since stability proofs for linear
systems (or proof of instability) are relatively simple.
The mesh topology and analog nature of traffic adjustment amount make
it difficult or impossible to apply techniques based on discrete
states, such as Markov chain.
Proof of stability may be easier for dynamic load balancing applied
as a local decision rather than a distributed decision (see
Section 7.4). For a local decision, a set of inputs to a node are
balanced over a set of outputs. Absent routing loops, the output
will not affect the inputs.
Villamizar Expires September 15, 2014 [Page 35]
Internet-Draft Routing and Load Balance Stability March 2014
For dynamic load balancing with distributed decisions a proof must be
applicable to an arbitrary mesh topology. This is a more difficult
problem than the problem of local decision.
There may be characteristics of the mesh that make a proof for
dynamic load balancing with distributed decisions somewhat less
difficult. It may help that there exists a non-cylic relationship
between move heavily loaded links and less loaded links, where given
that all routers have the same information, traffic is only moved
from more heavily loaded links to less loaded links. Other
characteristics, such as non-cyclic nature of the traffic flow, may
help as well.
The problems of providing stability proofs for existing and proposed
mechanisms are clearly difficult and not yet solved. As such this
remains a good topic for researh.
9. Acknowledgements
While there was interest in OMP in the 1990s there was some
discussion of stability on the OSPF, ISIS, and MPLS WG mailing lists
as well as off list. Particularly helpful was discussion with Matt
Mathis (PSC) and Teunis Ott (Bellcore).
10. IANA Considerations
This memo includes no request to IANA.
11. Security Considerations
This document discusses both the known stability or instability of
previously deployed routing and load balancing techniques and the
unknown stability or instability of techniques which may be deployed
in the future. The difficulty in proving stability is also
discussed. Given the scope of the document, with no proposal for new
protocols and only discussion of factors that impact stability, there
are no security concerns raised by this document.
12. Informative References
[DBSPRA] Bertsekas, D., "Dynamic Behavior of Shortest Path Routing
Algorithms for Communication Networks", IEEE Trans. Auto.
Control 1982, 1982.
[I-D.ietf-isis-omp]
Villamizar, C. and T. Li, "IS-IS Optimized Multipath
(ISIS-OMP)", draft-ietf-isis-omp-01 (work in progress),
February 1999.
Villamizar Expires September 15, 2014 [Page 36]
Internet-Draft Routing and Load Balance Stability March 2014
[I-D.ietf-isis-te-metric-extensions]
Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas,
A., Filsfils, C., and W. Wu, "IS-IS Traffic Engineering
(TE) Metric Extensions", draft-ietf-isis-te-metric-
extensions-01 (work in progress), October 2013.
[I-D.ietf-mpls-forwarding]
Villamizar, C., Kompella, K., Amante, S., Malis, A., and
C. Pignataro, "MPLS Forwarding Compliance and Performance
Requirements", draft-ietf-mpls-forwarding-09 (work in
progress), March 2014.
[I-D.ietf-mpls-multipath-use]
Villamizar, C., "Use of Multipath with MPLS and MPLS-TP",
draft-ietf-mpls-multipath-use-04 (work in progress),
January 2014.
[I-D.ietf-mpls-te-express-path]
Atlas, A., Drake, J., Giacalone, S., Ward, D., Previdi,
S., and C. Filsfils, "Performance-based Path Selection for
Explicitly Routed LSPs using TE Metric Extensions", draft-
ietf-mpls-te-express-path-00 (work in progress), October
2013.
[I-D.ietf-ospf-omp]
Villamizar, C., "OSPF Optimized Multipath (OSPF-OMP)",
draft-ietf-ospf-omp-02 (work in progress), February 1999.
[I-D.ietf-ospf-te-metric-extensions]
Giacalone, S., Ward, D., Drake, J., Atlas, A., and S.
Previdi, "OSPF Traffic Engineering (TE) Metric
Extensions", draft-ietf-ospf-te-metric-extensions-05 (work
in progress), December 2013.
[I-D.ietf-rtgwg-cl-requirement]
Villamizar, C., McDysan, D., Ning, S., Malis, A., and L.
Yong, "Requirements for Advanced Multipath in MPLS
Networks", draft-ietf-rtgwg-cl-requirement-16 (work in
progress), February 2014.
[I-D.ietf-rtgwg-cl-use-cases]
Ning, S., Malis, A., McDysan, D., Yong, L., and C.
Villamizar, "Advanced Multipath Use Cases and Design
Considerations", draft-ietf-rtgwg-cl-use-cases-05 (work in
progress), November 2013.
[I-D.ietf-rtgwg-mrt-frr-architecture]
Villamizar Expires September 15, 2014 [Page 37]
Internet-Draft Routing and Load Balance Stability March 2014
Atlas, A., Kebler, R., Envedi, G., Csaszar, A., Tantsura,
J., Konstantynowicz, M., and R. White, "An Architecture
for IP/LDP Fast-Reroute Using Maximally Redundant Trees",
draft-ietf-rtgwg-mrt-frr-architecture-03 (work in
progress), July 2013.
[I-D.ietf-rtgwg-remote-lfa]
Bryant, S., Filsfils, C., Previdi, S., Shand, M., and S.
Ning, "Remote LFA FRR", draft-ietf-rtgwg-remote-lfa-04
(work in progress), November 2013.
[I-D.villamizar-mpls-omp]
Villamizar, C., "OSPF Optimized Multipath (OSPF-OMP)",
draft-villamizar-mpls-omp-01 (work in progress), February
1999.
[IEEE-802.1AX]
IEEE Standards Association, "IEEE Std 802.1AX-2008 IEEE
Standard for Local and Metropolitan Area Networks - Link
Aggregation", 2006, <http://standards.ieee.org/getieee802/
download/802.1AX-2008.pdf>.
[ISO.10589.1992]
International Organization for Standardization,
"Intermediate system to intermediate system intra-domain-
routing routine information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", ISO
Standard 10589, 1992.
[RFC1058] Hedrick, C., "Routing Information Protocol", RFC 1058,
June 1988.
[RFC2080] Malkin, G. and R. Minnear, "RIPng for IPv6", RFC 2080,
January 1997.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
[RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route
Flap Damping", RFC 2439, November 1998.
[RFC2453] Malkin, G., "RIP Version 2", STD 56, RFC 2453, November
1998.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
and W. Weiss, "An Architecture for Differentiated
Services", RFC 2475, December 1998.
Villamizar Expires September 15, 2014 [Page 38]
Internet-Draft Routing and Load Balance Stability March 2014
[RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and
Multicast Next-Hop Selection", RFC 2991, November 2000.
[RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path
Algorithm", RFC 2992, November 2000.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
Tunnels", RFC 3209, December 2001.
[RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen,
P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi-
Protocol Label Switching (MPLS) Support of Differentiated
Services", RFC 3270, May 2002.
[RFC3468] Andersson, L. and G. Swallow, "The Multiprotocol Label
Switching (MPLS) Working Group decision on MPLS signaling
protocols", RFC 3468, February 2003.
[RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering
(TE) Extensions to OSPF Version 2", RFC 3630, September
2003.
[RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May
2005.
[RFC4124] Le Faucheur, F., "Protocol Extensions for Support of
Diffserv-aware MPLS Traffic Engineering", RFC 4124, June
2005.
[RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
in MPLS Traffic Engineering (TE)", RFC 4201, October 2005.
[RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP)
Hierarchy with Generalized Multi-Protocol Label Switching
(GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005.
[RFC4632] Fuller, V. and T. Li, "Classless Inter-domain Routing
(CIDR): The Internet Address Assignment and Aggregation
Plan", BCP 122, RFC 4632, August 2006.
[RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP
Specification", RFC 5036, October 2007.
[RFC5250] Berger, L., Bryskin, I., Zinin, A., and R. Coltun, "The
OSPF Opaque LSA Option", RFC 5250, July 2008.
Villamizar Expires September 15, 2014 [Page 39]
Internet-Draft Routing and Load Balance Stability March 2014
[RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast
Reroute: Loop-Free Alternates", RFC 5286, September 2008.
[RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic
Engineering", RFC 5305, October 2008.
[RFC5308] Hopps, C., "Routing IPv6 with IS-IS", RFC 5308, October
2008.
[RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
for IPv6", RFC 5340, July 2008.
[RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching
(MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic
Class" Field", RFC 5462, February 2009.
[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC
5714, January 2010.
[RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free
Convergence", RFC 5715, January 2010.
[RFC6126] Chroboczek, J., "The Babel Routing Protocol", RFC 6126,
April 2011.
[RFC6391] Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan,
J., and S. Amante, "Flow-Aware Transport of Pseudowires
over an MPLS Packet Switched Network", RFC 6391, November
2011.
[RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and
L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
RFC 6790, November 2012.
[RFC6976] Shand, M., Bryant, S., Previdi, S., Filsfils, C.,
Francois, P., and O. Bonaventure, "Framework for Loop-Free
Convergence Using the Ordered Forwarding Information Base
(oFIB) Approach", RFC 6976, July 2013.
[RFC6981] Bryant, S., Previdi, S., and M. Shand, "A Framework for IP
and MPLS Fast Reroute Using Not-Via Addresses", RFC 6981,
August 2013.
Villamizar Expires September 15, 2014 [Page 40]
Internet-Draft Routing and Load Balance Stability March 2014
Author's Address
Curtis Villamizar
Outer Cape Cod Network Consulting, LLC
Email: curtis@occnc.com
Villamizar Expires September 15, 2014 [Page 41]