RIFT WG | Yuehua. Wei |
Internet-Draft | Zheng. Zhang |
Intended status: Standards Track | ZTE Corporation |
Expires: May 7, 2020 | Dmitry. Afanasiev |
Yandex | |
Tom. Verhaeg | |
Interconnect Services B.V. | |
Jaroslaw. Kowalczyk | |
Orange Polska | |
November 4, 2019 |
RIFT Applicability
draft-wei-rift-applicability-02
This document discusses the properties, applicability and operational considerations of RIFT in different network scenarios. It intends to provide a rough guide how RIFT can be deployed to simplify routing operations in Clos topologies and their variations.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 7, 2020.
Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document intends to explain the properties and applicability of RIFT in different deployment scenarios and highlight the operational simplicity of the technology compared to traditional routing solutions. It also documents special considerations when RIFT is used with or without overlays, controllers and corrects topology miscablings and/or node and link failures.
Clos and Fat-Tree topologies have gained prominence in today's networking, primarily as result of the paradigm shift towards a centralized data-center based architecture that is poised to deliver a majority of computation and storage services in the future.
Today's current routing protocols were geared towards a network with an irregular topology and low degree of connectivity originally. When they are applied to Fat-Tree topologies:
Further content of this document assumes that the reader is familiar with the terms and concepts used in OSPF and IS-IS link-state protocols and at least the sections of RIFT outlining the requirement of routing in IP fabrics and RIFT protocol concepts.
RIFT is a dynamic routing protocol for Clos and fat-tree network topologies. It defines a link-state protocol when "pointing north" and path-vector protocol when "pointing south".
It floods flat link-state information northbound only so that each level obtains the full topology of levels south of it. That information is never flooded East-West or back South again. So a top tier node has full set of prefixes from the SPF calculation.
In the southbound direction the protocol operates like a "fully summarizing, unidirectional" path vector protocol or rather a distance vector with implicit split horizon whereas the information propagates one hop south and is 're-advertised' by nodes at next lower level, normally just the default route.
+-----------+ +-----------+ | ToF | | ToF | LEVEL 2 + +-----+--+--+ +-+--+------+ | | | | | | | | | ^ + | | | +-------------------------+ | Distance | +-------------------+ | | | | | Vector | | | | | | | | + South | | | | +--------+ | | | Link+State + | | | | | | | | Flooding | | | +-------------+ | | | North v | | | | | | | | + +-+--+-+ +------+ +-------+ +--+--+-+ | |SPINE | |SPINE | | SPINE | | SPINE | | LEVEL 1 + ++----++ ++---+-+ +--+--+-+ ++----+-+ | + | | | | | | | | | ^ N Distance | +-------+ | | +--------+ | | | E Vector | | | | | | | | | +------> South | +-------+ | | | +-------+ | | | | + | | | | | | | | | + v ++--++ +-+-++ ++-+-+ +-+--++ + |LEAF| |LEAF| |LEAF| |LEAF | LEVEL 0 +----+ +----+ +----+ +-----+
Figure 1: Rift overview
A middle tier node has only information necessary for its level, which are all destinations south of the node based on SPF calculation, default route and potential disaggregated routes.
RIFT combines the advantage of both Link-State and Distance Vector:
And RIFT eliminates the disadvantages of Link-State or Distance Vector:
So there are two types of link state database which are "north representation" N-TIEs and "south representation" S-TIEs. The N-TIEs contain a link state topology description of lower levels and S-TIEs carry simply default routes for the lower levels.
There are a bunch of more advantages unique to RIFT listed below which could be understood if you read the details of RIFT.
Albeit RIFT is specified primarily for "proper" Clos or "fat-tree" structures, it already supports PoD concepts which are strictly speaking not found in original Clos concepts.
Further, the specification explains and supports operations of multi-plane Clos variants where the protocol relies on set of rings to allow the reconciliation of topology view of different planes as most desirable solution making proper disaggregation viable in case of failures. This observations hold not only in case of RIFT but in the generic case of dynamic routing on Clos variants with multiple planes and failures in bi-sectional bandwidth, especially on the leafs.
RIFT is not limited to pure Clos divided into PoD and multi-planes but supports horizontal links below the top of fabric level. Those links are used however only as routes of last resort northbound when a spine loses all northbound links or cannot compute a default route through them.
A possible configuration is a "ring" of horizontal links at a level. In presence of such a "ring" in any level (except ToF level) neither N-SPF nor S-SPF will provide a "ring-based protection" scheme since such a computation would have to deal necessarily with breaking of "loops" in Dijkstra sense; an application for which RIFT is not intended.
A full-mesh connectivity between nodes on the same level can be employed and that allows N-SPF to provide for any node loosing all its northbound adjacencies (as long as any of the other nodes in the level are northbound connected) to still participate in northbound forwarding.
Through relaxations of the specified adjacency forming rules RIFT implementations can be extended to support vertical "shortcuts" as proposed by e.g. [I-D.white-distoptflood]. The RIFT specification itself does not provide the exact details since the resulting solution suffers from either much larger blast radii with increased flooding volumes or in case of maximum aggregation routing bow-tie problems.
RIFT is largely driven by demands and hence ideally suited for application in underlay of data center IP fabrics, vast majority of which seem to be currently (and for the foreseeable future) Clos architectures. It significantly simplifies operation and deployment of such fabrics as described in Section 4 for environments compared to extensive proprietary provisioning and operational solutions.
The demand for bandwidth is increasing steadily, driven primarily by environments close to content producers (server farms connection via DC fabrics) but in proximity to content consumers as well. Consumers are often clustered in metro areas with their own network architectures that can benefit from simplified, regular Clos structures and hence RIFT.
Commercial edifices are often cabled in topologies that are either Clos or its isomorphic equivalents. With many floors the Clos can grow rather high and with that present a challenge for traditional routing protocols (except BGP and by now largely phased-out PNNI) which do not support an arbitrary number of levels which RIFT does naturally. Moreover, due to limited sizes of forwarding tables in active elements of building cabling the minimum FIB size RIFT maintains under normal conditions can prove particularly cost-effective in terms of hardware and operational costs.
It is common in high-speed communications switching and routing devices to use fabrics when a crossbar is not feasible due to cost, head-of-line blocking or size trade-offs. Normally such fabrics are not self-healing or rely on 1:/+1 protection schemes but it is conceivable to use RIFT to operate Clos fabrics that can deal effectively with interconnections or subsystem failures in such module. RIFT is neither IP specific and hence any link addressing connecting internal device subnets is conceivable.
The Cloud Central Office (CloudCO) is a new stage of telecom Central Office. It takes the advantage of Software Defined Networking (SDN) and Network Function Virtualization (NFV) in conjunction with general purpose hardware to optimize current networks. The following figure illustrates this architecture at a high level. It describes a single instance or macro-node of cloud CO. An Access I/O module faces a Cloud CO Access Node, and the CPEs behind it. A Network I/O module is facing the core network. The two I/O modules are interconnected by a leaf and spine fabric. [TR-384]
+---------------------+ +----------------------+ | Spine | | Spine | | Switch | | Switch | +------+---+------+-+-+ +--+-+-+-+-----+-------+ | | | | | | | | | | | | | | | | | +-------------------------------+ | | | | | | | | | | | | | | | | | +-------------------------+ | | | | | | | | | | | | | | | | | +----------------------+ | | | | | | | | | | | | | | | | | | | | | +---------------------------------+ | | | | | | | | | | | | | | | | | | | | | | +-----------------------------+ | | | | | | | | | | | | | | | | | | | | | | +--------------------+ | | | | | | | | | | | | | | | | +--+ +-+---+--+ +-+---+--+ +--+----+--+ +-+--+--+ +--+ |L | | Leaf | | Leaf | | Leaf | | Leaf | |L | |S | | Switch | | Switch | | Switch | | Switch| |S | ++-+ +-+-+-+--+ +-+-+-+--+ +--+-+--+--+ ++-+--+-+ +-++ | | | | | | | | | | | | | | | +-+-+-+--+ +-+-+-+--+ +--+-+--+--+ ++-+--+-+ | | |Compute | |Compute | | Compute | |Compute| | | |Node | |Node | | Node | |Node | | | +--------+ +--------+ +----------+ +-------+ | | || VAS5 || || vDHCP|| || vRouter|| ||VAS1 || | | |--------| |--------| |----------| |-------| | | |--------| |--------| |----------| |-------| | | || VAS6 || || VAS3 || || v802.1x|| ||VAS2 || | | |--------| |--------| |----------| |-------| | | |--------| |--------| |----------| |-------| | | || VAS7 || || VAS4 || || vIGMP || ||BAA || | | |--------| |--------| |----------| |-------| | | +--------+ +--------+ +----------+ +-------+ | | | ++-----------+ +---------++ |Network I/O | |Access I/O| +------------+ +----------+
Figure 2: An example of CloudCO architecture
The Spine-Leaf architectures deployed inside CloudCO meets the network requirements of adaptable, agile, scalable and dynamic.
RIFT presents the opportunity for organizations building and operating IP fabrics to simplify their operation and deployments while achieving many desirable properties of a dynamic routing on such a substrate:
South reflection is a mechanism that South Node TIEs are "reflected" back up north to allow nodes in same level without E-W links to "see" each other.
For example, Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs from ToF21 to ToF22 separately. Respectively, Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs from ToF22 to ToF21 separately. So ToF22 and ToF21 see each other's node information as level 2 nodes.
In an equivalent fashion, as the result of the south reflection between Spine121-Leaf121-Spine122 and Spine121-Leaf122-Spine122, Spine121 and Spine 122 knows each other at level 1.
+--------+ +--------+ | ToF21 | | ToF22 | LEVEL 2 ++--+-+-++ ++-+--+-++ | | | | | | | + | | | | | | | linkTS8 +-------------+ | +-+linkTS3+-+ | | | +--------------+ | | | | | | + | | +----------------------------+ | linkTS7 | | | | | + + + | | | | +-------+linkTS4+------------+ | | | | + + | | | | | | +------------+--+ | | | | | | | linkTS6 | | +-+----++ ++-----++ ++------+ ++-----++ |Spin111| |Spin112| |Spin121| |Spin122| LEVEL 1 +-+---+-+ ++----+-+ +-+---+-+ ++---+--+ | | | | | | | | | +--------------+ | + ++XX+linkSL6+---+ + | | | | linkSL5 | | linkSL8 | +------------+ | | + +---+linkSL7+-+ | + | | | | | | | | +-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+ |Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0 +-+-----+ ++------+ +-----+-+ +-+-----+ + + + + Prefix111 Prefix112 Prefix121 Prefix122
Figure 3: Suboptimal routing upon link failure use case
As shown in Figure 3, as the result of the south reflection between Spine121-Leaf121-Spine122 and Spine121-Leaf122-Spine122, Spine121 and Spine 122 knows each other at level 1.
Without disaggregation mechanism, when linkSL6 fails, the packet from leaf121 to prefix122 will probably go up through linkSL5 to linkTS3 then go down through linkTS4 to linkSL8 to Leaf122 or go up through linkSL5 to linkTS6 then go down through linkTS4 and linkSL8 to Leaf122 based on pure default route. It's the case of suboptimal routing or bow-tieing.
With disaggregation mechanism, when linkSL6 fails, Spine122 will detect the failure according to the reflected node S-TIE from Spine121. Based on the disaggregation algorithm provided by RIFT, Spine122 will explicitly advertise prefix122 in Disaggregated Prefix S-TIE PrefixesElement(prefix122, cost 1). The packet from leaf121 to prefix122 will only be sent to linkSL7 following a longest-prefix match to prefix 122 directly then go down through linkSL8 to Leaf122 .
+--------+ +--------+ | ToF 21 | | ToF 22 | LEVEL 2 ++-+--+-++ ++-+--+-++ | | | | | | | | | | | | | | | linkTS8 +--------------+ | +--linkTS3-X+ | | | +--------------+ linkTS1 | | | | | | | | +-----------------------------+ | linkTS7 | | | | | | | | | | | linkTS2 +--------linkTS4-X-----------+ | | | | | | | | | | linkTS5 +-+ +---------------+ | | | | | | | linkTS6 | | +-+----++ +-+-----+ ++----+-+ ++-----++ |Spin111| |Spin112| |Spin121| |Spin122| LEVEL 1 +-+---+-+ ++----+-+ +-+---+-+ ++---+--+ | | | | | | | | | +---------------+ | | +----linkSL6----+ | linkSL1 | | | linkSL5 | | linkSL8 | +---linkSL3---+ | | | +----linkSL7--+ | | | | | | | | | | +-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+ |Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0 +-+-----+ ++------+ +-----+-+ +-+-----+ + + + + Prefix111 Prefix112 Prefix121 Prefix122
Figure 4: Black-holing upon link failure use case
This scenario illustrates a case when double link failure occurs and with that black-holing can happen.
Without disaggregation mechanism, when linkTS3 and linkTS4 both fail, the packet from leaf111 to prefix122 would suffer 50% black-holing based on pure default route. The packet supposed to go up through linkSL1 to linkTS1 then go down through linkTS3 or linkTS4 will be dropped. The packet supposed to go up through linkSL3 to linkTS2 then go down through linkTS3 or linkTS4 will be dropped as well. It's the case of black-holing.
With disaggregation mechanism, when linkTS3 and linkTS4 both fail, ToF22 will detect the failure according to the reflected node S-TIE of ToF21 from Spine111\Spine112\Spine121\Spine122. Based on the disaggregation algorithm provided by RITF, ToF22 will explicitly originate an S-TIE with prefix 121 and prefix 122, that is flooded to spines 111, 112, 121 and 122.
The packet from leaf111 to prefix122 will not be routed to linkTS1 or linkTS2. The packet from leaf111 to prefix122 will only be routed to linkTS5 or linkTS7 following a longest-prefix match to prefix122.
Each RIFT node may operate in zero touch provisioning (ZTP) mode. It has no configuration (unless it is a Top-of-Fabric at the top of the topology or it is desired to confine it to leaf role w/o leaf-2-leaf procedures). In such case RIFT will fully configure the node's level after it is attached to the topology.
The most import component for ZTP is the automatic level derivation procedure. All the Top-of-Fabric nodes are explicitly marked with TOP_OF_FABRIC flag which are initial 'seeds' needed for other ZTP nodes to derive their level in the topology. The derivation of the level of each node happens then based on LIEs received from its neighbors whereas each node (with possibly exceptions of configured leafs) tries to attach at the highest possible point in the fabric. This guarantees that even if the diffusion front reaches a node from "below" faster than from "above", it will greedily abandon already negotiated level derived from nodes topologically below it and properly peer with nodes above.
+----------------+ +-----------------+ | ToF21 | +------+ ToF22 | LEVEL 2 +-------+----+---+ | +----+---+--------+ | | | | | | | | | | | | +----------------------------+ | | +---------------------------+ | | | | | | | | | | | | | | | | | +-----------------------+ | | | | +------------------------+ | | | | | | | | | | | | +-+---+-+ +-+---+-+ | +-+---+-+ +-+---+-+ |Spin111| |Spin112| | |Spin121| |Spin122| LEVEL 1 +-+---+-+ ++----+-+ | +-+---+-+ ++----+-+ | | | | | | | | | | +---------+ | link-M | +---------+ | | | | | | | | | | | +-------+ | | | | +-------+ | | | | | | | | | | | +-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+ |Leaf111| |Leaf112+-----+ |Leaf121| |Leaf122| LEVEL 0 +-------+ +-------+ +-------+ +-------+
Figure 5: A single plane miscabling example
Figure Figure 5 shows a single plane miscabling example. It's a perfect fat-tree fabric except link-M connecting Leaf112 to ToF22.
The RIFT control protocol can discover the physical links automatically and be able to detect cabling that violates fat-tree topology constraints. It react accordingly to such mis-cabling attempts, at a minimum preventing adjacencies between nodes from being formed and traffic from being forwarded on those mis-cabled links. Leaf112 will in such scenario use link-M to derive its level (unless it is leaf) and can report links to spines 111 and 112 as miscabled unless the implementations allows horizontal links.
Figure Figure 6 shows a multiple plane miscabling example. Since Leaf112 and Spine121 belong to two different PoDs, the adjacency between Leaf112 and Spine121 can not be formed. link-W would be detected and prevented.
+-------+ +-------+ +-------+ +-------+ |ToF A1| |ToF A2| |ToF B1| |ToF B2| LEVEL 2 +-------+ +-------+ +-------+ +-------+ | | | | | | | | | | | +-----------------+ | | | | +--------------------------+ | | | | | | | | | | | | | +------+ | | | +------+ | | | +-----------------+ | | | | | | | | +--------------------------+ | | | A | | B | | A | | B | +-----+-+ +-+---+-+ +-+---+-+ +-+-----+ |Spin111| |Spin112| +----+Spin121| |Spin122| LEVEL 1 +-+---+-+ ++----+-+ | +-+---+-+ ++----+-+ | | | | | | | | | | +---------+ | | | +---------+ | | | | | link-W | | | | | +-------+ | | | | +-------+ | | | | | | | | | | | +-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+ |Leaf111| |Leaf112+------+ |Leaf121| |Leaf122| LEVEL 0 +-------+ +-------+ +-------+ +-------+ +--------PoD#1----------+ +---------PoD#2---------+
Figure 6: A multiple plane miscabling example
RIFT provides an optional level determination procedure in its Zero Touch Provisioning mode. Nodes in the fabric without their level configured determine it automatically. This can have possibly counter-intuitive consequences however. One extreme failure scenario is depicted in Figure 7 and it shows that if all northbound links of spine11 fail at the same time, spine11 negotiates a lower level than Leaf11 and Leaf12.
To prevent such scenario where leafs are expected to act as switches, LEAF_ONLY flag can be set for Leaf111 and Leaf112. Since level -1 is invalid, Spine11 would not derive a valid level from the topology in Figure 7. It will be isolated from the whole fabric and it would be up to the leafs to declare the links towards such spine as miscabled.
+-------+ +-------+ +-------+ +-------+ |ToF A1| |ToF A2| |ToF A1| |ToF A2| +-------+ +-------+ +-------+ +-------+ | | | | | | | +-------+ | | | + + | | ====> | | X X +------+ | +------+ | + + | | | | +----+--+ +-+-----+ +-+-----+ |Spine11| |Spine12| |Spine12| +-+---+-+ ++----+-+ ++----+-+ | | | | | | | +---------+ | | | | | | | | | | +-------+ | | +-------+ | | | | | | | +-+---+-+ +--+--+-+ +-----+-+ +-----+-+ |Leaf111| |Leaf112| |Leaf111| |Leaf112| +-------+ +-------+ +-+-----+ +-+-----+ | | | +--------+ | | +-+---+-+ |Spine11| +-------+
Figure 7: Fallen spine
RIFT allows advertising IPv4 prefixes over IPv6 RIFT network. IPv6 AF configures via the usual ND mechanisms and then V4 can use V6 nexthops analogous to RFC5549. It is expected that the whole fabric supports the same type of forwarding of address families on all the links. RIFT provides an indication whether a node is v4 forwarding capable and implementations are possible where different routing tables are computed per address family as long as the computation remains loop-free.
+-----+ +-----+ +---+---+ | ToF | | ToF | ^ +--+--+ +-----+ | | | | | | | +-------------+ | | | +--------+ | | | | | | | V6 +-----+ +-+---+ Forwarding |SPINE| |SPINE| | +--+--+ +-----+ | | | | | | | +-------------+ | | | +--------+ | | | | | | | v +-----+ +-+---+ +---+---+ |LEAF | | LEAF| +--+--+ +--+--+ | | IPv4 prefixes| |IPv4 prefixes | | +---+----+ +---+----+ | V4 | | V4 | | subnet | | subnet | +--------+ +--------+
Figure 8: IPv4 over IPv6
TODO
TODO
Each RIFT node may operate in zero touch provisioning (ZTP) mode. It has no configuration (unless it is a Top-of-Fabric at the top of the topology or the must operate in the topology as leaf and/or support leaf-2-leaf procedures) and it will fully configure itself after being attached to the topology.
+---+ +---+ +---+ |ToF| |ToF| |ToF| +---+ +---+ +---+ | | | | | | | +----------------+ | | | | | | | | | +----------------+ | | | | | | | +----------+--+ +--+----------+ | Spine|ToR1 | | Spine|ToR2 | +--+------+---+ +--+-------+--+ +---+ | | | | | | +---+ | | | | | | | | | +-----------------+ | | | | | | +-------------+ | | + | + | | |-----------------+ | X | X | +--------x-----+ | X | + | + | | | + | +---+ +---+ +---+ +---+ | | | | | | | | +---+ +---+ ...............+---+ +---+ SV(1) SV(2) SV(n+1) SV(n)
Figure 9: Dual-homing servers
In the single plane, the worst condition is disaggregation of every other servers at the same level. Suppose the links from ToR1 to all the leaves become not available. All the servers' routes are disaggregated and the FIB of the servers will be expanded with n-1 more spicific routes.
Sometimes, pleople may prefer to disaggregate from ToR to servers from start on, i.e. the servers have couple tens of routes in FIB from start on beside default routes to avoid breakages at rack level. Full disaggregation of the fabric could be achieved by configuration supported by RIFT.
There are many different ways to deploy the controller. One possibility is attaching a controller to the RIFT domain from ToF and another possibility is attaching a controller from the leaf.
+------------+ | Controller | ++----------++ | | | | +----++ ++----+ ---------- | ToF | | ToF | | +--+--+ +-----+ | | | | | | | +-------------+ | | | +--------+ | | | | | | | +-----+ +-+---+ RIFT domain |SPINE| |SPINE| +--+--+ +-----+ | | | | | | | +-------------+ | | | +--------+ | | | | | | | | +-----+ +-+---+ ---------- |LEAF | | LEAF| +-----+ +-----+
Figure 10: Fabric with a controller
If a controller is attaching to the RIFT domain from ToF, it usually uses dual-homing connections. The loopback prefix of the controller should be advertised down by the ToF and spine to leaves. If the controller loses link to ToF, make sure the ToF withdraw the prefix of the controller(use different mechanisms).
If the controller is attaching from a leaf to the fabric, no special provisions are needed.
TODO
TODO
+--------+ +--------+ | | LIE LIE | | | A | +----> <----+ | B | | +---------------------+ | +--------+ +--------+ X/24 Y/24
Figure 11: subnet mismatch
LIEs are exchanged over all links running RIFT to perform Link (Neighbor) Discovery. A node MUST NOT originate LIEs on an address family if it does not process received LIEs on that family. LIEs on same link are considered part of the same negotiation independent on the address family they arrive on. An implementation MUST be ready to accept TIEs on all addresses it used as source of LIE frames.
As shown in the above figure, without further checks adjacency of node A and B may form, but the forwarding between node A and node B may fail because subnet X mismatches with subnet Y.
To prevent this a RIFT implementation should check for subnet mismatch just like e.g. ISIS does. This can lead to scenarios where an adjacency, despite exchange of LIEs in both address families may end up having an adjacency in a single AF only. This is a consideration especially in Section 4.6 scenarios.
+ traffic | v +------+------+ | ToF | +---+-----+---+ | | | | +------------+ | | +------------+ | | | | +---+---+ +-------+ +-------+ +---+---+ | | | | | | | | |Spine11| |Spine12| |Spine21| |Spine22| LEVEL 1 +-+---+-+ ++----+-+ +-+---+-+ ++----+-+ | | | | | | | | | +---------+ | | +---------+ | | | | | | | | | | +-------+ | | | +-------+ | | | | | | | | | | +-+---+-+ +--+--+-+ +-+---+-+ +--+--+-+ | | | | | | | | |Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0 +-+-----+ ++------+ +-----+-+ +-----+-+ + + + ^ | PrefixA PrefixB PrefixA | PrefixC | + traffic
Figure 12: Anycast
If the traffic comes from ToF to Leaf111 or Leaf121 which has anycast prefix PrefixA. RIFT can deal with this case well. But if the traffic comes from Leaf122, it will always get to Leaf121 and never get to Leaf111. If the intension is that the traffic should been offloaded to Leaf111, then use policy guided prefixes [PGP reference].
The following people (listed in alphabetical order) contributed significantly to the content of this document and should be considered co-authors:
Tony Przygienda
Juniper Networks
1194 N. Mathilda Ave
Sunnyvale, CA 94089
US
Email: prz@juniper.net