Internet DRAFT - draft-filyurin-rift-access-networks
draft-filyurin-rift-access-networks
RIFT Working Group Y. Filyurin, Ed.
Internet-Draft Bloomberg LP
Intended status: Informational June 13, 2018
Expires: December 15, 2018
RIFT -- Motivation, Additional Requirements and Use Cases in User Access
Networks
draft-filyurin-rift-access-networks-00
Abstract
RIFT is a new specialized dynamic routing protocol originally
designed for Clos and Fat Tree Data Center networks. It is designed
to work on multilevel network topologies in which nodes in certain
level will only connect to nodes in one upper or lower level with
optional and non-contiguous intra-level connectivity.
While the protocol was originally designed to meet the needs of
Massively Scalable Data Centers, its ability to automatically prune
the information distribution from higher levels to lower levels, as
well as provide optimal routing for intra and inter-level traffic
makes it a good match for user access networks, or any network that
combines end user access and various compute enabling various network
service for these end users. Current directions in distributed
computing seek to blur even that distinction. Large distributed
networks can be created, where virtual compute units can be in all
tiers, combining and crossing many requirements for DC or User Access
design. This draft seeks to analyze these requirements.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 15, 2018.
Filyurin Expires December 15, 2018 [Page 1]
Internet-Draft XX June 2018
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Definitions of Terms Used in This Memo . . . . . . . . . . . 2
2. Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Additional Requirements for RIFT Access Networks . . . . . . 4
5. Network Slicing . . . . . . . . . . . . . . . . . . . . . . . 4
5.1. Overall Network Slicing . . . . . . . . . . . . . . . . . 5
5.2. Identification and Propagation of Slice Information . . . 5
5.3. Network Instances and RIB and FIB Requirement . . . . . . 6
5.4. Network Instances and Control Plane . . . . . . . . . . . 7
5.5. Network Instances and Forwarding . . . . . . . . . . . . 9
6. External Routing Information . . . . . . . . . . . . . . . . 10
7. RIFT and Endpoint Address Mobility . . . . . . . . . . . . . 11
7.1. Mobility Use Cases . . . . . . . . . . . . . . . . . . . 11
8. Border Nodes and Superspine East/West traffic . . . . . . . . 12
9. Border Nodes and Superspine East/West traffic . . . . . . . . 13
10. Security Considerations . . . . . . . . . . . . . . . . . . . 14
11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 14
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 15
14.1. Normative References . . . . . . . . . . . . . . . . . . 15
14.2. Informative References . . . . . . . . . . . . . . . . . 15
14.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 16
1. Definitions of Terms Used in This Memo
MSDC - Massively Scalable Data Center
IGP - Interior Gateway Protocol
Filyurin Expires December 15, 2018 [Page 2]
Internet-Draft XX June 2018
RIB - Routing Information Base
FIB - Forwarding Information Base
MT - Mutli-Topology in the context of IS-IS
MI - Mutli-Instance in the context of IS-IS
AD - Auto-discovery
UDP - User Datagram Protocol
IID - Instance ID, in the context of control and data plane slicing
of network devices
TIE - Topology Information Element, per original RIFT specification
N-TIE - Northbound Topology Information Element, flooded in the
Northbound direction, per original RIFT specification
S-TIE - Northbound Topology Information Element, propagated in the
Southbound direction, per original RIFT specification
Node TIE - Node Topology Information Element, per original RIFT
specification
Prefix TIE - Prefix Topology Information Element, per original RIFT
specification
Key Value TIE or K/V TIE - A TIE (mainly Southbound) that is
carrying a set of key value pairs, per original RIFT specification
LIE - Link Information Element, per original RIFT specification
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] when, and only when, they appear in all capitals, as
shown here.
2. Authors
Following authors substantially contributed to the current format of
the document:
Filyurin Expires December 15, 2018 [Page 3]
Internet-Draft XX June 2018
3. Introduction
Typical access networks are built in a hierarchical fashion using
"Core", "Distribution" and "Access" layers designed to support
collections of wiring distribution blocks that in turn connect to end
user devices, server compute nodes and various forms of utility
devices. This design is just variation of the Fat Tree design and
RIFT presents an opportunity to significantly reduce traditional
switched networks design limitations, bring seamless mobility to end
systems within the entire access network domain and remove the
operational overhead that comes with provisioning access networks.
All this can be done without forcing lower level network devices to
carry feature sets traditionally found in higher end aggregation
devices.
Decoupling network layer information from device reachability
information allows any network layer information to be propagated,
and thus, expand the protocol to support routing for any type of
network layer addressing. Use of Policy Guided Prefixes allows
specialized forwarding policies where packets are forwarded through
specialized paths or redirected to specialized service nodes, such as
packet shapers. Use of Key/Value N-TIEs and S-TIEs would allow
propagation of both configuration information to facilitate fully
automated deployment and operations. Key/Value TIEs can be used to
propagate other information that can aid forwarding such as interface
queuing policies, access control policies or configuration of
auxiliary services such as DHCP relay. The use of IPv6 Link Local
addressing on all infrastructure for exchange of LIEs removes a lot
of operational overhead in bringing up and supporting RIFT network.
4. Additional Requirements for RIFT Access Networks
The original RIFT specification was created for traditional Data
Center environments. Access networks may call for additional
capabilities. This desire for additional capabilities is due to the
fact that many endpoints in these traditional access environments
often lack the capabilities of providing traditional delineation
between the network infrastructure domain and individual workloads
running on these devices and must rely on the network edge to provide
that delineation.
5. Network Slicing
Network slicing in this context is defined as creating individual
separate virtual networks within our access networks connecting sets
of edge devices. The slices are effectively their own virtual Fat
Trees with separate Control Plane data structures holding prefix
information. The protocol processes populate virtual RIBs, which
Filyurin Expires December 15, 2018 [Page 4]
Internet-Draft XX June 2018
program the FIB (assuming common FIB in most platforms) to define
instance specific packet identification and its per hop forwarding
behavior.
Network slices can also be called network instances. Often they are
used interchangeably, but often network instance applies a virtual
network construct local to an individual device, where network slice
covers a virtual network carved out from the set of interconnected
devices.
5.1. Overall Network Slicing
RIFT original specification uses the concepts of Multi-Topology
RFC5120 [1] and Multi-Instance RFC6822 [2] to create network-wide
virtual routing domains. RIFT capabilities to form separate neighbor
relationship for each instance make MI approach more appropriate for
creating network slices, allowing multiple virtual Fat Trees to
operate as "ships in the night" creating completely separate RIFT
flooding/propagation domains. As part of initial LIE exchange
individual adjacencies per instance will be formed, as long as the
nodes can agree on the instance ID. Standard discovery process can
apply, and it could be argued that all auto-configuration information
exchange can happen only at the global instance.
5.2. Identification and Propagation of Slice Information
The process is no different in principle than for many other forms of
virtual private network services. The process starts with Auto-
Discovery where nodes hosting a particular instance can propagate
this information to other nodes (using Key/Value (K/V) Ties, for
example) and individual neighbor relationships will form. Once
instance adjacencies form, then all other information can be
exchanged and propagated.
Since RIFT is fundamentally an underlay protocol, and relies on
itself for next hop resolution, instance awareness must not just be
on edge devices hosting the instance, but all transit devices. In
both MT and MI approaches, topologies and instances are explicitly
configured. When provisioning RIFT networks, there must be some
approach to facilitate instance activation on transit devices. Once
the device becomes "instance aware", then LIE exchange can take place
to establish common parameters such as UDP ports and neighbor
adjacency can be established using standard process.
Due to K/V capabilities of RIFT, there should be no need to define
special Instance ID TLVs or modify the Thrift models. Some external
entity will configure instance parameters and access policies
Filyurin Expires December 15, 2018 [Page 5]
Internet-Draft XX June 2018
instance system IDs established and K/V N-TIEs can be propagated to
higher levels and neighbor adjacencies established.
5.3. Network Instances and RIB and FIB Requirement
While RIFT is an underlay protocol, as soon as individual virtual Fat
Trees are created, packet forwarding on links, that are used for
multiple slices can no longer be programmed using standard network
layer information. This is a typical example of using some unique
identifier to determine unique per-hop behavior. The price of using
unique identifiers whether they take on the form of shim headers,
special packet metadata or even translation and encapsulation
techniques is the requirement of creating more advanced forwarding
state on the transit network devices. First, there must be an
association that maps a particular identifier to a particular
instance, then another action that makes the forwarding decision
identifying the next hop and the final action of adding the right
metadata to the packet allowing the next hop to perform the same set
of actions.
The problem can be resolved using two standard approaches outside of
deploying multi-operation forwarding devices. Either put the
destination address based forwarding on the edges, that already have
the policies to associate network layer information with instance
information, or create more advanced FIB data structures that map to
hardware operations that allow metadata/address lookup and forwarding
to be done as simple atomic operations.
The first approach to some degree defeats the purpose of using RIFT
as a routing protocol - access devices having visibility to all
destinations and metadata available to them. The second approach is
more realistic, but these advanced capabilities may only be available
on more advanced devices. These devices are less likely to be
deployed closer to edges of RIFT network, and possibly get in the way
of the requirement of less expensive and feature rich access network.
Within MI RIFT domain, there would be three types of forwarding
behavior. First forwarding behavior is on the leaf devices
connecting to endpoints that apply policies associating end systems
with instances, impose and dispose of the metadata and forward the
packet to transit devices. The second forwarding behavior is found
on transit devices. These devices forward exclusively based on
metadata, or shim headers, effectively forwarding traffic to the
highest level aggregation devices. The last type of device is the
aggregation device, that maintains advanced FIB that processes and
forwards packets, imposing metadata used to forward to the leaf
devices.
Filyurin Expires December 15, 2018 [Page 6]
Internet-Draft XX June 2018
5.4. Network Instances and Control Plane
Taking the example drawing from original RIFT spec:
:
. +--------+ +--------+
. | | | | ^ N
. |Spine 21| |Spine 22| |
.Level 2 ++-+--+-++ ++-+--+-++ <-*-> E/W
| | | | | | | | |
. P111/2| |P121 | | | | S v
. ^ ^ ^ ^ | | | |
. | | | | | | | |
. +--------------+ | +-----------+ | | | +---------------+
. | | | | | | | |
. South +-----------------------------+ | | ^
. | | | | | | | All TIEs
. 0/0 0/0 0/0 +-----------------------------+ |
. v v v | | | | |
. | | +-+ +<-0/0----------+ | |
(I1, I11, I-Odd, I-Even)| | | | | |
.+-+----++ optional +-+----++ ++----+-+ ++-----++
.| | E/W link | | | | | |
.|Node111+----------+Node112| |Node121| |Node122|
.+-+---+-+ ++----+-+ +-+---+-+ ++---+--+
. | | | South | | | |
. | +---0/0--->-----+ 0/0 | +----------------+ |
. (I1, I11, I-Odd) | | | | | | |
. | +---<-0/0-----+ | v | +--------------+ | |
. v | (I1, I11, I-Even) | | | |
.+-+---+-+ +--+--+-+ +-+---+-+ +---+-+-+
.| | (L2L) | | | | Level 0 | |
.|Leaf111~~~~~~~~~~~~Leaf112| |Leaf121| |Leaf122|
.+-+-----+ +-+---+-+ +--+--+-+ +-+-----+
. + + \ / + +
. Prefix111 Prefix112 \ / Prefix121 Prefix122
. multi-homed
. Prefix
.+---------- Pod 1 ---------+ +---------- Pod 2 ---------+
A two level spine-and-leaf topology
Filyurin Expires December 15, 2018 [Page 7]
Internet-Draft XX June 2018
Assuming we take every "Leaf" device (111,112,121 and 122) and create
instance I1 on each device, as well as instance policies. At the
same time, Leafs 111 and 112 can host an instance I11 and leafs 121
and 122 can host instance I12. 111 and 121 are hosting I-Odd and 112
and 122 are hosting I-even. Northbound K/V TIEs can be used to
propagate instance information and set up instance RIB data
structures on the transit devices. Leafs will have those data
structures set up during instance creation and transit devices as
soon as they receive K/V TIEs. In this example Spines will have the
RIB data structures for all the instances created, Node 111 and Node
112 should only have the state from I1, I11 and I-Odd and I-Even and
Nodes 121 and 122 should have the identical state, except that I11
would be replaced by I12.
The same approach would be applied to forming adjacencies. Once the
initial LIE exchange completes and instance TIEs have been exchanged
between the devices and parameter negotiation is complete - instance
specific neighbor adjacency can be established. The creation of all
the data structures, TIE flooding and propagation starts then.
In the above setup, the leafs maintain the needed Control Plane state
created as part of configuration and propagation of Prefix S-TIEs
from transit nodes. Their 0/0 or ::/0 or any other relevant routing
state within each instance is designed to route packets towards the
spines.
Transit nodes (Node 111, 112, 121 and 122) would have instance
adjacencies with leafs based on which leaf hosts which instance. For
example all transit nodes with maintain I1 adjacency with every leaf,
but I-Even adjacency with leafs 112 and 122 and I-Odd with 111 and
112. Since per instance adjacencies are formed this is even more
flexible than MI-ISIS, and there is no need to do IID TLV mechanism.
A direct association exists between instance RIB data structures and
per instance adjacencies.
Spines 22 and 22 would create RIB data structures for all the
instances, as the spines are responsible for routing the traffic
between leafs. In our example their adjacencies are still based on
advertise K/V TIEs indicating instance memberships. Spines 21 and 22
would have all the instance adjacencies with nodes 111 and 112,
except for instance I12 and with 121 and 122 for all the instances,
except for I11.
All the standard RIFT rules must apply for adjacency establishment on
horizontal links between nodes of the same level. The same rules
must apply for prefix disaggregation and treatment of Policy Group
Prefix (PGP) TIEs.
Filyurin Expires December 15, 2018 [Page 8]
Internet-Draft XX June 2018
A leaf is expected to connect to multiple nodes and failure of
instance synchronization on the horizontal link either indicates and
outage of an error. It could be up to the implementation to define
default behavior and correlation of K/V TIEs with flooded Node TIEs
5.5. Network Instances and Forwarding
Leafs apply instance policies, dispose of the metadata and make
forwarding decisions to forward packets to spines through various
node transit devices. This is the primary difference between normal
RIFT operation and per-instance RIFT, designed to address the
forwarding limitations of transit devices, that would have to
identify the topology, perform the forwarding action within the
context of that topology and potentially put another identifier for
the next device. Leafs therefore must not just forward the packets,
but impose the right information on it, to allow transparent
forwarding to the spines by transit devices. Spines in turn have the
task of identifying the topology, determining the leaf device for the
destination address (for any address schema) and properly marking the
packet for topology identification as it is forwarded towards the
destination leaf.
Techniques for forwarding packets to the spines and then to the
appropriate leafs can be up to implementations or may be hardware
specific, where some set-ups are better off with encapsulation, some
better of with shim headers and some with address manipulation. The
forwarding tables of transit devices must have the information to
forward packets to the spines, and multiple instances can share that
information, as long as spines can uniquely identify the instance.
This is a potential use case for various techniques ranging from
simple Label Switched Paths (LSPs) using both label swapping and
forwarding, more complex approaches for path set-up with use of
deeper label stacks to identify the devices, instance and some other
per-hop behavior. Doing this conflicts with the Requirement #13 as
outlined in the original RIFT draft, where all traffic must transit
the spine. This may be very much acceptable in a traditional user
access network, where most of the traffic is ultimately North/South
or has to be North/South due to various security requirements, but as
traffic patterns change, various systems become more distributed and
enterprise data processing starts resembling smaller scale MSDCs, it
may not be a bad idea to have the capability to have multiple levels
of devices capable of executing advanced per-hop actions on the
packets.
Filyurin Expires December 15, 2018 [Page 9]
Internet-Draft XX June 2018
6. External Routing Information
In most environments RIFT will not be the only control plane
protocol. Recent advances in compute virtualization designs create
an opportunity for designs in which traditional compute hosts are now
running multiple workloads where network virtualization is now at the
network layer, as opposed to traditional approach of transport layer
virtualization. As such, individual virtual operating system
instances or virtual processes present their own network layer
address. These addresses exist on the network only for the duration
of the workload and in some situations even move. This applies to
primarily Data Center networks and while these can found in access
environments, the scale requirements are unlikely to be significant.
In access environments, however, server compute nodes are replaced by
numerous systems that in turn support mobile devices, special purpose
mesh networks.
Mobility will be discussed later, but various control plane protocols
can be deployed on lower level nodes, especially leaf nodes, where
these external protocols are used to create routing information used
to forward packet to these compute workloads. In addition to these
protocols, Network Admission Control protocols as well as network
discovery protocols can be used to populate device routing tables.
All this routing information must be exchanged with RIFT as part of
export/import relationship between RIFT and RIB manager or
redistribution between RIFT and databases of these protocols. These
foreign prefixes are propagated as Prefix TIEs Northbound with the
ability to carry some information that identifies these as external
and some additional information allowing non-leaf devices to treat
the information in a special way. Prefix TIEs are able to carry
optional attribute set. As part of this optional set, Route Tags can
be defined and used for external route identification. Aside from
the optional attribute set, there would not even be difference
between "internal" and "external" prefixes, as the import process is
nearly identical.
External routes by default should not be propagated Southbound and
would be subject of the same de-aggregation rules that apply to
normal RIFT operation. External prefixes would only be propagated
southbound if the node in the southern direction could follow the
default in the direction where there would no visibility of that
route. Implementation should offer the option to propagate external
routes without any explicit configuration. Situations, in which RIFT
domain could be used to interconnect other routing domains can be a
match for this requirement.
Filyurin Expires December 15, 2018 [Page 10]
Internet-Draft XX June 2018
RIFT is not meant to become an inter-domain routing protocol, but
various forms of stub networks of many compute and transit entities
using other specialized routing protocols could be interconnected
using RIFT domain, as well as connecting to other external systems.
7. RIFT and Endpoint Address Mobility
Most of endpoint addressing including network addressing belongs to
fixed locations, as the network address is associated with a
connecting interface. When service endpoints have their own
addresses that exist independent of network addresses, this
separation ultimately creates the need for address mobility.
Endpoint address mobility is both the ability to move the association
of any address endpoint to any network device interface, as well as
ability to reuse any endpoint address anywhere in the mobility
domain.
Numerous traditional approaches exist ranging from relying on
combining locator and endpoints in a single address, keeping all
endpoints in a continuous broadcast domain relying on auto-discovery
mechanisms to various centralized and distributed Locator/Endpoint
mapping systems, that keep track of endpoint mobility. Numerous work
went into making both approaches scalable utilizing various
networking layers, but the problem has been reduced to one of
distributed dynamic routing - ability to re-advertise the address of
the endpoint to reroute to it through a different locator.
7.1. Mobility Use Cases
Actual mobility use cases may include activation, deactivation and
moves of virtual compute systems in both server and access
environment. They can be both virtual servers serving clients to
virtual nodes in various peer-to-peer applications. Other use cases
may include activation, deactivation and association of wireless
nodes to different point-to-point, point-to-multipoint and mesh
wireless networks. Whether the locator is a physical compute node or
a wireless access point, the locator serves as the boundary between
the static locator and dynamic endpoint networks. RIFT takes on
dynamically routing in the first to support access to the second.
RIFT support for mobility is defined in the Mobility section of the
RIFT specification. The fundamental requirement for the mobile node
management systems, whether centralized or distributed is to support
notifying RIFT either through redistribution/import mechanism or
directly when mobility events happen, as RIFT does not have a native
purge mechanism and RIFT will insure the right network state to
provide routing to the right locator is maintained using time stamp
Filyurin Expires December 15, 2018 [Page 11]
Internet-Draft XX June 2018
and sequence counter mechanisms. Both unicast and unicast routing
can be supported.
Address mobility should be supported in both single global instances
as well as multi-instance configuration. For non-global instances
RIFT operation should be no different, and each instance would
maintain its own data structures keeping track of timestamps and
sequence numbers. As stated in the original RIFT specification,
mobility can be defined as a service and supported through a separate
instance. If done so, then various transit nodes between leafs and
super-spines are either forwarding encapsulated packets or programmed
to process just shim headers and metadata. While this does not
minimize the control plane effort needed to perform mobility at scale
(as RIFT is an underlay protocol), this would reduce the FIB sizes
and minimize the data plane requirements.
As outlined in the original RIFT specifications, some environments
would already be designed to support mobility using other techniques
for locator/endpoint separation such as LISP or ILA. While RIFT can
assist these protocols with providing the needed configuration to the
leaf nodes, such as instance mapping and resolver information, the
two systems operate independently.
8. Border Nodes and Superspine East/West traffic
Border Nodes are special purpose leaf nodes connected directly to the
top level of the hierarchy. They may run a foreign routing protocol
and will often be used to interconnect to different networks. Most
of the external routes, including the default routes would be
originated from those. The first approach is to treat these devices
as any other nodes in the hierarchy. They will assume a lower level
and will flood N-TIEs and receive standard Node S-TIEs and all Prefix
S-TIEs. They are just regular leafs, but because of their function,
they are capable of propagating external routing information and also
receive all prefix TIEs, as opposed to just originated default.
The second approach is to have the mechanism to treat the super-
spine, other interconnected super-spines and border nodes (which
become super-spines at this point) as part of a single flood domain.
This is similar as treating super-spines as a traditional backbone
area in OSPF or Layer 2 domain in IS-IS. All N-TIEs are flooded on
all links in the higher available level.
This is a request to have the only allowed exception to the original
specification that explicitly states that neither N-SPF nor S-SPF can
provide full loop prevention capability as the entire Fat Tree design
is not based on the continuous connectivity at any level. If the
super-spine domain becomes its own Link State flooding domain, or
Filyurin Expires December 15, 2018 [Page 12]
Internet-Draft XX June 2018
East/West TIEs are introduced, than Prefix S-TIEs must be used to
populate the RIB if they are available. In addition East/West ties
can never be used to propagate information as S-TIEs. Super-spines
do not get to act as backbone areas, or various techniques used for
things like route leaking have to be employed.
9. Border Nodes and Superspine East/West traffic
Where super-spines represent the top of the hierarchy bringing
various design ideas and their caveats such as continuous super-spine
domain, leafs also want to take advantage of certain topology
optimizations. In certain set-ups especially in large campus and
metro area networks, leaf connectivity can be deployed in a "daisy
chain" fashion. In such connectivity set-up, a set of leaf devices
will be interconnected where the "leftmost" and "rightmost" devices
provide connectivity to higher levels of the tree. Similar to the
interconnected super-spine concept, this violates some of the design
principles of Fat Tree topology and some accommodations for this in
the RIFT protocol may be required.
RIFT is not designed to provide full ring protection, unless the ring
consists of 2-3 nodes (becoming either an interconnected single tier
or leaf/spine with a single leaf). A ring of more than 3 nodes
becomes a broken Fat Tree topology. Before a multilevel RIFT
environment with the bottom level being a daisy chain of leafs, we
can try a simple ring approach. Assuming two adjacent nodes on the
ring can be configured as SUPER_SPINE, then it is theoretically
possible that all other nodes of the newly formed "half-ring" could
have a level assigned to them, and depending on the number of nodes
in a ring, one or two nodes would become level 0 leafs. Assuming
that we would want all the nodes to become leafs, then either the
nodes must be explicitly configured to be LEAF_ONLY, or the links
from the two "aggregation nodes" to the leaf nodes must be configured
to explicitly tell other nodes that they are leafs, which those leaf
nodes must continue propagating. This may require creation of
another flag used in adjacency formation.
Assuming the correct adjacencies have been formed and we have a set
of two nodes: Node1 and Node2 of level 1 and a set of leafs, Leaf1,
Leaf2 and Leaf3 where Leaf1 connects to Node1 and Leaf2, Leaf2
connects to Leaf1 and Leaf3 and Leaf3 connects to Node2. Nodes 1 and
Node 2 can either have a direct E/W link or just links to Nodes of
Level 2.
The first design violation is breaking one of the Leaf-to-Leaf rules,
which states that only the N-TIEs that are originated by a particular
leaf are sent over East/West Leaf-to-Leaf link. Since the leaf
devices in a daisy chain are part of the same level, this rule could
Filyurin Expires December 15, 2018 [Page 13]
Internet-Draft XX June 2018
be relaxed, as N-TIEs from leafs in the chain can be propagated to
higher levels where they get to run N-SPF and deal with partitioned
leaf network. The condition of this relaxation can be that devices
in the daisy chain ultimately rely on S-SPF only based on what is
propagated with S-TIEs. S-TIEs in turn get propagated in both
directions of the chain without being sent Northbound.
Endpoints on devices in the half-ring rely on S-TIEs to reach other
endpoints in this sub-topology and S-TIEs to reach endpoints outside
the half ring. N-TIEs are propagated to allow endpoints outside the
half-ring to reach endpoints in the half-ring. Partition within the
half-ring would have to trigger the reflooding of the N-TIEs, as well
as propagation of the S-TIEs. This may be the only possible
situation on which a purge like Southbound mechanism is used, but
ultimately the direction is not Southbound, but East-West.
10. Security Considerations
Access environments are less trusted environments. RIFT is designed
in such a way to make it possible for a device to join the network
without too much extra configuration. The protocol was designed to
simplify operations, but at the price at making it a lot easier for
devices to become part of the network. In many MSDC environments the
devices are deployed to come online with special interfaces that
connect to dedicated Out-of-Band (OOB) management network. Not only
the process preconfigures these devices, the system ensures that the
initial configuration as well as software and firmware of the version
that a particular enterprise considers secure. Devices deployed in
more remote locations or just those without out of band management
network connectivity may not go through the initial configuration,
plus general physical security is lowered. Security procedures for
neighbor authentication become a lot more critical.
Implementing Secure Neighbor Discovery would make the attachment to
the network more difficult and implementing a protocol that supports
encryption could keep protocol communications secure.
A number of activities in 6lo working group have developed a number
of ideas that could create a more secure way for the RIFT neighbors
to authenticate each before forming a formation. Of note is the work
done in RFC6775 [3] as well as its extension that addresses security
11. Conclusions
RIFT started out as a Data Center protocol, and will evolve in that
direction, allowing greater scalability in building multi-tier
fabrics. As the requirements of MSDCs and larger access environments
start to look very similar, and as end user compute and server
Filyurin Expires December 15, 2018 [Page 14]
Internet-Draft XX June 2018
compute start performing very similar functions, there will be more
similarities between end user mobility and workload mobility than
there differences. RIFT and its enhancements that combine many
aspects of control and management planes, can become the IGP these
environments have been waiting for.
12. IANA Considerations
At this point there is no need for any allocations
13. Acknowledgments
Would like to acknowledge Antoni Przygienda for the original feedback
and hopefully steering this document in the right direction.
14. References
14.1. Normative References
[I-D.ietf-rift-rift]
Przygienda, T., Sharma, A., Thubert, P., Atlas, A., and J.
Drake, "RIFT: Routing in Fat Trees", draft-ietf-rift-
rift-01 (work in progress), April 2018.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
14.2. Informative References
[RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
"SEcure Neighbor Discovery (SEND)", RFC 3971,
DOI 10.17487/RFC3971, March 2005,
<https://www.rfc-editor.org/info/rfc3971>.
[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
"Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,
DOI 10.17487/RFC4861, September 2007,
<https://www.rfc-editor.org/info/rfc4861>.
[RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi
Topology (MT) Routing in Intermediate System to
Intermediate Systems (IS-ISs)", RFC 5120,
DOI 10.17487/RFC5120, February 2008,
<https://www.rfc-editor.org/info/rfc5120>.
Filyurin Expires December 15, 2018 [Page 15]
Internet-Draft XX June 2018
[RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
Low-Power and Lossy Networks", RFC 6550,
DOI 10.17487/RFC6550, March 2012,
<https://www.rfc-editor.org/info/rfc6550>.
[RFC8202] Ginsberg, L., Previdi, S., and W. Henderickx, "IS-IS
Multi-Instance", RFC 8202, DOI 10.17487/RFC8202, June
2017, <https://www.rfc-editor.org/info/rfc8202>.
14.3. URIs
[1] https://tools.ietf.org/html/rfc5120
[2] https://tools.ietf.org/html/rfc6822
[3] https://tools.ietf.org/html/rfc6775
Author's Address
Yan Filyurin (editor)
Bloomberg LP
731 Lexington Ave.
New York, NY 10022
US
EMail: yfilyurin@bloomberg.net
Filyurin Expires December 15, 2018 [Page 16]