SPRING Working Group | P. Sarkar, Ed. |
Internet-Draft | Arrcus, Inc. |
Intended status: Standards Track | H. Gredler |
Expires: October 13, 2017 | RtBrick Inc. |
C. Filsfils | |
S. Previdi | |
Cisco Systems, Inc. | |
B. Decraene | |
Orange | |
M. Horneffer | |
Deutsche Telekom | |
April 11, 2017 |
Anycast Segments in MPLS based Segment Routing
draft-ietf-spring-mpls-anycast-segments-01
Instead of forwarding to a specific device or to all devices in a group, anycast addresses, let network devices forward a packet to (or steer it through) one or more topologically nearest devices in a specific group of network devices. The use of anycast addresses has been extended to the Segment Routing (SR) network, wherein a group of SR-capable devices can represent a anycast address, by having the same Segment Routing Global Block (SRGB) provisioned on all the devices and each one of them advertising the same anycast prefix segment (or Anycast SID).
This document describes a proposal for implementing anycast prefix segments in a MPLS-based SR network, without the need to have the same SRGB block (label ranges) provisioned across all the member devices in the group. Each node can be provisioned with a separate SRGB from the label range supported by the specfic hardware platform.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 13, 2017.
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Anycast is a network addressing scheme and routing methodology in which packets from a single source device are forwarded to the topologically nearest node in a group of potential receiving devices, all identified by the same anycast address. There are various useful usecases of anycast addresses, and discussion of the same are outside the scope of this document.
[I-D.ietf-spring-segment-routing] extended the use of anycast addresses to SR networks. An operator may combine a group of SR-enabled nodes to form a anycast group, by picking a anycast address and a segment identifier (hereon referred to as SID) to represent the group, and then provisioning all the nodes with the same address and SID. Once provisioned, each device in the group advertises the corresponding anycast address in its IGP link-state advertisements along with the SID provisioned. Source devices on receiving such anycast prefix segment advertisements, finds out the topologically nearest device that originated the anycast segment and forwards packets destined to the same on the shortest-path to the nearest device.
[I-D.ietf-spring-segment-routing] requires all devices in a given anycast group to implement the exact same SRGB block(s). This requirement will always be met in SR network deployed over IPV6 forwarding plane [I-D.previdi-6man-segment-routing-header]. For SR over MPLS dataplane [I-D.ietf-spring-segment-routing-mpls], while this is a simple (and hence more desirable) solution, the same may not be possible in a multi-vendor networks deploying devices with varying hardware capabilities.
In MPLS-based SR deployments, the segments on a given source router are actually mapped to a MPLS labels allocated from the local label pool carved out by the device for accomodating the SRGB. In multi-vendor deployments with various types of devices deployed in the same network topology, such a anycast group may contain a good combination of devices from different vendors and have different internal hardware capabilities. In such environments it is not sufficient to assume that all the devices in a anycast group will be able to allocate exactly the same range of labels for implementing the SRGB. In reality, getting a common range of labels among all the various vendors may not be feasible.
This documents provides mechanisms to implement anycast segments with any kind of device in a multi-vendor network deployment without requiring to provision the same exact range of labels for SRGB on all the devices.
To better illustrate the problem let us consider an example topology using anycast segments as shown in Figure 1 below.
+--------------+ | Group A | | 192.1.1.1/32 | | SID:100 | | | +-----------A1---A3----------+ | | | \ / | | | SID:10 | | | / | | | SID:30 1.1.1.1/32 | | | / \ | | | 1.1.1.3/32 PE1------R1----------A2---A4---------R3------PE3 \ /| | | |\ / \ / | +--------------+ | \ / \ / | | \ / / | | / / \ | | / \ / \ | +--------------+ | / \ / \| | | |/ \ PE2------R2----------B1---B3----+----R4------PE4 1.1.1.2/32 | | | \ / | | | 1.1.1.4/32 SID:20 | | | / | | | SID:40 | | | / \ | | | +-----+-----B2---B4----+-----+ | | | Group B | | 192.1.1.2/32 | | SID:200 | +--------------+
Figure 1: Topology 1
In Figure 1 above, there are two groups of transit devices. Group A consists of devices {A1, A2, A3 and A4}. They are all provisioned with the anycast address 192.1.1.1/32 and the anycast SID 100. Similarly, group B consists of devices {B1, B2, B3 and B4} and are all provisioned with the anycast address 192.1.1.2/32, anycast SID 200. In the above network topology, each PE device is connected to two routers in each of the groups A and B.
Following are all the possible ECMP paths between the various pairs of PE devices.
As seen above, there is always eight ECMP paths between each of pair of PE devices. The network operator may not wish to utilize all possible ECMP paths for all possible types of traffic flowing between a given pair of PE devices. It may be more useful for use paths P1, P2, P3 and P4 for certain types of traffic and use paths P5, P6, P7 and P8 for all other types of traffic between the same PE devices. If so desired, operators may use these anycast groups A and B and the corresponding anycast segment to impose a segment-list (refer to [I-D.ietf-spring-segment-routing]) to forward the respective traffic flows over the desired specific paths as shown below. Figure 2 below depicts a expanded view of the paths via group A. The range labels allocated for SRGB on each of the devices in group A are also mentioned in this diagram.
+-------------------------+ | Group A | | 192.1.1.1/32 | | SID:100 | |-------------------------| | | | SRGB: SRGB: | SID:10 |(1000-2000) (3000-4000)| SID:30 PE1---+ +-------A1-------------A3-------+ +---PE3 \ / | | \ / | | \ / \ / | | +-----+ / | | \ / SRGB: \ / | | \ / | | \ / SRGB: (7000-8000) R1 | | \ | | R3 (6000-7000) / \ | | / \ | | / \ / \ | | +-----+ \ | | / \ / \ | | / \ | | / \ PE2---+ +-------A2-------------A4-------+ +---PE4 SID:20 | SRGB: SRGB: | SID:40 |(2000-3000) (4000-5000)| | | +-------------------------+
Figure 2: Transit paths via anycast group A
In the above topology, if device PE1 (or PE2) requires to send a packet to the device PE3 (or PE4) it needs to encapsulate the packet in a MPLS payload with the following stack of labels.
While the first label is easy to compute, in this case since there are more than one topologically nearest devices (A1 and A2), unless A1 and A2 implement same exact SRGB, determining the second label is impossible. In all likeness, devices A1 and A2 may be devices from different hardware vendors and it may not implement the same exact SRGB label ranges. In such cases, separate labels are allocated by A1 and A2 (1030 and 2030 respectively, in the above example). Hence, PE1 (or PE2) cannot compute an appropriate label stack to steer the packet exclusively through the group A devices. Same holds true for devices PE3 and PE4 when trying to send a packet to PE1 or PE2.
This document introduces the term 'Common-Anycast SRGB' (hereafter referred to as the CA-SRGB) to define the SRGB implemented by the majority of the devices in the network, that are participating in one or more anycast segments. Each device MUST implement provisions to let the operators assign the CA-SRGB on the device. Each vendor implementation MUST implement provisions to configure the CA-SRGB at all configuration levels (per-routing-instance/per-protocol/per-topology etc) wherein provisions to configure the local SRGB label ranges has also been implemented. Essentially, for each SRGB configured on the device, vendor implementations MUST allow configuring a corresponding CA-SRGB value.
For each configuration level (per-routing-instance/per-protocol/per-topology etc)supported, the operator MUST set the same exact CA-SRGB on all devices across the entire IGP domain (including different IS-IS levels and OSPF areas). This ensures the proposal specified in Section 3.2.1 works flawlessly across all devices in any multi-vendor network deployment.
However assigning the CA-SRGB (for a given routing-instance/protocol/topology etc.) on the device, does not mean the label ranges allocated by the device for the corresponding SRGB has to belong to the CA-SRGB defined. The device may dynamically allocate the corresponding SRGB label ranges, or allocate the range provisioned by the operator, through an appropriate separate configuration (please refer to [I-D.ietf-spring-sr-yang] for more details).
For devices that has the local SRGB to be exact same as the 'CA-SRGB' applicable for the entire network, operators need not explictly set the corresponding CA-SRGB values. In such case, the vendor implementations MUST assume the local SRGB values to be the corresponding CA-SRGB values defined on the specific device.
If the CA-SRGB defined on a device does not absolutely match the corresponding SRGB label ranges allocated (or provisioned) on the same device (i.e. the CA-SRGB is not an exact copy of the corresponding SRGB label ranges), and the device is provisoned with one or more anycast prefix segments, the device MUST implement all the additional functionalities specified in Section 3.2.2, Section 3.2.3 and Section 3.2.4. On devices, where the SRGB label ranges is an exact copy of the corresponding CA-SRGB defined, the device need not implement these additional functionalities ( Section 3.2.2, Section 3.2.3 and Section 3.2.4).
For each anycast prefix segment, this document also defines a 'Common Anycast Prefix Segment Label' (hereafter referred to CAPSL). The value of this label is derived by applying the SID index associated with the prefix segment as an offset to the CA-SRGB configured on the specific device. Since the operator MUST configure the same CA-SRGB values on all devices in the IGP domain, all devices shall associate the same CAPSL label value for a given anycast prefix segment. Table 1 below shows the CAPSL labels allocated by any device for the various prefix segments found in Figure 2, with CA-SRGB set to 3000-4000 on all devices.
SID | CA-SRGB Range | CAPSL Value |
---|---|---|
10 | 2000-3000 | 2010 |
20 | 2000-3000 | 2020 |
30 | 2000-3000 | 2030 |
40 | 2000-3000 | 2040 |
100 | 2000-3000 | 2100 |
This document also introduces the term 'Anycast Prefix Segment Label' (hereafter referred to as APSL) to define the label allocated by a device to advertise reachability for the specific anycast prefix segment. The value of this label is derived by applying the SID index associated with the anycast prefix segment as an offset to the SRGB of the specific device. Table 2 below shows the labels allocated by the various devices in Figure 2 for the anycast prefix segment with SID 100.
Anycast-SID | Device | SRGB | APSL-Label |
---|---|---|---|
100 | R1 | 7000-8000 | 7100 |
100 | A1 | 1000-2000 | 1100 |
100 | A2 | 2000-3000 | 2100 |
100 | A3 | 3000-4000 | 3100 |
100 | A4 | 4000-5000 | 4100 |
100 | R3 | 6000-7000 | 6100 |
A MPLS device that tries to encapsulate any kind of traffic into a SR-based MPLS payload (hereafter referred to as the ingress device) and steer it through a series of SR adjacency and/or unicast/anycast prefix segments, needs to compute an appropriate stack of MPLS labels and put it in the outgoing packet. Alternatively, in a SDN environment, the SDN controller may need to compute the label stack and install it on the ingress device.
However in both cases, as illustrated in Section 2, for a given ingress device (e.g. PE1 or PE2), there maybe multiple topologically nearest devices in a specific anycast group (e.g. A1 and A2), even through there is only out-going link from the source device(e.g. PE1->R1 or PE2-R1). In such case, when the ingress device (or the SDN controller) wants to steer a packet through the anycast group A, it can use the anycast segment label advertised by the downstream neighbor of the ingress device for the specific anycast prefix segment. Since the packet may reach any one of the multiple devices in the group and each of them may have a separate SRGB label range, choosing the MPLS label for the next segment providing reachability to the final destination. Also, since the packet steered through a anycast segment can reach of any of the member device in the anycast group, it is sufficient to assume that the ingress (or the controller) cannot place an adjacency segment immediately after a anycast segment in the outgoing packet.
This document proposes the ingress device (or the SDN controller) to derive the label for a prefix segment that immediately follows a given anycast segment, to be the CAPSL label associated with the corresponding SID index (refer to Section 3.1.2). Note the prefix segment immediately following the given anycast segment may itself be another anycast segment.
The ingress (or the SDN controller) MUST follow the algorithm below to compute the label-stack, that it must use to steer a packet through a list of SR segments.
When a MPLS packet on the wire first hits a device, the forwarding hardware reads the topmost label in the MPSL header and looks up the default label lookup table associated with the interface on which the label has been received. This table is generally called LFIB. The range of labels found in the LFIB constitutes the default label space.
This document introduces a separate virtual label lookup table (hereafter referred to as Virtual LFIB or V-LFIB), that represents a label space which is also separate from the actual label space represented by the default LFIB. The label value may be present in both the default and Virtual LFIB. However the forwarding semantics associated with the label under the default and Virtual LFIB may not be same. Following are the fields of a typical entry of this table.
This document proposes that, any device, when provisioned with one or more anycast prefix segment (address and SID), and the CA-SRGB defined by the operator is not an exact copy of the corresponding SRGB label ranges allocated by the device, it MUST create a Virtual LFIB table.
Such a device MUST add an entry in the Virtual LFIB for each unicast and anycast prefix segments learnt from a remote device, if and only if the same prefix has not been provisioned on the device. The device SHOULD NOT add an entry for any of the Anycast or Node prefix segments that it has advertised itself. However if the device has learnt any anycast prefix segment from a remote device, and the same is not provisioned on this device, the device MUST include the same in the Virtual LFIB table.
In cases where a prefix segment is reachable via multiple shortest paths on a given device, the corresponding entry for the prefix SID MUST have as many forwarding entries in the Virtual LFIB table as the number of shortest-paths found for the corresponding prefix on the device.
Figure 3 below shows how the Virtual LFIB table on each of devices in group A should look like. Please note that some of the prefix segments has multiple forwarding semantics associated with them. For example, on device A1, the prefix SID 10 (originated by PE3) is reachable through its neighbors A3 and A4. And as per the SRGB advertised by A3 and A4, the labels allocated by A3 and A4 are 3030 and 4030 respectively. Hence A1 has added two forwarding entries for the prefix SID 30 in its Virtual LFIB table.
CA-SRGB configured on all devices: {2000-3000} +========+=============+=======+========================+ | | | Forwarding Semantics | | Device | CAPSL-Label |--------------------------------| | | | Outgoing-Label | Outgoing-Link | +========+=============+================+===============+ | A1 | 2010 | 7010 | A1->R1 | | +-------------+----------------+---------------+ | | 2020 | 7020 | A1->R1 | | +-------------+----------------+---------------+ | | 2030 | 3030 | A1->A3 | | | | 4030 | A1->A4 | | +-------------+----------------+---------------+ | | 2040 | 3040 | A1->A3 | | | | 4040 | A1->A4 | +========+=============+================+===============+ | A2 | No V-LFIB Table created since CA-SRGB is | | | identical to SRGB allocated locally | +========+=============+================+===============+ | A3 | 2010 | 1010 | A3->A1 | | | | 2010 | A3->A2 | | +-------------+----------------+---------------+ | | 2020 | 1020 | A3->A1 | | | | 2020 | A3->A2 | | +-------------+----------------+---------------+ | | 2030 | 6030 | A3->R3 | | +-------------+----------------+---------------+ | | 2040 | 6040 | A3->R3 | +========+=============+================+===============+ | A4 | 2010 | 1010 | A4->A1 | | | | 2010 | A4->A2 | | +-------------+----------------+---------------+ | | 2020 | 1020 | A4->A1 | | | | 2020 | A4->A2 | | +-------------+----------------+---------------+ | | 2030 | 6030 | A4->R3 | | +-------------+----------------+---------------+ | | 2040 | 6040 | A4->R3 | +========+=============+================+===============+
Figure 3: Virtual LFIB Table Setup
Please note that node A2 has not created a Virtual LFIB table since the CA-SRGB (2000-3000) is identical to the SRGB provisioned on it.
Also please note that none of the devices in the anycast group have included the anycast SID 100 in the Virtual LFIB table, since the same has already been provisioned on these devices.
When a device receives a MPLS packet with the anycast segment label associated with one of the anycast prefix segments provisioned on the same device, and the CA-SRGB defined by the operator is not an exact copy of the corresponding SRGB label ranges allocated by it, it MUST use the Virtual LFIB table to lookup the next label that follows the anycast segment label in the stack of labels found in the MPLS header. Refer to Section 3.2.4 for more details.
Following forwarding instructions MUST be installed in the MPLS data-plane for each entry in the Virtual LFIB entry.
Like unicast prefix segments, anycast prefix segments SHOULD be advertised in IGP Link-state advertsements using IGP protocol extension for SR specified in [I-D.ietf-isis-segment-routing-extensions], [I-D.ietf-ospf-segment-routing-extensions] and [I-D.ietf-ospf-ospfv3-segment-routing-extensions]. This document does not propose any protocol extension for advertising anycast prefix segments.
However when advertising the anycast segments, and the CA-SRGB defined by the operator is not an exact copy of the corresponding SRGB label ranges allocated by the originating device, it MUST set the corresponding P-Flag(No-PHP) in ISIS Prefix-SID SubTLV and/or the NP-Flag (No-PHP) in OSPFv2 and OSPFv3 Prefix-SID SubTLV to 1 and the E-Flag in the same SubTLVs to 0. Please refer to following for more details on usage of these flags.
The proposal above, ensures that a MPLS packet sent to (or taking transit through) a given anycast group, when reaching at a topologically nearest device in the group where CA-SRGB does not match SRGB provisioned on it, always arrives with the APSL-label that is derived from the device's SRGB, and the SID associated with the corresponding anycast prefix segment. Note in the above topology, assuming domain-wide CA-SRGB is set to (2000-3000) on all nodes, while nodes A1, A3 and A4 will advertise the SID 100 with P-Flag(No-PHP) set to 1, node A2 will advertise the same anycast prefix SID with P-Flag unset. This is because on node A1 the domain-wide CA-SRGB is identical to the local SRGB provisioned on A2.
In Figure 2, when PE1 or PE2 intends to steer a packet destined for PE3 or PE4, through the anycast group A (SID 100), it needs to forward the packet to R1 (SRGB:7000-8000), after putting the label 7100 (derived from R1's SRGB), at top of the label stack in the MPLS header. However when the same packet is forwarded to A1 (one of the topologicaly nearest devices in group A), R1 shall not POP (or remove) the label 7100. Instead R1 shall replace it with the label 1100 while forwarding to A1. While forwarding to A2, since A2 would have advertised the anycast SID 100 with P-Flag (No-PHP) unset, R1 shall POP the incoming label 7100 before forwarding it to R1.
The proposal specified in Section 3.2.3, ensures that a MPLS packet destined to (or steered via) a anycast prefix segment always arrives at the nearest device in the anycast group with a label derived from the device's SRGB and the SID associated with the corresponding anycast prefix segment, as the top-most label label stack in its MPLS header. If this label is also the bottom-most label (S=1), it means packet has been destined to the anycast segment, and should be consumed by the local device. If the label is not the bottom-most label (S=0), the packet must be forwarded to the next segment, for which the next label in the stack should be consulted. However Section 3.2.1 specifies that the next label in such case, shall be a label belonging to the CA-SRGB defined by the operator, derived from the SID associated with the next segment. Since the CAPSL label for the SID index associated with a prefix segment may directly collide with another label in the default LFIB table, Section 3.2.2 also proposed to have a Virtual LFIB table to provide a separate label-space for looking up the next label.
This document specifies that a device provisioned with a given prefix segment index MUST implement following forwarding semantics for the anycast segment label (refer to Section 3.1.3) associated with the anycast prefix segment, if the CA-SRGB label ranges defined is not an exact copy of the corresponding SRGB label range(s) locally allocated/provisioned on the device.
Figure 4 below ilustrate how SR-based MPLS packets destined for PE3 and sourced by PE1 are expected to flow through when PE1 encapsulates the packet with an appropriate label stack to steer it through group A devices only
+-------------------------+ | Group A | | 192.1.1.1/32 | CA-SRGB: {2000-3000} | SID:100 | |-------------------------| | | | | ---> ---> ---> ---> ---> ---> +----+----+--+ +----+----+--+ +----+--+ +----+--+ +--+ +--+ |7100|2030|..| |1100|2030|..| |3030|..| |6030|..| |..| |..| +----+----+--+ +----+----+--+ +----+--+ +----+--+ +--+ +--+ | | | SRGB: SRGB: | SID:10 |(1000-2000) (3000-4000)| SID:30 ---PE1---+ +-------A1-------------A3-------+ +---PE3--- \ / | | \ / | | \ / \ / | | +-----+ / | | \ / SRGB: \ / | | \ / | | \ / SRGB: (7000-8000) R1 | | \ | | R3 (6000-7000) / \ | | / \ | | / \ / \ | | +-----+ \ | | / \ / \ | | / \ | | / \ ---PE2---+ +-------A2-------------A4-------+ +---PE4--- SID:20 | SRGB: SRGB: | SID:40 |(2000-3000) (4000-5000)| | | ---> ---> ---> +----+--+ +----+--+ +----+--+ |2030|..| |4030|..| |6030|..| +----+--+ +----+--+ +----+--+ | | | | +-------------------------+
Figure 4: Packet Flow through MPLS-based SR Anycast Segments
Many many thanks to Shraddha Hegde, Eric Rosen, Chris Bowers and Stephane Litkowski for their valuable inputs.
N/A. - No protocol changes are proposed in this document.
This document does not introduce any change in any of the protocol specifications.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997. |
[I-D.ietf-isis-segment-routing-extensions] | Previdi, S., Filsfils, C., Bashandy, A., Gredler, H., Litkowski, S., Decraene, B. and J. Tantsura, "IS-IS Extensions for Segment Routing", Internet-Draft draft-ietf-isis-segment-routing-extensions-04, May 2015. |
[I-D.ietf-ospf-ospfv3-segment-routing-extensions] | Psenak, P., Previdi, S., Filsfils, C., Gredler, H., Shakir, R., Henderickx, W. and J. Tantsura, "OSPFv3 Extensions for Segment Routing", Internet-Draft draft-ietf-ospf-ospfv3-segment-routing-extensions-02, February 2015. |
[I-D.ietf-ospf-segment-routing-extensions] | Psenak, P., Previdi, S., Filsfils, C., Gredler, H., Shakir, R., Henderickx, W. and J. Tantsura, "OSPF Extensions for Segment Routing", Internet-Draft draft-ietf-ospf-segment-routing-extensions-04, February 2015. |
[I-D.ietf-spring-segment-routing] | Filsfils, C., Previdi, S., Decraene, B., Litkowski, S. and R. Shakir, "Segment Routing Architecture", Internet-Draft draft-ietf-spring-segment-routing-03, May 2015. |
[I-D.ietf-spring-segment-routing-mpls] | Filsfils, C., Previdi, S., Bashandy, A., Decraene, B., Litkowski, S., Horneffer, M., Shakir, R., Tantsura, J. and E. Crabbe, "Segment Routing with MPLS data plane", Internet-Draft draft-ietf-spring-segment-routing-mpls-01, May 2015. |
[I-D.ietf-spring-sr-yang] | Litkowski, S., Qu, Y., Sarkar, P. and J. Tantsura, "YANG Data Model for Segment Routing", Internet-Draft draft-ietf-spring-sr-yang-00, July 2015. |
[I-D.previdi-6man-segment-routing-header] | Previdi, S., Filsfils, C., Field, B. and I. Leung, "IPv6 Segment Routing Header (SRH)", Internet-Draft draft-previdi-6man-segment-routing-header-06, May 2015. |