BESS Working Group | J. Drake |
Internet-Draft | Juniper Networks |
Intended status: Standards Track | A. Farrel |
Expires: May 7, 2020 | Old Dog Consulting |
L. Jalil | |
Verizon | |
A. Lingala | |
AT&T | |
November 4, 2019 |
BGP-LS Filters : A Framework for Network Slicing and Enhanced VPNs
draft-drake-bess-enhanced-vpn-02
Future networks that support advanced services, such as those enabled by 5G mobile networks, envision a set of overlay networks each with different performance and scaling properties. These overlays are known as network slices and are realized over a common underlay network.
In order to support network slicing, as well as to offer enhanced VPN services in general, it is necessary to define a mechanism by which specific resources (links and/or nodes) of an underlay network can be used by a specific network slice, VPN, or set of VPNs. This document sets out such a mechanism for use in Segment Routing networks.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 7, 2020.
Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Network slicing is an approach to network operations that builds on the concept of network abstraction to provide programmability, flexibility, and modularity. Driven largely by needs surfacing from 5G, the concept of network slicing has gained traction, for example in [TS23501] and [TS28530]. Network slicing requires the underlying network to support partitioning the network resources to provide the client with dedicated (private) networking, computing, and storage resources drawn from a shared pool. The slices may be seen as (and operated as) virtual networks.
Advanced services drive a need to create virtual networks with enhanced characteristics. The tenant of such a virtual network can require a degree of isolation and performance that previously could only be satisfied by dedicated networks. Additionally, the tenant may ask for some level of control to their virtual networks, e.g., to customize the service forwarding paths in the underlying network.
The concepts of "enhanced VPNs" and "network slicing" are introduced in [I-D.ietf-teas-enhanced-vpn].
In order to support network slicing, as well as to offer enhanced VPN services in general, it is necessary to define a mechanism by which specific resources (links and/or nodes) of an underlay network can be used by a specific network slice, VPN, or set of VPNs. This document sets out such a mechanism for use in Segment Routing networks [RFC8402] and builds on the ideas introduced in [I-D.ietf-idr-segment-routing-te-policy]. I.e., it generalizes that work to support multipoint-to-multipoint (MP2MP), point-to-multipoint (P2MP), and bidirectional point-to-point (P2P) topologies; it integrates BGP-based VPN support ([RFC4364], [RFC7432]); it supports DSCP as well a Color-based forwarding, and it uses BGP Link-State (BGP-LS) [RFC7752] to distribute topology information.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
The approach is based on a network controller that uses the {source, destination} traffic matrix and the performance and scaling properties of each network slice, VPN, or set of VPNs in conjunction with the topology of the underlay network to assign each network slice, VPN, or set of VPNs a set of underlay links and nodes that it can use. That is, each network slice, VPN, or set of VPNs gets a subset, either dedicated or shared, of the resources in the underlay network.
It should be noted that resources can be assigned at any of the following granularities:
Once the network controller has determined the resource assignments, it distributes this information to the PEs that participate in each VPN using the usual VPN information dissemination tools, e.g., route targets (RT) [RFC4360], route reflectors (RR) [RFC4456], and RT constraints [RFC4684].
This information is distributed to the PEs by giving them a customized and limited view of the underlay network on the basis of a network slice, a VPN, or a set of VPNs. Each PE will have a complete view of the underlay network and this customized and limited view acts as filter on the underlay network telling the PE which underlay network resources it can use to direct the traffic of a given network slice, VPN, or set of VPNs to best deliver end-to-end services.
The resource allocation information is encoded using BGP-LS. This approach is chosen for the following reasons:
It should be noted that this mechanism also follows the scalability model of the existing BGP-based VPN infrastructure, which is that the per-VPN information is restricted to only those PE routers that are supporting that VPN and that the P routers have no per-VPN state.
The PEs in non-enhanced VPNs do not receive this resource allocation information and would not confine their usage of the underlay network resources. In order to ensure that the underlay network resources allocated to enhanced VPNs are not inadvertently used by the PEs in non-enhanced VPNs, the network controller SHOULD ensure that the IGP and TE metrics for these resources is higher than the metrics for the underlay network resources allocated to non-enhanced VPNs. In certain situations, detailed in Section 4, PEs in enhanced VPNs will use the underlay networks resources allocated to non-enhanced VPNs.
Additional to the programming of the PEs and its computation and assignment of resources for use by network slices, VPNs, or sets of VPNs, the network controller also instructs the P routers to make the actual allocation of these resources by assigning link bandwidth to a specific DSCP or adjacency SID.
We define a BGP-LS Filter to be a BGP-LS encoded description of a subset of the links and nodes in the underlay network. A BGP-LS Filter defines the topology for a network slice or a set of one or more VPNs. The topology defined by a BGP-LS Filter needs to provide connectivity between the PEs in a given network slice, VPN or set of VPNs. I.e., it connects the PEs in these VPNs and is used by them to send packets to each other. A given filter is tagged with the route targets of the VPNs whose PEs are to import the filter. A BGP-LS Filter is pushed southbound to those PEs by the network controller and SHOULD provide multiple paths between a given ingress/egress PE pair.
Note that there will be multiple BGP-LS Filters in a given network deployment and that a given underlay network link or node may appear in more than one of them. In order to provide disambiguation AFI 16388 (BGP-LS) and SAFI 72 (BGP-LS-VPN) are used in BGP-LS UPDATE messages and the network controller SHOULD allocate a different route distinguisher (RD) to each BGP-LS Filter.
Within a given VPN, when an ingress PE needs to send a packet to an egress PE it selects a path to that egress PE from the topology defined by the BGP-LS Filters it has imported for that VPN. It then either adds a segment routing label stack specifying that path to the packet or places the packet in an RSVP-TE LSP which uses that path. The ingress PE may use any path computation it wishes if that path computation confines the path to the topology defined by the relevant set of BGP-LS Filters.
If Segment Routing is used and a nodal SID is placed in the segment routing label stack, then when that segment is active the P routers will forward the packet using the underlay network resources allocated to non-enhanced VPNs. Similarly, if the RSVP-TE LSP was established using a loose source route to the subject node, the path to that node was selected using the underlay network resources allocated to non-enhanced VPNs.
Because the BGP-LS UPDATE messages specifying a BGP-LS Filter may arrive in any order and the BGP-LS UPDATE messages of multiple BGP-LS Filters may be interleaved, there is a need for a new attribute that is attached to a BGP-LS UPDATE. This attribute contains a Filter ID, a Filter version number, a Filter type (MP2MP, P2MP, or P2P), the total number of fragments in the filter, and the specific fragment number of the piece in hand. I.e., it is assumed that a PE may import more than one BGP-LS Filter, that a given BGP-LS Filter may change over time, and that a given BGP-LS Filter may span multiple BGP-LS UPDATE messages. The Filter ID needs to be unique across the set of VPNs into which the BGP-LS Filter is to be imported.
A BGP-LS Filter that is created for a set of VPNs will contain a set of network resources sufficient to connect the PEs in each VPN in the set and each of the BGP-LS UPDATE messages for the filter MUST be tagged with the RT for each VPN in the set.
If a PE imports more than one BGP-LS Filter it may use the union of the links and nodes specified in each filter when selecting a path. A PE should give precedence to BGP-LS Filters of type P2MP and P2P when selecting a path. Routes targets specific to a given VPN/PE pair are needed for BGP-LS Filters of type P2MP and P2P.
A given BGP-LS Filter may change in response to updates to the PE membership in a VPN to which the BGP-LS Filter applies or to updates to the underlay network. When this occurs, the network controller should push a new version of the affected BGP-LS Filters. That is, it increments the version number of each BGP-LS Filter. Note that a network controller does not need to compute new BGP-LS Filters in response to an individual link or node failure in the underlay network if connectivity still exists among the PEs in the network slice, VPN or set or VPNs with the existing BGP-LS Filters.
A BGP-LS Filter cannot be used by a PE until it is completely assembled. If the BGP-LS Filter that is being assembled is a newer version of a BGP-LS Filter that the PE is currently using, the PE should continue to use its current version of the BGP-LS Filter until the newer version is completely assembled.
When selecting a path using one or more BGP-LS Filters, an ingress PE can use a link or node only if it is active in the underlay network. If this precludes connectivity to the egress PE it may use the underlay network resources allocated to non-enhanced VPNs to reach the egress PE.
Additionally, when there is a newly activated PE it will not be present in any of the BGP-LS Filters used by the other PEs. Until a new BGP-LS Filter or Filters that contain that PE has been distributed, other PEs will use the underlay network resources allocated to non-enhanced VPNs to reach the newly activated PE and it use these resources to reach other PEs.
[RFC4271] defines the BGP Path attribute. This document introduces a new Optional Transitive Path attribute called the BGP-LS Filter attribute with value TBD1 to be assigned by IANA.
The first BGP-LS Filter attribute MUST be processed and subsequent instances MUST be ignored.
The common fields of the BGP-LS Filter attribute are set as follows:
The content of the BGP-LS Filter attribute is a series of Type-Length-Value (TLV) constructs. Each TLV may include sub-TLVs. All TLVs and sub-TLVs have a common format that is:
The formats of the TLVs defined in this document are shown in the following sections. The presence rules and meanings are as follows.
The BGP-LS Filter attribute MUST contain exactly one Filter TLV. Its format is shown in Figure 1. Note that a given BGP-LS Filter may span multiple UPDATE messages and the Topology, Version Number, and the Number of Fragments fields in the BGP-LS Filter attribute contained in each UPDATE message MUST be set to the same value or the BGP-LS Filter is unusable.
+--------------------------------------------+ | Type = 1 (1 octet) | +--------------------------------------------+ | Length (2 octets) | +--------------------------------------------+ | Topology (1 Octet) | +--------------------------------------------+ | ID (4 Octets) | +--------------------------------------------+ | Version Number (4 Octets) | +--------------------------------------------+ | Number of Fragments (4 Octets) | +--------------------------------------------+ | Fragment Number (4 Octets) | +--------------------------------------------+
Figure 1: The Filter TLV Format
The fields are as follows:
The DSCP List TLV MAY be included in the BGP-LS Filter attribute. If included, a packet whose DSCP matches a DSCP in the DSCP list is to be forwarded using the BGP-LS Filter defined by the containing BGP-LS Filter attribute. The first DSCP List TLV MUST be processed and subsequent instances MUST be ignored. The format of the DSCP List TLV is shown in Figure 2.
+--------------------------------------------+ | Type = 2 (1 octet) | +--------------------------------------------+ | Length (2 octets) | +--------------------------------------------+ | DSCP List (variable) | +--------------------------------------------+
Figure 2: The DSCP List TLV Format
The fields are as follows:
The Color List TLV MAY be included in the BGP-LS Filter attribute. If a BGP UPDATE contains a Color extended community with a color (as defined by [RFC5512]) that matches an entry in the Color List, then a packet whose destination is covered by one of the routes in that UPDATE is to be forwarded using the BGP-LS Filter defined by the containing BGP-LS Filter attribute. The first Color List TLV MUST be processed and subsequent instances MUST be ignored. The format of the Color List TLV is shown in Figure 3.
Note that if both a DSCP List and a Color List TLV are included in a BGP-LS Filter attribute, packets matching an entry in either list are to be forwarded using the BGP-LS Filter defined by the containing BGP-LS Filter attribute. If neither list is included then all packets for that network slice, VPN, or set of VPNs can be forwarded using the BGP-LS Filter defined by the containing BGP-LS Filter attribute.
+--------------------------------------------+ | Type = 3 (1 octet) | +--------------------------------------------+ | Length (2 octets) | +--------------------------------------------+ | Color List (variable) | +--------------------------------------------+
Figure 3: The Color List TLV Format
The fields are as follows:
The Root TLV MUST be included in the BGP-LS Filter attribute if its topology is of type P2MP or P2P unidirectional. It defines the root node for that topology and if it is not present the BGP-LS Filter is unusable. The TLV, if present, MUST be ignored if the topology is of type MP2MP or P2P bidirectional.
The Root TLV is structured as shown in Figure 4 and MAY contain any of the sub-TLVs defined in section 3.2.1.4 of [RFC7752].
+--------------------------------------------+ | Type = 3 (1 octet) | +--------------------------------------------+ | Length (2 octets) | +--------------------------------------------+ | Sub-TLVs (variable) | +--------------------------------------------+
Figure 4: The Root TLV Format
The fields are as follows:
Section 6 of [RFC4271] describes the handling of malformed BGP attributes, or those that are in error in some way. [RFC7606] revises BGP error handling specifically for the for UPDATE message, provides guidelines for the authors of documents defining new attributes, and revises the error handling procedures for a number of existing attributes. This document introduces the BGP-LS Filter attribute and so defines error handling as follows:
TBD
Figure 5shows a sample underlay topology. Six PEs (PE1 through PE6) are connected across a network of twelve P nodes (P1 through P12). Each PE is dual-homed, and the P nodes are variously connected so that there are multiple routes between PEs.
PE3 PE4 |\ /| | \ / | | \ / | | \/ | | /\ | | / \ | | / \ | |/ \| P1--------P2 / |\ /| \ / | \ / | \ / | \ / | \ / | \/ | \ P3-------P4--------P5-------P6 | / | /\ | \ | | / | / \ | \ | | / | / \ | \ | |/ |/ \| \| P7---P8--P9--------P10-P11-P12 |\ /| |\ /| | \/ | | \/ | | /\ | | /\ | |/ \| |/ \| PE1 PE2 PE5 PE6
Figure 5: Underlay Network Topology
Figure 6 shows how a Multi-point-to-multipoint (MP2MP) service that connects PE1, PE3, and PE6 can be installed over the underlay network. Path have been computed so that, for example, PE1 is connected to both PE3 and PE6 via a pair of redundant paths. Similarly, PE3 is connected to PE1 and PE6, and PE6 is connected to PE1 and PE3.
PE3 PE4 | \ | \ | \ | \ | \ | \ | \ | \ P1 P2 / \ /| / \ / | / \ / | / \ / | P3 P4 X P5 P6 | / \ \ | / \ \ | / \ \ | / \ \ P7 P8--P9---------P10-P11 P12 | / \ | | / \ | | / \ | |/ \| PE1 PE2 PE5 PE6
Figure 6: An MP2MP Service Installed at PE1, PE3, and PE6
Figure 7 shows the provision of a Point-to-Multipoint (P2MP) rooted at PE3 and connected to PE1 and PE6. As in the previous example, a redundant pair of paths is established between PE3 and each of PE1 and PE6. Thus, the two paths from PE3 to PE1 are PE3-P1-P4-P7-PE1 and PE3-P2-P9-P8-PE1.
PE3 PE4 | \ | \ | \ | \ | \ | \ | \ | \ P1 P2 |\ / \ | \ / \ | \ / \ | \ / \ P3 P4 X P5 P6 / / \ | / / \ | / / \ | / / \ | P7---P8--P9 P10-P11 P12 | / \ | | / \ | | / \ | |/ \| PE1 PE2 PE5 PE6
Figure 7: A P2MP Unidirectional Service Installed at PE3
Figure 8 shows a Point-to-Point (P2P) service rooted at PE1 and connected to PE3. This is equivalent to a Segment Routing Traffic Engineering (SR TE) Policy [I-D.ietf-idr-segment-routing-te-policy] installed at PE1.
As in the previous examples, a pair of redundant paths are computed.
PE3 PE4 |\ | \ | \ | \ | \ | \ | \ | \ P1 P2 | | | | | | | | P3 P4 P5 P6 / | / | / | / | P7 P8--P9--------P10 P11 P12 | / | / | / |/ PE1 PE2 PE5 PE6
Figure 8: A P2P Unidirectional Service (SR TE Policy) Installed at PE1
Figure 9 show a bidirectional P2P service connecting PE1 and PE6. This is equivalent to a Segment Routing Traffic Engineering (SR TE) Policy [I-D.ietf-idr-segment-routing-te-policy] installed at PE1 and PE6.
PE3 PE4 P1 P2 P3 P4--------P5 P6 / \ / \ / \ / \ P7 P8--P9--------P10-P11 P12 | / \ | | / \ | | / \ | |/ \| PE1 PE2 PE5 PE6
Figure 9: A P2P Bidirectional Service Installed at PE1 and PE6
TBD
Per VPN OAM and telemetry will be required in order to monitor and verify the performance of network slices. This is particularly important when the performance of a network slice has been committed to a customer through a Service Level Agreement.
TBD
IANA maintains a registry of "Border Gateway Protocol (BGP) Parameters" with a subregistry of "BGP Path Attributes". IANA is requested to assign a new Path attribute called "BGP-LS Filter attribute" (TBD1 in this document) with this document as a reference.
IANA maintains a registry of "Border Gateway Protocol (BGP) Parameters". IANA is request to create a new subregistry called the "BGP-LS Filter attribute TLVs" registry.
Valid values are in the range 0 to 255.
This document should be given as a reference for this registry. The new registry should track:
The registry should initially be populated as follows:
Type | Name | Reference | Date ------+-------------------------+---------------+--------------- 1 | Filter TLV | [This.I-D] | Date-to-be-set 2 | DSCP List TLV | [This.I-D] | Date-to-be-set 3 | Color List TLV | [This.I-D] | Date-to-be-set 4 | Root TLV | [This.I-D] | Date-to-be-set
The authors are grateful to all those who contributed to the discussions that led to this work: Ron Bonica, Stewart Bryant, Jie Dong, Keyur Patel, and Colby Barth.