Internet DRAFT - draft-malyushkin-bess-ip-vpn-abstract-next-hops
draft-malyushkin-bess-ip-vpn-abstract-next-hops
BGP Enabled ServiceS I. Malyushkin
Internet-Draft Independent Contributor
Intended status: Standards Track 17 August 2022
Expires: 18 February 2023
Abstract next-hop addresses in IP VPNs
draft-malyushkin-bess-ip-vpn-abstract-next-hops-00
Abstract
This document discusses the IP VPN convergence aspects and specifies
procedures for IP VPN to signal the attachment circuit failure. The
specified procedures help significantly improve the IP VPN
convergence.
About This Document
This note is to be removed before publishing as an RFC.
Status information for this document may be found at
https://datatracker.ietf.org/doc/draft-malyushkin-bess-ip-vpn-
abstract-next-hops/.
Discussion of this document takes place on the BGP Enabled ServiceS
Working Group mailing list (mailto:bess@ietf.org), which is archived
at https://mailarchive.ietf.org/arch/browse/bess/. Subscribe at
https://www.ietf.org/mailman/listinfo/bess/.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 18 February 2023.
Malyushkin Expires 18 February 2023 [Page 1]
Internet-Draft IP VPNs abstract next-hops August 2022
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4
3. Terminoly . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Solution Description . . . . . . . . . . . . . . . . . . . . 5
5. Abstract Next-Hop Address . . . . . . . . . . . . . . . . . . 8
5.1. Status of the Abstract Next-Hop . . . . . . . . . . . . . 9
5.2. Distribution of the Abstract Next-Hop . . . . . . . . . . 10
5.3. Tunnels to the Abstract Next-Hop . . . . . . . . . . . . 10
6. Distribution of VPN Routes . . . . . . . . . . . . . . . . . 12
7. Forwarding . . . . . . . . . . . . . . . . . . . . . . . . . 13
8. Failure Detection . . . . . . . . . . . . . . . . . . . . . . 13
8.1. Egress PE . . . . . . . . . . . . . . . . . . . . . . . . 13
8.2. Ingress PE . . . . . . . . . . . . . . . . . . . . . . . 14
9. Deployment Considirations . . . . . . . . . . . . . . . . . . 14
9.1. Scalability . . . . . . . . . . . . . . . . . . . . . . . 14
9.2. Using the Abstract Next-Hops . . . . . . . . . . . . . . 15
9.3. Failure Detection . . . . . . . . . . . . . . . . . . . . 16
9.4. Routes Aggregation . . . . . . . . . . . . . . . . . . . 16
10. Multicast Considirations . . . . . . . . . . . . . . . . . . 17
11. Security Considerations . . . . . . . . . . . . . . . . . . . 17
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 17
13.1. Normative References . . . . . . . . . . . . . . . . . . 17
13.2. Informative References . . . . . . . . . . . . . . . . . 18
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 19
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 19
Malyushkin Expires 18 February 2023 [Page 2]
Internet-Draft IP VPNs abstract next-hops August 2022
1. Introduction
Neither IP VPN [RFC4364] nor IPv6 VPN [RFC4659] have a mass routes
withdrawal mechanism. The failure of a connection to a CE forces a
PE to withdraw all affected VPN routes instead of noticing other PE
routers about the attachment circuit failure. These routes may be
packed into one or more BGP UPDATE messages and then disseminated
through the network. Depending on the BGP topology these messages
may be further processed and replicated by intermediate nodes (e.g.,
route reflectors). In general, every affected route must be
withdrawn from all interested parties. The number of failed routes
impacts the convergence time. More routes require more time. A
sophisticated intermediate BGP topology may also negatively affect
this time.
Network`s convergence speed is important. There is a potential
traffic loss that lasts until the failure notification (BGP UPDATE
messages) reaches other members participating in the affected VPN
service (i.e., routers using the affected VPN routes for traffic
forwarding). Moreover, this loss happens at the egress point where
the failed CE router is connected to the network and after traffic
has proceeded a whole path.
There is a mechanism to avoid this traffic loss that acts while the
network is converging which is named the BGP PIC edge
[I-D.ietf-rtgwg-bgp-pic]. This mechanism depends on the availability
of an extra exit point for every affected route. In case when the CE
router is connected to a pair of PE routers (i.e., it is multihomed)
and a link between the CE and one of these PE fails all affected
traffic can be redirected by this PE toward another. On the other
hand, the BGP PIC edge when it is active at egress is associated with
the sub-optimal routing. Traffic from an ingress PE must follow the
path toward the egress PE where the failed link with the CE is
attached. Then this egress PE redirects traffic thanks to the pre-
installed backup records toward another PE. Such a tromboning can
negatively influence traffic characteristics (delay, loss rate,
etc.).
Another problem with the BGP PIC edge at egress is a possible routing
loop. Suppose a CE router is connected to a pair of PE routers and
contributes to them a set of routes. These PE install these routes
and propagate them via internal BGP VPN sessions. Both PEs receive
these routes via a PE-CE protocol and the internal BGP VPN sessions.
The routes received via the internal BGP VPN sessions are used as
backups for the routes received via the PE-CE protocol. When the CE
fails the PE routers activate their backups sending traffic to each
other until TTL reaches zero.
Malyushkin Expires 18 February 2023 [Page 3]
Internet-Draft IP VPNs abstract next-hops August 2022
The BGP PIC edge mechanism is a transient solution. As soon as all
VPN members are notified about the unreachability of all affected VPN
routes traffic will be sent to the extra exit point in an optimal way
or it will be dropped at ingress. The goal of the solution described
in this document is to decrease the time required by all VPN members
to be aware of the failure thus reduce the time the BGP PIC edge
lasts. This solution does not replace the BGP PIC edge and can be
applied to networks in parallel with it. It is recommended to
combine them together.
Even if destinations that were advertised by a failed CE lack
alternatives the time of the network reaction may be important.
Imagine that the CE advertises a huge number of routes and attracts a
considerable amount of traffic, but for some reason, these routes do
not have an alternative exit point. Until all other members of the
VPN service are aware of the failure traffic will flow through the
network in vain. The solution described in this document will
significantly reduce this time.
This document refers to [RFC4364] in all cases when a logic of the
latter is applicable to both address families. [RFC4659] is referred
to explicitly if it introduces a new logic.
2. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Terminoly
The document uses the terminology defined in [RFC4271], [RFC4364],
[RFC6513], and [RFC3031].
AC
Attachment Circuit.
GRT
Global Routing Table.
EC
BGP Extended Community [RFC4360].
BGP VPN route
Either a VPN-IPv4 or VPN-IPv6 route.
Malyushkin Expires 18 February 2023 [Page 4]
Internet-Draft IP VPNs abstract next-hops August 2022
Abstract Next-Hop (ANH) address
An artificial IPv4 or IPv6 address in the GRT that represents an
address of CE in VRF.
Linked Address (LA)
An actual IPv4 or IPv6 address of CE that is bound (linked) to
ANH.
ANH proxying
A unidirectional dependency between the statuses of ANH and LA.
4. Solution Description
Consider the topology in Figure 1. CE1 and CE2 maintain external BGP
sessions with PE1 for IPv4 and IPv6 unicast address families. Both
CEs send routes via these sessions, which must be reachable by CE3
through the VPN service. PE1 exports routes installed into VRF1 and
send them as VPN routes to PE2.
+-----+
| CE3 |
+-----+
|
|
+-------------+
| |
| PE2 |
| 192.0.2.2 |
+-------------+
|
|
+-------------+
| | +-----------+
| | | |.0 198.51.100.0/31 .1 +-----+
| | | VRF1 AC1------------------------------| CE1 |
| | | |::1 FE80::/64 ::2 +-----+
| IP/MPLS |-----| PE1 |
| network | | 192.0.2.1 |
| | | |.2 198.51.100.2/31 .3 +-----+
| | | VRF1 AC2------------------------------| CE2 |
| | | |::0 2001:DB8::/127 ::1 +-----+
| | +-----------+
+-------------+
Figure 1: IP/MPLS network with IP VPN
Malyushkin Expires 18 February 2023 [Page 5]
Internet-Draft IP VPNs abstract next-hops August 2022
Figure 2 shows the routes received from the CEs and installed into
VRF1 by PE1 on the left and VPN routes advertised by PE1 on the
right. The most interesting column here is "VPN Next-Hops". The
address of 192.0.2.1 is a primary address of PE1 which is used as a
default VPN next-hop address and as a source address of internal BGP
sessions. All routes of CE2 use the default address of PE1 as the
next-hop when exported as VPN routes. There is a special export
policy on PE1 for the internal BGP sessions that modifies next-hops
for the VPN-IPv4 routes of CE1 to the address of 192.0.2.100 and for
the VPN-IPv6 routes of CE1 to the address of ::ffff:192.0.2.200.
+-------------------+----------------+----------+
| VRF1 Routes | VRF1 Next-Hops | VRF1 ACs |
+-------------------+----------------+----------+
+-------------------+----------------+----------+
| 203.0.113.0/25 | 198.51.100.1 | AC1 |
+-------------------+----------------+----------+
| 203.0.113.128/25 | 198.51.100.3 | AC2 |
+-------------------+----------------+----------+
+-------------------+----------------+----------+
| 2001:DB8:100::/64 | FE80::2 | AC1 |
+-------------------+----------------+----------+
| 2001:DB8:200::/64 | 2001:DB8::1 | AC2 |
+-------------------+----------------+----------+
|
|
v
+---------------------+----------------------+-----------+
| VPN Routes | VPN Next-Hops | VPN Label |
+---------------------+----------------------+-----------+
+---------------------+----------------------+-----------+
| RD:203.0.113.0/25 | 0:192.0.2.100 | 100 |
+---------------------+----------------------+-----------+
| RD:203.0.113.128/25 | 0:192.0.2.1 | 100 |
+---------------------+----------------------+-----------+
+---------------------+----------------------+-----------+
| RD:2001:DB8:100::/64| 0:::ffff:192.0.2.200 | 100 |
+---------------------+----------------------+-----------+
| RD:2001:DB8:200::/64| 0:::ffff:192.0.2.1 | 100 |
+---------------------+----------------------+-----------+
Figure 2: Export routes into VPN by PE1
PE1 advertises unicast host-specific routes for the addresses
192.0.2.1, 192.0.2.100, and 192.0.2.200 via a routing protocol. PE2
receives these routes and installs them into the GRT. PE1 also
allocates an MPLS label for the addresses mentioned above and
distributes the bindings of this label via a label distribution
Malyushkin Expires 18 February 2023 [Page 6]
Internet-Draft IP VPNs abstract next-hops August 2022
protocol. PE2 receives these bindings and installs them into its
tunnel table. Thus, PE2 can resolve all VPN routes that PE1 has
sent.
Suppose that AC2 between PE1 and CE2 has failed for some reason.
When PE1 has noticed the failure it invalidates all routes inside
VRF1 that were used to reach the CE2 addresses, 198.51.100.2/31 and
2001:DB8::/127. Other routes that recursively uses the routes to
addresses of CE2 for look up their next-hops become inactive too.
Because of that PE1 starts withdrawing the corresponding VPN routes,
203.0.113.128/25 and 2001:DB8:200::/64 (RD is omitted). PE2 must
wait for these withdrawals before it stops sending traffic toward PE1
(traffic from CE3 to CE2).
In another scenario, AC1 fails instead of AC2. Imagine, PE1 is
configured to monitor the status of AC1 and in the case of the
failure of AC1 PE1 immediately updates the routing protocol and the
label distribution protocol. These updates include the withdrawal of
the unicast routes for the addresses 192.0.2.100 and 192.0.2.200 (but
not for 192.0.2.1) and the label bindings for them. In parallel with
it, PE1 proceeds with the similar steps described previously for the
case with the AC2 failure. PE2 eventually receives the updates
either by the routing protocol or the label distribution protocol or
both. Thanks to a hierarchical FIB it invalidates all VPN routes at
once that use the failed routes (and tunnels) to their VPN next-hops.
PE2 stops sending traffic to PE1 (traffic from CE3 to CE1) even if it
has not received yet any withdrawals for the corresponding VPN
routes.
For the sake of brevity, both scenarios are discussed without
alternative exit points for the routes inside VRF1 and just for a
couple of such routes. In real deployments, CE can distribute much
more routes to more than one PE. In that case, the mechanism
described above can significantly improve the network convergence
times.
This document introduces the mechanism that helps notify VPN members
about the AC failure that has happened to one of these members. The
described solution expects the following:
* Hierarchical FIB MUST be supported and used among all VPN members.
* Any VPN member acting as an ingress PE MUST consider the status of
a unicast route in the GRT toward the BGP next-hop (BGP next-hop
tracking). This status MUST be considered during the BGP route
resolution and after the route is placed into the appropriate
routing table.
Malyushkin Expires 18 February 2023 [Page 7]
Internet-Draft IP VPNs abstract next-hops August 2022
* It is RECOMMENDED for any VPN member acting as an ingress PE to
consider the status of a tunnel toward the BGP next-hop also
during the BGP VPN route resolution and after the route is placed
into the appropriate routing table.
The solution described in this document modifies the behavior of
egress PE routers only and can be deployed incrementally.
5. Abstract Next-Hop Address
Section 4.3.2 of [RFC4364] states:
When a PE router distributes a VPN-IPv4 route via BGP, it uses its
own address as the "BGP next hop".
In most cases, the "own address" is the address of a virtual
interface (e.g., a loopback). This address usually acts as a tunnel
endpoint for labeled traffic. The tunnel using it may be
instantiated by different mechanisms and must be capable of
forwarding MPLS traffic. The PE also uses this address as a source
address of internal BGP sessions. Due to a virtual nature of the
interface owning this address, it is nearly impossible to face the
failure of this interface (except for artificial ways). Only the
failure of a whole PE leads to it.
The solution described in the document proposes using additional
next-hops for VPN routes advertised by a single PE. This alters a
behavior described in [RFC4364] and [RFC4659] that presupposes the
advertising of a single next-hop address for all VPN routes of a PE.
With regard to the described solution, additional next-hops
advertised by a PE are named ANH addresses. An ANH address is an
artificial IPv4 or IPv6 address that belongs to the GRT. An ANH acts
as a proxy address for an actual address of a CE residing in a VRF.
The status of the latter address influences the status of its ANH. A
CE`s address selected for an ANH is named an LA. An LA may belong to
a common subnet of a PE-CE pair in a VRF or can be any other address
of the CE, it cannot belong to the GRT. An ANH and LA pair does not
necessarily belong to the same address family. For example, it is
possible to have an ANH of IPv4 and an LA of its ANH of IPv6, and
vice versa. An LA can be a link-local IPv6 address, in this case,
its ANH MUST be a proxy to a triplet (LA, AC, VRF) instead of (LA,
VRF) where the LA belongs to the AC from the triplet.
Malyushkin Expires 18 February 2023 [Page 8]
Internet-Draft IP VPNs abstract next-hops August 2022
Addresses installed in different VRFs may be overlapped. Thus,
values of ANHs may be arbitrary and do not have to be the same as
their LAs. An operator is free to choose these values according to
network address plans. To achieve goals stated in Section 1 values
of ANHs MUST be unique throughout the GRTs of the network. The case
when several PE routers advertise the same value for the ANH (e.g.,
anycast) is out of the scope of this document.
The ANH proxying is not a route leaking mechanism, it cannot be used
for traffic forwarding between the GRT and a VRF in any direction.
The ANH proxying creates a dependency between the statuses of an ANH
and an LA (Section 5.1). An ANH MUST be bound to only one LA. An LA
in turn MUST be bound to only one ANH. There is a strict one-to-one
mapping between them. An operator may create the ANH proxying for
any address in a VRF, but the solution expects that this address is
used as a next-hop for routes in this VRF. These routes does not
necessarily belong to the same CE that owns the LA (i.e., third-party
next-hop).
The ANH proxying can be described as a static host-specific route
that is installed in the GRT. A destination address of this route is
configured as a value selected for an ANH by an operator. A next-hop
address of the static route is equal to an LA (selected for the ANH).
Additionally, the operator configures a VRF (its name or index)
directly for this static route. It points to where the next-hop of
the route must be resolved. For unlabeled traffic coming to a PE via
the GRT, the static route acts as a route to the bit bucket. This
document does not restrict implementations by this mechanic.
5.1. Status of the Abstract Next-Hop
The status of an ANH depends on the existence of a route to its LA.
This route MUST be present in a VRF associated with the ANH. The ANH
is considered active if and only if the route to its LA is active and
is available for traffic forwarding. The proposed solution does not
restrict the type of this route, but it MUST support at least direct
routes and static routes. An implementation MAY filter the protocols
used for resolution of LAs by a configuration policy.
The status of an ANH is unidirectional, only the status of an LA
defines the status of an ANH.
An implementation MAY support the option of deactivation of an ANH
manually by an operator.
Malyushkin Expires 18 February 2023 [Page 9]
Internet-Draft IP VPNs abstract next-hops August 2022
Besides the dependency on a route toward an LA, an ANH MAY be a
client of any mechanism of active monitoring of the LA. It can be
any next-hop tracking (ARP, ICMP probes if they are applicable to the
LA) or a BFD [RFC5880] session between a CE that owns the LA and a PE
that owns the ANH.
5.2. Distribution of the Abstract Next-Hop
In general, the distribution of ANHs by means of a routing protocol
does not differ from the distribution of any other addresses that are
considered to be BGP next-hops.
An ANH SHOULD be advertised by a routing protocol. In this case, the
next conditions MUST be met:
* The status of the ANH is active (Section 5.1).
* The ANH is advertised as a host-specific route.
* This route is reachable by at least a subset of PEs via the
routing protocol.
* These PEs import VPN routes with a BGP next-hop address equal to
the ANH (covered by the route) in appropriate VRFs and use these
VPN routes for traffic forwarding.
This solution does not restrict the type of the routing protocol for
ANH routes distribution.
When the status of an ANH changes from active to inactive a PE MUST
notify the other PEs receiving a route to this ANH. The speed of
origination of such notification and its propagation is crucial.
PEs that received a route to an ANH act according to the standard
procedures that are applicable to the routing protocol. This
solution does not modify this behavior.
5.3. Tunnels to the Abstract Next-Hop
A PE may have one or several ANHs and distributes them as per
Section 5.2. In that case, according to Section 5 of [RFC4364] there
MUST be a tunnel for every such address of the PE.
Malyushkin Expires 18 February 2023 [Page 10]
Internet-Draft IP VPNs abstract next-hops August 2022
This solution does not restrict the type of tunnels that point to
ANHs, but these tunnels MUST forward MPLS traffic. However, an
implementation of an egress tunnel endpoint may require some changes
to support a point-to-point tunnel (e.g., RSVP-TE LSP [RFC3209] or IP
GRE [RFC4032]) to an ANH. These changes are out of the scope of this
document.
The solution does not consider in detail using tunneling technologies
other than MPLS LSPs for transport of labeled VPN traffic. The rest
of the section is applicable to MPLS LSPs only.
For all ANHs with LAs that belong to the same VRF, a PE MUST allocate
the same label. A PE MAY allocate a single label for all ANHs (e.g.,
implicit label).
When a PE allocates a label for an ANH it MUST associate a release
timer with this label. If the status of the ANH changes to inactive
the PE starts the release timer for the label. While the timer is
active if the PE is receiving traffic with this label it MUST
continue to handle this traffic like the failure has not happened.
When the timer reaches zero the PE starts freeing the resources
associated with the label. This timer does not influence the
generation and advertising of the failure notification via the label
distribution protocol. An implementation SHOULD support a manual
setting of the release timer (including zero). If a label is
allocated for a group of ANHs a PE starts the timer if and only if
the last active address of the group becomes inactive.
When a PE advertises a label binding to an ANH it either MUST be
accomplished by a label distribution protocol in parallel with the
advertising of the ANH via a routing protocol (Section 5.2), or the
ANH MUST be sent as a labeled route (e.g., BGP-LU [RFC8277]).
When the status of an ANH (Section 5.1) changes to inactive and a
label binding to this ANH was advertised by a PE via the label
distribution protocol the PE MUST notify other routers receiving the
label for this ANH. The speed of origination of such notification
and its propagation is also important, this notification may be
received before the notification via the routing protocol, or it may
be the only notification channel (Section 9.4).
Routers that received a label binding for an ANH act according to the
standard procedures that are applicable to the label distribution
protocol. The proposed solution does not change the behavior of
ingress LERs or LSRs.
Malyushkin Expires 18 February 2023 [Page 11]
Internet-Draft IP VPNs abstract next-hops August 2022
6. Distribution of VPN Routes
The solution that is proposed in this document is only applicable to
VPN-IPv4 and VPN-IPv6 routes (i.e., SAFI 128). Using any other
routes with ANHs is out of the scope of this document.
For a group of routes installed in a VRF and united by a common next-
hop address, an operator MAY set up an ANH as a next-hop of the
corresponding VPN routes. The rest routes from the same VRF (if they
are left) MUST be advertised by procedures [RFC4364] or [RFC4659] if
it is supposed to advertise them.
An ANH for VPN-IPv4 routes is encoded according to Section 4.3.2 of
[RFC4364] as a VPN-IPv4 address with an RD of 0.
An ANH for VPN-IPv6 routes is encoded according to Section 3.2.1 of
[RFC4659] as a VPN-IPv6 address. This VPN-IPv6 address contains an
RD of 0 and an IPv6 address which is equal to the ANH. In case when
the ANH is the IPv4 address the VPN-IPv6 address is encoded as an
IPv4-mapped IPv6 address. The procedures of including a link-local
address are not altered by this solution.
According to Section 4.3 of [RFC4364], routes that are installed in a
VRF are converted to VPN routes (this statement is applicable to both
address families), and "exported" to BGP. This solution assumes that
all VPN routes are installed into the VPN Loc-RIB with a next-hop
address that is equal to the own address of a PE where this VRF is
configured. In the other words, the solution does not modify
procedures for converting routes from VRFs to VPN routes.
All routes in a Loc-RIB are processed into appropriate Adj-RIBs-Out
according to configured policies [RFC4271], Section 9.1.3. The
solution expects that there MUST be a special export policy that is
applicable to routes undergoing from the VPN Loc-RIB to VPN Adj-RIBs-
Out and is processed in a chain before all policies that are
configured by an operator (if there are such policies). This special
export policy modifies next-hop addresses only for those routes that
are supposed by a configuration to be sent with ANHs (or a single
ANH).
Malyushkin Expires 18 February 2023 [Page 12]
Internet-Draft IP VPNs abstract next-hops August 2022
A PE does not check the presence of a route to an ANH in the GRT
before copying VPN routes from a Loc-RIB into a corresponding Adj-
RIB-Out and during the Update-send process (Section 9.2 of
[RFC4271]). When the status of an ANH (Section 5.1) changes to
inactive a PE does not start withdrawing VPN routes that use this ANH
as their next-hop. It prevents churn in the case when an operator
decides to maintain a network and manually disable the ANH. On the
other hand, deleting a binding between an ANH and its LA MUST start
changing the corresponding next-hop addresses in Adj-RIBs-Out to the
default value (the value from the Loc-RIB).
An implementation MAY support an option of selecting distinct Adj-
RIBs-Out where VPN routes will be placed with ANHs.
7. Forwarding
For an ingress PE, it is impossible to determine whether a next-hop
address of received VPN routes is a regular address or an abstract
one. The ingress PE considers every VPN next-hop address as the
address of a standalone egress router even if a group of VPN next-hop
addresses belongs to the same device. Having an active route and a
tunnel to a BGP next-hop address the ingress PE encapsulates and
sends traffic via the tunnel according to Section 5 of [RFC4364].
If an egress PE receives MPLS traffic with a label that was allocated
for one of its ANHs the solution expects the following (other cases
are out of the scope of this document):
* This label MUST NOT contain a Bottom of Stack bit [RFC3032] is
set.
* At a bottom of the received stack there MUST be a label that was
allocated by the egress PE.
* There MAY be other labels between these two labels (e.g., entropy
labels [RFC6790]).
8. Failure Detection
8.1. Egress PE
A PE detects the failure of a connected CE by different mechanisms.
These mechanisms are not considered in this document. The net effect
of the failure is the unreachability of routes to addresses (or a
route to a single address) of the failed CE in a VRF where an AC to
the CE resides. The PE usually uses these routes to recursively
resolve next-hops for other routes in the VRF (are also usually
distributed by the CE). All failed routes try to find new options to
Malyushkin Expires 18 February 2023 [Page 13]
Internet-Draft IP VPNs abstract next-hops August 2022
resolve their next-hops, if there are no such options the PE starts
deleting the failed routes from the VRF.
After the PE detected the CE had failed and if one of addresses of
the CE is an LA the PE immediately deactivates an ANH of this LA. If
a route to the ANH was distributed as per Section 5.2 the PE notifies
all neighbors of a routing protocol. If the ANH was also bound to a
label and this label was distributed via a label distribution
protocol, the PE notifies all neighbors of the label distribution
protocol.
The PE may start distributing updates via BGP VPN sessions notifying
its peers that the routes in the VRF are no longer reachable. This
process does not relate to the process described above and the
solution does not modify it. However, if the route to ANH was
distributed by BGP via the same set of sessions that are used for VPN
routes distribution an implementation SHOULD schedule sending of the
UPDATE message with the ANH`s withdrawal prior to UPDATE messages
with the failed VPN routes.
8.2. Ingress PE
As stated in Section 4, the proposed solution expects an ingress PE
to consider the status of a tunnel toward a BGP VPN next-hop. Thus,
when the status of the tunnel changes to inactive the ingress PE
simultaneously deactivates all VPN routes with a next-hop equal to an
address of the tunnel`s endpoint. If the ingress PE does not follow
this logic, the solution expects that the status of a route toward a
BGP VPN next-hop in the GRT is used the same way. The ingress PE can
apply both procedures. In any case, the ingress PE can react to the
failure of a remote CE (the CE connected to a remote PE in the same
VPN) or an AC to the CE independently of the receiving of BGP UPDATE
messages that withdraw VPN routes pointing to this CE. The ingress
PE may activate backups for these routes and redirect traffic by
them.
9. Deployment Considirations
9.1. Scalability
A requirement to have a tunnel to every next-hop address that a PE
uses to advertise VPN routes may pose scalability concerns. There
are some thoughts on how to deploy the described solution from the
scalability point of view:
Malyushkin Expires 18 February 2023 [Page 14]
Internet-Draft IP VPNs abstract next-hops August 2022
* Use multipoint-to-point tunnels to reach VPN next-hop addresses.
It helps to reduce state on a PE that advertises VPN routes with
these next-hops avoiding dependence on the number of upstream
routers. It also reduces state on intermediate routers when the
tunnels are LSPs.
* Selective installation of routes and tunnels helps spend resources
only for VPN next-hop addresses that an ingress PE will use. If
the ingress PE does not import a VPN route it is not always
necessary to have a tunnel towards a next-hop address of this
route. In the case of ANHs, having such tunnels is even more
questionable.
* Do not use the ANH proxying for every possible CE address in every
possible VRF. There are a lot of cases when the traditional VPN
convergence is good enough.
* A PE may advertise a common label for all its ANHs if tunnels
towards them are supposed to be LSPs. In this case, it is
expected that the ANHs also share a common value for the release
timer (Section 5.3). Using different values for the release timer
may require to deaggregate labels.
9.2. Using the Abstract Next-Hops
Deploying of the ANH solution should be considered on a per-service
basis. The following points may help to decide whether an ANH is
appropriate:
* Most VPN services exchange a small number of routes, hundreds, or
several thousand. Usually, any CE contributes a small portion of
them, and its failure can be noticed and repaired by the network
in a reasonable time. Imagine a situation when a CE advertises a
significant number of routes, tens of thousands or more. In this
case, the restoration after the failure of the CE may exceed the
expected convergence time.
* Independently on the number of routes that are advertised by a CE
existence of an extra exit point should be considered. Improving
the switchover time requires a point where to switch.
* Sometimes it may be desirable to stop traffic flowing through a
network after a destination CE fails, even if there is no extra
exit point. For example, there is a cosiderable amount of traffic
towards a failed CE.
* Some services require special treatment due to stricter SLAs,
using ANH may help to achieve these SLAs.
Malyushkin Expires 18 February 2023 [Page 15]
Internet-Draft IP VPNs abstract next-hops August 2022
9.3. Failure Detection
It is worth noting the detection time of a CE failure or a failure of
link (or an AC) to the CE. This time contributes much to overall
convergence. For example, sometimes it is not possible to notice the
link failure by a loss of a signal, and extra mechanisms are required
for this task. Some of these mechanisms interact with a session of a
PE-CE protocol. When these mechanisms detect the failure, routes
distributed by an associated PE-CE protocol`s session will become
inactive. At the same time routes toward addresses of the CE (or a
single route) are usually not distributed by the PE-CE protocol`s
session. They may be direct routes or statics. Thus, the routes
toward the addresses of the CE are not affected by the detection
mechanism described above and are staying alive. If one of the CE`s
addresses is an LA the status of an ANH that is proxying to the LA
will also be active. To prevent such behavior, it is recommended to
use detection of the failure of an address of the CE or a route to
this address (both on the PE and the CE, especially if the CE is
multihomed). In the other words, it is better (in the described
case) to monitor a next-hop for the routes distributed by the PE-CE
protocol, but not the session of the PE-CE protocol.
9.4. Routes Aggregation
Routes advertised by a routing protocol can be aggregated at some
points of a network (e.g., ABRs). Such aggregation may lead to
obscurity issues in the event of an ANH deactivation. The
aggregation of routes removes a notification channel that is supplied
by the routing protocol. A label distribution protocol can provide
this notification channel when it is used for the distribution of
labels for ANHs. But in this case, it requires all VPN members to
consider the status of tunnels toward ANHs as described in Section 4.
If LDP [RFC5036] is used as the label distribution protocol for ANHs,
the following steps should be considered:
* LDP extension for Inter-Area LSPs [RFC5283] must be used through
the network.
* If LDP is configured in Downstream Unsolicited mode it must also
be configured in Ordered Control mode. LDP LSR with Independent
Control mode on a path to ANH will break the notification channel.
Malyushkin Expires 18 February 2023 [Page 16]
Internet-Draft IP VPNs abstract next-hops August 2022
10. Multicast Considirations
Section 7 of [RFC6514] introduces the VRF Route Import EC.
Section 5.1.3 of [RFC6513] describes a scenario when unicast VPN
routes do not contain this community during the selection of Upstream
PE:
If a route does not have a VRF Route Import Extended Community,
the route's Upstream PE is determined from the route's BGP Next
Hop.
The solution described in this document expects that unicast VPN
routes of a VPN service may be sent by a PE with different BGP next-
hop addresses. It may create an issue with the importing of
C-Multicast routes if this VPN service also acts as MVPN and does not
mark its VPN routes with the VRF Route Import EC. It is not
recommended to configure a new import Route Target EC for every ANH.
Instead, there are two possible ways to mitigate the described
problem:
* Using procedures described in Section 10 of [RFC6514].
* Do not use ANHs for unicast VPN routes that cover multicast
sources or RP addresses.
11. Security Considerations
This document specifies extensions for the advertisement of VPN
routes with different next-hops by a signle PE. From this point of
view, the security considirations described in [RFC4364] and
[RFC4659] are equally applicable for the extensions described in this
document.
12. IANA Considerations
This document has no IANA actions.
13. References
13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
Malyushkin Expires 18 February 2023 [Page 17]
Internet-Draft IP VPNs abstract next-hops August 2022
[RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
Label Switching Architecture", RFC 3031,
DOI 10.17487/RFC3031, January 2001,
<https://www.rfc-editor.org/info/rfc3031>.
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001,
<https://www.rfc-editor.org/info/rfc3032>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Border Gateway Protocol 4 (BGP-4)", RFC 4271,
DOI 10.17487/RFC4271, January 2006,
<https://www.rfc-editor.org/info/rfc4271>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
2006, <https://www.rfc-editor.org/info/rfc4364>.
[RFC4659] De Clercq, J., Ooms, D., Carugi, M., and F. Le Faucheur,
"BGP-MPLS IP Virtual Private Network (VPN) Extension for
IPv6 VPN", RFC 4659, DOI 10.17487/RFC4659, September 2006,
<https://www.rfc-editor.org/info/rfc4659>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
13.2. Informative References
[I-D.ietf-rtgwg-bgp-pic]
Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix
Independent Convergence", Work in Progress, Internet-
Draft, draft-ietf-rtgwg-bgp-pic-18, 9 April 2022,
<https://www.ietf.org/archive/id/draft-ietf-rtgwg-bgp-pic-
18.txt>.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001,
<https://www.rfc-editor.org/info/rfc3209>.
[RFC4032] Camarillo, G. and P. Kyzivat, "Update to the Session
Initiation Protocol (SIP) Preconditions Framework",
RFC 4032, DOI 10.17487/RFC4032, March 2005,
<https://www.rfc-editor.org/info/rfc4032>.
Malyushkin Expires 18 February 2023 [Page 18]
Internet-Draft IP VPNs abstract next-hops August 2022
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,
February 2006, <https://www.rfc-editor.org/info/rfc4360>.
[RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed.,
"LDP Specification", RFC 5036, DOI 10.17487/RFC5036,
October 2007, <https://www.rfc-editor.org/info/rfc5036>.
[RFC5283] Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension
for Inter-Area Label Switched Paths (LSPs)", RFC 5283,
DOI 10.17487/RFC5283, July 2008,
<https://www.rfc-editor.org/info/rfc5283>.
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010,
<https://www.rfc-editor.org/info/rfc5880>.
[RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
2012, <https://www.rfc-editor.org/info/rfc6513>.
[RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
Encodings and Procedures for Multicast in MPLS/BGP IP
VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012,
<https://www.rfc-editor.org/info/rfc6514>.
[RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and
L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
RFC 6790, DOI 10.17487/RFC6790, November 2012,
<https://www.rfc-editor.org/info/rfc6790>.
[RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address
Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017,
<https://www.rfc-editor.org/info/rfc8277>.
Acknowledgments
The author would like to thank Roman Peshekhonov for his review and
valuable input.
Author's Address
I. Malyushkin
Independent Contributor
Email: gmalyushkin@gmail.com
Malyushkin Expires 18 February 2023 [Page 19]