Internet DRAFT - draft-malyushkin-bess-ip-vpn-abstract-next-hops

draft-malyushkin-bess-ip-vpn-abstract-next-hops







BGP Enabled ServiceS                                       I. Malyushkin
Internet-Draft                                   Independent Contributor
Intended status: Standards Track                          17 August 2022
Expires: 18 February 2023


                 Abstract next-hop addresses in IP VPNs
           draft-malyushkin-bess-ip-vpn-abstract-next-hops-00

Abstract

   This document discusses the IP VPN convergence aspects and specifies
   procedures for IP VPN to signal the attachment circuit failure.  The
   specified procedures help significantly improve the IP VPN
   convergence.

About This Document

   This note is to be removed before publishing as an RFC.

   Status information for this document may be found at
   https://datatracker.ietf.org/doc/draft-malyushkin-bess-ip-vpn-
   abstract-next-hops/.

   Discussion of this document takes place on the BGP Enabled ServiceS
   Working Group mailing list (mailto:bess@ietf.org), which is archived
   at https://mailarchive.ietf.org/arch/browse/bess/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/bess/.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 18 February 2023.






Malyushkin              Expires 18 February 2023                [Page 1]

Internet-Draft         IP VPNs abstract next-hops            August 2022


Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   4
   3.  Terminoly . . . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Solution Description  . . . . . . . . . . . . . . . . . . . .   5
   5.  Abstract Next-Hop Address . . . . . . . . . . . . . . . . . .   8
     5.1.  Status of the Abstract Next-Hop . . . . . . . . . . . . .   9
     5.2.  Distribution of the Abstract Next-Hop . . . . . . . . . .  10
     5.3.  Tunnels to the Abstract Next-Hop  . . . . . . . . . . . .  10
   6.  Distribution of VPN Routes  . . . . . . . . . . . . . . . . .  12
   7.  Forwarding  . . . . . . . . . . . . . . . . . . . . . . . . .  13
   8.  Failure Detection . . . . . . . . . . . . . . . . . . . . . .  13
     8.1.  Egress PE . . . . . . . . . . . . . . . . . . . . . . . .  13
     8.2.  Ingress PE  . . . . . . . . . . . . . . . . . . . . . . .  14
   9.  Deployment Considirations . . . . . . . . . . . . . . . . . .  14
     9.1.  Scalability . . . . . . . . . . . . . . . . . . . . . . .  14
     9.2.  Using the Abstract Next-Hops  . . . . . . . . . . . . . .  15
     9.3.  Failure Detection . . . . . . . . . . . . . . . . . . . .  16
     9.4.  Routes Aggregation  . . . . . . . . . . . . . . . . . . .  16
   10. Multicast Considirations  . . . . . . . . . . . . . . . . . .  17
   11. Security Considerations . . . . . . . . . . . . . . . . . . .  17
   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  17
   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  17
     13.1.  Normative References . . . . . . . . . . . . . . . . . .  17
     13.2.  Informative References . . . . . . . . . . . . . . . . .  18
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  19
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  19









Malyushkin              Expires 18 February 2023                [Page 2]

Internet-Draft         IP VPNs abstract next-hops            August 2022


1.  Introduction

   Neither IP VPN [RFC4364] nor IPv6 VPN [RFC4659] have a mass routes
   withdrawal mechanism.  The failure of a connection to a CE forces a
   PE to withdraw all affected VPN routes instead of noticing other PE
   routers about the attachment circuit failure.  These routes may be
   packed into one or more BGP UPDATE messages and then disseminated
   through the network.  Depending on the BGP topology these messages
   may be further processed and replicated by intermediate nodes (e.g.,
   route reflectors).  In general, every affected route must be
   withdrawn from all interested parties.  The number of failed routes
   impacts the convergence time.  More routes require more time.  A
   sophisticated intermediate BGP topology may also negatively affect
   this time.

   Network`s convergence speed is important.  There is a potential
   traffic loss that lasts until the failure notification (BGP UPDATE
   messages) reaches other members participating in the affected VPN
   service (i.e., routers using the affected VPN routes for traffic
   forwarding).  Moreover, this loss happens at the egress point where
   the failed CE router is connected to the network and after traffic
   has proceeded a whole path.

   There is a mechanism to avoid this traffic loss that acts while the
   network is converging which is named the BGP PIC edge
   [I-D.ietf-rtgwg-bgp-pic].  This mechanism depends on the availability
   of an extra exit point for every affected route.  In case when the CE
   router is connected to a pair of PE routers (i.e., it is multihomed)
   and a link between the CE and one of these PE fails all affected
   traffic can be redirected by this PE toward another.  On the other
   hand, the BGP PIC edge when it is active at egress is associated with
   the sub-optimal routing.  Traffic from an ingress PE must follow the
   path toward the egress PE where the failed link with the CE is
   attached.  Then this egress PE redirects traffic thanks to the pre-
   installed backup records toward another PE.  Such a tromboning can
   negatively influence traffic characteristics (delay, loss rate,
   etc.).

   Another problem with the BGP PIC edge at egress is a possible routing
   loop.  Suppose a CE router is connected to a pair of PE routers and
   contributes to them a set of routes.  These PE install these routes
   and propagate them via internal BGP VPN sessions.  Both PEs receive
   these routes via a PE-CE protocol and the internal BGP VPN sessions.
   The routes received via the internal BGP VPN sessions are used as
   backups for the routes received via the PE-CE protocol.  When the CE
   fails the PE routers activate their backups sending traffic to each
   other until TTL reaches zero.




Malyushkin              Expires 18 February 2023                [Page 3]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   The BGP PIC edge mechanism is a transient solution.  As soon as all
   VPN members are notified about the unreachability of all affected VPN
   routes traffic will be sent to the extra exit point in an optimal way
   or it will be dropped at ingress.  The goal of the solution described
   in this document is to decrease the time required by all VPN members
   to be aware of the failure thus reduce the time the BGP PIC edge
   lasts.  This solution does not replace the BGP PIC edge and can be
   applied to networks in parallel with it.  It is recommended to
   combine them together.

   Even if destinations that were advertised by a failed CE lack
   alternatives the time of the network reaction may be important.
   Imagine that the CE advertises a huge number of routes and attracts a
   considerable amount of traffic, but for some reason, these routes do
   not have an alternative exit point.  Until all other members of the
   VPN service are aware of the failure traffic will flow through the
   network in vain.  The solution described in this document will
   significantly reduce this time.

   This document refers to [RFC4364] in all cases when a logic of the
   latter is applicable to both address families.  [RFC4659] is referred
   to explicitly if it introduces a new logic.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Terminoly

   The document uses the terminology defined in [RFC4271], [RFC4364],
   [RFC6513], and [RFC3031].

   AC
      Attachment Circuit.

   GRT
      Global Routing Table.

   EC
      BGP Extended Community [RFC4360].

   BGP VPN route
      Either a VPN-IPv4 or VPN-IPv6 route.




Malyushkin              Expires 18 February 2023                [Page 4]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   Abstract Next-Hop (ANH) address
      An artificial IPv4 or IPv6 address in the GRT that represents an
      address of CE in VRF.

   Linked Address (LA)
      An actual IPv4 or IPv6 address of CE that is bound (linked) to
      ANH.

   ANH proxying
      A unidirectional dependency between the statuses of ANH and LA.

4.  Solution Description

   Consider the topology in Figure 1.  CE1 and CE2 maintain external BGP
   sessions with PE1 for IPv4 and IPv6 unicast address families.  Both
   CEs send routes via these sessions, which must be reachable by CE3
   through the VPN service.  PE1 exports routes installed into VRF1 and
   send them as VPN routes to PE2.

      +-----+
      | CE3 |
      +-----+
         |
         |
  +-------------+
  |             |
  |     PE2     |
  |  192.0.2.2  |
  +-------------+
         |
         |
  +-------------+
  |             |     +-----------+
  |             |     |           |.0     198.51.100.0/31     .1 +-----+
  |             |     |    VRF1 AC1------------------------------| CE1 |
  |             |     |           |::1    FE80::/64          ::2 +-----+
  |   IP/MPLS   |-----|    PE1    |
  |   network   |     | 192.0.2.1 |
  |             |     |           |.2     198.51.100.2/31     .3 +-----+
  |             |     |    VRF1 AC2------------------------------| CE2 |
  |             |     |           |::0    2001:DB8::/127     ::1 +-----+
  |             |     +-----------+
  +-------------+

                  Figure 1: IP/MPLS network with IP VPN






Malyushkin              Expires 18 February 2023                [Page 5]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   Figure 2 shows the routes received from the CEs and installed into
   VRF1 by PE1 on the left and VPN routes advertised by PE1 on the
   right.  The most interesting column here is "VPN Next-Hops".  The
   address of 192.0.2.1 is a primary address of PE1 which is used as a
   default VPN next-hop address and as a source address of internal BGP
   sessions.  All routes of CE2 use the default address of PE1 as the
   next-hop when exported as VPN routes.  There is a special export
   policy on PE1 for the internal BGP sessions that modifies next-hops
   for the VPN-IPv4 routes of CE1 to the address of 192.0.2.100 and for
   the VPN-IPv6 routes of CE1 to the address of ::ffff:192.0.2.200.

        +-------------------+----------------+----------+
        |    VRF1 Routes    | VRF1 Next-Hops | VRF1 ACs |
        +-------------------+----------------+----------+
        +-------------------+----------------+----------+
        | 203.0.113.0/25    | 198.51.100.1   | AC1      |
        +-------------------+----------------+----------+
        | 203.0.113.128/25  | 198.51.100.3   | AC2      |
        +-------------------+----------------+----------+
        +-------------------+----------------+----------+
        | 2001:DB8:100::/64 | FE80::2        | AC1      |
        +-------------------+----------------+----------+
        | 2001:DB8:200::/64 | 2001:DB8::1    | AC2      |
        +-------------------+----------------+----------+
                               |
                               |
                               v
   +---------------------+----------------------+-----------+
   |      VPN Routes     |    VPN Next-Hops     | VPN Label |
   +---------------------+----------------------+-----------+
   +---------------------+----------------------+-----------+
   | RD:203.0.113.0/25   | 0:192.0.2.100        | 100       |
   +---------------------+----------------------+-----------+
   | RD:203.0.113.128/25 | 0:192.0.2.1          | 100       |
   +---------------------+----------------------+-----------+
   +---------------------+----------------------+-----------+
   | RD:2001:DB8:100::/64| 0:::ffff:192.0.2.200 | 100       |
   +---------------------+----------------------+-----------+
   | RD:2001:DB8:200::/64| 0:::ffff:192.0.2.1   | 100       |
   +---------------------+----------------------+-----------+

                  Figure 2: Export routes into VPN by PE1

   PE1 advertises unicast host-specific routes for the addresses
   192.0.2.1, 192.0.2.100, and 192.0.2.200 via a routing protocol.  PE2
   receives these routes and installs them into the GRT.  PE1 also
   allocates an MPLS label for the addresses mentioned above and
   distributes the bindings of this label via a label distribution



Malyushkin              Expires 18 February 2023                [Page 6]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   protocol.  PE2 receives these bindings and installs them into its
   tunnel table.  Thus, PE2 can resolve all VPN routes that PE1 has
   sent.

   Suppose that AC2 between PE1 and CE2 has failed for some reason.
   When PE1 has noticed the failure it invalidates all routes inside
   VRF1 that were used to reach the CE2 addresses, 198.51.100.2/31 and
   2001:DB8::/127.  Other routes that recursively uses the routes to
   addresses of CE2 for look up their next-hops become inactive too.
   Because of that PE1 starts withdrawing the corresponding VPN routes,
   203.0.113.128/25 and 2001:DB8:200::/64 (RD is omitted).  PE2 must
   wait for these withdrawals before it stops sending traffic toward PE1
   (traffic from CE3 to CE2).

   In another scenario, AC1 fails instead of AC2.  Imagine, PE1 is
   configured to monitor the status of AC1 and in the case of the
   failure of AC1 PE1 immediately updates the routing protocol and the
   label distribution protocol.  These updates include the withdrawal of
   the unicast routes for the addresses 192.0.2.100 and 192.0.2.200 (but
   not for 192.0.2.1) and the label bindings for them.  In parallel with
   it, PE1 proceeds with the similar steps described previously for the
   case with the AC2 failure.  PE2 eventually receives the updates
   either by the routing protocol or the label distribution protocol or
   both.  Thanks to a hierarchical FIB it invalidates all VPN routes at
   once that use the failed routes (and tunnels) to their VPN next-hops.
   PE2 stops sending traffic to PE1 (traffic from CE3 to CE1) even if it
   has not received yet any withdrawals for the corresponding VPN
   routes.

   For the sake of brevity, both scenarios are discussed without
   alternative exit points for the routes inside VRF1 and just for a
   couple of such routes.  In real deployments, CE can distribute much
   more routes to more than one PE.  In that case, the mechanism
   described above can significantly improve the network convergence
   times.

   This document introduces the mechanism that helps notify VPN members
   about the AC failure that has happened to one of these members.  The
   described solution expects the following:

   *  Hierarchical FIB MUST be supported and used among all VPN members.

   *  Any VPN member acting as an ingress PE MUST consider the status of
      a unicast route in the GRT toward the BGP next-hop (BGP next-hop
      tracking).  This status MUST be considered during the BGP route
      resolution and after the route is placed into the appropriate
      routing table.




Malyushkin              Expires 18 February 2023                [Page 7]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   *  It is RECOMMENDED for any VPN member acting as an ingress PE to
      consider the status of a tunnel toward the BGP next-hop also
      during the BGP VPN route resolution and after the route is placed
      into the appropriate routing table.

   The solution described in this document modifies the behavior of
   egress PE routers only and can be deployed incrementally.

5.  Abstract Next-Hop Address

   Section 4.3.2 of [RFC4364] states:

      When a PE router distributes a VPN-IPv4 route via BGP, it uses its
      own address as the "BGP next hop".

   In most cases, the "own address" is the address of a virtual
   interface (e.g., a loopback).  This address usually acts as a tunnel
   endpoint for labeled traffic.  The tunnel using it may be
   instantiated by different mechanisms and must be capable of
   forwarding MPLS traffic.  The PE also uses this address as a source
   address of internal BGP sessions.  Due to a virtual nature of the
   interface owning this address, it is nearly impossible to face the
   failure of this interface (except for artificial ways).  Only the
   failure of a whole PE leads to it.

   The solution described in the document proposes using additional
   next-hops for VPN routes advertised by a single PE.  This alters a
   behavior described in [RFC4364] and [RFC4659] that presupposes the
   advertising of a single next-hop address for all VPN routes of a PE.
   With regard to the described solution, additional next-hops
   advertised by a PE are named ANH addresses.  An ANH address is an
   artificial IPv4 or IPv6 address that belongs to the GRT.  An ANH acts
   as a proxy address for an actual address of a CE residing in a VRF.
   The status of the latter address influences the status of its ANH.  A
   CE`s address selected for an ANH is named an LA.  An LA may belong to
   a common subnet of a PE-CE pair in a VRF or can be any other address
   of the CE, it cannot belong to the GRT.  An ANH and LA pair does not
   necessarily belong to the same address family.  For example, it is
   possible to have an ANH of IPv4 and an LA of its ANH of IPv6, and
   vice versa.  An LA can be a link-local IPv6 address, in this case,
   its ANH MUST be a proxy to a triplet (LA, AC, VRF) instead of (LA,
   VRF) where the LA belongs to the AC from the triplet.









Malyushkin              Expires 18 February 2023                [Page 8]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   Addresses installed in different VRFs may be overlapped.  Thus,
   values of ANHs may be arbitrary and do not have to be the same as
   their LAs.  An operator is free to choose these values according to
   network address plans.  To achieve goals stated in Section 1 values
   of ANHs MUST be unique throughout the GRTs of the network.  The case
   when several PE routers advertise the same value for the ANH (e.g.,
   anycast) is out of the scope of this document.

   The ANH proxying is not a route leaking mechanism, it cannot be used
   for traffic forwarding between the GRT and a VRF in any direction.
   The ANH proxying creates a dependency between the statuses of an ANH
   and an LA (Section 5.1).  An ANH MUST be bound to only one LA.  An LA
   in turn MUST be bound to only one ANH.  There is a strict one-to-one
   mapping between them.  An operator may create the ANH proxying for
   any address in a VRF, but the solution expects that this address is
   used as a next-hop for routes in this VRF.  These routes does not
   necessarily belong to the same CE that owns the LA (i.e., third-party
   next-hop).

   The ANH proxying can be described as a static host-specific route
   that is installed in the GRT.  A destination address of this route is
   configured as a value selected for an ANH by an operator.  A next-hop
   address of the static route is equal to an LA (selected for the ANH).
   Additionally, the operator configures a VRF (its name or index)
   directly for this static route.  It points to where the next-hop of
   the route must be resolved.  For unlabeled traffic coming to a PE via
   the GRT, the static route acts as a route to the bit bucket.  This
   document does not restrict implementations by this mechanic.

5.1.  Status of the Abstract Next-Hop

   The status of an ANH depends on the existence of a route to its LA.
   This route MUST be present in a VRF associated with the ANH.  The ANH
   is considered active if and only if the route to its LA is active and
   is available for traffic forwarding.  The proposed solution does not
   restrict the type of this route, but it MUST support at least direct
   routes and static routes.  An implementation MAY filter the protocols
   used for resolution of LAs by a configuration policy.

   The status of an ANH is unidirectional, only the status of an LA
   defines the status of an ANH.

   An implementation MAY support the option of deactivation of an ANH
   manually by an operator.







Malyushkin              Expires 18 February 2023                [Page 9]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   Besides the dependency on a route toward an LA, an ANH MAY be a
   client of any mechanism of active monitoring of the LA.  It can be
   any next-hop tracking (ARP, ICMP probes if they are applicable to the
   LA) or a BFD [RFC5880] session between a CE that owns the LA and a PE
   that owns the ANH.

5.2.  Distribution of the Abstract Next-Hop

   In general, the distribution of ANHs by means of a routing protocol
   does not differ from the distribution of any other addresses that are
   considered to be BGP next-hops.

   An ANH SHOULD be advertised by a routing protocol.  In this case, the
   next conditions MUST be met:

   *  The status of the ANH is active (Section 5.1).

   *  The ANH is advertised as a host-specific route.

   *  This route is reachable by at least a subset of PEs via the
      routing protocol.

   *  These PEs import VPN routes with a BGP next-hop address equal to
      the ANH (covered by the route) in appropriate VRFs and use these
      VPN routes for traffic forwarding.

   This solution does not restrict the type of the routing protocol for
   ANH routes distribution.

   When the status of an ANH changes from active to inactive a PE MUST
   notify the other PEs receiving a route to this ANH.  The speed of
   origination of such notification and its propagation is crucial.

   PEs that received a route to an ANH act according to the standard
   procedures that are applicable to the routing protocol.  This
   solution does not modify this behavior.

5.3.  Tunnels to the Abstract Next-Hop

   A PE may have one or several ANHs and distributes them as per
   Section 5.2.  In that case, according to Section 5 of [RFC4364] there
   MUST be a tunnel for every such address of the PE.









Malyushkin              Expires 18 February 2023               [Page 10]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   This solution does not restrict the type of tunnels that point to
   ANHs, but these tunnels MUST forward MPLS traffic.  However, an
   implementation of an egress tunnel endpoint may require some changes
   to support a point-to-point tunnel (e.g., RSVP-TE LSP [RFC3209] or IP
   GRE [RFC4032]) to an ANH.  These changes are out of the scope of this
   document.

   The solution does not consider in detail using tunneling technologies
   other than MPLS LSPs for transport of labeled VPN traffic.  The rest
   of the section is applicable to MPLS LSPs only.

   For all ANHs with LAs that belong to the same VRF, a PE MUST allocate
   the same label.  A PE MAY allocate a single label for all ANHs (e.g.,
   implicit label).

   When a PE allocates a label for an ANH it MUST associate a release
   timer with this label.  If the status of the ANH changes to inactive
   the PE starts the release timer for the label.  While the timer is
   active if the PE is receiving traffic with this label it MUST
   continue to handle this traffic like the failure has not happened.
   When the timer reaches zero the PE starts freeing the resources
   associated with the label.  This timer does not influence the
   generation and advertising of the failure notification via the label
   distribution protocol.  An implementation SHOULD support a manual
   setting of the release timer (including zero).  If a label is
   allocated for a group of ANHs a PE starts the timer if and only if
   the last active address of the group becomes inactive.

   When a PE advertises a label binding to an ANH it either MUST be
   accomplished by a label distribution protocol in parallel with the
   advertising of the ANH via a routing protocol (Section 5.2), or the
   ANH MUST be sent as a labeled route (e.g., BGP-LU [RFC8277]).

   When the status of an ANH (Section 5.1) changes to inactive and a
   label binding to this ANH was advertised by a PE via the label
   distribution protocol the PE MUST notify other routers receiving the
   label for this ANH.  The speed of origination of such notification
   and its propagation is also important, this notification may be
   received before the notification via the routing protocol, or it may
   be the only notification channel (Section 9.4).

   Routers that received a label binding for an ANH act according to the
   standard procedures that are applicable to the label distribution
   protocol.  The proposed solution does not change the behavior of
   ingress LERs or LSRs.






Malyushkin              Expires 18 February 2023               [Page 11]

Internet-Draft         IP VPNs abstract next-hops            August 2022


6.  Distribution of VPN Routes

   The solution that is proposed in this document is only applicable to
   VPN-IPv4 and VPN-IPv6 routes (i.e., SAFI 128).  Using any other
   routes with ANHs is out of the scope of this document.

   For a group of routes installed in a VRF and united by a common next-
   hop address, an operator MAY set up an ANH as a next-hop of the
   corresponding VPN routes.  The rest routes from the same VRF (if they
   are left) MUST be advertised by procedures [RFC4364] or [RFC4659] if
   it is supposed to advertise them.

   An ANH for VPN-IPv4 routes is encoded according to Section 4.3.2 of
   [RFC4364] as a VPN-IPv4 address with an RD of 0.

   An ANH for VPN-IPv6 routes is encoded according to Section 3.2.1 of
   [RFC4659] as a VPN-IPv6 address.  This VPN-IPv6 address contains an
   RD of 0 and an IPv6 address which is equal to the ANH.  In case when
   the ANH is the IPv4 address the VPN-IPv6 address is encoded as an
   IPv4-mapped IPv6 address.  The procedures of including a link-local
   address are not altered by this solution.

   According to Section 4.3 of [RFC4364], routes that are installed in a
   VRF are converted to VPN routes (this statement is applicable to both
   address families), and "exported" to BGP.  This solution assumes that
   all VPN routes are installed into the VPN Loc-RIB with a next-hop
   address that is equal to the own address of a PE where this VRF is
   configured.  In the other words, the solution does not modify
   procedures for converting routes from VRFs to VPN routes.

   All routes in a Loc-RIB are processed into appropriate Adj-RIBs-Out
   according to configured policies [RFC4271], Section 9.1.3.  The
   solution expects that there MUST be a special export policy that is
   applicable to routes undergoing from the VPN Loc-RIB to VPN Adj-RIBs-
   Out and is processed in a chain before all policies that are
   configured by an operator (if there are such policies).  This special
   export policy modifies next-hop addresses only for those routes that
   are supposed by a configuration to be sent with ANHs (or a single
   ANH).












Malyushkin              Expires 18 February 2023               [Page 12]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   A PE does not check the presence of a route to an ANH in the GRT
   before copying VPN routes from a Loc-RIB into a corresponding Adj-
   RIB-Out and during the Update-send process (Section 9.2 of
   [RFC4271]).  When the status of an ANH (Section 5.1) changes to
   inactive a PE does not start withdrawing VPN routes that use this ANH
   as their next-hop.  It prevents churn in the case when an operator
   decides to maintain a network and manually disable the ANH.  On the
   other hand, deleting a binding between an ANH and its LA MUST start
   changing the corresponding next-hop addresses in Adj-RIBs-Out to the
   default value (the value from the Loc-RIB).

   An implementation MAY support an option of selecting distinct Adj-
   RIBs-Out where VPN routes will be placed with ANHs.

7.  Forwarding

   For an ingress PE, it is impossible to determine whether a next-hop
   address of received VPN routes is a regular address or an abstract
   one.  The ingress PE considers every VPN next-hop address as the
   address of a standalone egress router even if a group of VPN next-hop
   addresses belongs to the same device.  Having an active route and a
   tunnel to a BGP next-hop address the ingress PE encapsulates and
   sends traffic via the tunnel according to Section 5 of [RFC4364].

   If an egress PE receives MPLS traffic with a label that was allocated
   for one of its ANHs the solution expects the following (other cases
   are out of the scope of this document):

   *  This label MUST NOT contain a Bottom of Stack bit [RFC3032] is
      set.

   *  At a bottom of the received stack there MUST be a label that was
      allocated by the egress PE.

   *  There MAY be other labels between these two labels (e.g., entropy
      labels [RFC6790]).

8.  Failure Detection

8.1.  Egress PE

   A PE detects the failure of a connected CE by different mechanisms.
   These mechanisms are not considered in this document.  The net effect
   of the failure is the unreachability of routes to addresses (or a
   route to a single address) of the failed CE in a VRF where an AC to
   the CE resides.  The PE usually uses these routes to recursively
   resolve next-hops for other routes in the VRF (are also usually
   distributed by the CE).  All failed routes try to find new options to



Malyushkin              Expires 18 February 2023               [Page 13]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   resolve their next-hops, if there are no such options the PE starts
   deleting the failed routes from the VRF.

   After the PE detected the CE had failed and if one of addresses of
   the CE is an LA the PE immediately deactivates an ANH of this LA.  If
   a route to the ANH was distributed as per Section 5.2 the PE notifies
   all neighbors of a routing protocol.  If the ANH was also bound to a
   label and this label was distributed via a label distribution
   protocol, the PE notifies all neighbors of the label distribution
   protocol.

   The PE may start distributing updates via BGP VPN sessions notifying
   its peers that the routes in the VRF are no longer reachable.  This
   process does not relate to the process described above and the
   solution does not modify it.  However, if the route to ANH was
   distributed by BGP via the same set of sessions that are used for VPN
   routes distribution an implementation SHOULD schedule sending of the
   UPDATE message with the ANH`s withdrawal prior to UPDATE messages
   with the failed VPN routes.

8.2.  Ingress PE

   As stated in Section 4, the proposed solution expects an ingress PE
   to consider the status of a tunnel toward a BGP VPN next-hop.  Thus,
   when the status of the tunnel changes to inactive the ingress PE
   simultaneously deactivates all VPN routes with a next-hop equal to an
   address of the tunnel`s endpoint.  If the ingress PE does not follow
   this logic, the solution expects that the status of a route toward a
   BGP VPN next-hop in the GRT is used the same way.  The ingress PE can
   apply both procedures.  In any case, the ingress PE can react to the
   failure of a remote CE (the CE connected to a remote PE in the same
   VPN) or an AC to the CE independently of the receiving of BGP UPDATE
   messages that withdraw VPN routes pointing to this CE.  The ingress
   PE may activate backups for these routes and redirect traffic by
   them.

9.  Deployment Considirations

9.1.  Scalability

   A requirement to have a tunnel to every next-hop address that a PE
   uses to advertise VPN routes may pose scalability concerns.  There
   are some thoughts on how to deploy the described solution from the
   scalability point of view:







Malyushkin              Expires 18 February 2023               [Page 14]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   *  Use multipoint-to-point tunnels to reach VPN next-hop addresses.
      It helps to reduce state on a PE that advertises VPN routes with
      these next-hops avoiding dependence on the number of upstream
      routers.  It also reduces state on intermediate routers when the
      tunnels are LSPs.

   *  Selective installation of routes and tunnels helps spend resources
      only for VPN next-hop addresses that an ingress PE will use.  If
      the ingress PE does not import a VPN route it is not always
      necessary to have a tunnel towards a next-hop address of this
      route.  In the case of ANHs, having such tunnels is even more
      questionable.

   *  Do not use the ANH proxying for every possible CE address in every
      possible VRF.  There are a lot of cases when the traditional VPN
      convergence is good enough.

   *  A PE may advertise a common label for all its ANHs if tunnels
      towards them are supposed to be LSPs.  In this case, it is
      expected that the ANHs also share a common value for the release
      timer (Section 5.3).  Using different values for the release timer
      may require to deaggregate labels.

9.2.  Using the Abstract Next-Hops

   Deploying of the ANH solution should be considered on a per-service
   basis.  The following points may help to decide whether an ANH is
   appropriate:

   *  Most VPN services exchange a small number of routes, hundreds, or
      several thousand.  Usually, any CE contributes a small portion of
      them, and its failure can be noticed and repaired by the network
      in a reasonable time.  Imagine a situation when a CE advertises a
      significant number of routes, tens of thousands or more.  In this
      case, the restoration after the failure of the CE may exceed the
      expected convergence time.

   *  Independently on the number of routes that are advertised by a CE
      existence of an extra exit point should be considered.  Improving
      the switchover time requires a point where to switch.

   *  Sometimes it may be desirable to stop traffic flowing through a
      network after a destination CE fails, even if there is no extra
      exit point.  For example, there is a cosiderable amount of traffic
      towards a failed CE.

   *  Some services require special treatment due to stricter SLAs,
      using ANH may help to achieve these SLAs.



Malyushkin              Expires 18 February 2023               [Page 15]

Internet-Draft         IP VPNs abstract next-hops            August 2022


9.3.  Failure Detection

   It is worth noting the detection time of a CE failure or a failure of
   link (or an AC) to the CE.  This time contributes much to overall
   convergence.  For example, sometimes it is not possible to notice the
   link failure by a loss of a signal, and extra mechanisms are required
   for this task.  Some of these mechanisms interact with a session of a
   PE-CE protocol.  When these mechanisms detect the failure, routes
   distributed by an associated PE-CE protocol`s session will become
   inactive.  At the same time routes toward addresses of the CE (or a
   single route) are usually not distributed by the PE-CE protocol`s
   session.  They may be direct routes or statics.  Thus, the routes
   toward the addresses of the CE are not affected by the detection
   mechanism described above and are staying alive.  If one of the CE`s
   addresses is an LA the status of an ANH that is proxying to the LA
   will also be active.  To prevent such behavior, it is recommended to
   use detection of the failure of an address of the CE or a route to
   this address (both on the PE and the CE, especially if the CE is
   multihomed).  In the other words, it is better (in the described
   case) to monitor a next-hop for the routes distributed by the PE-CE
   protocol, but not the session of the PE-CE protocol.

9.4.  Routes Aggregation

   Routes advertised by a routing protocol can be aggregated at some
   points of a network (e.g., ABRs).  Such aggregation may lead to
   obscurity issues in the event of an ANH deactivation.  The
   aggregation of routes removes a notification channel that is supplied
   by the routing protocol.  A label distribution protocol can provide
   this notification channel when it is used for the distribution of
   labels for ANHs.  But in this case, it requires all VPN members to
   consider the status of tunnels toward ANHs as described in Section 4.

   If LDP [RFC5036] is used as the label distribution protocol for ANHs,
   the following steps should be considered:

   *  LDP extension for Inter-Area LSPs [RFC5283] must be used through
      the network.

   *  If LDP is configured in Downstream Unsolicited mode it must also
      be configured in Ordered Control mode.  LDP LSR with Independent
      Control mode on a path to ANH will break the notification channel.









Malyushkin              Expires 18 February 2023               [Page 16]

Internet-Draft         IP VPNs abstract next-hops            August 2022


10.  Multicast Considirations

   Section 7 of [RFC6514] introduces the VRF Route Import EC.
   Section 5.1.3 of [RFC6513] describes a scenario when unicast VPN
   routes do not contain this community during the selection of Upstream
   PE:

      If a route does not have a VRF Route Import Extended Community,
      the route's Upstream PE is determined from the route's BGP Next
      Hop.

   The solution described in this document expects that unicast VPN
   routes of a VPN service may be sent by a PE with different BGP next-
   hop addresses.  It may create an issue with the importing of
   C-Multicast routes if this VPN service also acts as MVPN and does not
   mark its VPN routes with the VRF Route Import EC.  It is not
   recommended to configure a new import Route Target EC for every ANH.
   Instead, there are two possible ways to mitigate the described
   problem:

   *  Using procedures described in Section 10 of [RFC6514].

   *  Do not use ANHs for unicast VPN routes that cover multicast
      sources or RP addresses.

11.  Security Considerations

   This document specifies extensions for the advertisement of VPN
   routes with different next-hops by a signle PE.  From this point of
   view, the security considirations described in [RFC4364] and
   [RFC4659] are equally applicable for the extensions described in this
   document.

12.  IANA Considerations

   This document has no IANA actions.

13.  References

13.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.






Malyushkin              Expires 18 February 2023               [Page 17]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   [RFC3031]  Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
              Label Switching Architecture", RFC 3031,
              DOI 10.17487/RFC3031, January 2001,
              <https://www.rfc-editor.org/info/rfc3031>.

   [RFC3032]  Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
              Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
              Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001,
              <https://www.rfc-editor.org/info/rfc3032>.

   [RFC4271]  Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
              Border Gateway Protocol 4 (BGP-4)", RFC 4271,
              DOI 10.17487/RFC4271, January 2006,
              <https://www.rfc-editor.org/info/rfc4271>.

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
              2006, <https://www.rfc-editor.org/info/rfc4364>.

   [RFC4659]  De Clercq, J., Ooms, D., Carugi, M., and F. Le Faucheur,
              "BGP-MPLS IP Virtual Private Network (VPN) Extension for
              IPv6 VPN", RFC 4659, DOI 10.17487/RFC4659, September 2006,
              <https://www.rfc-editor.org/info/rfc4659>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

13.2.  Informative References

   [I-D.ietf-rtgwg-bgp-pic]
              Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix
              Independent Convergence", Work in Progress, Internet-
              Draft, draft-ietf-rtgwg-bgp-pic-18, 9 April 2022,
              <https://www.ietf.org/archive/id/draft-ietf-rtgwg-bgp-pic-
              18.txt>.

   [RFC3209]  Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
              and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
              Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001,
              <https://www.rfc-editor.org/info/rfc3209>.

   [RFC4032]  Camarillo, G. and P. Kyzivat, "Update to the Session
              Initiation Protocol (SIP) Preconditions Framework",
              RFC 4032, DOI 10.17487/RFC4032, March 2005,
              <https://www.rfc-editor.org/info/rfc4032>.





Malyushkin              Expires 18 February 2023               [Page 18]

Internet-Draft         IP VPNs abstract next-hops            August 2022


   [RFC4360]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,
              February 2006, <https://www.rfc-editor.org/info/rfc4360>.

   [RFC5036]  Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed.,
              "LDP Specification", RFC 5036, DOI 10.17487/RFC5036,
              October 2007, <https://www.rfc-editor.org/info/rfc5036>.

   [RFC5283]  Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension
              for Inter-Area Label Switched Paths (LSPs)", RFC 5283,
              DOI 10.17487/RFC5283, July 2008,
              <https://www.rfc-editor.org/info/rfc5283>.

   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010,
              <https://www.rfc-editor.org/info/rfc5880>.

   [RFC6513]  Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
              BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
              2012, <https://www.rfc-editor.org/info/rfc6513>.

   [RFC6514]  Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
              Encodings and Procedures for Multicast in MPLS/BGP IP
              VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012,
              <https://www.rfc-editor.org/info/rfc6514>.

   [RFC6790]  Kompella, K., Drake, J., Amante, S., Henderickx, W., and
              L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
              RFC 6790, DOI 10.17487/RFC6790, November 2012,
              <https://www.rfc-editor.org/info/rfc6790>.

   [RFC8277]  Rosen, E., "Using BGP to Bind MPLS Labels to Address
              Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017,
              <https://www.rfc-editor.org/info/rfc8277>.

Acknowledgments

   The author would like to thank Roman Peshekhonov for his review and
   valuable input.

Author's Address

   I. Malyushkin
   Independent Contributor
   Email: gmalyushkin@gmail.com






Malyushkin              Expires 18 February 2023               [Page 19]