Internet DRAFT - draft-wang-rtgwg-igp-pic

draft-wang-rtgwg-igp-pic







rtgwg                                                            Y. Wang
Internet-Draft                                             China Telecom
Intended status: Standards Track                                  C. Lin
Expires: 8 January 2024                             New H3C Technologies
                                                                 A. Wang
                                                           China Telecom
                                                             7 July 2023


                   IGP Prefix Independent Convergence
                      draft-wang-rtgwg-igp-pic-01

Abstract

   In many cases, a large number of routes can be reached by multiple
   next hops.  When a link fails, route calculation needs to be
   performed and a new reachable path needs to be calculated.  If all
   routes are re-calculated and refreshed, the calculation time
   increases linearly as the number of routes increases, resulting in a
   long time for route convergence.  This document describes an
   architecture where the number of prefixes is independent.  This
   architecture allows routes to be recalculated when paths change,
   regardless of the number of IGP routes.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 8 January 2024.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as the
   document authors.  All rights reserved.






Wang, et al.             Expires 8 January 2024                 [Page 1]

Internet-Draft                    rtgwg                        July 2023


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions used in this document . . . . . . . . . . . . . .   3
   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   4.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
     4.1.  Dependency  . . . . . . . . . . . . . . . . . . . . . . .   4
     4.2.  FRR Consideration . . . . . . . . . . . . . . . . . . . .   4
     4.3.  IGP-PIC Illustration  . . . . . . . . . . . . . . . . . .   5
   5.  ISIS PIC  . . . . . . . . . . . . . . . . . . . . . . . . . .   7
     5.1.  Maintenance of ISIS IGP-nodes . . . . . . . . . . . . . .   7
     5.2.  PIC Route Compute . . . . . . . . . . . . . . . . . . . .   8
   6.  OSPF PIC  . . . . . . . . . . . . . . . . . . . . . . . . . .   8
     6.1.  Maintenance of OSPF IGP-nodes . . . . . . . . . . . . . .   8
     6.2.  PIC Route Compute . . . . . . . . . . . . . . . . . . . .   9
   7.  Example . . . . . . . . . . . . . . . . . . . . . . . . . . .   9
     7.1.  ISIS PIC Route  . . . . . . . . . . . . . . . . . . . . .   9
     7.2.  OSPF PIC Route  . . . . . . . . . . . . . . . . . . . . .  10
   8.  Normative References  . . . . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  12

1.  Introduction

   In modern networks, it is not uncommon to have a prefix reachable via
   multiple paths.  When the primary link fails, routes must be
   converged again as soon as possible.

   For the OSPF route calculation process, see [RFC2328].

   1) Calculate the shortest path (spf) tree from the root node to all
   routing nodes based on the link status.

   2) The cost of each prefix is calculated according to the distance
   between the root node and the router node in the shortest path tree.

   When the number of prefixes increases, route convergence slows down.






Wang, et al.             Expires 8 January 2024                 [Page 2]

Internet-Draft                    rtgwg                        July 2023


   This document proposes a hierarchical shared forwarding chain
   organization that allows traffic to be restored in time periods
   independent of prefix number.  This technology relies on internal
   router behavior that is completely transparent to operators and can
   be deployed and enabled progressively without operator intervention.

2.  Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119] .

3.  Terminology

   The following terms are defined in this draft:

   *  IGP prefix: A prefix P/m (of any AFI/SAFI) that is learnt via an
      Interior Gateway Protocol, such as OSPF and ISIS, has a path for.
      The prefix may be learnt directly through the IGP or redistributed
      from other protocol(s)

   *  OSPF ABR Node: OSPF Area Boundary Router, A OSPF router between
      multiple areas

   *  OSPF ASBR Node: OSPF AS boundary router, A OSPF router that
      exchanges routing information with routers in other AS

   *  OSPF Node: A node is associated with a real OSPF router or the
      combination of multiple OSPF routers that advertise the same
      prefix.  Real OSPF Routers include OSPF ABR Node, OSPF ASBR Node,
      and OSPF ordinary Node.

   *  ISIS Node: A node is associated with a real ISIS router or the
      combination of multiple ISIS routers that advertise the same
      prefix

   *  IGP Node: including OSPF Node and ISIS Node

4.  Overview

   The idea of IGP-PIC is based on two pillars,

   1) A shared forwarding Chain: Instead of having q separate list of
   next-hops for each destination, all destinations sharing the same
   list of next-hops can point to a single copy of this thereby allowing
   fast convergence by making changes to a single shared list of next-
   hops rather than possibly a large number of destinations.




Wang, et al.             Expires 8 January 2024                 [Page 3]

Internet-Draft                    rtgwg                        July 2023


   2) A forwarding plan that support multiple levels of indirection: A
   forwarding that starts with a destination and ends with an outgoing
   interface is not a simple flat structure.  Instead a forwarding entry
   is constructed via multiple levels of dependency.

   Designing a forwarding plane that constructs multi-level forwarding
   chains with maximal sharing of forwarding objects allows rerouting a
   large number of destinations by modifying a small number of objects
   thereby achieving convergence in a time frame that does not depend on
   the number of destinations.

   Similar to the implementation of BGP-PIC,
   see[I-D.ietf-rtgwg-bgp-pic]chapter 2 for details.

4.1.  Dependency

   This section describes the required functionality in the forwarding
   and control planes to support IGP-PIC described in the document.

   IGP PIC requires a hierarchical hardware FIB support: for each IGP
   forwarded packet, a destination is looked up, then an IGP Node, then
   an Adjacency.

4.2.  FRR Consideration

   As per [RFC5286] Rapid failure repair is achieved through use of
   precalculated backup next-hops that are loop-free and safe to use
   until the distributed network convergence process completes.  So
   based on backing up the next hop of the current route in advance, FRR
   can achieve rapid switching of faulty links.

                +-----+
           /----|  S  |----\
          /     +-----+     \
         / 5               8 \
        /                     \
     +-----+                +-----+
     |  E  |                | N_1 |
     +-----+                +-----+
        \                     /
    \    \  4              3 /  /
     \|   \                 / |/
     -+    \    +-----+    /  +-
            \---|  D  |---/
                +-----+
    Figure 1: Node Protection Topology





Wang, et al.             Expires 8 January 2024                 [Page 4]

Internet-Draft                    rtgwg                        July 2023


   As shown in the figure, the optimal next hop from original device S
   to D is E.  If we take N_1 as the next hop for backup from S to E,
   when there is a fault between S and E, the data packet to D is handed
   over to N_1.  It can be forwarded to D normally, so N_ 1 has the
   qualification for backup next hop from S to E.  But if the COST value
   of the direct link from N_1 to D is greater than 17,before the route
   on N_1 converges again, the next jump from N_1 to D is S instead of D
   thus forming a temporary loop.  So as per [RFC5286]

   A neighbor N_1 can provide a loop-free alternate (LFA) if and only if
   Distance_opt(N_1, D) < Distance_opt(N_1, S) + Distance_opt(S, D)

         +-----+       +-----+
         |  S  |-------|  N  |
         +-+---+   6   +-----+
           |              |
           | 5          2 |
           |              |
           |    +-----+   |
           +----|  E  |---+
                +--+--+
                   |
                   | 3
                   |
                +--+--+
                |  D  |
                +-----+
    Figure 2: Link Protection Topology

   Another typical scenario is shown in figure 2.  When S and N Both
   have enabled IP FRR, so S and N will treat each other as their backup
   to the next hop of the D main path.  At this time, when downstream
   node E fails, S and N will send messages to D to each other and
   resulting in a microloop.  So the priority of node protection is
   higher than that of link protection.

4.3.  IGP-PIC Illustration














Wang, et al.             Expires 8 January 2024                 [Page 5]

Internet-Draft                    rtgwg                        July 2023


                      +---+
               +------|R2 |-------+
               |      +---+       |
               |                  |
             +---+              +---+   Prefix-1
             |R1 |              |R4 |   Prefix-2
             +---+              +---+    ...
               |                  |     Prefix-n
               |      +---+       |
               +------|R3 |-------+
                      +---+

   Figure 3: Single source PIC network diagram

   As shown in the figure 1, R4 advertides n prefix routes.  R1->R2->R4,
   R1->R3->R4.  When the link between R1 and R2 is faulty, route
   calculation is performed again.  Topology calculation is performed
   first to calculate the path to R4 from the original equal-cost path
   to the single path R1->R3->R4.  Routes from prefix-1 to prefix-n are
   recalculated, and forwarding entries are updated for all routes.

   When the number of prefix-1 to prefix-n increases, the time for route
   calculation and forwarding table update increases as the number of
   routes increases, which slows route convergence.

   For prefix-1 to prefix-n routes, since they are all advertised by R4,
   their paths are the same after switching.  In route calculation, the
   change of the route to R4 only needs to be calculated once, and the
   forwarding table to R4 needs to be updated to the new forwarding
   path.  The route from Prefix-1 to Prefix-n can be updated.  This is
   the convergence of prefix-independent routes.

   Before PIC route calculation, the prefix needs to be associated with
   the IGP Node.  In the current example, the IGP node is the real
   router R4.

        Prefix        IGP Node     NextHop
       +--------+    +----------+
       |Prefix-1|    |R4        | ---->R2
       |Prefix-2|--->|          | ---->R3
       |...     |    +----------+
       |Prefix-n|
       +--------+
      Figure 4: Single source PIC Forward

   When path switching occurs, only the forwarding path of the IGP node
   needs to be updated from the equal-cost route ECMP path R2+R3 to R3,
   without recalculating and updating all prefixes.  This saves the time



Wang, et al.             Expires 8 January 2024                 [Page 6]

Internet-Draft                    rtgwg                        July 2023


   of route calculation and forwarding table update, and improves the
   speed of route convergence.  In the process of PIC route calculation
   update, that is, the next hop information to the corresponding IGP
   node is updated regardless of the specific prefix.

               +---+?  ? +---+? Prefix-1
        +------|R2 |-------|R4 |   Prefix-2
        |      +---+       +---+   ...
        |                          Prefix-n
      +---+
      |R1 |
      +---+
        |
        |      +---+       +---+   Prefix-1
        +------|R3 |-------|R5 |   Prefix-2
               +---+       +---+   ...
                                   Prefix-n

     Figure 5: Multi-source PIC network diagram

   In the case of multiple sources, the multiple destination nodes are
   combined into combined IGP node and the path is calculated for this
   combined node.

         Prefix        IGP Node     NextHop
       +--------+    +----------+
       |Prefix-1|    |R4,R5     | ---->R2
       |Prefix-2|--->|          | ---->R3
       |...     |    +----------+
       |Prefix-n|
       +--------+
     Figure 6: Multi-source PIC Forward

   When the path changes, route calculation is performed again for the
   combined node (R4,R5), and the forwarding path is updated from the
   original R2+R3 to R3 without route calculation for all prefixes and
   forwarding table flushing.

5.  ISIS PIC

5.1.  Maintenance of ISIS IGP-nodes

   For single-source prefixes, when an ISIS LSP is received carries the
   prefix TLV, an ISIS IGP Node is created and associated with the
   prefix.  The key of ISIS IGP Node is system-id, level, and topo.

   If the prefix is advertised by the LSP of the pseudo node, the key of
   ISIS IGP Node is system-id, pseudo node ID, level, and topo.



Wang, et al.             Expires 8 January 2024                 [Page 7]

Internet-Draft                    rtgwg                        July 2023


   For multi-source prefixes, Multiple ISIS routers advertise the same
   prefix through LSPs, a combined ISIS IGP node is create and
   associated with the prefix.  The key of the combined ISIS IGP node is
   multiple (system-id, level, and topo).

5.2.  PIC Route Compute

   The procedure for route calculation is as follows,

   (1) Calculating the shortest-path tree for Level-1 and Level-2.

   (2) Calculate each routes for Level-1 and Level-2.

   When support PIC Route Compute, The procedure for route calculation
   is as follows,

   (1) Calculating the shortest-path tree for Level-1 and Level-2.

   (2) Instead of calculating routes based on each prefix, the next hop
   information is updated based on IGP-node.

6.  OSPF PIC

6.1.  Maintenance of OSPF IGP-nodes

   The key of OSPF IGP-node is router-id, area, and topo.

   When the prefix is advertised through a router-LSA, the OSPF IGP-node
   is create and the key is router-id, area, and topo.

   When the prefix is advertised through a network-LSA, the key of OSPF
   IGP-node is router-id, DR IP-Address, area, and topo.

   When the prefix is advertised through Type-3 summary-LSA, the key of
   OSPF IGP-node is ABR router-id, area, and topo.

   When the prefix is advertised through Type-5 AS-external-LSA, the key
   of OSPF IGP-node is ASBR router-id, Forwarding Address, and topo.

   For multi-source prefixes, Multiple OSPF routers advertise the same
   prefix through LSAs, a combined OSPF IGP-node is create and
   associated with the prefix.  The key of the combined OSPF IGP-node is
   multiple (router-id, area, and topo).








Wang, et al.             Expires 8 January 2024                 [Page 8]

Internet-Draft                    rtgwg                        July 2023


6.2.  PIC Route Compute

   For OSPF route calculation, see [RFC2328], chapter 16, Calculation of
   the routing table.  The procedure for route calculation is as
   follows,

   (1) Calculating the shortest-path tree for an area, and then
   calculate the intra-area routes.

   (2) Calculating the inter-area routes by examining summary-LSAs.

   (3) Examining transit areas' summary-LSAs.

   (4) Calculating AS external routes.

   When support PIC Route Compute, The procedure for route calculation
   is as follows,

   (1) Calculating the shortest-path tree for an area, and then
   calculate the intra-area routes.  Instead of calculating intra-area
   routes based on each prefix, the next hop information is updated
   based on IGP-node.

   (2) Calculating the inter-area routes by examining summary-LSAs.  If
   the ABR IGP-node has been updated, the inter-area routes do not need
   to be recalculated.

   (3) Examining transit areas' summary-LSAs.  Instead of calculating
   routes based on each prefix, the next hop information is updated
   based on Intra IGP-node and ABR IGP-node.

   (4) Calculating AS external routes.  If the ASBR IGP-node has been
   updated, the AS external routes do not need to be recalculated.

7.  Example

7.1.  ISIS PIC Route

   When the link to the IGP node changes, the topology is re-calculated
   and the corresponding next hop list is updated, without updating the
   forwarding table for each prefix.










Wang, et al.             Expires 8 January 2024                 [Page 9]

Internet-Draft                    rtgwg                        July 2023


                0000.0000.0002
             12.1.1.2+---+
             +-------|R2 |-----------+
             |       +---+           |
             |if1,12.1.1.1           |
           +---+                   +---+                 192.0.0.1/32
           |R1 | 0000.0000.0001    |R4 |  0000.0000.0004 192.0.0.2/32
           +---+                   +---+                  ...
             |if2,13.1.1.1           |                   192.168.0.10/32
             |       +---+           |
             +-------|R3 |-----------+
             13.1.1.3+---+
                0000.0000.0003

   Figure 7: Single source ISIS PIC network diagram

     Prefix        IGP Node     NextHop
   +-------------+    +----------------+
   |192.0.0.1/32 |    |0000.0000.0004  | ---->R2(Via 12.1.1.2,if1)
   |192.0.0.2/32 |--->|                | ---->R3(Via 13.1.1.3,if2)
   |...          |    +----------------+
   |192.0.0.10/32|
   +-------------+
       Figure 8: Single source ISIS PIC Forward

     Prefix        IGP Node     NextHop
   +-------------+    +----------------+
   |192.0.0.1/32 |    |0000.0000.0004  |
   |192.0.0.2/32 |--->|                |  ---->R3(Via 13.1.1.3,if2)
   |...          |    +----------------+
   |192.0.0.10/32|
   +-------------+
       Figure 7: Single source ISIS PIC Forward

   If the path to R2 is faulty, re-calculate the route and update the
   next hop information of the IGP node associated with R4.

7.2.  OSPF PIC Route

   When the link to the IGP node changes, the topology is re-calculated
   and the corresponding next hop list is updated, without updating the
   forwarding table for each prefix.









Wang, et al.             Expires 8 January 2024                [Page 10]

Internet-Draft                    rtgwg                        July 2023


             22.22.22.22
             12.1.1.2+---+
             +-------|R2 |-----------+
             |       +---+           |
             |if1,12.1.1.1           |
             +---+                   +---+                 192.0.0.1/32
             |R1 | 11.11.11.11       |R4 |  44.44.44.44    192.0.0.2/32
             +---+                   +---+                  ...
             |if2,13.1.1.1           |                   192.168.0.10/32
             |       +---+           |
             +-------|R3 |-----------+
             13.1.1.3+---+
             33.33.33.33
   Figure 9: Single source OSPF PIC network diagram

     Prefix            IGP Node             NextHop
   +-------------+    +----------------+
   |192.0.0.1/32 |    |44.44.44.44       ---->R2(Via 12.1.1.2,if1)
   |192.0.0.2/32 |--->|                  ---->R3(Via 13.1.1.3,if2)
   |...          |    +----------------+
   |192.0.0.10/32|
   +-------------+
       Figure 10: Single source OSPF PIC Forward

     Prefix              IGP Node     NextHop
   +-------------+    +----------------+
   |192.0.0.1/32 |    |44.44.44.44
   |192.0.0.2/32 |--->|                  ---->R3(Via 13.1.1.3,if2)
   |...          |    +----------------+
   |192.0.0.10/32|
   +-------------+
       Figure 11: Single source OSPF PIC Forward

   If the path to R2 is faulty, re-calculate the route and update the
   next hop information of the IGP node associated with R4.

8.  Normative References

   [I-D.ietf-rtgwg-bgp-pic]
              Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix
              Independent Convergence", Work in Progress, Internet-
              Draft, draft-ietf-rtgwg-bgp-pic-19, 1 April 2023,
              <https://datatracker.ietf.org/doc/html/draft-ietf-rtgwg-
              bgp-pic-19>.







Wang, et al.             Expires 8 January 2024                [Page 11]

Internet-Draft                    rtgwg                        July 2023


   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328,
              DOI 10.17487/RFC2328, April 1998,
              <https://www.rfc-editor.org/info/rfc2328>.

   [RFC5286]  Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for
              IP Fast Reroute: Loop-Free Alternates", RFC 5286,
              DOI 10.17487/RFC5286, September 2008,
              <https://www.rfc-editor.org/info/rfc5286>.

Authors' Addresses

   Yue Wang
   China Telecom
   Beiqijia Town, Changping District
   Beijing
   Beijing, 102209
   China
   Email: wangy73@chinatelecom.cn


   Changwang Lin
   New H3C Technologies
   China
   Email: linchangwang.04414@h3c.com


   Aijun Wang
   China Telecom
   Beiqijia Town, Changping District
   Beijing
   Beijing, 102209
   China
   Email: wangaj3@chinatelecom.cn













Wang, et al.             Expires 8 January 2024                [Page 12]