Internet DRAFT - draft-liu-rtgwg-path-aware-remote-protection
draft-liu-rtgwg-path-aware-remote-protection
Network Working Group Y. Liu
Internet Draft China Mobile
Intended status: Informational C. Lin
Expires: August 30, 2024 M. Chen
New H3C Technologies
Z. Zhang
ZTE Corporation
March 3, 2024
Path-aware Remote Protection Framework
draft-liu-rtgwg-path-aware-remote-protection-01
Abstract
This document describes the framework of path-aware remote
protection.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on August 30, 2024.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
Liu & Lin, et al. Expire August 30, 2024 [Page 1]
Internet-Draft Path-aware Remote Protection March 2024
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction...................................................2
1.1. Requirements Language.....................................3
2. Use Case.......................................................3
2.1. Spine-leaf Network........................................3
2.2. Dragonfly Network.........................................4
3. Framework......................................................5
3.1. Remote Failure Detection..................................5
3.2. Path-Aware Forwarding Plane...............................6
3.3. Path-Aware Routing Plane..................................7
4. Role Types.....................................................8
5. Protection Scope...............................................8
6. Security Considerations........................................9
7. IANA Considerations............................................9
8. References.....................................................9
8.1. Normative References......................................9
8.2. Informational References..................................9
Authors' Addresses...............................................10
1. Introduction
Current IP network protection mechanisms can be mainly divided into
local protection and end-to-end protection. Local protection
technologies, such as ECMP, LFA [RFC5714], and TI-LFA [I-D.ietf-
rtgwg-segment-routing-ti-lfa], can only perceive local failures and
perform fast reroute. End-to-end protection technologies are usually
targeted at end-to-end TE paths, where the head-end detects TE path
failures and performs rapid switchover.
There is no mechanism to quickly detect remote failures and invoke
repairs for non-TE paths. In addition, local protection such as TI-
LFA technology relies on IGP deployment. For certain networks,
current protection mechanisms may not meet the requirements. A
typical scenario is the Spine-Leaf network, such as the AI-DC
network, which is usually a two-layer architecture. Detecting remote
Liu, et al. Expires August 30, 2024 [Page 2]
Internet-Draft Path-aware Remote Protection March 2024
failures and invoking fast repairs can provide protection against
link or node failure and reduce the disruption time.
This paper proposes a path-aware remote protection mechanism and
describes its framework.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Use Case
2.1. Spine-leaf Network
+--+ +--+
Spine |R1| |R2|
+--+ +--+
| \ / |
| \ / |
| \/ |
| /\ X <- Fault
| / \ |
| / \ |
+--+ +--+
Leaf |R3| |R4|
+--+ +--+
^ |
| v
Source Destination
Figure 1
In the network shown in Figure 1, assuming that the R2-R4 link
fails, R3 will continue to send traffic to both R1 and R2, and half
of the traffic will be dropped by R2. It is not until R2 sends BGP
withdrawn routes to R3 and the control plane converges that the
traffic is fully restored. The convergence speed would be slow when
there is a large number of BGP routes.
In some Spine-leaf networks, such as DC networks, only the BGP
protocol is deployed without IGP, and thus TI-LFA cannot be applied.
On the other hand, if TI-LFA is used, the traffic path during the
protection period will be R3->R2->R3->R1->R4, which additionally
Liu, et al. Expires August 30, 2024 [Page 3]
Internet-Draft Path-aware Remote Protection March 2024
increases the traffic in the direction of R2->R3 and may cause
congestion.
The objective of path-aware remote protection is for R3 to detect
R2-R4 link failure and then adjust ECMP quickly.
2.2. Dragonfly Network
Source
|
v
+---------+
| |
| Group 1 |------------+
| | |
+---------+ |
| +---------+
| | |
X<- Fault | Group 3 |
| | |
| +---------+
+---------+ |
| | |
| Group 2 |------------+
| |
+---------+
|
v
Destination
Figure 2
In the network shown in Figure 1, the primary path for the traffic
is from Group 1 to Group 2, while the backup path detours from
Group1 through Group3 and then to Group2.
The objective of path-aware remote protection is for the routers in
Group 1 to detect the link failure between Group 1 and Group 2 and
then switch to the backup path quickly.
Liu, et al. Expires August 30, 2024 [Page 4]
Internet-Draft Path-aware Remote Protection March 2024
3. Framework
+-------------+
|Routing Plane|
+-------------+
|
| Path Info
v
+----------------+
|Forwarding Plane|
+----------------+
^
| Element Failure in Path
|
+------------------------+
|Remote Failure Detection|
+------------------------+
Figure 3
The framework of path-aware remote protection is shown in Figure 3.
On the routing plane, the route calculation is not limited to the
next hop, but requires path awareness. And then the path information
is downloaded to the forwarding plane. When a failure occurs in any
component along the path, it is required to quickly detect the
failure and invoke repairs.
3.1. Remote Failure Detection
When a failure occurs, it is first detected by the router adjacent
to it. The local failure detection may be based on existing
techniques such as BFD. Then, that router notifies its neighbors of
the failure, especially the upstream neighbors. After the remote
repairing router receives the failure notification, the remote
protection is invoked.
The failure notification between neighboring routers has the
following requirements:
o Independent of routing protocols.
o Avoiding broadcast flooding.
For one example, in a two-level spine-leaf network, a spine router
can use BFD to monitor the adjacent links. When a link fails, the
spine router can use a BGP-independent protocol to notify
Liu, et al. Expires August 30, 2024 [Page 5]
Internet-Draft Path-aware Remote Protection March 2024
neighboring leaf routers. The failure notification is limited in one
hop.
For another example, a flow-based mechanism can be used to detect
failure. When the traffic packets are dropped, a notification is
triggered and sent to neighbors in the direction of the incoming
traffic. The failure notification is limited in the upstream
direction.
The detailed mechanisms are out of the scope of this document.
3.2. Path-Aware Forwarding Plane
In the forwarding table, each next-hop is associated with a path.
When detecting any failure in the path, the protection for the
corresponding next-hop will be invoked.
Figure 4 shows the forwarding entries for ECMP next-hops.
+------+ +---------------+
|Prefix|---+-->|Next-hop: to R1|
+------+ | +---------------+
| | +----------------+
| +---------->|Path: R3->R1->R4|
| +----------------+
| +---------------+
+-->|Next-hop: to R2|
+---------------+
| +----------------+
+---------->|Path: R3->R2->R4|
+----------------+
Figure 4
Figure 5 shows the forwarding entries for primary and backup next-
hops.
Liu, et al. Expires August 30, 2024 [Page 6]
Internet-Draft Path-aware Remote Protection March 2024
+------+ +-----------------------+
|Prefix|---+-->|Primary Next-hop: to G2|
+------+ | +-----------------------+
| | +------------+
| +---------->|Path: G1->G2|
| +------------+
| +----------------------+
+-->|Backup Next-hop: to G3|
+----------------------+
| +----------------+
+---------->|Path: G1->G3->G2|
+----------------+
Figure 5
When receiving failure notification from a neighbor, the next-hop
entries corresponding to that neighbor will be checked to determine
whether the associated path information contains the failed
component. If detecting any failure in the path, the corresponding
next-hop is regarded as failed. For a failed ECMP next-hop, it will
be removed from the ECMP, and the traffic will be switched to the
other ECMP next-hops. For a failed primary next-hop, the traffic
will be switched to the backup next-hop.
3.3. Path-Aware Routing Plane
When calculating routes, the path needs to be perceived and the path
information will be attached to the next hop.
In a BGP-based network, a BGP route may carry the router-id of the
peer from which that route is received, and the router-id will be
added into the path information when calculating that route. The BGP
protocol may needs some extensions to support such feature.
For an EBGP-based DC network, a router may use the AS-PATH attribute
(with SEQUENCE type) in the BGP route as the path information,
without any protocol extensions.
In an IGP-based network, a router may compute the path information
based on the SPF tree and attach it into the next hop.
The detailed mechanisms are out of the scope of this document.
Liu, et al. Expires August 30, 2024 [Page 7]
Internet-Draft Path-aware Remote Protection March 2024
4. Role Types
******** Notification *******
* * Fault
v * |
+------+ +-------+ +---------+ |
|Remote| | Inter-| | Local | V
|Repair|-----|mediate|-----|Detection|---X---Destination
| Node | | Node | | Node | |
+------+ +-------+ +---------+ |
| |
| Repair Path |
+---------------------------------------------+
Figure 6
In the path-aware remote protection, there are three types of roles
for a router:
o Remote repair node: It has the repair path(s) and provides the
remote protection function.
o Local detection node: It is adjacent to the failure and detects
the failure first. Then, it sends failure notification messages
to the remote repair node.
o Intermediate node: It exists only if there are multiple hops
between the remote repair node and the local detection node. It
helps deliver the failure notification messages from the local
detection node to the remote repair node.
5. Protection Scope
The scope of remote protection covers at least two hops from the
remote repair node to the failure.
As the protection scope increases, the number of intermediate nodes
increases, which may slower the speed and wider the propagation of
fault notification. So, it would bring benefits to limit the scope
of remote protection to a reasonable range.
One recommendation is that, the node closest to the failure and with
a repair path should provide the protection function.
For example, in a spine-leaf network with multiple levels, usually
there are ECMP paths on every two levels. Remote protection only
needs to cover two hops.
Liu, et al. Expires August 30, 2024 [Page 8]
Internet-Draft Path-aware Remote Protection March 2024
6. Security Considerations
TBD.
7. IANA Considerations
TBD.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, May 2017
8.2. Informational References
[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC
5714, DOI 10.17487/RFC5714, January 2010,
<https://www.rfc-editor.org/info/rfc5714>.
[I-D.ietf-rtgwg-segment-routing-ti-lfa] Litkowski, S., Bashandy, A.,
Filsfils, C., Francois, P., Decraene, B., and D. Voyer,
"Topology Independent Fast Reroute using Segment Routing",
draft-ietf-rtgwg-segment-routing-ti-lfa-13 (work in
progress), January 2024.
Liu, et al. Expires August 30, 2024 [Page 9]
Internet-Draft Path-aware Remote Protection March 2024
Authors' Addresses
Yisong Liu
China Mobile
China
Email: liuyisong@chinamobile.com
Changwang Lin
New H3C Technologies
China
Email: linchangwang.04414@h3c.com
Mengxiao Chen
New H3C Technologies
China
Email: chen.mengxiao@h3c.com
Zheng Zhang
ZTE Corporation
China
Email: zhang.zheng@zte.com.cn
Liu, et al. Expires August 30, 2024 [Page 10]