Internet DRAFT - draft-mohanty-idr-secondary-label
draft-mohanty-idr-secondary-label
Network Working Group S R. Mohanty
Internet-Draft Cisco Systems, Inc.
Intended status: Standards Track I. Means
Expires: 11 January 2024 P. Ramadenu
AT&T Labs, Inc.
10 July 2023
The Secondary Label and its applications
draft-mohanty-idr-secondary-label-00
Abstract
This draft utilizes the concept of a secondary label to solve few
cases in L3VPN Deployments.In BGP VPN networks, BGP speakers
associate a local MPLS label when the next-hop is reset and advertise
that label to other peers. The receiving peer installs this
"received" label in the forwarding and forwards traffic to the
sending router using this label. In some deployments, there arises
need where a different label is required to be sent. We illustrate
with two use-cases.
This draft presents a method where this label is encoded in a newly
defined attribute that is advertised with the BGP updates targeting
these specified use-cases
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 11 January 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
Mohanty, et al. Expires 11 January 2024 [Page 1]
Internet-Draft Secondary-Label-applications July 2023
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Requirements Language . . . . . . . . . . . . . . . . . . . . 2
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
3. The per-Nexthop-received-label Mode . . . . . . . . . . . . . 3
4. Problem Description . . . . . . . . . . . . . . . . . . . . . 3
4.1. Problem Description 1 (PD#1) . . . . . . . . . . . . . . 3
4.2. Problem Description 2 (PD#2) . . . . . . . . . . . . . . 5
5. Proposed Solutions . . . . . . . . . . . . . . . . . . . . . 7
5.1. Proposed Solution PS#1 . . . . . . . . . . . . . . . . . 7
5.2. Proposed Solution PS#2 . . . . . . . . . . . . . . . . . 8
6. Secondary Label Attribute . . . . . . . . . . . . . . . . . . 9
7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 10
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
9. Operational Considerations . . . . . . . . . . . . . . . . . 10
10. Security Considerations . . . . . . . . . . . . . . . . . . . 10
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10
12. Normative References . . . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Introduction
In BGP L3VPN VPN networks, BGP speakers associate a local MPLS label
with the next-hop is reset and advertise that label to other peers.
The receiving peer installs this "received" label in the forwarding
and directs traffic to the sending router using this label. This
local label allocation is governed by the configured label allocation
mode. Broadly, most vendors already offer different allocation modes
like per-vhf, per-prefix, per-next-hop and per-nexthop-per-received-
label.
Mohanty, et al. Expires 11 January 2024 [Page 2]
Internet-Draft Secondary-Label-applications July 2023
In certain cases, the exclusive allocation of the local label is not
sufficient. In this draft, we outline use-cases where the allocation
of an additional label, hereby referred to as the secondary label, is
necessary to be communicated to the BGP peer. Using this secondary
label, the peer can impose forwarding decisions and solve some use-
cases that are significantly non-trivial to achieve with the standard
local-label allocation alone.
3. The per-Nexthop-received-label Mode
The standard behavior in case of option-B ASBR [RFC4364] is to
allocate a per-prefix label for vpn prefixes. To conserve label
space at the ASBR, many vendors implement a label allocation mode
called per-nexthop-received- label. With per-nexthop-received-label,
all prefixes received with the same next-hop and same received-label
(both together constitute the label context) will be assigned the
same local label. This approach conserves label space by avoiding
the allocation is a unique label for each prefix. In case of
Primary/backup, the context of the label allocation is the set of
tuples {(Nexthop, recvd-label)} The above implementation (originally
meant for the ASBR) also applies to RR with next-hop-self. In the
below topology (representative of a tier 1 provider topology), RR1
and RR2 have the per-nexthop-received-label mode configuration and
have next-hop-self towards each other. Both RRs receive the VPN
prefix (RD 1:1: 2.2.2.2/32) from R1 with its connected address as the
next-hop and advertise to the other RR on the cross-link after
resetting the next-hop to self.
Although we will not explain here, a similar topology can be thought
of in an dual Option-B deployment where the ASBRs will have each
other as backup [RFC2119].
4. Problem Description
4.1. Problem Description 1 (PD#1)
Mohanty, et al. Expires 11 January 2024 [Page 3]
Internet-Draft Secondary-Label-applications July 2023
+---------------------------------+
| +----+ |
| |PE1 | |
| /+----+\ |
| / \ |
| / \ |
| +----+ +----+ |
| |RR1 |-------|RR2 | |
| +----+ +----+ |
| \ / |
| \ / |
| \ / |
| +----+ |
| |PE2 | |
| +----+ |
| | |
+-----------------|---------------+
Figure 1 RR deployed with nexthop-self in a symmetric PIC Configuration
Figure 1
Figure 1 represents an all-IBGP network. PE1 is originating VPN
routes and advertising them to RR1 and RR2. Both these RRs are also
clients of each other and advertise VPN routes to each other with the
next-hop set to the peering address. Each RR considers the path from
PE1 as the best and the backup from the other RR (BGP PIC for VPNV4
and VPNV6 is configured). Label mode per-nexthop-received-label is
configured.
a. This is how the issue gets manifested.
b. Initially, RR1 receives the primary path from PE1. Local Label
allocation at RR1 has context [(PE1, LabelPE1)] and local label,
LabelRR1 is allocated. This label is advertised to RR2.
c. Similarly, RR2 receives the primary path from PE1. Local Label
allocation at RR2 has context [(PE1, LabelPE1)] and local label,
LabelRR2 is allocated. This label is advertised to RR1
Mohanty, et al. Expires 11 January 2024 [Page 4]
Internet-Draft Secondary-Label-applications July 2023
d. RR1 gets the update from RR2. It now sees the label context as
[(PE1, LabelPE1),(RR2, LabelRR2)] and allocates local label
LabelRR11. This label now becomes the received label at RR2.
e. RR2 now sees the label context as [(PE1, LabelPE1),(RR1,
LabelRR11)] and allocates local label LabelRR21. This label now
becomes the received label at RR1.
f. RR1 gets the update from RR2. It now sees the label context as
[(PE1, LabelPE1),(RR2, LabelRR21)] and allocates local label
LabelRR12. This label now becomes the received label at RR2 and
this process continues.
The root cause of the label churn in is because the local label in
RR1 (same for RR2) is an input to the label allocation context at
RR2, and the resulting allocated local label at RR2 now serves as an
input into the label allocation context at RR1. Because of this
feedback loop the situation quickly results in the RRs getting out of
label space very quickly.
Notice that if the RRs have the per-prefix label allocation mode
configured, then this sort of oscillation will not happen. However,
the per-prefix label allocation in an RR with next-hop-self
configured will also mean a unique label for every unique prefix and
that is not scalable.
4.2. Problem Description 2 (PD#2)
ISP1 and ISP2 are CE devices that establish an EBGP session with PE1
and PE2 respectively. Both ISPs advertise the same 700k prefixes/
routes to PE1 and PE2. Both PE1 and PE2 only send the default route
to the remote PE, PE0.
Mohanty, et al. Expires 11 January 2024 [Page 5]
Internet-Draft Secondary-Label-applications July 2023
+-----------------------------------------+
| +-------+ +-----+ |
| |--| PE1 |----|ISP1 | |
| | +-------+ +-----+ |
| | | \ / |
| | | \ / |
| | | \ / |
| +-----+ +--+ | \/ |
| |PE0 |---|BR| | /\ |
| +-----+ +--+ | / \ |
| | | / \ |
| | | / \ |
| | | / \ |
| | +-------+ +----+ |
| |--| PE2 |----|ISP2| |
| +-------+ +----+ |
| |
+-----------------------------------------+
Figure 2 Dual Homed CE Setup
Figure 2
a. The PE devices have VPNv4 peering between them. BR is a P
router.
b. Traffic from the remote PE, PE0, does ECMP forwarding to both PE1
and PE2.
c. The 400K routes prefer ISP1 as the egress NH and 300K routes
prefer ISP2 as the egress NH by the policy configured on PE
devices.
d. The Policy is a simple BGP policy that associates the highest
Local Preference (LP) with the preferred EBGP path, the next
highest local preference with the IBGP path and the lowest local
preference with the least preferred EBGP path.
Failure scenario 1 (FS#1) When ISP1-PE1 link goes down , after BGP at
PE1 converges, traffic will traverse the link through PE2, and then
reach ISP1.
Mohanty, et al. Expires 11 January 2024 [Page 6]
Internet-Draft Secondary-Label-applications July 2023
Failure scenario 2 (FS#2) The links from ISP1 to PE1 and PE2 are down
at the same time; traffic will go to ISP2 after the BGP convergence
at PE1 and PE2
FS#1 is a classic example when BGP PIC is appropriate . It is no
wonder therefore that the convergence is good. However, in case of
FS#2, with BGP PIC in place, this is what happens:
1. when the link between PE1 and ISP1 went down, the traffic that
ingresses on PE1 gets diverted to PE2
2. Because the link between PE2 and ISP1 was torn down and the BGP
withdraw from PE1 has not yet been received at PE2, this diverted
traffic suffers a routing lookup at PE2 and will be sent back to
PE1.
3. On PE1 it suffers a routing lookup again and diverted again to
PE2.
4. This process repeats until the BGP withdraws corresponding to
link breakages are received at the peer PEs
5. It is important to observe here that the particular label
allocation mode (per-prefix or per-next-hop)has no bearing on the
loop, it will still happen regardless.
6. FIB performance gets impacted due to the loop and new control
plane state after convergence takes more time to be installed in
the FIB.
The conclusion is BGP PIC by itself is not adequate to handle these
kinds of convergence issues arising from double link-failures.
5. Proposed Solutions
5.1. Proposed Solution PS#1
To solve the issue of Section 4.1 the concept of a secondary label is
introduced. At both the RRs, in addition to the local label another
label, hereby referred to as the secondary label, is also allocated.
This secondary label depends on the primary path exclusively i.e.
only the path from PE1 and not on the path from RR2. This secondary
label is encoded inside an attribute called the secondary label
attribute that is advertised along with the the BGP best-path
advertisement to RR2 and PE2. The format of the secondary label
attribute is described in Section 6. Similar concept is described in
[I-D.kaliraj-idr-multinexthop-attribute] but the next-hop and other
fields etc. is not included in the secondary label attribute.
Mohanty, et al. Expires 11 January 2024 [Page 7]
Internet-Draft Secondary-Label-applications July 2023
When RR2 receives the update from RR1 that it selects as its backup
path and finds the secondary label attribute, it will only consider
the label encoded in the secondary label attribute and ignore the
received label in its local-label allocation decision. It will also
program the label encoded in the secondary label attribute instead of
the received label in the forwarding imposition. As the secondary
label only depends on the primary path from PE1, it is unaffected by
the advertisement from the other RR, and the continuous label churn
is arrested immediately.
5.2. Proposed Solution PS#2
Without loss of generality, considering PE2 as the DUT, the main
reason about the inability of BGP PIC (as described above) to help in
this case is that the status of the primary link on the peer PE, PE1,
the PE1-ISP1 link, is unknown to PE2 until it receives the
corresponding BGP withdraw. Following is the main underlying idea of
our proposed solution.
1. Allocate a Primary label with the primary path pointing to the
directly connected preferred CE (best EBGP path) and the backup
to the less preferred PE (IBGP Path).
2. Allocate a 2nd label with primary path to directly connected
preferred CE (best EBGP path) and backup to the less preferred
EBGP path. This second label is advertised in the Control Plane
along with the primary label leveraging the idea of the secondary
label. But the notion that this second (backup) label is also
associated with a primary path and in case of failure also points
to another backup path is what distinguishes this from the PD#1.
Accordingly, secondary label needs to have a context
3. With the help of the Figure below, we explain our scheme with
respect to the traffic of 400K, that prefers ISP1 (The
explanation for the other 300k follows a symmetric reasoning).
4. Consider 10.10.1.0.0/24 as one such VPN prefix in the group of
400k.
5. Accordingly, PE1 allocates a primary label of 100 that points to
the primary next-hop (NH), ISP1, and to the backup NH, PE2; and,
a backup label of 200 (pointing to primary NH ISP1 and backup NH
ISP2). Similarly, PE2 allocates a primary label of 300 (this
primary label points to primary NH ISP1 and backup NH PE1) and a
backup label of 400 (pointing to primary NH ISP1 and backup NH
ISP2)
Mohanty, et al. Expires 11 January 2024 [Page 8]
Internet-Draft Secondary-Label-applications July 2023
6. Traffic from the remote PEs always uses the primary label.
Traffic sent from one peer PE to another is always sent using the
backup label.
7. Therefore, traffic to 10.10.0.0/24 from the PE0 is received on
PE1 with label 100. In the normal case, this traffic will be
sent on the direct PE1-ISP1 link.
8. Now, if link PE1-IPS1 breaks, this traffic is diverted to PE2
with label 400.
9. When this traffic is received at PE2, if the PE2-ISP1 link is up,
traffic will be forwarded to ISP1 on that link. But, now if the
PE2-ISP1 goes down, the backup path for the label 400 which
points to the NH ISP2, is activated immediately and the traffic
is directed to ISP2 on the PE2-ISP2 link.
6. Secondary Label Attribute
A new Optional Transitive Attribute will be created for carrying the
secondary label. This attribute will be referred as the secondary
label attribute. The format is as specified below.
+-----------------------------+
| Attr Flags| Attr Code = 71|
+-----------------------------+
| Length | Flags |
+-----------------------------+
| Type | Label |
+------------------------------
| Type | |
+-----------------------------+
Figure 3 Secondary Label Attribute
Mohanty, et al. Expires 11 January 2024 [Page 9]
Internet-Draft Secondary-Label-applications July 2023
Figure 3
The Secondary label attribute contains a flags field (1-byte) and a
set of Type (1 byte) and Label (3 bytes). The flag bits will be
specified in the future. The label type will denote the context, for
PS1#, the type is 0, for PS#2, the type is 1. As we find more and
more use-cases, types will be assigned appropriately.
We will request IANA assignment for the secondary label attribute
7. Conclusion
We have described two use-cases where the concept of a second label
greatly helps in optimizing network resources and improve convergence
at the potential cost of increasing the label allocation resources.
However, the advantages of the solutions with the secondary label are
the simplicity, the optimization and convergence improvements that it
provides to the network. There can be many potential use-cases for
this secondary label concept.
We will request IANA assignment for the secondary label attribute
8. IANA Considerations
Request IANA assignment for the secondary label attribute with code-
type 71
9. Operational Considerations
TBD.
10. Security Considerations
This document raises no new security issues for RT Constraints.
11. Acknowledgements
TBD.
12. Normative References
[I-D.kaliraj-idr-multinexthop-attribute]
Vairavakkalai, K. and J. M. Jeganathan, "BGP MultiNexthop
Attribute", Work in Progress, Internet-Draft, draft-
kaliraj-idr-multinexthop-attribute-07, 5 July 2023,
<https://datatracker.ietf.org/doc/html/draft-kaliraj-idr-
multinexthop-attribute-07>.
Mohanty, et al. Expires 11 January 2024 [Page 10]
Internet-Draft Secondary-Label-applications July 2023
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
2006, <https://www.rfc-editor.org/info/rfc4364>.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006,
<https://www.rfc-editor.org/info/rfc4456>.
Authors' Addresses
Satya Ranjan Mohanty
Cisco Systems, Inc.
225 West Tasman Drive
San Jose, CA 95134
United States of America
Email: satyamoh@cisco.com
Israel Means
AT&T Labs, Inc.
7337 Trade St
San Diego, CA 92121
United States of America
Email: im8327@att.com
Praveen Ramadenu
AT&T Labs, Inc.
3538 Torrance Blvd Unit 124
Torrance, CA 90503
United States of America
Email: pr9637@att.com
Mohanty, et al. Expires 11 January 2024 [Page 11]