Internet DRAFT - draft-mohanty-idr-rtc-hierarchical-rr
draft-mohanty-idr-rtc-hierarchical-rr
Network Working Group S R. Mohanty
Internet-Draft J. Alcaide
Intended status: Standards Track M. Ghosh
Expires: 13 May 2024 Cisco Systems, Inc.
10 November 2023
A solution to the Hierarchical Route Reflector issue in RT Constraints
draft-mohanty-idr-rtc-hierarchical-rr-02
Abstract
Route Target Constraints (RTC) is used to build a VPN route
distribution graph such that routers only receive VPN routes
corresponding to specified route-targets (RT) that they are
interested in. This is done by exchanging the route-targets as
routes in the RTC address-family and a corresponding "RT filter" is
installed that influences the VPN route advertisement. In networks
employing hierarchical Route Reflectors (RR) the use of RTC can lead
to incorrect VPN route distribution and loss in connectivity as
detailed in an earlier draft . Two solutions were provided to
overcome the problem.
This draft presents a method with suggested modifications to the RTC
RFC in order to solve the hierarchical RR RTC problem in an efficient
manner.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 13 May 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
Mohanty, et al. Expires 13 May 2024 [Page 1]
Internet-Draft Hierarchical RR RT-Constraints November 2023
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Requirements Language . . . . . . . . . . . . . . . . . . . . 2
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
3. RTC and RR Rules . . . . . . . . . . . . . . . . . . . . . . 3
4. Problem Definition . . . . . . . . . . . . . . . . . . . . . 3
5. Topology Considerations . . . . . . . . . . . . . . . . . . . 6
6. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 7
6.1. Overwriting of Attributes . . . . . . . . . . . . . . . . 7
6.2. Receiver Acceptance Rule . . . . . . . . . . . . . . . . 11
6.3. Optimization when only one Client advertises RTC . . . . 11
7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
9. Operational Considerations . . . . . . . . . . . . . . . . . 12
10. Security Considerations . . . . . . . . . . . . . . . . . . . 12
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
12. Normative References . . . . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Introduction
Hierarchical RR [RFC4456] deployments with VPN [RFC4364] working in
conjunction with RTC [RFC4684] may result in sub-optimal and
incorrect VPN route distribution that is nicely described in
[I-D.ietf-idr-rtc-hierarchical-rr]. The root reason for this is the
way the RR rules for RTC are defined in [RFC4684]. The authors of
[I-D.ietf-idr-rtc-hierarchical-rr] furnish two solutions for the
problem, one based on add-paths and the other based on diverse-paths
constructs. In this memo, we present another another solution to the
very same problem.
Mohanty, et al. Expires 13 May 2024 [Page 2]
Internet-Draft Hierarchical RR RT-Constraints November 2023
3. RTC and RR Rules
When advertising RT membership NLRI to a route-reflector client,
Section 3.2 of [RFC4684] advocates the advertising RR to set the
ORIGINATOR_ID attribute [RFC4456] to its own router-id, and the Next-
hop attribute to be set to the local address for that session.
However, this creates the issue in hierarchical RR setups as
explained in [I-D.ietf-idr-rtc-hierarchical-rr]. Fig. 1 represents
the same Figure as in [I-D.ietf-idr-rtc-hierarchical-rr]. When RR-2
and RR-3 advertise RT-1 to RR-1, the latter will choose one of the
routes to be best and will advertise the same to RR-2 and RR-3
respectively after setting the ORIGINATOR_ID and next-hop to itself.
Note that RR-1 will also add its own CLUSTER_ID [RFC4456]to the
CLUSTER_LIST but importantly not overwrite the CLUSTER_ID of the
sender. This leads to the issue explained in
[I-D.ietf-idr-rtc-hierarchical-rr].
4. Problem Definition
In the Fig 1, when RR-1 chooses the route from RR-2 as the best
route, and formats the next-hop and ORIGINATOR_ID as explained above
and then advertises the route to RR-2, RR-2 will drop the route
reflected from RR-1 because of the CLUSTER_ID check.
Mohanty, et al. Expires 13 May 2024 [Page 3]
Internet-Draft Hierarchical RR RT-Constraints November 2023
+---------------------------------+
| +----+ |
| Clu-1 |RR-1| |
| /+----+\ |
| / \ |
| +----+ +----+ |
| Clu-2 |RR-2| |RR-3| Clu-3 |
| +-/--+ +/--\+ |
| / / \ |
| +----+ +----+ +----+ |
| |PE-1| |PE-2| |PE-3| |
| +----+ +----+ +----+ |
| | | | |
+-------|----------|---------|----+
RT-1 | RT-1 | | RT-1
+--------+ +--------+ +--------+
| VPN-1 | | VPN-1 | | VPN-1 |
+--------+ +--------+ +--------+
Figure 1 Hierarchical RR Setup with RTC
Figure 1
RR-2 will therefore not form the outbound filter of RT-1 towards RR-1
which means that after convergence RR-2 will not advertise VPN routes
to RR-1 anymore. This leads to an incorrect VPN route distribution
across the network.
In the scenario of Fig 2. CE-1 is multi-homed to PE-1 and PE-2 and
wants to communicate with CE-2 which is behind PE-4. As explained
earlier, because RR-1 chooses RR-2 path as best in the RTC family,
RR-1 is only receiving the VPN route from RR-3 (and not RR-2) in the
steady state.
Mohanty, et al. Expires 13 May 2024 [Page 4]
Internet-Draft Hierarchical RR RT-Constraints November 2023
+---------------------------------+
| +----+ +-----+ | +------------+
| Clu-1 |RR-1| ---|PE-4 |- - -| VPN-1 (CE2)|
| /+----+\ +-----+ | +------------+
| / \ |
| +----+ +----+ |
| Clu-2 |RR-2| |RR-3| Clu-3 |
| +-/--+ +--\-+ |
| / \ |
| +----+ +----+ |
| |PE-1| |PE-2| |
| +----+ +--/-+ |
| \ / |
+----------\-----------/----------+
\ RT-1 /
+--------+ ----|
| VPN-1 (CE1) |
--------------|
Figure 2 Hierarchical RR Setup with RTC with dual-homed CE
Figure 2
Notice that even though the link between between RR-3 and RR-1 comes
down, The RR-2 PATH still remains as best in the RTC address-family
at RR-1 and the VPN route advertisements to RR-1 from RR-2 still
continue to be blocked. Thus even though there is an alternative
connectivity from CE-1 to PE-4 via PE-1, RR-2 and RR-1, the BGP VPN
routes cannot be sent. In fact CE-1 is completely cut-off from rest
of the network. Generalizing, it means that in a hierarchical RR
with only a single first-level RR as its client, the solution is
completely broken. Notice that without RTC, RR-1 would have both VPN
paths and the loss of connectivity to RR-3 would just result in local
convergence at RR-1 subject to the time when the path from RR-2
becomes best.
The solutions presented in [I-D.ietf-idr-rtc-hierarchical-rr] are
based on
a. Addpath, RR-1 will advertise both the paths from RR-2 and RR-3 to
RR-2 and RR-3 so that each of the first level RRS will accept at
least one of the routes and install the filter
Mohanty, et al. Expires 13 May 2024 [Page 5]
Internet-Draft Hierarchical RR RT-Constraints November 2023
b. When RR-1 will advertise the best-path to a client or non-client
speaker, and that speaker is the one whose path is the best, the
advertising router will use the most "diverse" path (different
next-hop and ORIGINATOR_ID than the best-path) to accomplish the
same goal, i.e. the path will be accepted at the receiving
speaker
One of the problems of solution 1 are a higher management burden
(higher level RR need to be identified, add-paths need to be
configured) and therefore an increase in the number of paths to be
advertised. The decision on what paths to be advertised also
increase management burden (1 extra path, as suggested, may not be
enough – there are scenarios where the CLUSTER_LIST of the second
best path will contain the cluster-id of the peer). Even advertising
all the paths, a NPR scheme cannot be guaranteed, as it can be
inferred from some of the examples we’ll present below.
For solution 2, a measure of how disjoint are the paths is not well
defined. But suffers of the same problems than solution 1. In
addition, the new requirement is sending a different update for every
client. This effectively breaks the shared peer update-formatting
implementation than most vendors use.
In the next section, we provide a solution, that does not require
add-path and also improves upon [RFC4684] while solving this
hierarchical RR issue in RTC.
5. Topology Considerations
By the rules of [RFC4456], route-reflector client is a property
defined by a given BGP speaker to each of its peering session
(independently on whether the BGP peer defines it as well or not).
This flexible definition can be used to configure non-canonical RR
networks (for instance, two peer BGP speakers defining each other as
route-reflector clients). Regardless of the recommendation of using
this non-canonical networks, they can be used in a RR network without
loss of connectivity.
Within the scope of RTC, only RR canonical networks are supported.
By a RR canonical network, we define a network where each speaker can
have the role of a given level within the hierarchy (e.g. RR 1st
tier, RR 2nd tier, client), and a higher level can only have as a
client a speaker of a lower level. In a RR canonical network, a
speaker advertising a route to a client, will never receive this
route back. The requirement for a canonical network to propagate RTC
routs is implicit in [RFC4684], but is hereby formalized.
Mohanty, et al. Expires 13 May 2024 [Page 6]
Internet-Draft Hierarchical RR RT-Constraints November 2023
An additional consideration, as we will see in some of the examples
below, it’s also desirable for VPN routes to fully propagate (.e.
equivalent to not having RTC routes at all).
6. Proposed Solution
To solve the problem described, a given client needs to use the RTC
route to be create a VPN filter towards the RR, also when the RR is
sending back the RTC route advertised by the client. Loop prevention
is avoided in [RFC4684] by overwriting attributes that could trigger
it. But, as described, this overwriting is only effective when there
is only one level of RRs.
Two solutions are proposed, one for the sender of RTC routes, that
generalizes [RFC4684], and one for the receiver of RTC routes, that
uses a different paradigm than the one described on [RFC4684]. Only
one need to be implemented. Implementing both, one at the receiver
and one at the sender, allows easier interoperability with non-
compliant implementations. If sender option is implemented, it will
have preference over receiver option (that will become a NOOP).
6.1. Overwriting of Attributes
This rule is to be used by the sender of RTC routes.
When a RR reflects RTC route from RR client to RR client, some
attributes of the route may be overwritten when advertising the best
RTC route. This overwrite is particular for RTC address family and
will not happen for other address-families. It disables loop
detection via those attributes when the best RTC route routes are
advertised back to its originators. This is needed in case there are
other non-best RTC routes; it allows the originator of the best RTC
route to receive a RTC for the route-target of interest and to create
its own VPN RT filter towards the RR.
The above is a described in [RFC4684], by overwriting ORIGINATOR_ID
and NEXT_HOP attributes ((section 3.2, rule (i)). The proposed new
rules are a generalization of this concept by the means of
overwriting replacing CLUSTER_LIST as well. This new behavior allows
the correct propagation of RTC routes at higher level RR.
When reflecting the (best-path) RTC route from RR client to RR
client, the following rules will apply:
* When RTC route has CLUSTER_LIST, overwrite all CLUSTER_ID of
CLUSTER_LIST to local CLUSTER_ID. Note that when advertising that
RTC route, the local CLUSTER_LIST will still be prepending per
usual rules.
Mohanty, et al. Expires 13 May 2024 [Page 7]
Internet-Draft Hierarchical RR RT-Constraints November 2023
* ORIGINATOR_ID is set or overwritten with local router-id.
* NEXT_HOP is overwritten with local peering address (next-hop-
self).
* A RTC route will be always advertised to the client we received it
from.
Note that the rules above only exposes RTC routes to routing loops
(by overwriting attributes) in the client to client top to down
direction (i.e. from client to client). Thus, this draft restricts
RFC4684 into disallowing attribute overwrite into non-client to
client direction.
In Figure 3, consider a case similar to the case in Figure 1 but with
3 levels of RR. Assume there is one physical link for each BGP
peering, each with the same IGP cost. Both PE-4 and PE-5 originate a
RTC route. Propagation of RTC routes is PE-4->RR-4->RR-2->RR-1 and
PE-5->RR-5->RR-3->RR-1. RR-1 choses as best the RTC route from RR-2.
It reflects it back to RR-2 and RR-3 with ORIGINATOR_ID=router-id-of-
RR-1 and CLUSTER_LIST ={ Clu-1, Clu-1, Clu-1}. RR-2 still prefers the
route from RR-4, but it accepts the route received from RR-1. Thus
RR-2 creates a VPN filter towards RR-1 to propagate the VPN route.
In this case, the RTC route received from RR-1 stops at RR-2, so only
the overwriting of the first cluster-id of the CLUSTER_LIST was
strictly necessary.
Mohanty, et al. Expires 13 May 2024 [Page 8]
Internet-Draft Hierarchical RR RT-Constraints November 2023
+---------------------------------+
| +----+ |
| |RR-1| Clu-1 |
| /+----+\ |
| / \ |
| +----+ +----+ |
|Clu-2 |RR-2| |RR-3| Clu-3|
| +-|--+ +--|-+ |
| | | |
| +----+ +----+ |
|Clu-4 |RR-4| |RR-5| Clu-5|
| +--|-+ +--|-+ |
| | | |
| +--|-+ +--|-+ |
| |PE-4| |PE-5| |
| +----+ +----+ |
| | | |
+----------|-------------|--------+
RT-1 | | RT-1
+--------+ +--------+
| VPN-1 | | VPN-1 |
+--------+ +--------+
Figure 3 Example of overwriting CLUSTER_LIST with different cluster-ids
Figure 3
Consider a similar scenario in Figure 4. In this case, tier II and
tier III of RRs have each the same cluster-id. IGP costs are not
exactly defined but assume that they are the cause of the route-
propagation that follows. Both PE-4 and PE-5 originate a RTC route.
One propagation is PE-5->RR-5->RR-3->RR-1. The IGP costs are such
that RR-2 prefers the route received from RR-1. RR-2 reflects the
route from RR-1 to RR-4, and RR-4 accepts it because it receives
CLUSTER_LIST = {Clu-2, Clu-1, Clu-1, Clu-1} (after RR-1 overwrote and
RR-2 prepended). Similarly, RR-3 reflects the route received from
RR-5 to RR-4, and RR-4 accepts it because it receives CLUSTER_LIST =
{Clu-2, Clu-2} (after RR-3 overwrote it).
Consider now a different set rule: only the first cluster-id of the
CLUSTER_LIST is overwritten. In this case, then RR-4 would have
received CLUSTER_LIST = {Clu-2, Clu-1, Clu-1, Clu-3}. RR-4 would have
discarded the update. The end result is that RR-4 would not install
the VPN filter towards RR-2 and it would not advertise VPN routes
towards RR-2. This becomes a network where the VPN routes are not
fully propagated (i.e. the propagation of VPN routes is different
than if there were no RTC routes at all). In this kind of network,
Mohanty, et al. Expires 13 May 2024 [Page 9]
Internet-Draft Hierarchical RR RT-Constraints November 2023
VPN routes still reach PE-6. However, if RR-3/RR-5 went down, VPN
routes would not immediately reach RT-1. RTC routes would have to
reconverge and then a filter would be installed to allow RR-4 to
advertise routes to RR-2. Thus, convergence would suffer.
It can be seen that for the general case it’s necessary to overwrite
all the cluster-id of the CLUSTER_LIST.
+----------------------------------+
| +----+ +-----+ | RT-1 +-------+
| Clu-1 |RR-1| ---|PE-6 |----------| VPN-1 |
| /+----+\ +-----+ | +-------+
| / \ |
| +----+ +----+ |
|Clu-2 |RR-2| |RR-3| Clu-2|
| +-|--+\ / +--|-+ |
| | \ / | |
| +----+ X +--|-+ |
|Clu-3 |RR-4| -/ \ -|RR-5| Clu-3|
| +--|-+ +--|-+ |
| | \ / | |
| +--|-+ \ / +--|-+ |
| |PE-4| X |PE-5| |
| +----+ -- / \--+----+ |
| | | |
+----------|-------------|---------+
RT-1 | | RT-1
+--------+ +--------+
| VPN-1 | | VPN-1 |
+--------+ +--------+
Figure 4 Example of overwriting CLUSTER_LIST with same
Figure 4
RFC4684 is not explicit about it, but the underlying assumption is
that a route received from a route-reflector-client MUST be reflected
back to that client. Hereby, this is made explicit.
The following recommended (NEXT_HOP-IGNORE) rules can be implemented:
* When reflecting a RTC route, NEXT_HOP overwrite is disabled.
* When receiving A RTC route, it is not discarded even if the
received NEXT_HOP is one of the IP addresses of the speaker.
Mohanty, et al. Expires 13 May 2024 [Page 10]
Internet-Draft Hierarchical RR RT-Constraints November 2023
The NEXT_HOP-IGNORE rules effectively allow using the same the same
NEXT_HOP across the network. They are a change respect [RFC4684]
even for a single level of RR. Note that disabling NEXT_HOP check
doesn’t create any more loop conditions in a canonical network.
An advantage of using the NEXT_HOP-IGNORE rules is that the selection
of best-path RTC route is now determined by the IGP cost to the
original next-hop. Otherwise, propagation of RTC routes is more
unforeseeable and it depends on the IGP costs towards the peering
address of each individual peer.
6.2. Receiver Acceptance Rule
This rule is to be used by the receiver of RTC routes.
When receiving a RTC route, the following rules will apply:
1. CLUSTER_ID, ORIGINATOR_ID and NEXT_HOP checks will be considered,
but instead of discarding the routes, the route will be kept in
Adj-RIB-IN as a Received-only route.
2. A route in Received-only state will not be considered for best
-path nor advertised to any peer
3. A route in Received-only state will be considered to install a
VPN filter.
The rules above apply also to just one level of RR, and it’s a
solution not contemplated in RFC4684.
The rules above will allow propagation of RTC routes in a different
way than using the sender option rules (with sender option, non-
client to client propagation will not be stopped). But the creation
of VPN filters will be the same in a standard RR topology.
6.3. Optimization when only one Client advertises RTC
An additional optional route is defined to optimize the propagation
of RTC routes to the RR when unnecessary.
When reflecting the (best-path) RTC route from RR client to RR
client, the following rule will apply:
1. -When the RR best RTC route is from a client and that RTC route
is not being received from any other peer, the RR MAY skip the
advertisement towards that client.
Mohanty, et al. Expires 13 May 2024 [Page 11]
Internet-Draft Hierarchical RR RT-Constraints November 2023
The rule above can be used as an optimization even if only the
receiver rule is implemented.
7. Conclusion
With the procedures it is not necessary for the RR to know in which
level it is operating. The above rules are compatible. We always
advertise best-path for any rule and it is easily seen that RR-2 will
accept the RT Constraint path advertised from RR-1 . Since the path
is accepted, the RT Filter at RR-2 will pass the VPN routes, and the
problem scenarios are resolved accordingly.
With this specification in the RT-Constraint address-family, we solve
both the incorrect and sub-optimal issues as mentioned above. There
is no need for add-paths. We can also optimize over [RFC4684] on RTC
advertisements based on diversity of ORIGINATOR_ID and CLUSTER_ID so
that a higher level RR does not have to be populated with VPN routes
with a specific RT if that RT is not present in other clusters.
8. IANA Considerations
None.
9. Operational Considerations
TBD.
10. Security Considerations
This document raises no new security issues for RT Constraints.
11. Acknowledgements
The authors would like to thank Swadesh Agrawal and M. Mirza for
useful discussions related to hierarchical RR RTC.
12. Normative References
[I-D.ietf-idr-rtc-hierarchical-rr]
Dong, J., Chen, M., and R. Raszuk, "Extensions to RT-
Constrain in Hierarchical Route Reflection Scenarios",
Work in Progress, Internet-Draft, draft-ietf-idr-rtc-
hierarchical-rr-03, 3 July 2017,
<https://datatracker.ietf.org/doc/html/draft-ietf-idr-rtc-
hierarchical-rr-03>.
Mohanty, et al. Expires 13 May 2024 [Page 12]
Internet-Draft Hierarchical RR RT-Constraints November 2023
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
2006, <https://www.rfc-editor.org/info/rfc4364>.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006,
<https://www.rfc-editor.org/info/rfc4456>.
[RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk,
R., Patel, K., and J. Guichard, "Constrained Route
Distribution for Border Gateway Protocol/MultiProtocol
Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual
Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684,
November 2006, <https://www.rfc-editor.org/info/rfc4684>.
Authors' Addresses
Satya Ranjan Mohanty
Cisco Systems, Inc.
225 West Tasman Drive
San Jose, CA 95134
United States of America
Email: satyamoh@cisco.com
Juan Alcaide
Cisco Systems, Inc.
225 West Tasman Drive
San Jose, CA 95134
United States of America
Email: jalcaide@cisco.com
Mrinmoy Ghosh
Cisco Systems, Inc.
225 West Tasman Drive
San Jose, CA 95134
United States of America
Email: mrghosh@cisco.com
Mohanty, et al. Expires 13 May 2024 [Page 13]