Internet DRAFT - draft-wz-bess-evpn-vpws-as-vrf-ac
draft-wz-bess-evpn-vpws-as-vrf-ac
BESS WG Y. Wang
Internet-Draft Z. Zhang
Intended status: Standards Track ZTE Corporation
Expires: 1 March 2022 28 August 2021
EVPN VPWS as VRF Attachment Circuit
draft-wz-bess-evpn-vpws-as-vrf-ac-02
Abstract
When a VRF Attachment Cirucit (VRF-AC) is far away from its IP-VRF
instance, we can deploy an EVPN VPWS ([RFC8214]) between that VRF-AC
and its IP-VRF instance. From the viewpoint of the IP-VRF instance,
a local virtual interface takes the place of that remote "VRF-AC".
The IP address for that VRF-AC is now configured to the virtual
interface, in other words, the virtual interface is the actual VRF-AC
of the IP-VRF instance. The virtual interface is also the AC of that
VPWS instance, in other words, the virtual interface is cross-
connected to that remote "VRF-AC" by the VPWS instance.
This document proposes an extension to
[I-D.ietf-bess-evpn-inter-subnet-forwarding] to support this
scenario.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 1 March 2022.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
Wang & Zhang Expires 1 March 2022 [Page 1]
Internet-Draft EVPN VPWS as VRF-AC August 2021
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Integrated Routing and Cross-connecting . . . . . . . . . 3
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5
2. ARP/ND Synching and IP Prefix Synching . . . . . . . . . . . 6
2.1. Constructing MAC/IP Advertisement Route . . . . . . . . . 7
2.1.1. When CEs are Hosts . . . . . . . . . . . . . . . . . 7
2.1.2. When CEs are Routers . . . . . . . . . . . . . . . . 8
2.2. Constructing Ethernet A-D Route . . . . . . . . . . . . . 8
2.3. Constructing IP Prefix Advertisement Route . . . . . . . 9
2.3.1. Direct-Prefixes Advertisement . . . . . . . . . . . . 9
2.3.2. Exclusive CE-Prefixes of Each CE . . . . . . . . . . 9
3. Packet Walk Through . . . . . . . . . . . . . . . . . . . . . 10
3.1. When CEs are Hosts . . . . . . . . . . . . . . . . . . . 10
3.2. When CEs are Routers . . . . . . . . . . . . . . . . . . 11
4. Fast Convergence for Routed Traffic . . . . . . . . . . . . . 11
5. Considerations on ABRs and Route Reflectors . . . . . . . . . 12
6. For Common CE-prefixes behind R1 and R2 . . . . . . . . . . . 12
6.1. Solution 1: Independent CE-BGP sessions . . . . . . . . . 12
6.2. Solution 2: ECMP-Merging for RT-5G routes . . . . . . . . 13
6.2.1. ECMP-Merging by RT-5L . . . . . . . . . . . . . . . . 15
6.2.2. ECMP-Merging by RT-2R . . . . . . . . . . . . . . . . 15
6.3. Solution 3: RT-5E Routes Advertisement . . . . . . . . . 16
6.3.1. CE-Prefix Advertisement by RT-5E Routes . . . . . . . 16
6.3.1.1. When Internal Remote PEs Receive the RT-5E . . . 18
6.3.1.2. When External Remote PEs Receive the RT-5E . . . 18
6.3.1.3. Packet Walk Through . . . . . . . . . . . . . . . 18
6.3.2. The Advertisement of SOI-mapping Routes . . . . . . . 19
6.3.3. IP-mapping SOI Extended Community . . . . . . . . . . 19
7. Security Considerations . . . . . . . . . . . . . . . . . . . 20
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
9. Normative References . . . . . . . . . . . . . . . . . . . . 20
10. Informative References . . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21
Wang & Zhang Expires 1 March 2022 [Page 2]
Internet-Draft EVPN VPWS as VRF-AC August 2021
1. Introduction
When a VRF Attachment Cirucit (VRF-AC) is far away from its IP-VRF
instance, we can deploy an EVPN VPWS ([RFC8214]) between that VRF-AC
and its IP-VRF instance. From the viewpoint of the IP-VRF instance,
a local virtual interface takes the place of that remote "VRF-AC".
The IP address for that VRF-AC is now configured to the virtual
interface, in other words, the virtual interface is the actual VRF-AC
of the IP-VRF instance. The virtual interface is also the AC of that
VPWS instance, in other words, the virtual interface is cross-
connected to that remote "VRF-AC" by the VPWS instance.
The requirements of this scenario is described in Section 1.1.
1.1. Integrated Routing and Cross-connecting
When an IP-VRF instance and an EVPN VPWS instance are connected by an
virtual-interface, We call such scenarios as Integrated Routing and
Cross-connecting (IRC) use-case, and the virtual-interface connecting
EVPN VPWS and IP-VRF is called as IRC interface, because that the
packets received from the virtual-interface is routed in the IP-VRF
and the data packets sent to the virtual-interface is cross-connected
to the remote AC of that EVPN VPWS.
The IRC use case are illustrated by the following figure:
PE1
+---------------------+
| IRC1=10.9 |
| +-----+ +------+ |.
.| |VPWS1|---|IPVRF1| | .
. | +-----+ +------+ | .
PE4 . | | . PE3
+--------+. +---------------------+ +---------+
| | | | |
|+-----+ | | RT-2 |+------+ |
||VPWS1| | | <10.2, M1> ||IPVRF1| |
|+-----+ | | label2=IPVRF1 |+------+ |
| | | | label1=VPWS1 | | |
+---|----+. | RT=VPWS1 .+---|-----+
| . PE2 V . |
| . +---------------------+ . |
| .| IRC1=10.9 |. |
N1=10.2 | +-----+ +------+ | N3=30.2
| |VPWS1|---|IPVRF1| |
Behind N1: | +-----+ +------+ |
60.0/24 | |
70.0/24 +---------------------+
Wang & Zhang Expires 1 March 2022 [Page 3]
Internet-Draft EVPN VPWS as VRF-AC August 2021
Figure 1: ARP/ND Synchronizing for IRC Interfaces
There are four PE nodes named PE1/PE2/PE3/PE4 in the above network.
PE4 is a pure EVPN VPWS PE, there may be no IP-VRFs on it. PE3 is a
pure L3 EVPN PE, there may be no VPWSes or MAC-VRFs on it. PE1 and
PE2 are the border of the EVPN VPWS domain and the L3 EVPN domain, so
they are both EVPN VPWS PE and L3 EVPN PE, there will be both EVPN
IP-VRFs and EVPN VPWSes on them.
N1/N2/N3/N1b may be a host or an IP router. N1/N1b and IRC1 is in
the subnet 10.0.0.0/24, where N1's IP is 10.0.0.2, N1b's IP is
10.0.0.3 and IRC1's IP is 10.0.0.9 (10.9). N2 and IRC2 (see
Figure 3) is in the subnet 20.0.0.0/24, where N2's IP is 20.0.0.2 and
IRC2's IP is 20.0.0.9 (20.9). N3 is in the subnet 30.0.0.0/24. When
N1/N2/N3/N1b is a host, it is also called H1/H2/H3/H1b in this
document. When N1/N2/N3/N1b is a router, it is also called R1/R2/R3/
R1b in this document. N1/N2/N3/N1b's MAC address is M1/M2/M3/M1b
respectively.
When N1 is a Router, there are two subnets behind N1, these subnets
are 60.0/24 and 70.0/24.
Note that there may be L2 switches between N1/N2/N3/N4 and their PEs.
These switches are not shown in Figure 1.
Note that the IRC interfaces are considered as AC interfaces in EVPN
VPWS instances. At the same time, they are considered as VRF-ACs in
IP-VRF instances.
When N1 sends an ARP Request REQ_P1, then REQ_P1 will be forwarded by
PE4 to either PE1 or PE2, not to the both. Both the IRC1 on PE1 and
PE2 are N1's subnet-gateway(SNGW). But when N3 send an ARP Reply
REP_P2 to N1, then PE3 may load-balance REP_P2 to either PE1 or PE2,
not to the both.
When REQ_P1 is load-balanced to PE1, not to PE2, but PE3 load-balance
REP_P2 to PE2, The ARP entry of N1 will not be prepared on PE2 for
REP_P2. So the fowarding of REP_P2 will be delayed due to ARP
missing.
We use RT-2 routes to advertise the ARP entry of N1 from PE2 to PE3.
Note that an ESI may be assigned to IRC1 and IRC2, But it is not
necessary to advertise that ESI in the L3 EVPN domain in some
scenarios. The ESI may be advertised in the EVPN VPWS domain only,
in such scenarios.
Wang & Zhang Expires 1 March 2022 [Page 4]
Internet-Draft EVPN VPWS as VRF-AC August 2021
1.2. Terminology
Most of the terminology used in this documents comes from [RFC7432]
and [I-D.ietf-bess-evpn-prefix-advertisement] except for the
following:
* VRF AC: VRF Attachment Circuit, An Attachment Circuit (AC) that
attaches a CE to an IP-VRF. It is defined in [RFC4364].
* IRC: Integrated Routing and Cross-connecting, thus a IRC interface
is the virtual interface connecting an IP-VRF and an EVPN VPWS.
* L3 EVI: An EVPN instance spanning the Provider Edge (PE) devices
participating in that EVPN which contains VRF ACs and maybe
contains IRB interfaces or IRC interfaces.
* IP-AD/EVI: Ethernet Auto-Discovery route per EVI, and the EVI here
is an IP-VRF.
* IP-AD/ES: Ethernet Auto-Discovery route per ES, and the EVI for
one of its route targets is an IP-VRF.
* CE-BGP: The BGP session between PE and CE. Note that CE-BGP route
doesn't have a RD or Route-Target.
* RMAC: Router's MAC, which is signaled in the Router's MAC extended
community.
* RT-2R: When a MAC/IP Advertisement Route is used in the context of
an IP-VRF, it is called as a RT-2R in this draft.
* RT-5E: An EVPN Prefix Advertisement Route with a non-reserved ESI.
* RT-5G: An EVPN Prefix Advertisement Route with a zero ESI and a
non-zero GW-IP.
* RT-5L: An EVPN Prefix Advertisement Route with both zero ESI and
zero GW-IP, but a valid MPLS label.
* SOI: Supplementary Overlay Index (see Section 6.3.3), the SOI is
used together with an ESI to select IP A-D per EVI routes.
* Internal Remote PE: When PEx is called as an EVPN route ERy's
internal remote PE, that is saying that, PEx is on the ES which is
identified by ERy's ESI field. When ERy's SOI is not zero, that is
aslo saying that PEx has been attached to the ethernet tag which is
identified by the <ESI, SOI>.
Wang & Zhang Expires 1 March 2022 [Page 5]
Internet-Draft EVPN VPWS as VRF-AC August 2021
* External Remote PE: When PEx is called as an EVPN route ERy's
external remote PE, that is saying that, PEx is not on the ES which
is identified by ERy's ESI field. When ERy's SOI is not zero, PEx
may aslo be a PE which has not been attached to the ethernet tag
which is identified by the <ESI, SOI>.
* CE-Prefix: When an IP prefix can be reached through CEx from PEy,
that IP prefix is called as PEy's CE-prefix behind CEx in this
draft. PEy's CE-prefix behind CEx is also called as PEy's CE-
prefix for short in this draft.
* Common CE-Prefix: When an CE-Prefix can be reached through either
CEy or CEz from PEy, in this draft, it is called as a common CE-
Prefix of CEy and CEz,from the viewpoint of PEy.
* Exclusive CE-Prefix: When an CE-Prefix of PEy can be reached
through CEy, and it can't be reached through other CEs of PEy, it
is called as an exlusive CE-Prefix of CEy, from the viewpoint of
PEy.
* SNGW: Sub-Net-specific Gate Way IP address, the SNGW of a subnet
is an IP address which is used by the hosts of that subnet to be
the nexthop of the default route of these host.
* Intermediate subnet: The subnet that connects a PE and a CE of a
L3 EVI.
* Intermediate SNGW : The SNGW of a intermediate subnet. It will be
the IP address of a IRC interface in this draft.
* Intermediate nexthop : The CE's IP address in the intermediate
subnet.
* Overlay nexthop : The CE-Prefix's nexthop IP address which is in
the address-space of the L3 EVI.
* Original Overlay nexthop : The overlay nexthop which is advertised
by the CE through a PE-CE route protocol.
2. ARP/ND Synching and IP Prefix Synching
IP-MAC relations of hosts are learnt by PEs on the access side via a
control plane protocol like ARP. In case where N1 is multihomed to
multiple L3 EVPN PE nodes by an All-Active EVPN VPWS, N1's Host IP/
MAC will be learnt and advertised in the MAC/IP Advertisement Route
only by the PE that receives the ARP packet. The MAC/ IP
Advertisement with non-zero ESI will be received by the other
multihomed PEs.
Wang & Zhang Expires 1 March 2022 [Page 6]
Internet-Draft EVPN VPWS as VRF-AC August 2021
As a result, after PE2 receives the MAC/IP Advertisement and imports
it to the VPWS Service Instance, PE2 installs an ARP entry to the
VPWS Service instance's IRC interface. Such ARP entry is called as
remote synched ARP Entry in this document.
Note that the PE3 follows the DGW1 behavior of
[I-D.ietf-bess-evpn-prefix-advertisement]'s section 4.1 to achieve
the load balancing procedures based on the recursive route resolution
by the GW-IP Overlay Index.
When PE3 load balance the traffic towards PE1/PE2, both PE1 and PE2
would have been prepared with corresponding ARP entry yet because of
the following ARP synching procedures.
2.1. Constructing MAC/IP Advertisement Route
The CEs may be hosts or routers, these factors may have an influence
on how the MAC/IPs of these CEs should be advertised.
* The CEs are Hosts - In this case, there may be many hosts in the
subnet of an IRC interface.
It is not necessary for the MAC/IP routes of these hosts to be
imported by their external remote PEs (e.g. PE3). These MAC/IP
routes just need to be imported by their internal remote PEs (e.g.
PE1/PE2).
* The CEs are Routers - In this case, there may be few Routers in
the subnet of an IRC interface.
The MAC/IP routes of these routers should be imported by their
external remote PEs (e.g. PE3), because that the GW-IP of the RT-
5G routes (see Section 2.3) of the CE-prefixes behind these
routers should be resolved to these MAC/IP routes.
This draft introduces a new usage/construction of MAC/IP
Advertisement route to enable ARP/ND synching for IP addresses in
EVPN IRC use-cases. The usage/construction of this route remains
similar to that described in
[I-D.ietf-bess-evpn-inter-subnet-forwarding] with a few notable
exceptions as below.
2.1.1. When CEs are Hosts
* The Route-Distinguisher should be set to the corresponding EVPN-
VPWS context.
Wang & Zhang Expires 1 March 2022 [Page 7]
Internet-Draft EVPN VPWS as VRF-AC August 2021
* The Ethernet Tag should be set to the VPWS Service Instance
Identifier of the IRC interface.
* The MAC/IP Advertisement SHOULD carry one EVI-RT (for the EVPN
VPWS instance) and one ES-Import RT (for the ESI of the IRC
interface).
* The ESI can be set to the ESI of the IRC interface or the I-ESI of
VPWS1's L2 EVI.
Note that the receiver use the ESI and Ethernet Tag ID to
determine the VPWS Service Instance whose IRC interface is the
interface that the synced ARP entry will be installed to.
Note that VPWS1 and VPWS2 are two VPWS Service Instances of the
same L2 EVPN Instance, thus they have different VPWS Service
Instance Identifiers. Then we can assign an I-ESI to that L2 EVI.
The ESI of the Ethernet A-D per EVI routes for these two VPWS
Service Instances will be set to this I-ESI. The Ethernet Tag ID
of each of these Ethernet A-D per EVI routes (for EVPN VPWS
domain) will be set to its VPWS Service Instance ID.
* The MPLS Label1 should be set to the label of the <ESI,VPWS
service instance identifier>.
2.1.2. When CEs are Routers
* Route-Distinguisher: The RD of VPWS1's EVI.
* Ethernet Tag ID: The same as Section 2.1.1.
* SOI: The same as the ET-ID of Section 2.1.1.
* Router Target: IPVRF1's export RTs and EVPN VPWS's export RTs.
* ESI: The same as Section 2.1.1.
* MPLS Label1: The same as Section 2.1.1.
* MPLS Label2: The MPLS Label2 should be set to IPVRF1's EVPN label.
* RMAC: The Rourter's MAC Extended Community attribute SHOULD be
carried in VXLAN EVPN.
2.2. Constructing Ethernet A-D Route
When CEs are hosts, the ESI of the IRC interface is mainly used in
the EVPN VPWS domain. That ESI typically has nothing to do with the
fundamental function of the L3 EVPN domain.
Note that PE3 or PE4 will not import the RT-2 route with an ES-import
RT it doesn't recognize.
Wang & Zhang Expires 1 March 2022 [Page 8]
Internet-Draft EVPN VPWS as VRF-AC August 2021
Note that the Ethernet A-D route advertisement in the EVPN VPWS
domain still follows [RFC8214]. The IRC interface is considered as
an ordinary AC in the EVPN VPWS domain.
When CEs are routers, the <ESI, SOI> of the RT-2R route for the GW-IP
of the RT-5G routes will be used to do recursive resolution. Thus an
corresponding IP A-D per EVI route should be advertised for the IRC1
interface in the context of IPVRF1.
* Route-Distinguisher: IPVRF1's RD.
* Ethernet Tag ID: IRC1 interface's local VPWS service instance ID.
* Router Target: IPVRF1's export RT.
* ESI: IRC1's ESI or the I-ESI of VPWS1's L2 EVI.
* MPLS Label: IPVRF1's EVPN label.
* RMAC: The Rourter's MAC Extended Community should be set as per
[I-D.sajassi-bess-evpn-ip-aliasing].
2.3. Constructing IP Prefix Advertisement Route
There may be two types of IP prefixes on PE1/PE2, direct-prefixes
(e.g. intermediate subnet of IRC interface) and CE-prefixes. The
direct-prefixes are the subnets of the PE's own interfaces (e.g. the
IRC interface). The CE-prefixes are the prefixes behind the CE node
N1 (especially when N1 is a router).
2.3.1. Direct-Prefixes Advertisement
Given that PE1/PE2 can install synced ARP entries to its proper IRC
interface benefitting from the RT-2 route of Section 2. This ensures
that both PE1 and PE2 will know all hosts of the IRC interface's own
subnet. So it is not necessary for PE1/PE2 to advertise per-host IP
prefixes of that subnet to PE3 by RT-2 routes. It is recommended
that PE1/PE2 advertise a single RT-5L route of that subnet to PE3
instead. The ESI of these RT-5 routes can be simply set to zero,
because when PE3 receives such RT-5 routes from both PE1 and PE2, PE3
can consider them as ECMP or FRR even when their ESI is zero.
2.3.2. Exclusive CE-Prefixes of Each CE
There may be two types of CE-Prefixes on PE1/PE2, they are the common
CE-prefixes (e.g. SN9) of R1 and R2, and the exclusive CE-prefixes
(which can only be reached by a specified CE) of R1 or R2. Let us
discuss the exclusive CE-Prefixes first, the common CE-prefixes will
be discussed in Section 6.
Note that N1 may be a host or a router, when it is a router, there
may be some prefixes behind N1 on PE1. Those prefixes will be learnt
via a PE-CE route protocol (e.g. CE-BGP). N1's IP address may be
Wang & Zhang Expires 1 March 2022 [Page 9]
Internet-Draft EVPN VPWS as VRF-AC August 2021
considered as the overlay nexthop of those prefixes. The overlay
nexthop of those prefixes will be carried in the RT-5 route's GW-IP
field. Those RT-5 routes are called as RT-5G routes because their
Overlay Indexes are their GW-IPs (and their ESI and label are zero).
Note that these RT-5G routes are advertised by PE1 to both PE2 and
PE3. If the IRC1 interface of PE1 fails, these CE-prefixes will
achieve more faster convergency on PE3 by the withdraw (from PE1) of
the corresponding IP A-D per EVI route.
Note that when PE3 receives the withdraw of the RT-2R of 10.2 from
PE1, and the RT-2R is the only RT-2R of 10.2, and the <ESI, SOI> of
the RT-2R can be resolved to an IP A-D per EVI route from another PE
(e.g. PE2), PE3 should triger a delayed deletion of that RT-2R. so
that ARP/ND refresh can happen on PE2 before the deletion.
3. Packet Walk Through
The procedures for local/remote host learning and MAC/IP
Advertisement route constructing are described above.
3.1. When CEs are Hosts
When N3 sends a data packet P301 to 10.2 which is a host of the
subnet of IRC1, P301 will match prefix 10.0/24 on PE3.
Both PE1 and PE2 have advertised the RT-5L route of 10.0/24 to PE3.
PE3 may consider them as ECMP or FRR, depending on their route
attributes. Then PE3 should forward P301 to PE1 or PE2, depending on
the ECMP/FRR procedures.
We can assume that it is PE2 that will receive P301 from PE3. The
outgoing interface for P301 (whose destination IP is 10.2) is IRC1
interface. The destination MAC should be found from the ARP entries
on IRC1.
The ARP entry for 10.2 is a synched ARP entry, because N1 sent the
ARP Request only to PE1. It is intalled onto IRC1 interface just
because the RT-2 route's route-target mathes VPWS1's L2 EVI and the
RT-2 route's <ESI,Ethernet Tag ID> matches the IRC1 interfaces's ESI
and VPWS Service Instance ID.
Then P301 is encapsulated with a ethernet header and becomes an
ethernet packet P301E. The destination MAC address of P301E is N1's
MAC address which is determined by that ARP entry. The source MAC
address of P301E is IRC1's MAC address. Then P301E is sent over IRC1
interface.
Wang & Zhang Expires 1 March 2022 [Page 10]
Internet-Draft EVPN VPWS as VRF-AC August 2021
After P301E is sent over IRC1 interface, it will be forwarded to PE4
in the EVPN VPWS instance according to [RFC8214]
3.2. When CEs are Routers
When N3 sends a data packet P301b to a host 60.1 whose location is
behind R1(N1), P301b will match prefix 60.0/24 on PE3. The RT-5G
route for 60.0/24 will be used to forward P301b. The GW-IP of that
RT-5G route is 10.2 (R1). So PE3 uses 10.2 to do recursive route
resolution and matches the RT-2R route of 10.2.
Note that the recursive route resolution follows the DGW1 behavior of
[I-D.ietf-bess-evpn-prefix-advertisement]'s section 4.1.
Both PE1 and PE2 have advertised the IP A-D per EVI route for the
<ESI, ET-ID> of the RT-2R route of 10.2. PE3 may consider them as
ECMP or FRR, depending on the ESI is all-active or single-active.
Then PE3 can forward P301b to PE1 or PE2, depending on the ECMP/FRR
procedures.
We can assume that it is PE2 that will receive P301b from PE3. The
destination IP of P301b is in prefix 60.0/24. That prefix has been
installed into IPVRF1 on PE2. PE2 previously received that prefix
either from a PE-CE route protocol or from a RT-5G route from PE1.
The overlay nexthop or GW-IP of prefix 60.0/24 is 10.2, which is a
host of IRC1's subnet. The outgoing interface for P301b is IRC1
interface.
The ARP entry for 10.2 will be found by the same way as Section 3.1.
then the ethernet header will be encapsulated by the same way as
Section 3.1. then it will be forwarded to PE4 by the same way as
Section 3.1.
4. Fast Convergence for Routed Traffic
When IRC1 interface goes down, PE1 will withdraw the RT-5L route of
10.0/24. And the RT-5G routes of 60.0/24 and 70.0/24 will be just
changed to stale state. When PE3 receives the withdraw of that RT-5L
route, it will stop to forward the data packets of those two subnets
to PE1 again. But PE3 will continue to forward these data packets to
PE2.
Wang & Zhang Expires 1 March 2022 [Page 11]
Internet-Draft EVPN VPWS as VRF-AC August 2021
5. Considerations on ABRs and Route Reflectors
When an ABR or ASBR receives a MAC/IP Advertisement Route that
contains both EVI-RT and ES-Import RT, It should re-advertise that
route even if that route's MPLS label1 is null (It should not
consider that route as malformed). When that route's nexthop are
changed to itself, It don't have to allocate a new label for each
RT-2 route's MPLS label1 field separately. That field can be
rewritten to the same preconfigured MPLS label that will blackhole
the data packets it received. But the MPLS label2 (if is not null)
field should be rewritten normally along with the nexthop-rewritting.
6. For Common CE-prefixes behind R1 and R2
We can assume that there is a common prefix (SN9) and two exclusive
prefixes (SN7 and SN8). SN9 is behind both R1 and R2, SN7 is
particular to R1 while SN8 is particular to R2. That's saying that
PE5 can reach SN9 through either R1 or R2.
6.1. Solution 1: Independent CE-BGP sessions
R1 and R2 don't know which prefix is their common prefix, and which
prefix is their exclusive prefix. So R1 establish its own CE-BGP
session S1 to PE1, and R2 establish its own CE-BGP session S2 to PE2.
When R1(or R2) advertises IP prefixes to PE1(or PE2), the BGP next
hop of these prefixes are set to R1's (or R2's) IP address in the
IRC1's (or IRC2's) subnet .
Wang & Zhang Expires 1 March 2022 [Page 12]
Internet-Draft EVPN VPWS as VRF-AC August 2021
PE4 +-----------------------+
+-------------+ PE1 | |
SN7 | VPWS1 | +------+------+ ----------> |
+ | +---------+ | | (VPN1) | RT5(SN9) |
| | | _|_|_________|__ / IRC1 | GW-IP=R1 |
| | | PW1 / | | P | (VPWS1) | RT2(R1,M1) |
+-R1----O=====< | | +-----+-------+ |PE3
| | | \_|_|___ | +--+---+
| | | | | B \ +-----+-------+ | |
| | +---------+ | \____|__(VPWS1) | | |
| | | | \ IRC1 | ----------> | |
SN9 | | |PE5 (VPN1) | RT2(R1,M1) |(VPN1)--R3
| | VPWS2 | ____|__ / IRC2 | RT2(R2,M2) | |
| | +---------+ | / | (VPWS2) | | |
| | | _|_|___/ +-----+-------+ | |
| | | PW2 / | | B | +--+---+
+-R2----O=====< | | +-----+-------+ |
| | | \_|_|_________|__(VPWS2) | ----------> |
| | | | | P | \ IRC2 | RT5(SN9) |
+ | +---------+ | | (VPN1) | GW-IP=R2 |
SN8 | | +------+------+ RT2(R2,M2) |
+-------------+ PE2 | |
+-----------------------+
Figure 2: Common CE-Prefixes and Exclusive CE-Prefixes
In such case, the route advertisement is just the same as Section 2
(on the condition that the CEs are routers).
Note that according to the recursive route resolution behavior of
[I-D.ietf-bess-evpn-prefix-advertisement]'s section 4.1, If both RT-
5G routes of SN9 were equally preferable and ECMP is enabled, SN9
would be added to the routing table with both Overlay Index 10.2 and
Overlay Index 20.2.
6.2. Solution 2: ECMP-Merging for RT-5G routes
In some scenarios, R1 and R2 will not have any exclusive prefixes
(e.g. SN7 or SN8 in Figure 2) at all, in other words, all prefixes
of them are always their common prefixes, in such case, when R1
advertises SN9 to PE1 over that CE-BGP session S1, 10.2 may not be
the best choice for SN9's BGP next hop.
Wang & Zhang Expires 1 March 2022 [Page 13]
Internet-Draft EVPN VPWS as VRF-AC August 2021
+-----------------------+
PE1 | |
+------+------+ ----------> |
| (VPN1) | RT5(SN9) |
_____________|__ / IRC1 | GW-IP=IP201 |
PW1 / P | (VPWS1) | RT2(IP201) |
+-R1----O=====< +-----+-------+ MAC201 |PE3
| VPWS1 \_______ | +--+---+
| B \ +-----+-------+ | |
| \____|__(VPWS1) | | |
| | \ IRC1 | ----------> | |
SN9 |PE5 (VPN1) | RT2(IP201) |(VPN1)--R3
| ____|__ / IRC2 | MAC201 | |
| / | (VPWS2) | | |
| _______/ +-----+-------+ | |
| PW2 / B | +--+---+
+-R2----O=====< +-----+-------+ |
VPWS2 \_____________|__(VPWS2) | ----------> |
P | \ IRC2 | RT2(IP201) |
| (VPN1) | MAC201 |
+------+------+ |
PE2 | |
+-----------------------+
Figure 3: IP Aliasing of Common CE-Prefixes
In such case, we can configure a common anycast loopback address (say
IP201, whose value is 7.7.7.7) on R1 and R2. Then, when R1 advertise
SN9 to PE1, R1 choose IP201 to be the BGP next-hop of the
advertisement. Thus the RT-5G of SN9 from PE1 will be advertised
along with GW-IP=IP201.
In such case, we can configure a static route in VPN1 for IP201 on
PE1, PE2 and PE5. The static route on PE1 (which is called as SRE1)
use NH1 as its overlay next hop. The static route on PE2 (which is
called as SRE2) use NH2 as its overlay next hop. The static route on
PE5 (which is called as SRE5) use both NH1 and NH2 as its overlay
next hops.
If SRE1, SRE2 and SRE5 are advertised by RT-5G routes too, The
recursive resolution will be complicated. There are two ways to
simplify the recursive resolution.
Wang & Zhang Expires 1 March 2022 [Page 14]
Internet-Draft EVPN VPWS as VRF-AC August 2021
6.2.1. ECMP-Merging by RT-5L
Note that IRC1 and IRC2 are on the same I-ES (say ESI512). Thus 10.2
(say NH1) and 20.2(say NH2) are behind different Ethernet Tags of the
same I-ESI. We can assume that the ET-ID of IRC1 is ETI100, while
the ET-ID of IRC2 (say ETI200) is ETI200. Thus 10.2 is behind
<ESI512, ETI100>, while 20.2 is behind <ESI512, ETI200>.
Then all of the three PEs advertise a RT-5L route (say RT5L_201,
whose ESI is zero) for IP201 (in fact it is for SRE1, SRE1 or SRE2)
separately.
Then we advertise a RT-5G route for SN9 (say RT5G_SN9), the
RT5G_SN9's GW-IP is IP201, and its ESI is 0, its ET-ID is 0.
When PE3 receives RT5G_SN9 and RT5L_201, the GW-IP of RT5G_SN9 can be
resolved to RT5L_201. Then the corresponding data packets of
RT5G_SN9 will be forwarded according to IP201's ECMP pathes formed by
the corresponding RT-5L routes.
Note that we can use this approach to merge the two ECMP Path
collections (e.g. <ESI201,ETI100>s and <ESI201, ETI200>s) for the CE-
prefixes (e.g. SN9) behind a specified anycast IP address (e.g.
7.7.7.7 or IP201, which is the IP-address of a loopback interface).
6.2.2. ECMP-Merging by RT-2R
We can substitute a RT-2 route (say RT2R_201) for
RT5L_201(Section 6.2.1). The RT2R_201's IP address is IP201, its MAC
address is MAC201, its RD is VPN1's RD, its ESI is 0, its ET-ID is 0.
Such RT-2 routes MUST NOT carry any Route-Targets of a Broadcast
Domain. Its MPLS Label2 field should be set to VPN1's EVPN label,
thus its RMAC should be set to the PE's MAC address in VXLAN EVPN.
and its MPLS Label1 field should be set to a pre-configured (for all
such RT-2 routes) value.
Note that MAC201 is a pre-configured MAC address for IP201. And the
MAC201 MUST be advertised along with the Stricky flag.
Note that the diferences between RT2R_201 and RT5L_201 exists only in
the control plane, when they are installed into the FIB of VPN1 in
the data plane, they will be the same.
Wang & Zhang Expires 1 March 2022 [Page 15]
Internet-Draft EVPN VPWS as VRF-AC August 2021
6.3. Solution 3: RT-5E Routes Advertisement
For direct-prefixes and exclusive CE-prefixes behind each CE, no ESIs
need to be advertised along with them, but for the common CE-prefixes
behind R1 and R2, a virtual ESI can be used to achieve the ECMP-
merging.
6.3.1. CE-Prefix Advertisement by RT-5E Routes
This use case is different from Section 6.2 in the following:
* There are common prefixes behind R1 and R2, but there are also
other prefixes which can only be reached through R1 or R2.
* For the common prefixes behind R1 and R2, the integration of R1
and R2 can be considered as a vRouter whose two LPUs is R1 and R2.
Note that the vRouter concept is a logical entity only for the
common prefixes behind R1 and R2, it should not be used for other
prefixes.
* The CE-prefixes are IPv6 prefixes whith IPv6 nexthop (NH21).
* The vRouter is identified by VR621(Virtual Router-ID 621).
* The VR621 can be mapped to form an IPv6 address VRID_IP. The
VRID_IP are slected from an 96 bits IPv6 prefix VRID_Prefix, and
the VRID_IP's lowest 32 bits may be set to a constant X.
The VRID_Prefix's lowest 32 bits (of that 96 bits) should be set
to VR621.
Wang & Zhang Expires 1 March 2022 [Page 16]
Internet-Draft EVPN VPWS as VRF-AC August 2021
+-------------------------------+
PE1 | |
+------+--------+ ----------------> |
vRouter | (VRF1:vES) | RT5(SN9) |
+---------+ _______|__ / IRC1 | ESI=vES251 |
| | VPWS1 / P | (VPWS1) | SOI=VR612 |
| R1---+---O==< +-----+---------+ RT1(vES251,VR612) |PE3
| | \___ | Label=VRF1 +--+--+
| | B \ +-----+---------+ | |
| VR612 | \__|__(VPWS1) | | |
| | | \ IRC1 | ----------------> | |
| | PE5 | (VRF1:vES) | RT1(vES251,VR612) | |
| | __|__ / IRC2 | Label=VRF1 | |
| (NH612) | / | (VPWS2) | | |
| | ___/ +-----+---------+ | |
| | VPWS2 / B | +--+--+
| R2---+---O==< +-----+---------+ |
| | \_______|__(VPWS2) | ----------------> |
+---+-----+ P | \ IRC2 | RT1(vES251,VR612) |
| | (VRF1:vES) | Label=VRF1 |
| +------+--------+ |
+ PE2 | |
SN9(Common Prefix) +-------------------------------+
Figure 4: VRID as ET-ID
* SOI-mapping Route per each VRID
A special static route (which is called as SOI-mapping route) is
configured for prefix VRID_Prefix on PE1, PE2, PE5, they are
VRID_MR1 (VRID Mapping Route 1), VRID_MR2, VRID_MR5 respectively.
VRID_MR1's nexthop is IP102 of R1, which is allocated from IRC1's
subnet. VRID_MR2's nexthop is IP202 of R2, which is allocated from
IRC2's subnet. VRID_MR5's nexthops are both IP102 and IP202.
* I-ESI per L3 EVPN Instance
Then we can assign an I-ESI (illustrated as vES251 in the figure)
to that L3 EVI.
Note that a single RT-1 per ES route will be advertised for vES251,
because vES251 is dedicated to that L3 EVI.
The RT-4 route will be advertised for DF-Election of vES251. AC-DF
mode should be used for vES251.
* Ethernet Tag per each vRouter
Wang & Zhang Expires 1 March 2022 [Page 17]
Internet-Draft EVPN VPWS as VRF-AC August 2021
Each vRouter of that L3 EVI is considered to be attached to an
Ethernet Tag of vES251. The ET-ID of such Ethernet Tag will be a
vRouter's VRID. The Ethernet A-D per EVI route advertisement is
triggered by the SOI-mapping route (which represents the vRouter)
per each PE, where:
- RD: VRF1's RD.
- ESI: VRF1's I-ESI (vES251).
- ET-ID: The vRouter's VRID.
- MPLS Label: VRF1's EVPN label.
- Route Target: VRF1's eRT (export Route Target).
* RT-5E Route per each CE-Prefix
The CE-Prefixes are advertised using RT-5E route, instead of RT-5G
route.
When PE1 learns a CE-prefix SN9 from the CE-BGP session between PE1
and the vRouter, PE1 will advertise a RT-5E route RT5E_SN9, where:
- RD: VRF1's RD.
- Ethernet Tag ID: The ET-ID should be set to 0.
- ESI: VRF1's I-ESI (vES251).
- Supplementary Overlay Index: The VRID of the CE-Prefix's
advertising vRouter.
The SOI can be carried in IP-mapping SOI extended community.
- MPLS Label: VRF1's EVPN label.
- Route Target: VRF1's eRT (export Route Target).
6.3.1.1. When Internal Remote PEs Receive the RT-5E
PE5 receives the RT5E_SN9 whose VRID_IP can match a local SOI-mapping
route VRID_MR5, and VRID_MR5 indicates that RT5E_SN9 should be
installed is if its overlay nexthop is the VRID_IP. The VRID_IP can
be infered from the SOI and VRID_MR5 and the constant X.
6.3.1.2. When External Remote PEs Receive the RT-5E
PE3 receives the RT5E_SN9 whose SOI can't match a local SOI-mapping
route, RT5E_SN9 should be installed (as FIB_Entry_6) with <vES251,
SOI> as its Overlay Index.
6.3.1.3. Packet Walk Through
When PE3 use that RT-5E to forward data packet DP6, it follows
[I-D.wang-bess-evpn-ether-tag-id-usage].
When PE2 receives DP6 from PE3, it forwards DP6 according to
FIB_Entry_6.
Wang & Zhang Expires 1 March 2022 [Page 18]
Internet-Draft EVPN VPWS as VRF-AC August 2021
6.3.2. The Advertisement of SOI-mapping Routes
VRID_MR1, VRID_MR2, VRID_MR3 can be advertised using RT-5L along with
EVI-RT and ES-Import RT to preclude the external remote PEs from
importing these routes into their IP-VRF. Because that they don't
have to be used on the external remote PEs.
6.3.3. IP-mapping SOI Extended Community
The IP-specific SOI extended community is an extension of
Supplementary Overlay Index extended community.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x06 | Sub-Type=TBD |Type=4 |O|Z|F=1| Flags |V|G|Rsv|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IP-mapping SOI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: IP-mapping SOI Extended Community
Where:
IP-mapping SOI: A SOI that is derived from or mapped to an IP
address, Router ID, static route, etc.
V Flag: IPv6 Flag, when it is set to 1, it indicates that the SOI
should be mapped to an overlay IPv6 nexthop on internal remote
PEs, otherwise the SOI should be mapped to an overlay IPv4
nexthop (whose value is the same as the IP-mapping SOI field) on
internal remote PEs.
When V Flag is 1, on the internal remote PEs, the IP-mapping SOI
will be mapped to an IPv6 address (like the VRID_IP in
Section 6.3.1, Paragraph 2, Item 5) in the address space of the
IP-VRF, then it will be used the same as the above case.
When V Flag is zero, on the internal remote PEs, the IP-mapping
SOI don't need to be mapped to an IPv6 address.
F: Format Inicator is set to 1, to indicate that it is a type-
specific SOI.
Type: Type code is 4, to indicate that it is an IP-mapping SOI.
Rsv: Reserved for future use.
G Flag: When G Flag is zero, on the external remote PEs, the SOI-
mapped IP address can be used as if it is the GW-IP field of the
RT-5 route it belongs to, except for that it don't require to
find a RT-2 routes (which is discussed in Appendix B.2 of
[I-D.wang-bess-evpn-arp-nd-synch-without-irb]) before the
recursive resolution.
Wang & Zhang Expires 1 March 2022 [Page 19]
Internet-Draft EVPN VPWS as VRF-AC August 2021
When V Flag is 1, the SOI-mapped IP address is an IPv6 address
like the VRID_IP in Section 6.3.1, Paragraph 2, Item 5. When V
Flag is 0, the SOI-mapped IP address is the SOI itself.
When the G Flag is set to 1, the advertising PE should advertise
an RT-5L route for that SOI-mapped IP address. and the RT-5L
route should not use EVI-RT and ES-import RT.
Other fields: The same as [I-D.wang-bess-evpn-ether-tag-id-usage].
7. Security Considerations
TBD.
8. IANA Considerations
There is no IANA consideration needed.
9. Normative References
[I-D.ietf-bess-srv6-services]
Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R.,
Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based
Overlay Services", Work in Progress, Internet-Draft,
draft-ietf-bess-srv6-services-07, 11 April 2021,
<https://datatracker.ietf.org/doc/html/draft-ietf-bess-
srv6-services-07>.
[I-D.ietf-bess-evpn-prefix-advertisement]
Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A.
Sajassi, "IP Prefix Advertisement in EVPN", Work in
Progress, Internet-Draft, draft-ietf-bess-evpn-prefix-
advertisement-11, 18 May 2018,
<https://datatracker.ietf.org/doc/html/draft-ietf-bess-
evpn-prefix-advertisement-11>.
[I-D.ietf-bess-evpn-inter-subnet-forwarding]
Sajassi, A., Salam, S., Thoria, S., Drake, J., and J.
Rabadan, "Integrated Routing and Bridging in EVPN", Work
in Progress, Internet-Draft, draft-ietf-bess-evpn-inter-
subnet-forwarding-15, 26 July 2021,
<https://datatracker.ietf.org/doc/html/draft-ietf-bess-
evpn-inter-subnet-forwarding-15>.
[RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February
2015, <https://www.rfc-editor.org/info/rfc7432>.
Wang & Zhang Expires 1 March 2022 [Page 20]
Internet-Draft EVPN VPWS as VRF-AC August 2021
[RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J.
Rabadan, "Virtual Private Wire Service Support in Ethernet
VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017,
<https://www.rfc-editor.org/info/rfc8214>.
[RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
Uttaro, J., and W. Henderickx, "A Network Virtualization
Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
DOI 10.17487/RFC8365, March 2018,
<https://www.rfc-editor.org/info/rfc8365>.
[I-D.wang-bess-evpn-ether-tag-id-usage]
Wang, Y., "Ethernet Tag ID Usage Update for Ethernet A-D
per EVI Route", Work in Progress, Internet-Draft, draft-
wang-bess-evpn-ether-tag-id-usage-03, 26 August 2021,
<https://datatracker.ietf.org/doc/html/draft-wang-bess-
evpn-ether-tag-id-usage-03>.
[I-D.sajassi-bess-evpn-ip-aliasing]
Sajassi, A., Badoni, G., Warade, P., Pasupula, S., Drake,
J., and J. Rabadan, "EVPN Support for L3 Fast Convergence
and Aliasing/Backup Path", Work in Progress, Internet-
Draft, draft-sajassi-bess-evpn-ip-aliasing-02, 8 June
2021, <https://datatracker.ietf.org/doc/html/draft-
sajassi-bess-evpn-ip-aliasing-02>.
10. Informative References
[I-D.wang-bess-evpn-arp-nd-synch-without-irb]
Wang, Y. and Z. Zhang, "ARP/ND Synching And IP Aliasing
without IRB", Work in Progress, Internet-Draft, draft-
wang-bess-evpn-arp-nd-synch-without-irb-07, 9 August 2021,
<https://datatracker.ietf.org/doc/html/draft-wang-bess-
evpn-arp-nd-synch-without-irb-07>.
Authors' Addresses
Yubao Wang
ZTE Corporation
No. 68 of Zijinghua Road, Yuhuatai Distinct
Nanjing
China
Email: wang.yubao2@zte.com.cn
Wang & Zhang Expires 1 March 2022 [Page 21]
Internet-Draft EVPN VPWS as VRF-AC August 2021
Zheng(Sandy) Zhang
ZTE Corporation
No. 50 Software Ave, Yuhuatai Distinct
Nanjing
China
Email: zhang.zheng@zte.com.cn
Wang & Zhang Expires 1 March 2022 [Page 22]