Internet DRAFT - draft-heitz-bess-evpn-option-b
draft-heitz-bess-evpn-option-b
BESS J. Heitz
Internet-Draft A. Sajassi
Intended status: Standards Track Cisco
Expires: May 17, 2018 J. Drake
Juniper
J. Rabadan
Nokia
November 13, 2017
Multi-homing and E-Tree in EVPN with Inter-AS Option B
draft-heitz-bess-evpn-option-b-01
Abstract
The BGP speaker that originates an EVPN Ethernet A-D per ES route is
identified by the next-hop of the route. When the route is
propagated by an ASBR as an Inter-AS Option B route, the ASBR
overwrites the next-hop. This document describes a method to
identify the originator of the route.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 17, 2018.
Heitz, et al. Expires May 17, 2018 [Page 1]
Internet-Draft EVPN Inter-AS Option B November 2017
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. EVPN multi-homing and Inter-AS Option B issue . . . . . . 3
2.2. EVPN E-tree and Inter-AS Option B issue . . . . . . . . . 4
3. Solution using the Tunnel Encapsulation Attribute . . . . . . 4
4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5. Procedures at the Imposition PE . . . . . . . . . . . . . . . 5
5.1. Primer for subsequent sections . . . . . . . . . . . . . 5
5.2. OPE exists on all Type 2/5 and EAD Routes . . . . . . . . 5
5.3. Some routes do not contain OPE . . . . . . . . . . . . . 6
5.4. OPE exists on EAD routes, but not on Type 2/5 routes . . 6
6. Security Considerations . . . . . . . . . . . . . . . . . . . 6
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7
9. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . 7
9.1. Alternative Ways to Signal OPE . . . . . . . . . . . . . 7
9.1.1. Extended Community holding the IP addres . . . . . . 7
9.1.2. Large Community holding the BGP Identifier . . . . . 7
9.2. Considerations . . . . . . . . . . . . . . . . . . . . . 7
10. Normative References . . . . . . . . . . . . . . . . . . . . 8
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Terminology
Inter-AS Option B: This is described in Section 10.b of [RFC4364]
EAD-per-ES: Ethernet A-D per Ethernet Segment Route.
EAD-per-EVI: Ethernet A-D per EVPN Instance Route.
Heitz, et al. Expires May 17, 2018 [Page 2]
Internet-Draft EVPN Inter-AS Option B November 2017
EAD: EVPN Type 1 route: Ethernet Auto-discovery Route. Either an
EAD-per-ES or an EAD-per-EVI route.
Type 2/5: either the EVPN Type 2 route: MAC/IP Advertisement Route or
the EVPN Type 5 route: IP Prefix Route described in
[I-D.ietf-bess-evpn-prefix-advertisement].
Mass Withdraw: To withdraw the route from the forwarding table. For
example, a MAC route that is mass withdrawn remains in the BGP table.
The MAC route is required for directing packets with the specified
MAC destination address to a matching backup or alias route. When a
MAC route is completely withdrawn, then the matching backup or alias
routes can no longer be used for the given MAC address. The
withdrawal of an EAD-per-ES route will cause the mass withdrawal of
associated Type 2/5 routes as well as associated EAD-per-EVI routes.
2. Introduction
Inter-AS Option B is illustrated in Figure 1.
CE3
|
PE1
/ \
CE1 ASBR1---ASBR2---PE3--CE2
\ /
PE2
Figure 1: Inter-AS Option B
Traffic flow is from CE2 to CE1 where PE3 is an imposition PE, and
PE1 and PE2 are disposition PEs. The following sections describe the
issues that EVPN multi-homing and EVPN E-tree services have in these
types of scenarios.
2.1. EVPN multi-homing and Inter-AS Option B issue
In a multi-homing scenario, the router that performs the redundancy
switchover or the load balancing (e.g. PE3) must know which router
originated the Ethernet A-D routes. These redundancy functions are
normally implemented on a PE, but not on an ASBR.
Quote from [RFC7432]:
"A remote PE that receives a MAC/IP Advertisement route with a
non-reserved ESI SHOULD consider the advertised MAC address to be
reachable via all PEs that have advertised reachability to that
MAC address's EVI/ES via the combination of an Ethernet A-D per
Heitz, et al. Expires May 17, 2018 [Page 3]
Internet-Draft EVPN Inter-AS Option B November 2017
EVI route for that EVI/ES (and Ethernet tag, if applicable) AND
Ethernet A-D per ES routes for that ES."
In the Intra-AS case, the remote PE identifies the "PEs that have
advertised reachability" by the next-hops of the Ethernet A-D routes.
In the Inter-AS option B case, ASBR1 and ASBR2 rewrite the next-hops
to themselves on all EVPN route advertisements, thus losing the
identity of the PE that originated an advertisement.
As a result, PE3 is unable to distinguish an EAD-per-ES route that
originated at PE1 from one that originated at PE2.
2.2. EVPN E-tree and Inter-AS Option B issue
As described in [EVPN-Etree], leaf-to-leaf BUM traffic filtering is
always performed at the disposition PE and based on the Leaf Label.
The Leaf Label can be downstream allocated (ingress replication) or
upstream allocated (p2mp tunnels) and is advertised in an EAD-per-ES
route with ESI-0. As in the multi-homing case, the PEs must identify
the PE that originated a given EAD-per-ES route, for both cases,
ingress replication or p2mp tunnels, so that the leaf-to-leaf BUM
filtering can be successful.
If ingress-replication is used for BUM traffic, the ingress PE must
identify the originator of the ESI-0 EAD-per-ES route, program the
Leaf Label and push it on the stack when sending BUM Leaf traffic to
the egress PE. However, this identification of the originating PE is
not possible in Inter-AS option B scenarios where ASBRs rewrite the
next-hops. For instance, assuming CE2 and CE3 (Figure 1) are
connected to Leaf ACs, PE1 will advertise a Leaf Label in an EAD-per-
ES route for ESI-0. When CE2 sends BUM traffic, PE3 will not know
what Leaf Label to use for sending traffic to PE1.
Similarly, when PE3 uses non-segmented p2mp tunnels for BUM traffic,
PE3 will upstream allocate a Leaf Label and advertise it in an EAD-
per-ES route, so that when sending BUM traffic with a Leaf Label, PE1
can identify that is coming from a Leaf and not forward it to CE3.
In both cases, the current Intra-AS procedures do not allow to
identify the originator of the EAD-per-ES routes and therefore egress
BUM filtering for leaf-to-leaf is not possible when the Leaf ACs are
located on different AS'es.
3. Solution using the Tunnel Encapsulation Attribute
The Tunnel Encapsulation Attribute is specified in
[I-D.ietf-idr-tunnel-encaps]. A new TLV to identify the Originating
PE is specified here. It is called OPE. The tunnel type for the OPE
Heitz, et al. Expires May 17, 2018 [Page 4]
Internet-Draft EVPN Inter-AS Option B November 2017
(suggested value 15) is to be assigned by IANA. The OPE MUST contain
the Remote Endpoint Sub-TLV. The OPE must be able to uniquely
identify the PE of origin within all ASes that participate in an EVPN
instance.
If a BGP speaker, such as a route reflector or an ASBR, is about to
re-advertise a Type 2/5 or EAD route that does not have a OPE, and
will change the next-hop of that route, then it MUST add one by
putting the received next-hop into the Remote Endpoint Sub-TLV of the
OPE. This will ensure that all originating EVPN routes carry the
necessary information for imposition PEs to function properly for
aliasing and mass withdraw.
Any router that re-advertises a route that contains a OPE may modify
some TLVs in the Tunnel Encapsulation Attribute attribute. However,
it MUST keep the OPE unchanged. Examples are ASBR1 and ASBR2 in
Figure 1.
4. Operation
For an inter-AS option B scenario, when a PE receives EVPN route(s)
with OPE from an ASBR, then everything works per [RFC7432]
specification including both aliasing function and mass withdraw.
i.e., the imposition PE (e.g., PE3) can process mass withdraw
messages (Ethernet A-D per ES route). However, if a PE receives EVPN
route(s) without a OPE from an ASBR, then the mass withdraw function
operates in a degenerate mode where only Ethernet A-D per EVI route
can be processed (for its corresponding MAC-VRF) but not Ethernet A-D
per ES route (corresponding to all the impacted MAC-VRFs). The
following sections detail the procedures associated with OPE
processing.
5. Procedures at the Imposition PE
5.1. Primer for subsequent sections
When routes are being compared, they must exist in the same MAC-VRF
and have the same non-reserved ESI. In addition, when Type 2/5
routes and EAD-per-EVI routes are being compared, they must have the
same Ethernet Tag. Type 2/5 routes with ESI==0 do not use mass
withdrawal or aliasing.
5.2. OPE exists on all Type 2/5 and EAD Routes
If all Type 2/5 and EAD routes have a OPE, then "PEs that have
advertised reachability" can be identified by the OPE and the
procedures of [RFC7432] can be applied without modification.
Heitz, et al. Expires May 17, 2018 [Page 5]
Internet-Draft EVPN Inter-AS Option B November 2017
5.3. Some routes do not contain OPE
The routes that have a OPE are handled as per the previous section.
The routes that do not have a OPE need the following procedures.
Type 2/5 routes without a OPE and EAD-per-EVI routes without a OPE
are valid if at least one EAD-per-ES route without a OPE exists with
the same next-hop. In other words: if multiple EAD-per-ES routes
with the same next-hop as a Type 2/5 route exist, then the Type 2/5
route will only be mass withdrawn once all of the EAD-per-ES routes
are withdrawn. This rule is necessary, because a BGP speaker may
serve dual roles as ASBR and PE
[Editorial note: If it is determined that no BGP speakers exist that
do not normally follow the procedures in this document (Legacy
speakers) then the following sub sections may be omitted]
If an EAD-per-EVI route without a OPE is withdrawn, it will mass
withdraw all Type 2/5 routes without a OPE that have the same next-
hop and the same RD as the EAD-per-EVI route. This is called mass-
withdraw per EVI. Note, it is not the absence of the EAD-per-EVI
route that causes mass-withdrawal, but the actual withdrawal itself.
If the route was never there to begin with, then no withdrawal took
place.
If any entity in the network rewrites an RD, then all entities must
rewrite the RD in a consistent manner, such that routes with the same
RD continue to have the same RD and routes with different RDs
continue to have different RDs. Note that if this condition is
violated, then other network functions would also break.
5.4. OPE exists on EAD routes, but not on Type 2/5 routes
If a Type 2/5 route exists without a OPE and an EAD-per-EVI route
exists with a OPE and it has the same next-hop and the same RD as the
Type 2/5 route, then the Type 2/5 route shall inherit the OPE from
the EAD-per-EVI route. Thereafter, Section 5.2 applies.
6. Security Considerations
TBD
7. IANA Considerations
A Tunnel Encapsulation Attribute Tunnel Type for the OPE is required.
Heitz, et al. Expires May 17, 2018 [Page 6]
Internet-Draft EVPN Inter-AS Option B November 2017
8. Acknowledgements
Thanks to Kiran Pillai, Patrice Brissette, Satya Mohanty and Keyur
Patel for careful review and suggestions.
9. Appendix
9.1. Alternative Ways to Signal OPE
[Note to RFC editor: This appendix to be removed before publication]
9.1.1. Extended Community holding the IP addres
The Extended Community to use must be transitive and either IPv4
Specific or IPv6 Specific as described in [RFC5701]. Thus, if it is
IPv4 Specific, it will be of type 0x41 and if IPv6 Specific, it will
be of type 0x40.
The Extended Community will hold the IP address of the PE that
originates the EVPN routes.
9.1.2. Large Community holding the BGP Identifier
A PE can be uniquely identified by its BGP identifier (also called
Router ID) and its AS number (ASN). A Large Community [RFC8092] can
be used to carry the BGP identifier and the ASN. A well known Large
Community needs to be allocated for this. This allocation is for the
Global Administrator field. The Local Data Part 1 field should carry
ASN and the Local Data Part 2 should carry the BGP identifier.
9.2. Considerations
It may be possible to associate the EAD-per-ES route with the Type
2/5 route by matching the Administrator Subfield of the RD. However,
there are too many constraints that need to be met to make this
method reliable. Basically, the RD was emphatically designed to
distinguish routes, not to identify them. The constraints that need
to be met are:
o The RD MUST by of Type 1. [RFC7432] recommends Type 1, but does
not mandate it.
o The Administrator subfield of the RD MUST be the same for each of
these routes originated by one PE. [RFC7432] does not require
this. It just says "The value field comprises an IP address of
the PE", but does not say that it must be the same IP address for
all. In an IPv6 only scenario, other ways will be used to assign
RD.
Heitz, et al. Expires May 17, 2018 [Page 7]
Internet-Draft EVPN Inter-AS Option B November 2017
o The Administrator subfield of the RD MUST be unique among all PEs
participating in the Inter-AS EVPN. This is likely, but not
guaranteed.
o If RDs are rewritten at AS boundaries, then the Administrator
subfield MUST be rewritten in a consistent way such as to preserve
the above properties.
By allowing a single EAD-per-ES route to validate all EAD-per-EVI
routes and all Type 2/5 routes, some of those routes may be falsely
validated. However that is the best possible outcome without a OPE.
It is transient until the Type 2/5 route can be withdrawn.
The possibility of the address space of PE next-hops in one AS
overlapping that of another AS was raised. In such a case, the IP
address of a PE in one AS may be the same as the IP address of a
different PE in another AS. Because an ASBR overwrites next-hops,
this can work. The OPE contains both the ASN as well as the IP
address of the originating PE, so this works too. However, EVPN
route types 3 and 4 contain only the originating router's IP address,
but not the originating router's ASN. Therefore, EVPN route types 3
and 4 may also need a OPE.
The possibility of making the EAD-per-EVI route mandatory was raised.
This would make some of the procedures easier, because the RD of the
EAD-per-EVI route can be matched with the RD of the Type 2/5 route
10. Normative References
[I-D.ietf-bess-evpn-prefix-advertisement]
Rabadan, J., Henderickx, W., Palislamovic, S., and A.
Isaac, "IP Prefix Advertisement in EVPN", draft-ietf-bess-
evpn-prefix-advertisement-02 (work in progress), September
2015.
[I-D.ietf-idr-tunnel-encaps]
Rosen, E., Patel, K., and G. Velde, "The BGP Tunnel
Encapsulation Attribute", draft-ietf-idr-tunnel-encaps-02
(work in progress), May 2016.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
2006, <https://www.rfc-editor.org/info/rfc4364>.
Heitz, et al. Expires May 17, 2018 [Page 8]
Internet-Draft EVPN Inter-AS Option B November 2017
[RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended Community
Attribute", RFC 5701, DOI 10.17487/RFC5701, November 2009,
<https://www.rfc-editor.org/info/rfc5701>.
[RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February
2015, <https://www.rfc-editor.org/info/rfc7432>.
[RFC8092] Heitz, J., Ed., Snijders, J., Ed., Patel, K., Bagdonas,
I., and N. Hilliard, "BGP Large Communities Attribute",
RFC 8092, DOI 10.17487/RFC8092, February 2017,
<https://www.rfc-editor.org/info/rfc8092>.
Authors' Addresses
Jakob Heitz
Cisco
170 West Tasman Drive
San Jose, CA 95134
USA
Email: jheitz@cisco.com
Ali Sajassi
Cisco
170 West Tasman Drive
San Jose, CA 95134
USA
Email: sajassi@cisco.com
John Drake
Juniper
Email: jdrake@juniper.net
Jorge Rabadan
Nokia
Email: jorge.rabadan@nokia.com
Heitz, et al. Expires May 17, 2018 [Page 9]