Internet DRAFT - draft-sajassi-bess-pbb-evpn-anycast-ip-tunnels
draft-sajassi-bess-pbb-evpn-anycast-ip-tunnels
BESS Workgroup Ali Sajassi
Internet Draft Samer Salam
Intended Status: Standards Track Cisco
Luay Jalil
Verizon
John Drake
Tapraj Singh
Juniper
Expires: April 19, 2016 October 19, 2015
PBB-EVPN with Anycast IP Tunnels
draft-sajassi-bess-pbb-evpn-anycast-ip-tunnels-00
Abstract
This document describes how PBB-EVPN can be combined with Anycast IP
Tunnels to provide resilient Layer 2 services with simplified
operations on Access Nodes (ANs).
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
Sajassi et al. Expires April 19, 2016 [Page 1]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Current Challenges with PBB-EVPN . . . . . . . . . . . . . . . . 4
3.1 All-Active Redundancy Mode . . . . . . . . . . . . . . . . . 4
3.2 Single-Active Redundancy Mode . . . . . . . . . . . . . . . 5
4 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . 6
4.2.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . 7
4.3 Optimizing PE B-MAC Address Allocation . . . . . . . . . . . 8
4.3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . 8
4.3.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . 8
4.3.2.1 BUM Traffic from Ethernet Segment . . . . . . . . . 8
4.3.2.2 BUM Traffic from PE in Redundancy Group . . . . . . 8
4.3.2.3 BUM Traffic from Remote PE . . . . . . . . . . . . . 8
5 Failure Scenarios . . . . . . . . . . . . . . . . . . . . . . . 9
5.1 Link/Node Failure in Access Network . . . . . . . . . . . . 9
5.2 PE Node Failure . . . . . . . . . . . . . . . . . . . . . . 9
5.3 IP Tunnel Failure . . . . . . . . . . . . . . . . . . . . . 9
5.4 AN Node Failure . . . . . . . . . . . . . . . . . . . . . . 10
6 Security Considerations . . . . . . . . . . . . . . . . . . . . 10
7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10
8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.1 Normative References . . . . . . . . . . . . . . . . . . . 10
8.2 Informative References . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11
Sajassi et al. Expires April 19, 2016 [Page 2]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
1 Introduction
RFC7623 defines PBB-EVPN, a solution that provides scalable MPLS
Layer 2 VPN services using multi-protocol BGP combined with Provider
Backbone Bridging (PBB) [802.1ah].
In this document, we describe a solution that uses PBB-EVPN to
aggregate IP access networks over an MPLS backbone, while offering
Layer 2 VPN services end to end. The network topology in question
comprises of three domains: the customer network, the IP access
network and the MPLS backbone, as shown in the figure below.
Customer IP Access MPLS
Network Network Backbone
+--------------+
+---------+ | | +---------------+
|+---+ | +----+ | | |
|| | +---+ | | +-+ | | +---+ |
-|AN |-|IPc|---| PE |---|P| | | |IPc| |
/|+---+ +---+ /+----+ /+-+ | | /+---+ |
/ |+---+ | / +----+ / +----+ / +---+|
+--+ || | +---+/ | |/ +-+ | |/ +---+ |AN || +--+
|CE|--|AN |-|IPc|---| PE |---|P|--| PE |---|IPc|-| |--|CE|
+--+ |+---+ +---+ +----+ +-+ +----+ +---+ +---+| +--+
| | | | | |
+---------+ | | +---------------+
+--------------+
Figure 1: Target Topology
The customer network connects via Customer Edge (CE) devices to the
IP Access Network. The IP Access Network includes Access Nodes (ANs)
and IP core nodes (IPc). The ANs perform tunneling of Ethernet
packets using some IP tunneling mechanism (e.g. GRE or VXLAN). The
MPLS Backbone comprises of PBB-EVPN PEs as well as MPLS core nodes
(P). The PBB-EVPN PEs terminate the IP tunnels which originate from
the ANs in their local IP Access Network.
To simplify the operations and reduce the provisioning overhead on
the ANs, as well as to provide resiliency, the PEs will use Anycast
IP addresses as the tunnel destination, for tunnels originating from
the ANs. We will refer to this setup as PBB-EVPN with Anycast IP
tunnels.
1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
Sajassi et al. Expires April 19, 2016 [Page 3]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Requirements
The solution MUST support multipoint Layer 2 VPN services end to end
(i.e. CE to CE), with the following requirements:
- Support for IP as well as non-IP payloads. Hence, the solution
should not rely on any ARP or ND snooping mechanism, but rather rely
on MAC learning in data-plane.
- Use IP as the underlying transport in the access network, with
support for IPv6.
- Support VLAN multiplexing with Customer VLAN (C-Tag) transparency
in the IP tunnels.
- Support for VLAN aware service bundling over the IP tunnels on the
PBB-EVPN PEs. This means that the PE needs to identify the L2 Bridge
Domain based on the combination of the IP tunnel identifier AND the
C-Tag (and/or S-tag).
- Support for local switching between IP tunnels on the PBB-EVPN PEs.
This is due to the assumption that the ANs may not support any local
switching/bridging functionality between their access ports.
- Support for hierarchical QoS with two levels in the hierarchy: per
IP tunnel and per C-Tag (and/or S-tag).
- Provide resilient interconnect with protection against: PE node
failure, path failures in the IP access network, and IP tunnel
failure.
- Simplify the provisioning of the ANs by using Anycast IP addresses
as the tunnel destination on the PEs. This eliminates the need to
explicitly provisioning the ANs with the unicast IP addresses of the
redundant PEs.
3 Current Challenges with PBB-EVPN
PBB-EVPN currently can operate in two different redundancy modes:
All-Active redundancy and Single-Active redundancy. Neither of these
modes can satisfy the requirements set forth in section 2 above. We
will discuss this in detail in this section.
3.1 All-Active Redundancy Mode
Sajassi et al. Expires April 19, 2016 [Page 4]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
The challenge with the All-Active redundancy mode is that the traffic
arriving from the MPLS backbone get load-balanced among the PEs in
the redundancy group. As such, it is not possible to enforce the QoS
policies reliably for traffic in the backbone to access direction.
Note that one cannot assume that the traffic will get distributed
evenly between the PEs due to the possibility of "elephant" flows
(i.e. flows with considerable bandwidth demands), especially when
load-balancing algorithms on the PEs do not take bandwidth into
account in the hashing decision.
3.2 Single-Active Redundancy Mode
The challenge with the Single-Active redundancy mode is that, based
on the DF election, only one of the PBB-EVPN PEs will be forwarding
traffic from access to the backbone. At the same time, depending on
the IGP distance between the Access Node (AN), and the PEs, traffic
over the Anycast IP tunnel may be delivered to the non-DF PE, where
it is permanently dropped. This is because the DF election
procedures, which are triggered by BGP exchanges, are independent of
the IGP path calculations.
Relaxing the DF filtering rules will result in duplicate packets, and
hence is not an option.
4 Solution
4.1 Overview
The solution involves defining a new asymmetric redundancy scheme for
PBB-EVPN, which behaves similar to All-Active redundancy for traffic
destined to the MPLS backbone from the Ethernet Segment and behaves
similar to Single-Active redundancy for traffic destined to the
Ethernet Segment from the MPLS backbone.
Since the mechanism behaves like All-Active redundancy for traffic
destined towards the MPLS backbone, the PEs in the redundancy group
can all receive traffic from the ANs, irrespective of how the IGP
steers the Anycast traffic. The segment split horizon filtering on
the PE prevents looping of BUM traffic.
Moreover, since the mechanism behaves like Single-Active redundancy
for traffic destined towards the Ethernet Segment, the active PE can
reliably enforce the QoS policies.
4.2 Operation
Each IP tunnel will be treated as a virtual Ethernet Segment (vES)
[Virtual-ES] on the PBB-EVPN PEs. However, instead of having a single
Sajassi et al. Expires April 19, 2016 [Page 5]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
B-MAC address associated with each vES across all PEs in the RG, each
PE will assign a different B-MAC for each vES, because Single-Active
redundancy is used for traffic from the MPLS backbone. This ensures
that remote PEs learn the C-MAC addresses against the unique B-MAC of
the active PE only. Note that the active PE here is actually
determined based on where the Anycast traffic lands, according the
IGP distance between the ANs and the PEs in the IP access network.
Traffic filtering based on DF election applies only to BUM traffic,
and only for traffic in the direction towards the Ethernet Segment.
+--------------+
+---------+ | | +---------------+
|+---+ | +----+ | | |
+--+ || |---------| | | | |
|CE|--|AN1|-----\ |PE1 | | | |
+--+ |+---+ | \ /+----+ | | |
|+---+ | / +----+ +----+ +---+|
+--+ || |------/ \| | | | |AN || +--+
|CE|--|AN2|---------|PE2 | |PEr |---------| |--|CE|
+--+ |+---+ | +----+ +----+ +---+| +--+
| | | | | |
+---------+ | | +---------------+
+--------------+
Figure 2: Example Network
For what follows, please refer to the example network in Figure 2
above.
4.2.1 Known Unicast Traffic
When an access nodes (e.g. AN1) forwards known unicast traffic over
the IP access network, the traffic will be steered towards one of the
PEs depending on the IGP distance between the AN and the PEs. In the
case where both PEs are equidistant from the AN, i.e. ECMP, the load-
balancing hash on the AN and IP core nodes of the access network
determines which PE receives the traffic. Assume the traffic is
delivered to PE1 in this example. PE1 would then encapsulate the
traffic in the PBB header, using its B-MAC address that is associated
with the vES corresponding to the IP tunnel on which the traffic was
received. The PE then adds the MPLS encapsulation and forwards the
packets towards the destination remote PE (PEr in this example). PEr
would then learn the C-MAC against the B-MAC address associated with
PE1. Hence, for the reverse traffic PEr will forward the traffic
destined to that C-MAC to PE1. This follows normal PBB-EVPN
operation.
Sajassi et al. Expires April 19, 2016 [Page 6]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
The only issue to consider is how would the PEs enforce the QoS
policies in case of the presence of multiple equal cost paths between
the AN and the PEs. Since the QoS policies need to be enforced per IP
tunnel and per C-Tag (and/or S-tag) on the PE, the AN needs to ensure
that all traffic of a given tunnel lands on the same PE. This can be
done by encoding the tunnel identifier in the entropy field of the
packet. For example, in the case of VXLAN, the source VTEP address
can be hashed into the source UDP port. Similarly, in the case of
GRE, the source tunnel address can be hashed into the GRE key. This
guarantees that all traffic associated with a given IP tunnel from an
AN is destined to a single PE, even in the presence of ECMP.
4.2.2 BUM Traffic
The PEs in a redundancy group perform the standard DF election
procedures described in RFC7623 (i.e. with per I-SID load-balancing).
As a result, only one of the PEs will be responsible for forwarding
BUM traffic received from the MPLS backbone towards the Ethernet
Segment (i.e. IP access network). This is per standard PBB-EVPN
operation.
For BUM traffic from the IP access network destined towards the MPLS
backbone, the traffic will be sent as a unicast from the AN to one of
the PEs (depending on the IGP distance, or the ECMP hash in case of
equidistant PEs). This is because the AN does not support any
bridging function. The ingress PE, which receives the BUM traffic,
will flood it following the standard procedures of PBB-EVPN. This
includes applying any DF filtering on locally attached Ethernet
Segments. For example, assume that for a given I-SID (ISID1), PE1 is
the DF for the vES associated with the IP tunnel to AN1, whereas PE2
is the DF for the vES associated with the IP tunnel to AN2. For BUM
traffic sent by AN1 to PE1, the PE1 will not flood the traffic
towards AN2 since this PE is not the DF for that vES.
The ingress PE would flood the BUM traffic to other PEs in the
redundancy group. In our example, this means that PE1 will flood the
BUM traffic to PE2. These PEs need to apply the segment split-horizon
filtering function to guarantee that the BUM traffic does not loop
back into the originating AN. To achieve this, a new procedure is
added to standard PBB-EVPN whereby the PE examines the source B-MAC
address on the BUM traffic, and recognizes that this B-MAC is
associated with a local vES. As such, the PE does not flood the BUM
traffic over that specific vES. This is effectively an extension of
the PBB-EVPN split-horizon functionality: in the standard, a single
B-MAC is associated with a given ESI, whereas here multiple B-MACs
are associated with the same ESI. Note that the PE would forward the
BUM traffic on other virtual Ethernet Segments in the same bridge
domain, and for which it is the assigned DF. Going back to our
Sajassi et al. Expires April 19, 2016 [Page 7]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
example, PE2 in this case would forward the BUM traffic received from
PE1 towards AN2 but not towards AN1.
4.3 Optimizing PE B-MAC Address Allocation
The mechanisms described thus far require the allocation of a B-MAC
address per IP tunnel (i.e. per CE). For high-scale deployments, this
allocation scheme defeats the scaling benefits of PBB-EVPN. Hence, a
more optimal B-MAC address allocation mechanism is needed. This
section describes how this can be achieved.
In order to reduce the number of B-MAC addresses required per PE, the
approach is to adopt the "local bias" mechanism defined in [Overlay]
for PBB-EVPN. This allows only a single B-MAC address to be assigned
to each PE for all its vES's (instead of one B-MAC per vES).
4.3.1 Known Unicast Traffic
The operation for known unicast traffic is similar to what was
described in section 4.2.1, albeit with a single source B-MAC address
being used for all traffic arriving at the PE from the virtual
Ethernet Segments associated with the IP tunnels.
4.3.2 BUM Traffic
The operation for BUM traffic can be broken down to three different
scenarios, as discussed next.
4.3.2.1 BUM Traffic from Ethernet Segment
For BUM traffic arriving from the (virtual) Ethernet Segment, the PE
follows the local bias mechanisms. That is, it floods the traffic
over all other local (virtual) Ethernet Segments in the same bridge-
domain irrespective of DF election state. The PE also floods the
traffic over the MPLS backbone.
4.3.2.2 BUM Traffic from PE in Redundancy Group
For BUM traffic arriving from another ingress PE in the same
Redundancy Group, the PE inspects the source B-MAC address in the
packets to identify the right replication list for the I-SID in
question. This replication list would exclude all (virtual) Ethernet
Segments that are common with the ingress PE (to prevent loops). The
PE would flood the BUM traffic over all (virtual) Ethernet Segments
in that specific replication list, subject to the DF filtering
constraints - i.e., it is the DF for that segment.
4.3.2.3 BUM Traffic from Remote PE
Sajassi et al. Expires April 19, 2016 [Page 8]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
For BUM traffic arriving from a remote ingress PE that has no
segments in common with the egress PE, the latter would again examine
the source B-MAC address in the packets to identify the replication
list for the I-SID in question. This replication list would include
all (virtual) Ethernet Segments that are in the associated bridge-
domain. The PE would flood the traffic over all those segments while
observing the DF filtering rules.
5 Failure Scenarios
In this section we will discuss the operation of the solution in case
of various failure scenarios.
5.1 Link/Node Failure in Access Network
For link or node failures in the access network, which do not cause
the ANs to completely lose connectivity to the PEs, the failure will
trigger the IGP to recalculate the shortest path. This may cause the
Anycast traffic from some ANs to be steered towards new PEs. However,
no action will be required in the control plane on the PEs. Remote
PEs will automatically update their C-MAC/B-MAC associations through
data-plane learning. There are no transient loops, and the
convergence time is a function of how quickly the IGP can re-
converge.
5.2 PE Node Failure
Failure of the PE node is addressed by the fact that the IP tunnels
use Anycast addresses. The ANs do not need to explicitly react to the
failure in any special way, as the IGP re-convergence takes care of
the fail-over procedures. The B-MAC routes advertised by the failed
PE are withdrawn by the Route Reflector, which in turn flushes all
the C-MACs associated with those B-MACs on the remote PEs. If this
happens before any new traffic arrives from the CE via the new PE,
temporary flooding would occur as expected. However, as soon as those
remote PEs start receiving traffic from the associated C-MACs over
the new PE, the remote PEs will update their C-MAC/B-MAC bindings
through normal data-plane learning.
5.3 IP Tunnel Failure
IP tunnel failure occurs when there is no viable path, within the
access network, between the AN and a PE. The PE can detect such
condition by using IGP route watch mechanisms. Upon detecting this
failure, the PE reacts with two actions:
- It withdraws all the Ethernet Segment routes associated with the
unreachable AN.
Sajassi et al. Expires April 19, 2016 [Page 9]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
- It withdraws all the MAC Advertisement routes associated with the
B-MAC(s) of the affected vES's.
Similar to the PE node failure scenario above, this may result in
temporary flooding from remote PEs until traffic starts flowing in
the other direction from the CEs.
5.4 AN Node Failure
From the perspective of the PE, this failure is handled the same way
as the IP tunnel failure scenario. If the CE connected to the failed
AN is single-homed, then no redundancy would be available.
6 Security Considerations
There are no special security considerations beyond those of PBB-
EVPN.
7 IANA Considerations
TBD
8 References
8.1 Normative References
[KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with
Ethernet VPN (PBB-EVPN)", RFC 7623, September 2015.
[PBB] IEEE, "IEEE Standard for Local and metropolitan area
networks - Media Access Control (MAC) Bridges and Virtual
Bridged Local Area Networks", Clauses 25 and 26, IEEE Std
802.1Q, DOI 10.1109/IEEESTD.2011.6009146.
[Overlay] Sajassi et al., "A Network Virtualization Overlay Solution
using EVPN", draft-ietf-bess-evpn-overlay-01, work in
progress, February 2015.
8.2 Informative References
Sajassi et al. Expires April 19, 2016 [Page 10]
INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015
Authors' Addresses
Ali Sajassi
Cisco
EMail: sajassi@cisco.com
Samer Salam
Cisco
EMail: ssalam@cisco.com
Luay Jalil
Verizon
EMail: luay.jalil@verizon.com
John Drake
Juniper
EMail: jdrake@juniper.net
Tapraj Singh
Juniper
EMail: tsingh@juniper.net
Sajassi et al. Expires April 19, 2016 [Page 11]