Internet DRAFT - draft-xu-l2vpn-vpls-isis
draft-xu-l2vpn-vpls-isis
Network working group X. Xu
Internet Draft Huawei
Category: Standard Track H. Shah
Ciena Corp
L. Yong
Huawei
Y. Fan
China Telecom
Expires: January 2013 July 13, 2012
Virtual Private LAN Service (VPLS) Using IS-IS
draft-xu-l2vpn-vpls-isis-04
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 13, 2013.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
Xu, et al. Expires January 13, 2013 [Page 1]
Internet-Draft VPLS Using IS-IS July 2012
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Abstract
This document describes a light-weight Virtual Private LAN Service
(VPLS), referred to as IS-IS VPLS, which uses IS-IS for auto-
discovery and signaling. IS-IS VPLS is intended to be used as a
scalable cloud data center network solution.
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119].
Table of Contents
1. Introduction ................................................ 3
2. Terminology ................................................. 5
3. Control Plane ............................................... 6
3.1. VPLS Info TLV .......................................... 6
3.2. Auto-discovery ......................................... 7
3.3. Signaling .............................................. 8
4. Data Plane .................................................. 8
4.1. Data Encapsulation and Forwarding ...................... 8
4.1.1. Unicast ........................................... 8
4.1.2. Multicast/Broadcast .............................. 8
4.2. MAC Address Learning ................................... 9
4.2.1. Data-plane based MAC Learning ..................... 9
4.2.2. Control-plane based MAC Learning ................. 10
5. ARP Broadcast Reduction .................................... 10
6. Security Considerations .................................... 10
7. IANA Considerations ........................................ 10
8. Acknowledgements ........................................... 11
9. References ................................................. 11
9.1. Normative References .................................. 11
9.2. Informative References ................................ 11
10. Authors' Addresses ........................................ 12
Xu, et al. Expires January 13, 2013 [Page 2]
Internet-Draft VPLS Using IS-IS July 2012
1. Introduction
For leveraging the economics of scale, today's cloud data centers
tend to contain tens to hundreds of thousands of servers. Moreover
distributed computing and server virtualization are increasingly
adopted in today's cloud data centers. These factors lead to
significant scaling, performance, and operational challenges for
cloud data center networks. Therefore, to design a scalable and
sustainable cloud data center network solution, the following
requirements must be taken into account:
1) LAN Extension
To achieve service agility and business continuance, Virtual
Machine (VM) migration and High-Available (HA) cluster are
commonly used in today's cloud data centers. These two
applications introduce special requirements on data center
networks. For instance, to allow a VM to be freely migrated from
one server to another within data centers while retaining its IP
address and MAC address, the LAN where the VM is located needs to
be extend across multiple racks or pods within data centers. In
addition, as some HA cluster applications rely on link-local
multicast for cluster member discovery and heartbeat, cluster
member servers are usually required to reside within the same
Layer2 domain. As a result, LAN extension becomes a fundamental
requirement for cloud data center networks.
2) VPN Instance Space Scalability
In modern cloud data centers, tens of thousands of tenants (e.g.,
enterprises or governments who consume computing resources such
as Infrastructure-as-a-Service (IaaS) offered by cloud service
providers), could be hosted over a shared network infrastructure.
For security and performance isolation considerations, these
tenants SHOULD be isolated from one another. Hence, cloud data
center networks SHOULD provide a large enough VPN instance space
for tenant isolation.
3) Forwarding Table Scalability
In a highly virtualized cloud data center environment, it's not
uncommon that millions of VMs are contained over a common network
infrastructure. Therefore, the forwarding table of each network
device within cloud data center networks SHOULD be scalable
enough so as to keep up with that scale of VMs.
Xu, et al. Expires January 13, 2013 [Page 3]
Internet-Draft VPLS Using IS-IS July 2012
4) Bandwidth Utilization Maximization
In modern cloud data centers, distributed computing is driving the
server-to-sever traffic to become the dominating traffic compared
to the client-to-server traffic. To meet the growing capacity
demands for server-to-server connectivity, shortest path
forwarding and Equal Cost Multi-Path (ECMP) have already been the
basic capabilities of cloud data center networks.
5) ARP/Unknown Unicast Flood Suppression
It's well-known that the flooding of ARP broadcast and unknown
unicast traffic within large Layer2 networks will lead to
performance impact on both networks and hosts. As the Layer2
domain is extended across multiple racks or pods within data
centers, the above problem will become even worse. As such, how
to suppress the flooding of ARP broadcast and unknown unicast
traffic within cloud data centers becomes increasingly desirable.
6) Flexibility for Tradeoffs between Bandwidth and State
It's possible that there would be a great difference between
tenants within a cloud data center, in terms of VM numbers or
multicast/broadcast traffic volume. For example, some tenants may
have seldom VMs while others may have a lot of VMs, or some
tenants may have a high volume of multicast/broadcast traffic
while others may have little or even no multicast/broadcast
traffic. As such, there is no "one size fits all" VPN
multicast/broadcast delivery procedure for these tenants. Hence,
cloud data center networks SHOULD support both the ingress
replication procedure and the multicast tree procedure for
delivering VPN multicast/broadcast traffic, so as to allow for an
effective tradeoff between bandwidth usage and state maintenance
on a per tenant basis according to the particular conditions of
each tenant.
7) Simplified Provisioning and Operation
It's not surprising that a single cloud data center has thousands
of physical switches (e.g., ToR switches). However, a network of
such scale usually implies a big challenge for data center
operators. Therefore, how to simplify and even automate network
provisioning and operation becomes significantly important for
cloud data center networks.
8) Reuse Existing Operation Experiences
Xu, et al. Expires January 13, 2013 [Page 4]
Internet-Draft VPLS Using IS-IS July 2012
IP, as a proven routing technology, has already been used in most
today's data centers. Furthermore, those service providers who
are planning to build cloud data centers have years of experience
in operating MPLS-based L2VPN and/or L3VPN services which can be
transported over MPLS or IP-enabled Packet Switching Networks
(PSNs). To allow data center operators to reuse their existing
network operation experiences, cloud data center network
solutions SHOULD reuse existing technologies and protocols where
appropriate, rather than reinventing the wheel. That's why there
are increasing interests from the industrial community on how to
adopt or adapt the existing L2VPN and/or L3VPN technologies for
cloud data center networks.
Although the existing VPLS solutions [RFC4761, RFC4762] can meet
most of the above requirements, there are still spaces for
improvement. For instance, the existing VPLS solutions require
establishing full-mesh of pseudo-wires (PWs) between PE routers,
which implies a significant scaling challenge in the cloud data
center environment, especially when imagining configuring thousands
of ToR switches as PE routers and provisioning tens of thousands of
VPLS instances on them; secondly, the ingress replication procedure
used for delivering multicast and broadcast traffic in existing VPLS
solutions is not optimal from the bandwidth utilization perspective;
thirdly, existing VPLS solutions require running one or more
separate protocols besides IGP within data centers for VPLS protocol
(e.g., LDP and/or BGP), which results in a certain complexity in
network management and operation, especially when considering
configuring BGP and VPLS parameters on thousands of ToR switches and
configuring thousands of BGP peers on aggregation or core switches.
Hence, this document describes a light-weight VPLS solution,
referred to as IS-IS VPLS, which uses IS-IS [IS-IS][RFC1195] for
VPLS protocol. IS-IS VPLS retains almost all advantages of existing
VPLS solutions (e.g., split-horizon forwarding) while overcoming
their shortages as mentioned above. For example, there is no need
for full-mesh PWs between PE routers in IS-IS VPLS. Furthermore, it
allows data center operators to flexibly make tradeoffs between
bandwidth and state on a per tenant basis. Finally, an already
deployed IGP within cloud data centers (i.e., IS-IS), rather than a
dedicated protocol(s), is used for providing VPLS services.
2. Terminology
This memo makes use of the terms defined in [RFC4664], [VPLS-MCAST],
[RFC4761] and [RFC4762].
Xu, et al. Expires January 13, 2013 [Page 5]
Internet-Draft VPLS Using IS-IS July 2012
3. Control Plane
There are two primary functions of the VPLS control plane: auto-
discovery and signaling. In IS-IS VPLS, these two functions are
accomplished by using a single extended IS-IS TLV, referred to as
VPLS Info TLV (see section 3.1). By propagating such VPLS Info TLVs
that contain VPLS information within data center networks, PE
routers automatically discover which other PE routers are part of a
given VPLS instance and their assigned VPLS labels for that VPLS
instance. Furthermore, according to the ISIS specification defined
in [IS-IS] and [RFC1195], IS-IS routers would ignore unknown TLVs in
the LSP and pass them on to other neighbors unchanged. Therefore, P
routers don't need processing the VPLS info TLV, but instead
synchronizing the Link State PDUs (LSP) containing such TLV with
their adjacent IS-IS neighbors as normal. In addition, to overcome
the 255-byte TLV limit, IS-IS allows the interpretation of multiple
TLVs of a given type to be considered additive rather than mutually
exclusive (see section 6.4 in [RFC5311] for more details), therefore
there is no scaling issue in using IS-IS for propagating a huge
amount of VPLS information.
3.1. VPLS Info TLV
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
|Type=VPLS Info |
+-+-+-+-+-+-+-+-+
| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| PE's IPv4 or IPv6 Address |
| (128 bits) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS ID (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS Label (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS ID (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS Label (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Xu, et al. Expires January 13, 2013 [Page 6]
Internet-Draft VPLS Using IS-IS July 2012
Type
Type code for VPLS Info TLV: TBD.
Length
Total number of bytes contained in the value field.
PE's IPv4 or IPv6 Address
This 128-bit field is filled with one of the originating PE
router's IPv4 or IPv6 addresses which are reachable across
the IP backbone. The address filled in this field SHOULD be
used as a tunnel destination address by remote PE routers
when these PE routers acting as ingress PE routers want to
tunnel a customer Ethernet frame to such PE router. If the
IP address is IPv4, the last four octets of this field are
filled with the IPv4 address while the remaining part is set
to zero. In other words, it is filled with an IPv4-mapped
IPv6 address.
VPLS ID
This field is filled with a 20-bit globally unique VPLS ID
for a particular attached VPLS instance. In case that a
larger VPLS ID space is required, the leftmost 12-bit
reserved field could be used together with the VPLS ID field
as an extended VPLS ID field. That is to say, the whole 32
bits are filled with a 32-bit long extended VPLS ID value.
VPLS Label
This field is filled with a 20-bit MPLS label corresponding
to the VPLS instance which is identified by the VPLS ID.
3.2. Auto-discovery
In IS-IS VPLS, each PE router could automatically discover which
other PE routers are part of a given VPLS instance that is
identified by the globally unique VPLS ID. This allows each PE
router to be configured with only the identities of the attached
VPLS instances, but not identities of all the other PE routers
belonging to these VPLS instances.
Xu, et al. Expires January 13, 2013 [Page 7]
Internet-Draft VPLS Using IS-IS July 2012
3.3. Signaling
In IS-IS VPLS, a PE router would assign the same VPLS label for a
given VPLS instance to any other PE router. As such, this VPLS label
is only used for identifying a particular VPLS instance, rather than
identifying both a particular VPLS instance and the corresponding
ingress PE router as a PW label does.
4. Data Plane
4.1. Data Encapsulation and Forwarding
Since the VPLS label in IS-IS VPLS is only used for identifying a
particular VPLS instance, in the data-plane based MAC learning case
(see section 4.2.1), IP-based tunnel (e.g., GRE (Generic Routing
Encapsulation)/IP [RFC4023] or UDP [MPLS-in-UDP]) is RECOMMENDED to
be used as the PE-to-PE tunnel technology. As such, during the MAC
learning process, egress PE router could easily determine the
ingress PE router of the received VPLS packet from the tunnel source
address of that packet. Note that, in the control-plane based MAC
learning case (see section 4.2.2), there is no special requirement
for PE-to-PE tunnel technology in comparison with existing VPLS
solutions. The following sub-sections are based on an assumption
that IP tunnels are used between PE routers.
4.1.1. Unicast
For known unicast, MAC-in-MPLS-in-IP encapsulation [RFC4448] is used.
For unknown unicast, the encapsulation and forwarding procedures are
the same as that for multicast/broadcast described in the following
section.
4.1.2. Multicast/Broadcast
There are two major modes for delivering multicast and broadcast in
IS-IS VPLS: ingress replication mode and P-Multicast tree mode. P-
Multicast tree mode further includes two sub-options: non-
aggregative P-Multicast tree mode where one P-Multicast distribution
tree in the IP backbone is exclusively used by a single VPLS
instance, and aggregative P-Multicast tree mode in which one P-
Multicast tree is shared by more than one VPLS instance. The
corresponding encapsulation for each mode is described in the
following sub-sections.
Xu, et al. Expires January 13, 2013 [Page 8]
Internet-Draft VPLS Using IS-IS July 2012
4.1.2.1. Ingress Replication Mode
In the ingress replication mode, an ingress PE router forward the
received customer multicast/broadcast frames towards remote PE
routers in separate tunnels. Hence, the encapsulation in this mode
has no difference from that for unicast.
4.1.2.2. Non-aggregative P-Multicast Tree Mode
In the non-aggregative P-Multicast tree mode, MAC-in-IP
encapsulation is used directly since the destination IP address
(i.e., multicast address) contained in the IP-based tunnel header is
enough for egress PE routers to determine which VPLS instance the
received VPLS packet belongs to.
4.1.2.3. Aggregative P-Multicast Tree Mode
For the aggregative P-Multicast tree mode, MAC-in-MPLS-in-IP
encapsulation SHOULD be used. Furthermore, the MPLS label here
SHOULD be treated as an upstream-assigned label. For example, assume
a PE router has assigned a local label L for a given VPLS instance
and advertised that VPLS information using the VPLS Info TLV before,
when this PE router wants to send a multicast VPLS packet of that
VPLS instance through the corresponding aggregative P-Multicast tree,
label L as an upstream-assigned label will be contained in that VPLS
packet.
4.2. MAC Address Learning
MAC addresses of local CE hosts would still be learnt by PE routers
as normal bridges.
As for learning MAC addresses of remote CE hosts, IS-IS VPLS
provides two options: data-plane based MAC learning and control-
plane based MAC learning. If unknown unicast flood suppression is
required even at the cost of consuming more forwarding table
resources, the control-plane based MAC learning option could be
considered. Otherwise, the data-plane based MAC learning option
SHOULD be preferred.
4.2.1. Data-plane based MAC Learning
Upon receiving an VPLS packet from a remote PE router, the MPLS
label contained in the packet (or the tunnel destination IP address
in the non-aggregative P-Multicast tree case) is used to determine
the particular VPLS instance that the packet belongs to, while the
Xu, et al. Expires January 13, 2013 [Page 9]
Internet-Draft VPLS Using IS-IS July 2012
tunnel source IP address is used to tell from which ingress PE
router the packet was sent.
4.2.2. Control-plane based MAC Learning
In IS-IS VPLS, MAC addresses of remote CE hosts can also be learnt
on the control plane by using the MAC-Reachability TLV defined in
[RFC6165].
Upon learning the MAC addresses of their local CE hosts, PE routers
would immediately advertise these MAC addresses to other PE routers
of the same VPN instance by using the MAC-Reachability TLV defined
in [RFC6165]. One or more MAC-Reachability TLVs are carried in a LSP
which in turn is encapsulated with an Ethernet header. The source
MAC address is the originating PE router's MAC address whereas the
destination MAC address is a to-be-defined multicast MAC address
specifically identifying IS-IS VPLS PE routers. Such LSPs are
forwarded towards remote PE routers as customer Ethernet frames by
ingress PE routers. Egress PE routers receiving the above packets
SHOULD intercept them and accordingly process them. IP address of
the PE router originating these MAC routes could be derived either
from the "IP Interface Address" field contained in the corresponding
LSPs (Note that the IP address here SHOULD be identical with that
contained in the VPLS Info TLV) or from the tunnel source IP address
of the VPLS packet containing such MAC routes.
Since these LSPs are fully transparent to P routers, there is no
impact on the control plane of P routers. More details about the
control-plane based MAC learning procedure are for further study.
5. ARP Broadcast Reduction
To suppress ARP broadcast flood within a given VPLS instance, ARP
cache mechanism can be enabled on PE routers. For more details about
ARP cache mechanism, please refer to [ARP-Reduction]
6. Security Considerations
This document doesn't introduce additional security risk to IS-IS
and VPLS, nor does it provide any additional security feature for
IS-IS and VPLS.
7. IANA Considerations
The IS-IS TLV type code for VPLS Info TLV is required to be defined
by IANA.
Xu, et al. Expires January 13, 2013 [Page 10]
Internet-Draft VPLS Using IS-IS July 2012
8. Acknowledgements
Thanks to Tony Li, Peter Ashwood-Smith, Phil Bedard, Kris Price,
Shahram Davari, Adrian Farrel, Giles Heron and Christian Jacquenet
for their valuable comments on this proposal.
9. References
9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
9.2. Informative References
[IS-IS] ISO/IEC 10589, "Intermediate System to Intermediate System
Intra-Domain Routing Exchange Protocol for use in
Conjunction with the Protocol for Providing the
Connectionless-mode Network Service (ISO 8473)", 2005.
[RFC1195] Callon, R., "Use of OSI IS-IS for Routing in TCP/IP and
Dual Environments", RFC 1195 1990.
[RFC5311] McPherson, D., Ginsberg, L., Previdi, S., and M. Shand,
"Simplified Extension of Link State PDU (LSP) Space for
IS-IS", RFC 5311, 2009.
[RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
"Encapsulation Methods for Transport of Ethernet over MPLS
Networks", RFC 4448, April 2006.
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
MPLS in IP or Generic Routing Encapsulation (GRE)", RFC
4023, March 2005.
[MPLS-in-UDP] X. Xu, et al., "Encapsulating MPLS in UDP", draft-xu-
mpls-in-udp-01.txt (work in progress), May 2012.
[RFC6165] A. Banerjee., D. Ward, "Extensions to IS-IS for Layer-2
Systems", RFC 6165, February 2011.
[VPLS-MCAST] R. Aggarwal., Y. Kamite., L. Fang, "Multicast in VPLS",
draft-ietf-l2vpn-vpls-mcast-08.txt (work in progress),
October 2010.
Xu, et al. Expires January 13, 2013 [Page 11]
Internet-Draft VPLS Using IS-IS July 2012
[ARP-Reduction] H. Shah., A. Ghanwani., and N. Bitar, "ARP Broadcast
Reduction for Large Data Centers",
draft-shah-armd-arp-reduction-02.txt (work in progress),
October 2011.
[RFC5331] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS Upstream Label
Assignment and Context-Specific Label Space", RFC 5331,
August 2008.
[RFC4664] Andersson, L. and Rosen, E. (Editors),"Framework for Layer
2 Virtual Private Networks (L2VPNs)", RFC 4664, Sept 2006.
[RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
(VPLS) Using BGP for Auto-Discovery and Signaling", RFC
4761, January 2007.
[RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service
(VPLS) Using Label Distribution Protocol (LDP) Signaling",
RFC 4762, January 2007.
10. Authors' Addresses
Xiaohu Xu
Huawei Technologies,
Beijing, China
Email: xuxiaohu@huawei.com
Himanshu Shah
Ciena Corp
Email: hshah@ciena.com
Lucy Yong
Huawei USA
1700 Alma Dr. Suite 500
Plano, TX 75075, US
Email: lucyyong@huawei.com
Yongbing Fan
China Telecom
Guangzhou, China.
Phone: +86 20 38639121
Email: fanyb@gsta.com
Xu, et al. Expires January 13, 2013 [Page 12]