Network Virtualization Overlays Working Group | L. Xia |
Internet-Draft | Q. Wu |
Intended status: Standards Track | Huawei |
Expires: December 30, 2013 | June 28, 2013 |
Tenant system information discovery approaches Gap analysis
draft-wu-nvo3-mac-learning-arp-02
This document analyzes various protocol solutions for tenant system information (e.g. MAC, IP, etc) discovery in the virtualization environment (e.g.,MAC in MAC, MAC in IP, IP in IP) and identifies the gap against NVO3 control plane and data plane requirements.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 30, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The tenant system information in this document is referred to as L2 address and L3 address of VM. As described in [I.D-ietf-nvo3- framework], for an L2 NVE, the NVE needs to be able to determine MAC addresses of the tenant system. For an L3 NVE, the NVE needs to be able to determine IP addresses of the tenant system.
This can be achieved mainly in 3 ways: data plane learning; ARP; control plane distribution (e.g. by BGP or IS-IS). This document analyzes various protocol solutions for tenant system information (e.g. MAC, IP, etc) discovery in the virtualization environment (e.g.,MAC in MAC, MAC in IP, IP in IP) and identifies the gap against NVO3 control plane and data plane requirements.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [RFC2119].
Tenant system information discovery can be achieved either using dynamic data plane learning or ARP or control plane distribution. This document addresses how tenant system information discovery works in the overlay network enviroment. Figure 1 shows the NVO3 reference architecture for tenant system information discovery. The reference architecture assumes that:
,---------. ,' Backend `. ( NVA ) `. ,' `-+------+' | | .--..--. .--. .. ( ' '.--. .-.' L3 ' ( Overlay ) ( '-' .'--'._.'.-._.'.-._) NVE X = // \\ NVE Y = (MAC_X,IP_X) +------+ +-------+(MAC_Y,IP_Y) .-|NVE X | | NVE Y | ( +------+--. ( +-------+.--. .-.' ' .-.' ' ( DC Site X ) ( DC Site Y ) ( .'-' ( .'-' '--'._.'. ) '--'._.'. ) '--' / '--' / \ / \ __/_ \ /_ _\__ '--------' '--------' '--------' '--------' : Tenant : : Tenant : : Tenant : : Tenant : : SystemA: : SystemC: : SystemD: : SystemB: '--------' '--------' '--------' '--------' TSID= TSID= (VNID,MAC_A,IP_A) (VNID,MAC_B,IP_B)
Figure 1: Example of NVO3 reference architecture for tenant system information discovery
Here we give an example of tenant system information discovery in large layer 2 domain using NVO3 using traditional approach for MAC address learning. The packet flow and control plane operation are as follows:
The issues with tenant system information discovery are as follows:
Currently, 3 main solutions or their combination can be used to perform the tenant system information discovery. They are dynamic data plane learning, ARP, control plane distribution (including two options: centralized or distributed). Additionally, the ARP proxy [RFC1027] mechanism can be used for preventing the ARP flooding in the core network and limiting the MAC table size of NVEs and hosts. Here is a brief analysis of them and the associated protocols are discussed.
Shortest Path Bridging (SPB) [SPB] and TRILL [TRILL] are two different methods of IS-IS based overlay that operates over L2 Ethernets. They all use the MAC in MAC encapsulation and have the same default MAC address learning method:
In the centralized approach, TRILL may use TRILL ESADI to distribute the inner MAC address between all the RBridges however SPB doesn’t support ESADI distribution mechanism. In the distributed approach, SPB and TRILL may use combination of the above 3 methods.
The ARMD WG examined data center scaling issues with a focus on address resolution and developed a problem statement document [RFC6820]. In this document, the scaling issues of MAC address learning related to the overlay-based approach are listed as followed:
In order to tackle the above problems, SARP [SARP] seamlessly supports Layer 2 network virtualization services over the overlay network and significantly reduces their complexity in terms of table size and performance. The overlay networks are only required to map MAC addresses of the SARP proxies, instead of MAC address of the destination end host, to the correct tunnel.
BGP/MPLS IP VPNs [RFC4364] provides IP Virtual Private Networks (VPNs) for its customers and support VPN traffic isolation, address overlapping and separation between customer networks. The BGP/MPLS control plane is used to distribute both the VPN labels and the tenant system IP addresses that are used to identify the customer. However BGP/MPLS IP VPN doesn’t support interconnection with Data Center (DC) overlay networks and provide a virtual end to end tenant network service to tenant systems in the BGP/MPLS IPVPN.It also has the scalability related problems when IP addresses of a large number of VMs need to be propagated in control plane in the Virtualized data center environments.
For an L3 overlay node, the overlay node only needs to determine IP addresses of the tenant system but doesn't need to know the MAC address of the destination system since overlay tunnels the L3 traffic from the tenant system in an encapsulated format to the final destination and doesn't care about the MAC address of destination end system for the inner L3 packet. Therefore overlay node can answer any address resolution query with its own MAC address or one virtual MAC address. In [I.D-ietf-l3vpn-end-system], NVE uses XMPP to exchange information with the tenant system and answer the address resolution query from tenant system with a virtual router MAC address.
In order to propagate tenant system information to the whole overlay network environment, [I.D-ietf-l3vpn-end-system]use Route Server to gather VPN membership on each Forwarder and IP addresses that are currently associated with each virtual interface of tenant system and advertise them to the BGP speaker. In addition, BGP speaker also can interact with Route Server to generate tenant system information update to the upstream end systems.
Ethernet Virtual Private Networks (E-VPNs) [I-D.ietf-l2vpn-evpn] provide an emulated L2 service in which each tenant has its own Ethernet network over a common IP or MPLS infrastructure. PBB-EVPN [I-D.ietf -l2vpn-pbb-evpn] is a combined solution of PBB and E-VPN. They all use BGP for MAC address distribution over the core MPLS/IP network, and use ARP or data plane snooping for MAC address learning of locally attached hosts. In other words, the mapping table information <VNID,IP_A,NVE_X> should be distributed to all the remote overlay nodes that belong to the same VN. After that,the tenant system information<VNID,IP_A, MAC_X> is distributed from remote overlay nodes to all the remote tenant system. When all the tenant system information is populated, overlay nodes will process the packet from each tenant system and perform a lookup operation in its map table for the destination TSID=<VNID,IP_B> and determine which tunnel the packet needs to be sent to.
The analysis of their MAC address learning methods is as followed:
Pros:
Con: An E-VPN PE sends a BGP MAC Advertisement Route per customer/client MAC (C-MAC) address. This will raise the scalability related problems in the case of Virtualized data center environments where the number of virtual machines (VMs) is very large.
VPLS is an L2 VPN technology. VPLS uses the ARP and data plane learning for L2 tenant system information discovery, and not advertised and distributed via a BGP/LDP control plane. The analysis of this method is as followed:
Pros:
Cons:
LISP[RFC6830] essentially provides an IP over IP overlay where the internal addresses are end station Identifiers and the outer IP addresses represent the location of the end station within the core IP network topology. [draft-maino-nvo3-lisp-cp-02] discusses L2 over L3 LISP Encapsulation and proposes a LISP Mapping System for ARP resolution to eliminate the flooding of ARP traffic and further reduce the need for multicast in the underlay network. This system relies on mapping system for tenant system information distribution and involves MAP-request/MAP-Response message exchange between overlay node and mapping system. With introduced LISP Mapping system, the scalability is improved for tenant system information discovery. the packet flow and control plane operation are as follows:
The following table compares several tenant system information discovery methods from different aspects under the same network topology and scale.
+-----------+-------------+--------------+-------------+------------+ | TS | Forwarding | Packets |Control plane| Directory | | Discovery | table | flooding |Distribution Support | | method | size | impact | support | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | SPB | | | | Trill:Yes | | &TRILL | Mediaum | Medium | Yes | SPB:No | | | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | | | | | | | ARMD&SARP | Small | Medium | No | No | | | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | LISP | | | |LISP Mapping| | + | Medium | Medium | Yes | System | + ARP proxy | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | BGP/MPLS | | | | | | IP | Large | Large | Yes | No | | VPN | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | BGP/MPLS | | | | | | Ethernet | Large | Large | Yes | No | | VPN | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ | | | | | | | VPLS | | | | | | + | Medium | Small | Yes | No | | ARP proxy | | | | | | | | | | | +-----------+-------------+--------------+-------------+------------+ Table 1: The comparison between several tenant system information discovery methods
There are three ways for tenant system information discovery, data plane learning and control plane ARP learning and control plane distribution. In large layer 2 domain, the MAC address can not be simply learnt by looking at the outer layer 2 header, instead, Deeper parsing inner Ethernet header is required. However it also introduces a lot of processing overhead. In order to address this issue, the control plane distribution is proposed, and used to carry both MAC address and IP address and eliminate the above data plane learning issue. However distribution protocol is needed. How distribution protocol is used to propagate tenant system information and mapping table information in large scale and in a more efficient way is still under study.
This document has no actions for IANA.
TBC.