Internet DRAFT - draft-yong-nvo3-nve
draft-yong-nvo3-nve
Network working group L. Yong
Internet Draft L. Xia
Category: Standard Track Huawei
Q. Zu
Ericsson
Expires: December 2014 June 18, 2014
Network Virtualization Edge (NVE)
draft-yong-nvo3-nve-04
Abstract
This document specifies Network Virtualization Edge (NVE) data plane
interoperability functionality for Network Virtualization Overlays
(NVO3). These specifications are necessary for the interoperability
between an NVE and its attached tenant systems and between the NVEs.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 18, 2014.
Yong, et al [Page 1]
Internet-Draft Network Virtualization Edge (NVE) June 2014
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Table of Contents
1. Introduction...................................................3
1.1. Conventions used in this document.........................3
1.2. Terminology...............................................3
2. NVE Design Principles..........................................3
3. Interconnecting Tenant Systems.................................4
3.1. Virtual Machines and Physical Servers.....................4
3.2. Network Service Appliances................................5
3.3. Gateways..................................................6
4. Network Virtualization Edge (NVE)..............................6
4.1. NVE Forwarding............................................6
4.1.1. L2 NVE...............................................6
4.1.2. L3 NVE...............................................8
4.1.3. L2/L3 NVE............................................9
4.2. Overlay Tunnel between NVEs...............................9
4.3. Multi-Tenancy Support....................................10
4.4. Distributed Gateway (dGW)................................10
4.5. Route Path Control.......................................13
4.6. Split-NVE................................................13
4.7. Multi-Homing Support.....................................14
4.8. OAM Tools on NVE.........................................15
5. Operation Considerations......................................16
5.1. VM Mobility..............................................16
5.2. Gateway vs. Distributed Gateway..........................16
6. Security Considerations.......................................18
7. Acknowledgements..............................................18
8. IANA Considerations...........................................19
9. References....................................................19
9.1. Normative References.....................................19
9.2. Informative References...................................19
Yong, et al [Page 2]
Internet-Draft Network Virtualization Edge (NVE) June 2014
1. Introduction
Network Virtualization Edge (NVE) is a component in Network
Virtualization Overlays Technology. This component is described in
the NVO3 framework [NVO3FRWK] and architecture [NV03ARCH]. This
document specifies NVE data plane interoperate functionality. The
functionality specifications are necessary for the interoperability
between an NVE and its attached tenant systems and between the NVEs.
The data plane functionality described in this document is
independent of NVO3 control plane functionality. Thus, the control
plane functionality is outside the scope of this document. However
the specifications in this document can support any control plane
implementation and are helpful in control plane protocol development.
NVE data plane functionality essential is the packet forwarding. It
receives a packet from a tenant system via a virtual access point
(VAP), processes it, and sends it to the peer NVE via an overlay
tunnel or forwards to a local VAP; it receives a packet from a peer
NVE via an overlay tunnel, processes it, and sends it to a tenant
system via a VAP. In the process, an NVE performs the table lookup,
may modify the packet header and/or insert/remove the tunnel header
on the packet prior to sending. The forwarding table is manipulated
by the control plane. This document does not address the forwarding
table update/lookup and does not specify tunnel encapsulation
protocol but describe the usage at NVE.
In order to make NVO3 data plane work properly, some configurations
on NVEs are necessary. They can be done manually or automated. How
these configurations are done is outside the scope of the document.
1.1. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119].
1.2. Terminology
This document uses the terms defined in NVO3 framework [NVO3FRWK]
and architecture [NVO3ARCH] documents.
2. NVE Design Principles
NVE design principles are:
Yong, et al [Page 3]
Internet-Draft Network Virtualization Edge (NVE) June 2014
1. The solution supports multi-tenancy in a common underlying
network.
2. The solution supports different types of tenant systems and
requires no change on tenant system configuration and behavior.
3. The solution is agnostic to NVE location, i.e. regardless NVE
is co-located with tenant systems on a server/device or
physically separated from tenant systems.
4. No change on tenant system configuration and behavior whether
a gateway or distributed gateway is used.
5. The solution must support tenant system, i.e. virtual machine
(VM) mobility.
6. The solution must be scalable in supporting a VN having many
NVEs each of which may have many attached tenant systems; and
in supporting an NVE being the members of several VNs and
having attached tenant systems that belong to the same or
different VNs.
Note that NVO3 architecture [NVO3ARCH] defines NVE and NVA entities;
item 5 and 6 achievement depends on both NVE and NVA. This document
only focuses on NVE data plane functionality. The interaction
between NVE and NVA, between NVAs, and between hypervisor and NVE
are outside the scope of the document.
3. Interconnecting Tenant Systems
NVO3 provides network connectivity between the tenant systems
locally attached to an NVE, and between local and remote tenant
systems, i.e. on different NVE. NVE MUST be able to interwork with a
tenant system. NVO3 architecture [NVO3ARCH] defines several types of
tenant systems. Following sections describe these tenant systems in
terms of their role and networking behavior.
3.1. Virtual Machines and Physical Servers
Tenant system may be a virtual machine on a server or a physical
server. For a virtual machine, Guest OS runs on the tenant system
and application software runs on top of Guest OS. For a physical
server, host OS runs on the server and application software runs on
top of it. Here is the summary of such tenant system networking
behavior:
Yong, et al [Page 4]
Internet-Draft Network Virtualization Edge (NVE) June 2014
. A tenant system (TS) is configured with a subnet (such as
10.1.1.0/24) and a default GW IP address, and TS MAC and IP
address (manually or automatically e.g. [DHCP]). Note that TS
IP address is an address in tenant subnet. How to assign and
configure these addresses on Tenant System is outside the scope
of document.
. A tenant system learns the MAC address of the peer in the same
subnet by using a protocol (e.g. ARP, NDP) and may learn the
peer IP and MAC from the source address on the incoming packet
as well.
. A tenant system learns the GW MAC address from ARP or NDP
protocol. The GW entity needs to support ARP or NDP protocol.
. A tenant system may cache the interested destination IP and
MAC address for the packet forwarding.
. For intra subnet forwarding, a tenant system inserts the
destination MAC address on the packet.
. For inter subnet forwarding, a tenant system inserts the GW
MAC address on the packet.
. A tenant system may filter received packets and only accept
the packet with the designation MAC address the same as its MAC
address.
NVE ability to support such host behavior will be described in
section 4.1.
3.2. Network Service Appliances
A network service appliance such as load balancer or firewall can
act as a tenant system and provide a service to one or more VNs (via
distinct VAPs). A network service appliance may be implemented on a
physical device, a bare metal server, or a virtual machine on a
server. A tenant system, as a network service appliance, may have
different configuration and behavior as a host as described in
section 3.1. Such tenant system attaches to an NVE via a VAP and
acts as a middle box or service function in a VN. Typically, the
configuration or policy on the VN determines which traffic or
traffic flows in the VN is forwarded to this tenant system. In other
words, the NVE may not forward the packets toward a tenant system
based on the destination address on the packets. If a network
service appliance provides a service for two VNs interconnection
Yong, et al [Page 5]
Internet-Draft Network Virtualization Edge (NVE) June 2014
such as GW, NAT. In this case, such tenant system may even modify
the received packets header prior to send them. See next section.
NVE ability to support such tenant system behavior will be described
in section 4.5.
3.3. Gateways
A gateway may be used to interconnect two VNs implemented by NVO3
(refer it as to NVO3 VN), between an NVO3 VN and other networks that
may be virtual, physical, between an NVO3 VN and Internet, or a
combination of these. A gateway may also interconnect two NVO3 VNs
that are implemented with the same or different NVE service types. A
gateway may be implemented on a physical network device, a physical
server, or a VM. If it is a physical network device, the device
often supports embedded NVE functions and acts as a network element
in IGP.
Note that a distributed gateway may be implemented for NVO3 VNs
interworking. The distributed gateway means that a gateway function
is implemented on NVEs so that the traffic between the VNs can be
forwarded at the local NVE directly; as a result path optimization
is gained. See section 4.4.
It is often that a gateway integrates with several other network
service appliances such as NAT, firewall and policy based forwarding
for the interconnection need, which means that inter-VN traffic gets
these special treatments.
4. Network Virtualization Edge (NVE)
NVO3 framework [NVO3FRWK] defines three NVE service types: L2
service, L3 service, and L2/L3 service. A tenant network that is
implemented by NVO3 can be implemented with one of NVE types or the
combination w/ a gateway.
Note the document uses ARP protocol to describe interoperability
function between NVE and Tenant System. The use of NDP protocol is
for next version.
4.1. NVE Forwarding
4.1.1. L2 NVE
An L2 VN is implemented with L2 NVE service type and provides L2
broadcast domain to the tenant systems on the VN. The tenant system
attaches to the NVE via a VLAN, directly attached port or virtual
Yong, et al [Page 6]
Internet-Draft Network Virtualization Edge (NVE) June 2014
port. For NVO3 architecture, upon receiving ARP request from a
tenant system, an NVE should not forward the ARP request message. If
the interested tenant system at a remote NVE or an associated VAP,
the local NVE sends the ARP response with the requested MAC address
back. An NVE can obtain the remote tenant system MAC address via NVA
[NVO3ARCH]. The local tenant MAC is obtained from data plane
learning, configuration or ARP announcement. If the NVE does not
have the information about interested MAC in the receiving ARP
request, it should query the NVA. If receiving an unknown MAC packet
at VAP, NVE should change the packet to a known MAC packet prior to
the forwarding.
L2 NVE service supports a broadcast domain in the VN. Transporting
tenant broadcast/multicast traffic among NVEs requires a way to map
the VN broadcast/multicast traffic to an underlying IP multicast
solution. IP network does not support a multicast solution yet and
relies on Protocol Independent Multicast (PIM) to support multicast
transport. An NVE can send the VN broadcast/multicast traffic to the
remote NVEs by using unicast outer IP address on the packets, i.e.
replicating in the underlying network. This method does not require
the underlying network to support a multicast transport. How NVA
conveys such mapping information to NVEs is outside the scope of the
document.
If underlying network supports PIM, mapping between a
broadcast/multicast group in VN and an underlying IP multicast group
need to be configured at NVE manually or automatically. In case that
NVE is on a server, NVE uses IGMP [RFC3376] to join an IP multicast
group in underlying IP network. Upon receiving mcast MAC packet, NVE
encapsulates the packet and inserts underlying IP multicast address
as outer IP address on the packet.
To interconnect with external virtual or physical networks or an
overlay or non-overlay virtual network, a gateway is necessary. The
gateway as a tenant system attaches to an NVE and performs the
traffic enforcement based on policy between an L2 VN and external
networks.
L2 NVE service may apply to non-IP and IP applications. However IP
based application may be implemented in other ways.(see below)
To use NVE data plane learns the mapping between remote tenant
system MAC and remote NVE IP address without NVE not snooping and
terminating ARP is for further evaluation.
Yong, et al [Page 7]
Internet-Draft Network Virtualization Edge (NVE) June 2014
4.1.2. L3 NVE
An L3 VN is implemented with L3 NVE service type and provide L3
routing domain for the tenant systems in the VN. The VAP has an IP
interface, attached port/vport, or a VLAN interface. An NVE acts as
the first hop (or next hop if the TS has vR configured) router to
the attached tenant systems. For the VLAN interface, i.e. Ethernet
access interface, the NVE needs support ARP protocol, terminates all
ARP request messages, and replies its MAC address to the tenant
system. An NVE tracks tenant system IP and MAC address mapping if
VAP is Ethernet interface unless the special configuration is done
on the NVE.(see section 3.2).
When a tenant system forwards packets to its attached NVE, the NVE
receives either IP packets or Ethernet frames with NVE MAC address
as the destination MAC on the packets depending on VAP type. The NVE
performs an IP table lookup based on the destination IP address on
the packets. If the packet needs to forward to another NVE, the NVE
sends it via L3 overlay (NVE obtains the inner/outer address mapping
from NVA). If the packet needs to be forwarded to a local VAP, and
the VAP is Ethernet access, the NVE inserts the destination MAC
address on the packet and its MAC address as source MAC prior to
sending it to the tenant system; if the VAP is an IP interface, the
NVE sends IP packet directly.
NVE may learn local TS IP/MAC address via ARP or data plane learn or
from NVA. The rule of thumb is that such method MUST not require any
change on the tenant system side.
The tenant systems connecting to an L3 VN can be on the same or
different subnets. Typically, subnet or flow based policies may be
configured for route constraint or route path control policies may
be configured depending on the tenant network requirements. For
example, one tenant system on an L3 VN may be an inter-subnet
gateway. Packets from one subnet to another on the L3 VN may be sent
to this gateway prior to reach the destination tenant system.
If an L3 VN needs to interconnect with external networks, Internet,
or another VN, a gateway MUST be used. To avoid address collision,
either IP address space partition or IP address translation between
two VNs at a gateway is used. The former, in turn, looks like one
routing domain with one IP space shared between two virtual networks.
Figure 1 illustrates an example that a gateway (GW) as a tenant
system is used for interconnecting VNx, VNy, and external net. GW
attaches to NVE2. VN interconnecting policy may be configured on GW.
The distributed gateway may be used for VN interconnection, see
section 4.4.
Yong, et al [Page 8]
Internet-Draft Network Virtualization Edge (NVE) June 2014
+-----------+ +----------+
| +------+ |-~-~-~-~-~-~-~-| +------+ | +----+
TS1---+-|L3VNIx+--{L3 Overlay(VNx)}-+L3VNIx|-+--+ |
| +------+ |-~-~-~-~-~-~-~-| +------+ | | GW |
| | | | | |
| +------+ |-~-~-~-~-~-~-~-| +------+ | +--+-+
TS2---+-|L3VNIy+--{L3 Overlay(VNy)}-+L3VNIy|-+-/ |
| +------+ |-~-~-~-~-~-~-~-| +------+ | ..+...
+-----------+ +----------+ / \
NVE1 NVE2 | Ext. |
| net. |
\....../
Figure 1 Two VN interconnection via a GW
Note: an L3 VN, as a route domain, does not support broadcast and
multicast function. Applicability for MVPN [RFC4834] to NVO3 is for
further study. It is obvious that L3 NVE service only supports IP
based application.
4.1.3. L2/L3 NVE
L2/L3 NVE service type is used when an NVE supports distributed L3
gateway function and multiple L2 VNs on the NVE are instantiated.
See section 4.4.
4.2. Overlay Tunnel between NVEs
NVE may implement L2 overlay or L3 overlay depending on NVE service
types. A tunnel between two NVEs may be over one underlying network
segment/domain or span across multiple network domains. Both NVEs
need to use the same encapsulation protocol to encap./decap. packets
to/from a tunnel in between. There are several encapsulation methods
in the industry such as VXLAN [VXLAN], NVGRE [NVGRE]. If two NVEs do
not support the same encapsulation method, an interworking gateway
is needed for the encapsulation translation. An NVE may support more
than one encapsulation method, in this case, two NVEs need to select
the same encapsulation method. This can be done manually or via
control plane negotiation. In NVO3 architecture, an NVE relies on
the NVA to obtain the inner/outer address mapping and the underlying
network supports the IP connectivity between two NVEs. How NVA
obtains the mapping information is outside the scope of the document.
The overlay mechanism requires NVE encapsulating the packets from
tenant system, which adds overhead. To avoid packet fragmentation at
Yong, et al [Page 9]
Internet-Draft Network Virtualization Edge (NVE) June 2014
NVE, the tenant system packet MTU size MUST not exceed underlying
MTU size minus overlay header bytes minus inner header bytes. NVE
should drop the tenant packet when the packet size exceeds allowed
MTU size and raise the alert. Tenant System MUST support MTU
discovery.[RFC4821]
The encapsulation process on an NVE also needs convey the packet
characteristics in a VN to the underlying network, i.e. encode or
translate the packet parameters in the inner header to the outer
header, so that the packet can get the same treatment in the
underlying network. Some examples are CoS value, and entropy
information calculation. The detail will be updated in next version.
4.3. Multi-Tenancy Support
It is very important for NVO3 solution to support multi-tenancy over
a common physical infrastructure, and ensures independent address
space in individual tenant networks that are not communicated
directly, i.e. only communicate with address translation or via
Internet, and traffic isolation among them. NVE MUST maintain
separate forwarding tables to support address overlapping. Since a
tenant network may have one virtual or multiple virtual networks, it
is important for a tenant or DC operator to manage the address
allocation for the virtual networks in a tenant network to avoid
address collision. A tunnel between NVEs may carry the traffic
belonging to different virtual networks. The VN ID in the overlay
header serves the traffic segregation.
4.4. Distributed Gateway (dGW)
Distributed Gateway function may be implemented on an NVE for inter-
subnet, inter-VN gateway/forwarding for the local traffic. This will
gain the path optimization, i.e. traffic in one VN can be routed to
another VN at local NVE. Distributed Gateway implementation requires
inter-VN forwarding policy to be configured at each NVE. To support
TS mobility, all NVE use the same dGW address.
Figure 2 illustrates one example using dGW. As shown, VNx and VNy
present on NVE1 and NVE2; TS1 and TS3 connect to the L3 VNx and TS2
and TS4 to L3 VNy. The L3 overlay is used between the NVE1 and NVE2.
Both NVE1 and NVE2 support distributed gateway (dGW). The packet
from TS1 to TS2 will be processed at the dGW on NVE1. The packet
from TS1 to TS4 will be processed at the dGW on NVE1 and forwarded
over L3 Overlay (VNy) tunnel. The packet from TS1 to TS3 will be
forwarded over L3 Overlay (VNx) tunnel without processing at any dGW.
Yong, et al [Page 10]
Internet-Draft Network Virtualization Edge (NVE) June 2014
+------------+ +-----------+
| +------+ |-~-~-~-~-~-~-~-~-~| +------+ |
TS1---+-|L3VNIx+---{ L3 Overlay(VNx) }--+L3VNIx|-+---TS3
| +--+---+ |-~-~-~-~-~-~-~-~-~| +--+---+ |
| +-+-+ | | +-+-+ |
| |dGW| | | |dGW| |
| +-+-+ | | +-+-+ |
| +--+---+ |-~-~-~-~-~-~-~-~-~| +--+---+ |
TS2---+ |L3VNIy+---{ L3 Overlay(VNy) }--+L3VNIy|-+---TS4
| +--+---+ |-~-~-~-~-~-~-~-~-~| +---+--+ |
+------------+ +-----------+
NVE1 NVE2
Figure 2 L3 NVE Service w/dGW Model
Figure 3 illustrates other two examples where L2/L3 NVE service type
is used. Figure 3(a) shows two L2 VNs w/ distributed L3 gateway
function on NVEs; Figure 3(b) shows L2 VNs and L3 VNs interconnected
w/ distributed L3 gateway function on NVEs.
In case (a), the VAP may be VLAN or port/vport based. Tenant systems
on the same L2 VN or different L2 VNs may be on the same or
different NVEs. The tenant systems on the same L2 VN are in a
broadcast domain and can be communicated without a constraint. The
implementation looks like L2 NVE described above. For the traffic
across L2 VNs, i.e. from one L2 VN to another, the tenant system
sends the packet with GW MAC address that is configured on the local
NVE. The same MAC address for the same VN should be used on all NVEs
at the ARP response to the TSs. (Use of different GW MAC address on
NVEs is for further study).
When an NVE receive a packet from a TS on a VN, say VNx, if the
destination MAC on the packet is not L3dGW MAC address, NVE performs
MAC table lookup and forward the packet to the tenant system on the
same VN as described in section 4.1.1. If the destination MAC on the
packet is L3dGW MAC address, NVE performs the IP table lookup and
gets the destination VN, say VNy, and destination NVE. If the
destination NVE is itself, the NVE sends the packet to the tenant
system via the associated VAP; if the result is a remote NVE, the
NVE encapsulates the packet (IP packet) with the destination VN ID
prior sending to remote NVE. The remote NVE decapsulates the packet
and performs IP lookup. For case (a); remote NVE inserts L3dGW MAC
as the src MAC and found MAC address as dst MAC on the packet prior
Yong, et al [Page 11]
Internet-Draft Network Virtualization Edge (NVE) June 2014
sending to the tenant system. For case (b), remote NVE forward IP
packet to tenant system directly.
+----------+ +----------+
| +------+ |-~-~-~-~-~-~-~-~-~| +------+ |
TSs---+-|L2VNIx+-{ L2 Overlay(VNx) }-+L2VNIx|-+---TSs
| +---+--+ |-~-~-~-~-~-~-~-~-~| +---+--+ |
| +--|--+ | | +--|--+ |
| |L3dGW|-{L3 Overlay(VNx|VNz)}-|L3dGW| |
| +--|--+ | | +--|--+ |
| +---+--+ |-~-~-~-~-~-~-~-~-~| +----+-+ |
TSs---+-|L2VNIz+-{ L2 Overlay(VNz) }-+L2VNIz| +---TSs
| +------+ |-~-~-~-~-~-~-~-~-~| +------+ |
+----------+ +----------+
NVE1 NVE2
(a)
+----------+ +----------+
| +------+ |-~-~-~-~-~-~-~-~-~| +------+ |
TSs---+-|L2VNIx+-{ L2 Overlay(VNx) }-+L2VNIx|-+---TSs
| +---+--+ |-~-~-~-~-~-~-~-~-~| +--+---+ |
| +--+--+ | | +-+---+ |
| |L3dGW|-{L3 Overlay(VNx|VNz)}-|L3dGW| |
| +--+--+ | | +-+---+ |
| +---+--+ |-~-~-~-~-~-~-~-~-~| +--+---+ |
TSs---+-|L3VNIz+-{ L3 Overlay(VNz) }-+L3VNIz| +---TSs
| +------+ |-~-~-~-~-~-~-~-~-~| +------+ |
+----------+ +----------+
NVE1 NVE2
(b)
Figure 3 L3 distributed GW Examples
A tenant network with multiple L2 VNs and L3 VNs interconnected
w/distributed L3 gateway function MUST share the same IP and MAC
address space. If the tenant network further interconnects with
other tenant networks, external, or Internet via a gateway, either
IP and/or MAC address partition or network address translation (NAT)
MUST be used.
Yong, et al [Page 12]
Internet-Draft Network Virtualization Edge (NVE) June 2014
4.5. Route Path Control
A tenant network may be implemented as a full mesh among NVEs and
not have a policy on route path at all. As the result, single hop is
between sender TS and destination TS. A tenant network may contain
some tenant systems that are designated as a network service
appliance. In the case, tenant may want some tenant traffic passing
through some network service appliance prior to delivering to the
destination tenant system and some are not. For example, Tenant
network is implemented with three VNs in a DC. One L2 VN is for Web
Tier, second L2 VN is for application tier; third L2 VN is for
Database Tier. The policies are illustrated in the following figure.
Traffic from the Web Tier to the App Tier MUST pass through tenant
firewall on the Tenant System, say A; The traffic from App Tier to
Web Tier can be directly routed; Traffic between App Tier and DB
Tier MUST pass tenant firewall on tenant system, say B. No
communication is allowed between the Web Tier and the DB Tier.
Web Tier ----FW A----> App Tier ----FW B----> DB Tier
<----------- <----FW B-----
Figure 4 Polices on a Three-Tier Tenant Network
In this case, An NVE attached by the firewall tenant system A is
configured to forward all the traffic from Web Tier to the system A
via a VAP. The system A processes the packets and may forward the
packets to the App Tier via another VAP that is associated to App
Tier on the NVE. The NVE forwards to the tenant systems in App Tier
on behalf of the system A. The traffic from App Tier can be
forwarded to the Web Tier directly. The NVE simply obtains the
inner/outer address mapping and translate VN IDs on the packets
prior to forwarding to the peer NVEs. Thus, in the pass-thru
firewall case, the inner/outer address mapping that an NVE gets from
NVA is not destination tenant address (inner)/its NVE address; the
outer address is the address of the NVE which the system A attaches
to. The NVO3 architecture, such route path control can be
implemented in NVA, NVE, or both.
4.6. Split-NVE
Split-NVE may be used in several use cases [NVO3ARCH]. Some NVE
functions may reside on NVE spoke and some are on NVE hub. An
overlay tunnel is used between NVE spoke and hub. One useful
splitting structure in the data plane is to simplify the forwarding
table on the NVE spoke, i.e. only maintains local forwarding entry;
Yong, et al [Page 13]
Internet-Draft Network Virtualization Edge (NVE) June 2014
and let the NVE hub maintain the complete forwarding table. An NVE
spoke just sends the packets to the NVE hub if the receiving point
is not at local. It is possible that an NVE hub does not have any
direct attached TS but connects to many NVE spokes. In this case,
all the NVE spokes and NVE hubs are the member of one VN. Another
useful structure is Intra NVE and Inter NVE splitting. The Intra
NVEs forwards the packets if the sender TS and receiver TS are in
the same VN and forwards the packets to the Inter NVE if not. The
Inter NVE forwards the packets between the different VNs. The Inter
NVE is often seen as a gateway. Note that, this design may cause the
packet hire-pinning if the sender and receiver TSs in two different
VNs are on the same Intra NVE. See section 5.3.
Split-NVE applies to all NVE service types. There is no
configuration and behavior change between a TS and attaching NVE
regardless if split-NVE is used or not. However, the network
performance and the tenant network cost may differ. The splitting
control plane functionality on an NVE is outside the scope of this
document.
4.7. Multi-Homing Support
Two multi-homing of NVEs scenarios are described in NVO3
architecture document [NVO3ARCH]. 1) One NVE may have more than one
overlay paths in term of more than one reachable IP addresses. 2)
When an NVE is physically separated from attached tenant systems, a
tenant system may attach to more than one NVE via the VAPs. A design
may use one of them or both together.
For case 1), NVA may provide more than one inner/outer mapping to an
NVE, the NVE may support some ECMP capability to distribute the
traffic among the paths.
For L2 NVE service, multi-homing may be configured with either
active/active or active/standby to a tenant system in case 2). For
active/active mode, the tenant system may distribute traffic per-
flow or per-vlan.
For L3 NVE service, NVEs are the first hop router for local tenant
systems regardless of inter-subnet or intra-subnet traffic. The
link/node redundancy mechanisms (e.g., ECMP, VRRP, etc) can provide
various modes (i.e., active/active, active/standby) of multi-homing
access for tenant systems.
For L2/3 NVE service, the main extension to above 2 types is the
internal distributed gateway function on NVEs. Since the gateway
function is used for process the ingress traffic on individual NVEs,
Yong, et al [Page 14]
Internet-Draft Network Virtualization Edge (NVE) June 2014
multi-homing implementation is the same as of L2 NVE or L3 NVE
service type.
4.8. OAM Tools on NVE
It is necessary for an NVE to support some OAM tools. A tool can be
turned on when a tunnel or VN is set up or dynamically turned on/off
according to operation needs. The NVE implementation SHALL fulfill
the OAM requirements described in [NVO3OAM]. Memo: followings are in
considerations. This section is for the future study.
OAM tools on NVE should be operated under the conditions:
. Run various OAM tools along the same path as data frames of
overlay network between a pair of NVEs
. Run OAM tools between per-tenant NVEs to probe the status of
tunnel or NVE entities;
. Send fault notification from underlay network to overlay network
for its fault handing and alarm suppression.
NVE may support following tools but not limit to:
. Connectivity Fault Detection: detect the tunnel connectivity
fault between two or more NVEs that support the same virtual
network;
. Overlay Path Traceroute: trace the overlay path hops between two
NVEs;
. Underlying Path Traceroute: trace the underlay path hops between
two NVEs;
. Performance Monitoring: monitor various performance metrics such
as packet loss, packet delay, packet delay variation, packet
throughput, etc.
. NVE Auto Discovery: dynamically discover other NVEs that support
the same virtual network;
. Send fault notification from underlay network to overlay network
for its fault handing and alarm suppression.
Yong, et al [Page 15]
Internet-Draft Network Virtualization Edge (NVE) June 2014
5. Operation Considerations
5.1. VM Mobility
VM Mobility provides some benefits for DC operator in term of
resource optimizations and performance turning. If a tenant system
on a VM runs guest OS and application software, it can be moved from
one NVE to another NVE without the impact of the live application.
When a VM moves, NVA will send new inner/outer mapping to NVEs. If
the tenant systems running a network service appliances software,
besides the mapping changes, some setting on old NVE also need to be
configured on new NVE.(See section 4.5)
A VN ID is used to segregate the traffic for different VNs in data
plane, or say on "wire". NVE implementation may use a domain-wide
global VN ID or egress NVE assigned local VN ID in the data plane.
If use of local VN ID, when a VM moves from one NVE to another, a
sender NVE not only has to obtain the new NVE address the VM moves
to, i.e. the outer addresses, also has to obtain the new VN ID the
new NVE allocating for the VN. In other words, the ingress NVE has
to modify both overlay header and outer header when a VM moves from
one NVE to another. If a domain-wide VN ID is used on NVEs, ingress
NVE only need to modify the outer header when a VN moves. Although
local allocated VN ID adds implementation complexity for VM mobility,
it has advantage in associating VN ID with other context at egress
NVE to facilitate egress NVE packet processing. Thus, either may be
implemented for different use cases.
5.2. Gateway vs. Distributed Gateway
A gateway is used, in general, to interconnect two networks. The
gateway in NVO3 means that it interconnects a virtual network
overlay with other networks. The other network can be a physical
network, a virtual network, a virtual network overlay, or Internet,
which are often called an external network. Distributed gateway is a
gateway function that is implemented on the NVEs so the traffic
between two tenant systems in different virtual networks can be
routed on the local NVE directly. The main benefit to use the
distributed gateway is path optimization.
To interconnect two networks, a gateway may integrate with other
network service appliances such as NAT, Firewall and handle policy
enforcement. Note that a tenant often uses this as a rule in an
application networking design. When implementing distributed gateway,
that means that NVEs also need support such policy enforcement,
which, sometimes, may become complex.
Yong, et al [Page 16]
Internet-Draft Network Virtualization Edge (NVE) June 2014
Should a tenant network uses a gateway or distributed gateway for
two VN interconnection? Here are the general recommendations.
. If a VN overlay interconnects to an external network that is a
physical network, virtual network, or Internet, using a gateway
is practical. In this case, it is easy to place a gateway on
the traffic path.
. If a VN overlay in a DC interconnects to another VN overlay in
another DC, the inter-VN traffic will pass through DC GWs, i.e.
traffic pattern is north-south, it is good to use a gateway.
. If a VN overlay interconnects to anther VN overlay within a DC,
i.e. traffic pattern is easy-west, and there is light policies
for inter-VN traffic, using distributed gateway is better;
otherwise use a gateway.
A gateway and distributed gateway function on NVEs can further work
together to provide the inter-VN connection in a tenant network.
Figure 5 gives an example. A distributed GW is implemented on NVE1
for forwarding traffic between VNx and VNy on NVE1. The L2GW is used
to interconnect VNx and L2 bridge network; VNx, VNy, and bridge
network further connects to WAN network via DCGW. DCGW is member of
VNx, VNy, and connect to WAN network.
+----------+ +--------+ +--+.
| +------+ |-~-~-~-~-~-~-~-~-| | / \ Physical
TSs---+-|L2VNIx+-{ L2 Overlay(VNx) } L2GW |-|Bridge|-Servers
| +--+---+ |-~-~-~-~-~-~-~-~-+----+---+ \ Net./
| | | {L2 Overlay} +---+
| +-+---+ |-~-~-~-~-~-~-~-~-+----+---+ .'^^.\
| |L3dGW+-{L3 Overlay(VNx/y)} DCGW +----->{ WAN }
| +-+---+ |-~-~-~-~-~-~-~-~-+----+---+ .v.v./
| | | {L2 Overlay}
| | | +----+-----+
| +--+---+ |-~-~-~-~-~-~-~-~-| +--+----+|
TSs---+-|L2VNIy+-{ L2 Overlay(VNy) }-+L2VNIy |+--TSs
| +--+---+ |-~-~-~-~-~-~-~-~-| +-------+|
+----------+ +----------+
NVE1 NVE2
Figure 5 Example of Gateways and Distributed GW Usage
Yong, et al [Page 17]
Internet-Draft Network Virtualization Edge (NVE) June 2014
6. Security Considerations
NVO3 networks may be deployed in various use cases [NVO3CASE].
Difference cases may have different level of security requirements
for NVO3 networks. NVE is a key element for the security of NVO3
network. NVE should support the mutual or automated authentication
with NVAs, other NVEs, and tenant systems, to guarantee its peer
having valid identity and privilege to communicate. NVEs should also
provide integrity, confidentiality, and origin Authentication
protection for whether control or data traffic against the unsecure
underlay network. A per-tunnel based signatures or digests may
provide data origin authentication, non-repudiation, and integrity
protection. In addition, an NVE itself need to tolerant the DoS
attack.
In the Split-NVE case, there are security risks that the NVE may be
polluted by a compromised hypervisor with incorrect network updating
information. However in this circumstance, the security damages can
be limited to the hypervisor and the VNs attached to the compromised
hypervisor. There are still ways to protect the attached NVE itself
and mitigate the damages.
When an NVE is in the hypervisor, there are additional security
risks on the NVE if the hypervisor may be compromised. A compromised
NVE may send data traffic of a VN which it is not supposed to send.
It is very important for an NVE to prevent any security risk
initiated from a compromised remote NVE. The NVE may use the inner-
to-outer address mappings table to filter incoming data traffic to
ensure the inner address sourced packet originated from a correct
participating NVE address.
If the tenant traffic privacy is a concern, cryptographic measures
must be applied in addition. Confidentiality and integrity on the
tenant data plane traffic could avoid the tenant traffic to be
redirected, intercepted or modified by a compromised underlay
network component.
In additional, the NVE implementation shall fulfill the security
requirements described in [nvo3-security-requirements].
7. Acknowledgements
Authors like to thank Qin Wu for the review and valuable comments.
Yong, et al [Page 18]
Internet-Draft Network Virtualization Edge (NVE) June 2014
8. IANA Considerations
The document does not require any IANA action.
9. References
9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC2119, March 1997.
[RFC4821] Mathis, M. and Heffner, J., "Packetization Layer Path MTU
Discovery", RFC4821, March 2007
[RFC4834] Morin, T., "Requirements for Multicast in Layer 3
Provider-Provisioned Virtual Private Networks (PPVPNs)", RFC4834,
April 2007
9.2. Informative References
[NVO3ARCH] Black, D., Narten, T., et al, "An Architecture for
Overlay Networks (NVO3)", draft-narten-nvo3-arch, work in progress.
[NVO3CASE] Yong, L., et al, "Use Cases for DC Network
Virtualization Overlays", draft-ietf-nvo3-use-case, work in progress
[NVO3DPREQ] Bitar, N., et al, "NVO3 Data Plane Requirements", draft-
ietf-nvo3-dataplane-requirements-01, work in progress
[NVO3FRWK] LASSERRE, M., Motin, T., et al, "Framework for DC Network
Virtualization", draft-ietf-nvo3-framework, work in progress.
[NVO3OAM] Ashwood, P, et al, "NVO3 Operation Requirement", draft-
ashwood-nvo3-operational-requirement, work in progress
[NVGRE] Sridharan, M., et al, "NVGRE: Network Virtualization using
Generic Routing Encapsulation", draft-sridharan-virtualization-
nvgre, work in progress
[VXLAN] Mahalingam, M., Dutt, D., etc, "VXLAN: A Framework for
Overlaying Virtualized Layer 2 Networks over Layer 3 Networks",
draft-mahalingam-dutt-dcops-vxlan, work in progress
Authors' Addresses
Yong, et al [Page 19]
Internet-Draft Network Virtualization Edge (NVE) June 2014
Lucy Yong
Huawei Technologies, USA
Email: lucy.yong@huawei.com
Frank Liang Xia
Huawei Technologies
Email: frank.xialiang@huawei.com
Qiang Zu
Ericsson
Email: zu.qiang@ericsson.com
Yong, et al [Page 20]