Internet DRAFT - draft-gu-nvo3-tes-nve-mechanism
draft-gu-nvo3-tes-nve-mechanism
Network Working Group Y. Gu
Internet-Draft Y. Li
Intended status: Standards Track Huawei
Expires: April 22, 2013 Oct 19, 2012
The mechanism and signalling between TES and NVE
draft-gu-nvo3-tes-nve-mechanism-01
Abstract
his draft introduces the interaction required between TES to NVE when
NVE is located in an external box to TES . The signaling between TES
and NVE has to be designed carefully to reflect all the interaction
requirements. This document describes the relevant considerations
for such design and also provides a basic analysis of the potential
reusable protocols. Currently this draft focuses on the general
interaction procedures with relevant parameters and the signaling
design consideration. It may be extended to show more detailed
signalling design recommendation and/or solution recommendation in
the future with the progress of NVO3's work.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 22, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Gu & Li Expires April 22, 2013 [Page 1]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminologies and concepts . . . . . . . . . . . . . . . . . . 6
3. TES to NVE Interaction . . . . . . . . . . . . . . . . . . . . 9
3.1. Interaction Intentions . . . . . . . . . . . . . . . . . . 9
3.2. VM Lifetime Events . . . . . . . . . . . . . . . . . . . . 9
3.2.1. VM Creation . . . . . . . . . . . . . . . . . . . . . 9
3.2.2. VM Pre-associate with NVE . . . . . . . . . . . . . . 10
3.2.3. VM Associate with NVE . . . . . . . . . . . . . . . . 10
3.2.4. VM Suspension . . . . . . . . . . . . . . . . . . . . 10
3.2.5. VM Resume . . . . . . . . . . . . . . . . . . . . . . 11
3.2.6. VM Migration . . . . . . . . . . . . . . . . . . . . . 11
3.2.7. VM Termination . . . . . . . . . . . . . . . . . . . . 11
3.2.8. VM Full Lifecycle Sketch . . . . . . . . . . . . . . . 11
3.3. Events,Interaction and Parameters . . . . . . . . . . . . 13
3.3.1. VM Pre-association . . . . . . . . . . . . . . . . . . 13
3.3.2. VM Association . . . . . . . . . . . . . . . . . . . . 14
3.3.3. VM Suspension . . . . . . . . . . . . . . . . . . . . 15
3.3.4. VM Resume . . . . . . . . . . . . . . . . . . . . . . 15
3.3.5. VM Emigration . . . . . . . . . . . . . . . . . . . . 16
3.3.6. VM Immigration . . . . . . . . . . . . . . . . . . . . 16
3.3.7. VM Termination . . . . . . . . . . . . . . . . . . . . 17
3.3.8. Keep-alive . . . . . . . . . . . . . . . . . . . . . . 17
3.3.9. NVE Local Changes . . . . . . . . . . . . . . . . . . 18
3.4. Signalling Design Considerations . . . . . . . . . . . . . 18
3.4.1. General Requirements . . . . . . . . . . . . . . . . . 18
3.4.2. Consideration . . . . . . . . . . . . . . . . . . . . 19
3.4.3. Signalling States Machine . . . . . . . . . . . . . . 19
4. Security Considerations . . . . . . . . . . . . . . . . . . . 20
5. Appendix 1: Mechanism Analysis . . . . . . . . . . . . . . . . 20
5.1. IEEE 802.1Qbg . . . . . . . . . . . . . . . . . . . . . . 20
5.1.1. Brief Introduction . . . . . . . . . . . . . . . . . . 21
5.2. BGP . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3. External Controller . . . . . . . . . . . . . . . . . . . 23
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.1. Normative Reference . . . . . . . . . . . . . . . . . . . 23
6.2. Informative Reference . . . . . . . . . . . . . . . . . . 23
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24
Gu & Li Expires April 22, 2013 [Page 2]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
1. Introduction
Tenant End System (TES) is the physical host where tenant deploys
their applications. Tenants' applications can be deployed on a
physical server directly or on a virtual machine resided on a
physical server. Tenant's virtual network, or say virtual data
center, is an overlay network which is built on the underlying
network, but logically independent of the underlying network.
Network Virtualization Edge (NVE) is implemented with virtualization
functions to encapsulate or decapsulate a tenant's packet that allow
for L2 and/or L3 tenant separation and for hiding tenant addressing
information (MAC and IP addresses). A Tenant End System attaches to
a Network Virtualization Edge (NVE) node, either directly or via a
switched network (typically Ethernet). TES and NVE can be on the
same physical server or on the separate devices. Fig1 to Fig3 show
different NVE location cases. While TES and NVE are on the same
physical server, the interaction between TES and NVE is via some
proprietary internal interface which does not require a standard
signaling protocol. Therefore such scenario is not the target of
this document.For all the other scenarios, as long as the signaling
between TES and NVE is visibile to network developer, it is in the
scope of this draft. We tried to examine the different locations of
NVE to make sure the signaling interaction between NVE and TES cover
as possible scenarios as possible.
o (NVE Location 1) NVE and TES are co-located in a physical server.
VM connects to NVE on Hypervisor. In this case, there should be
some mechanism to assist Hypervisor know of VM changes, including
adding, deleting and migration. Both VM and Hypervisor, as well
as network service appliance, are controlled by VM Manager. VM
Manager is aware of any VM identity and event, hence it can easily
notify NVE about the information through some internal interface.
A publicaly available standard protocol is not necessary in this
case. Refer to Fig1.
Gu & Li Expires April 22, 2013 [Page 3]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
+-------------+------------+
| +--------------------+ |
| | +--------------+ | |
| | |Overlay Module| | |
| | +----+---------+ | |
| | | VN context| |
| | +-----+-------+ | |
| | | VNI | | |
| | +-+---------+-+ | |
| | | VAPs | | |
| +----+---------+-----+ |
| | | |
| +--+---------+---+ |
| | VM | |
| +----------------+ |
| |
+--------------------------+
Tenant End Systems
Figure 1
o (NVE Location 2) TES connects to NVE on an external network entity
next to it(Figure 2). VM is controlled by VM
Manager, while NVE is controlled by some other management entity
like network management system. Hence proprietary protocol
between TES and NVE may not fit all the scanarios. A standard
protocol to signal between TES and NVE is mandatory in this case.
Refer to Fig2.
Gu & Li Expires April 22, 2013 [Page 4]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
+------- L3 Network --------+
| |
| Tunnel Overlay |
+------------+---------+ +---------+------------+
| +----------+-------+ | | +---------+--------+ |
| | Overlay Module | | | | Overlay Module | |
| +---------+--------+ | | +---------+--------+ |
| |VN context| | VN context| |
| | | | | |
| +--------+-------+ | | +--------+-------+ |
| | VNI | | | | VNI | |
NVE1 | +-+------------+-+ | | +-+-----------+--+ | NVE2
| | VAPs | | | | VAPs | |
+----+------------+----+ +----+-----------+-----+
| | | |
-------+------------+-----------------+-----------+-------
| | Tenant | |
| | Service IF | |
+----+------------+--------+ +---+-----------+-------+
| +----------------+ | | +---------------+ |
| | Hypervisor | | | | Hypervisor | |
| +--------+-------+ | | +-------+-------+ |
| | | | | |
| +-------+------+ | | +------+------+ |
| | VM | | | | VM | |
| +--------------+ | | +-------------+ |
| | | |
+--------------------------+ +-----------------------+
Tenant End Systems Tenant End Systems
Figure 2: NVE Location3: VM connects to NVE on external network
entity
o (NVE Location 3) TES and NVE are indirectly connected. Refer to
Fig3.
Gu & Li Expires April 22, 2013 [Page 5]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
+------- L3 Network ------+
| |
| Tunnel Overlay |
+------------+--------+ +--------+------------+
| +----------+------+ | | +------+----------+ |
| | Overlay Module | | | | Overlay Module | |
| +--------+--------+ | | +--------+--------+ |
| |VN Context| | |VN Context|
| | | | | |
| +-------+-------+ | | +------+-------+ |
| | VNI | | | | VNI | |
NVE1 | +-+-----------+-+ | | +-+----------+-+ | NVE2
| | VAPs | | | | VAPs | |
+----+-----------+----+ +----+-----------+----+ /\
| | | | |
................... ................... |
-----: switched network: : switched network: |signalling
................... ................... |
| | Tenant | | |
| | Service IF | | \/
Tenant End Systems Tenant End Systems
Figure 3: Reference model when TES and NVE are indirectly
connected
In the mail list discussion, more than one mechanisms to be used
between TESand NVE were discussed, including VDP (VSI Discovery and
Configuration Protocol ), BGP and others.. This draft is not going
to make assertion about which protocol is better. We believe that
each candidate protocol can, with some revision or updating, be used
to exchange necessary events and information between TES and NVE.
The final decision on which one to be used does not only depend on
functionalities, but also some other aspects, e.g. lightweight to be
implemented on server, widely deployment in the industry, efficiency
and performance etc.
This draft first presents the recommended procedures of the TES and
NVE signalling, key parameters of each step, and issues need to be
addressed. Then a set of signaling design considerations are
provided, which can be used as design requirements for the future
signalling definition. In the appendix, we give a brief analysis on
two existing protocols and also show how they can be revised to adapt
to TES and NVE signaling.
2. Terminologies and concepts
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
Gu & Li Expires April 22, 2013 [Page 6]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
The document uses terms defined in [framework].
VN: Virtual Network. This is a virtual L2 or L3 domain that belongs
a tenant.
VNI: Virtual Network Instance. This is one instance of a virtual
overlay network. Two Virtual Networks are isolated from one another
and may use overlapping addresses.
Virtual Network Context or VN Context: Field that is part of the
overlay encapsulation header which allows the encapsulated frame to
be delivered to the appropriate virtual network endpoint by the
egress NVE. The egress NVE uses this field to determine the
appropriate virtual network context in which to process the packet.
This field MAY be an explicit, unique (to the administrative domain)
virtual network identifier (VNID) or MAY express the necessary
context information in other ways (e.g. a locally significant
identifier).
VNID: Virtual Network Identifier. In the case where the VN context
has global significance, this is the ID value that is carried in each
data packet in the overlay encapsulation that identifies the Virtual
Network the packet belongs to.
NVE: Network Virtualization Edge. It is a network entity that sits
on the edge of the NVO3 network. It implements network
virtualization functions that allow for L2 and/or L3 tenant
separation and for hiding tenant addressing information (MAC and IP
addresses). An NVE could be implemented as part of a virtual switch
within a hypervisor, a physical switch or router, a Network Service
Appliance or even be embedded within an End Station.
Underlay or Underlying Network: This is the network that provides the
connectivity between NVEs. The Underlying Network can be completely
unaware of the overlay packets. Addresses within the Underlying
Network are also referred to as "outer addresses" because they exist
in the outer encapsulation. The Underlying Network can use a
completely different protocol (and address family) from that of the
overlay.
Data Center (DC): A physical complex housing physical servers,
network switches and routers, Network Service Appliances and
networked storage. The purpose of a Data Center is to provide
application and/or compute and/or storage services. One such service
is virtualized data center services, also known as Infrastructure as
Gu & Li Expires April 22, 2013 [Page 7]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
a Service.
VM: Virtual Machine. Several Virtual Machines can share the
resources of a single physical computer server using the services of
a Hypervisor (see below definition).
Hypervisor: Server virtualization software running on a physical
compute server that hosts Virtual Machines. The hypervisor provides
shared compute/memory/storage and network connectivity to the VMs
that it hosts. Hypervisors often embed a Virtual Switch (see below).
Virtual Switch: A function within a Hypervisor (typically implemented
in software) that provides similar services to a physical Ethernet
switch. It switches Ethernet frames between VMs' virtual NICs within
the same physical server, or between a VM and a physical NIC card
connecting the server to a physical Ethernet switch. It also
enforces network isolation between VMs that should not communicate
with each other.
Tenant: A customer who consumes virtualized data center services
offered by a cloud service provider. A single tenant may consume one
or more Virtual Data Centers hosted by the same cloud service
provider.
Tenant End System: It defines an end system of a particular tenant,
which can be for instance a virtual machine (VM), a non-virtualized
server, or a physical appliance.
Virtual Access Points (VAPs): Tenant End Systems are connected to the
Tenant Instance through Virtual Access Points (VAPs). The VAPs can
be in reality physical ports on a ToR or virtual ports identified
through logical interface identifiers (VLANs, internal VSwitch
Interface ID leading to a VM).
VN Name: A globally unique name for a VN. The VN Name is not carried
in data packets originating from End Stations, but must be mapped
into an appropriate VN-ID for a particular encapsulating technology.
Using VN Names rather than VN-IDs to identify VNs in configuration
files and control protocols increases the portability of a VDC and
its associated VNs when moving among different administrative domains
(e.g. switching to a different cloud service provider).
VSI: Virtual Station Interface. Typically, a VSI is a virtual NIC
connected directly with a VM. [Qbg]
Gu & Li Expires April 22, 2013 [Page 8]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3. TES to NVE Interaction
3.1. Interaction Intentions
While TES is a non-virtualized physical server, a single physical
interface on NVE is exclusively attached to a single tenant and the
attachement doesn't change very frequently. In this case, NVE can be
pre-configured with tenant's network properties and policies to
execute appropriate packet proccessing. And when a physical server
moves, which means a server change its attach point to the network,
the new NVE, to which the server is going to attach with in the new
location, can also be preconfigured. In this case, there is no need
to proceed signalling between TES and NVE.
While TES is a virualized server with multiple VMs, the interaction
between TES and NVE becomes necessary. A physical interface on NVE
can be attached to multiple VMs, which could belong to the same or
different tenants, and VMs can be moved to new locations without
physical shutdown, which means NVE not able to know VMs' attachemnt
and/or detachment by checking the physical port. As described in
[framework], NVE need to establish Virtual Network Instance for each
tenant virtual network attached to it through physical interface, NVE
must be able to know which tenants are attached to it and the
corresponding VMs belongs to each tenants. So that NVE must be able
to 1) identify and distinguish VMs attched to NVE through the same
physical interface; 2) identify which tenant the VM belongs to; 3)
get the network policies that is associated with the tenant. That's
why a interaction signalling between TES and NVE is needed. Of
course the signalling between TES and NVE are not limited to the
above intentions. While looking into the detail proccessing of VM
events, we will find more signalling functionalities and proccessing
on TES and NVE.
3.2. VM Lifetime Events
Not every VM has to pass through all the listed VM lifetime events.
Any VM can have at least two or a combination of the following
events.
3.2.1. VM Creation
VM Manager indicates the hypervisor to schedule resources on server
for a particular VM, including CPU, Memory, Storage and Network
resources. After the VM is created on the server, the VM has
necessary resource and is ready to be launched. The creation of VM
doesn't necessarily mean the VM is running. The VM can created but
not launched for some while as long as the manager would like. The
VM can be created and launched at once. Launching a VM just like
Gu & Li Expires April 22, 2013 [Page 9]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
startup a physical computer.
Though VM creation is a very important events for VM, but the
attached NVE needn't be aware of this event.
3.2.2. VM Pre-associate with NVE
VM Manager can decide when to luanch a VM and connect the VM to the
network. Before VM connects to network, operator need to provision
VM's network properties and policies to the NVE that the VM is
attached to. The examples of network properties are VM MAC address,
tenant virtual network identifier. The exmaples of policies are ACL
and QoS. But these properties and policies are not immediately
activated on NVE unless the VM Manager indicate the VM to connect to
network. This is called Pre-association. Pre-association is
optional event.
3.2.3. VM Associate with NVE
This event means the VM is going to connect to the network. NVE has
to get VM's network properties and policies, assign resources and
install these properties and policies. If there is Pre-association
before Association, NVE can reduce the time for Association. While
VM is associated, it can use network resources as a physical server
does.
Association can happen with or without pre-association. If there is
Pre-association before Association, NVE has already the net work
properties and policies restored, or even installed. If the network
properties and policies in Association message is the same as the
pre-association, NVE can activate the installed network properties
and policies. If they are different, the old reserved resources
should be released and the new network properties and policies are
installed and activated.
3.2.4. VM Suspension
Creating and terminating VM may take a considerable amount of time.
Instead of performing these operations, operators can suspend a
virtual machine for the required time and quickly resume it later.
Suspending a VM is similar to putting a real computer into the sleep
mode. When suspending a VM, VM's current state (including the state
of all applications and processes running in the VM) is stored. When
the suspended virtual machine is resumed, it continues operating at
the same point the virtual machine was at the time of its suspending.
Gu & Li Expires April 22, 2013 [Page 10]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.2.5. VM Resume
To activate the suspended VM. The suspended applications will start
again at the state the VM was suspended. It's not always predictable
on when a suspended VM will be resumed.
3.2.6. VM Migration
Two kinds VM migration, i.e. hot migration (or live migraiton) and
offline migration. The proccessing of offline migration is similar
to terminating the VM on one server and creating it on another
server. The running applications on the VM will be broken and then
be restarted again on the new location. For live migration, VM is
lively migrated from one location to another, and the running
applications should not be visibly disrupted. There is no
termination or creation during live migration, so it's highly
important to let NVE be aware of the migration so that corresponding
network properties and policies can be correctly obtained, installed
and activated on new location, and removed from the old location.
Otherwise, there might be security risk and will influence or even
interrupted running applications.
There are two sub-type for VM migration: VM emigration and VM
immigration.
o VM Emigrating: VM is emigrating from this server. Hence, all the
relevant resources on the server and attached NVE are disabled,
but not removed right now, and is ready to be removed once VM is
successfully migrated. If VM is failed to immigrate on the new
location, VM has to be resumed on old location with the states and
policies disabled by old NVE.
o VM Immigrating: VM is immigrating to this server. The srever and
attached NVE has prepared the necessary resources and is ready to
enable the VM's properties and policies once VM is successfully
migrated.
3.2.7. VM Termination
All applications and processing on VM is terminated. All VM's
resources on server, including CPU, Memory, Storage and network
resources, are released. There is no such a VM any more.
3.2.8. VM Full Lifecycle Sketch
Not every VM has to pass through all the lifetime events emulated in
above. A simplest VM life has only VM Creation, VM Associating with
NVE and VM Termination. A most complex VM life has all the events
Gu & Li Expires April 22, 2013 [Page 11]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
listed in above. In this section, we show a sketch for a VM's full
lifecycle with all listed events. This is helpful for the signalling
designation in the future.
/~~~~~~~~~~~~\ /~~~~~\
|VM Terminate|--Aged out-->|NULL |
\~~~~~~~~~~~~/ \~~~~~/
^ |
VM Terminate v
| /~~~~~~~~~~~\
+-----------------|VM Creation|<---------.
| \~~~~~~~~~~~/ |
| | Fail
| v |
| /~~~~~~~~~~~~~~~~\ |
+--------------|VM Pre-Associate|--------.
| |with NVE |<-------.
| \~~~~~~~~~~~~~~~~/ |
| | Fail
| v |
+----------------/~~~~~~~~~~~~~\<--------|-----------------.
| .----------->|VM Associate |---------. |
| | |with NVE |<--------. |
| | \~~~~~~~~~~~~~/ | Successful Immigraiton
|VM Resume | or | or | | to this server
| | | .---. .---. | |
| | v | | | /~~~~~~~~~~~~~~\
+---|-----/~~~~~~~~~~~~~\ | .------|---------->|VM Immigrating|
| .-----|VM Suspension| | | \~~~~~~~~~~~~~~/
| \~~~~~~~~~~~~~/ | | |
| | Failed Immigration |
| | to other server |
| v | |
| /~~~~~~~~~~~~~\ | Failed Immigration
+--------------------|VM Emigrating|-----. to this server
| \~~~~~~~~~~~~~/ |
| | |
| Successful Immigration to other server |
| | |
+---------------------------. |
| |
+-----------------------------------------------------------.
Figure 4: VM Full Lifecycle Sketch
Gu & Li Expires April 22, 2013 [Page 12]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.3. Events,Interaction and Parameters
In this section, we will present description of interaction,
parameters and special concerns for each VM events are provided. The
interaction has strong relationship with VM lifetime events, but is
not one-to-one mapping, for example, there is no interaction for VM
Creation. For VM events, the interaction is initiated by hypervisor
on behalf of a VM and sent to VNI on attached NVE. But this is not
always the case, since NVE may also initiate interaction if there is
some changes happen on NVE and those changes must be learned by
particular VMs.
3.3.1. VM Pre-association
o Interaction: This event will trigger Hypervisor to compose a pre-
association message, and then Hypervisor sends the message to NVE.
While receives the pre-association message, NVE needs to authorize
the VM and/or Hypervisor, obtain VM's network properties and
policies, and install the properties and policies on NVE.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Pre-association.
* VMID, a global unique ID in Data Center for a VM. A VM can
have more than one MAC addresses and belongs to more than one
VNID, so a VMID is necessary for NVE to accosicate the VNIDs
and MACs with the particular VM.
* VNID(s), a global unique ID in Data Center for a tenant's
virtual network.
* MAC addresses, a VM may have more than one MAC addresses. A VM
may also belongs to more than one virtual network. So the MAC
address(s) and VNID should be presented in a way that NVE can
identify which MAC addresses belongs to which VNID.
* Policies, including ACL, QoS, Priority and etc. In the case
there are more than one VNID associated with the VM, Policies
should be explicitely indicated to belong to which VNID.
o Response: After NVE processes pre-association message, it repond
to TES with processing result. The response can be SUCCESS or
FAIL with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
POLICIES(e.g. the provisioned policies are conflict with other
existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
has not enough resources to install the provisioned policies).
Gu & Li Expires April 22, 2013 [Page 13]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.3.2. VM Association
o Interaction: This event will trigger Hypervisor to compose an
Association message, and then Hypervisor sends the message to NVE.
Association can happen with or without a Pre-association message.
* If there is a Pre-association message before Association, NVE
needs to compare the information provided by Pre-association
and Association. If they are same, NVE can activate the pre-
installed resources. If they are different, NVE needs to do
some additional work depending on what information has been
changed from pte-association to association. For example, if
policy or VNID is changed, NVE needs to update its memory.
* If there is no Pre-association message before Association, NVE
needs to do authorization, obtain VM's network properties and
policies, and install and activate the properties and policies
on NVE.
* If there is another successful Association message before this
Association, NVE needs to compare the information provided by
previous provisioned Association and this Association. If all
is the same, NVE do nothing except for update the VM's timer.
If there is different in comparision, NVE needs to do some
additional work, depends on what information is changed. For
example, if policies or VNID is changed, NVE needs to update
its memory.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Association.
* VMID
* VNID(s)
* MAC addresses
* Policies
o Response: After NVE processes Association message, it repond to
TES with processing result. The response can be SUCCESS or FAIL
with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
POLICIES(e.g. the provisioned policies are conflict with other
existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
has not enough resources to install the provisioned policies).
Gu & Li Expires April 22, 2013 [Page 14]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.3.3. VM Suspension
o Interaction: This event will trigger Hypervisor to compose an
Suspension message or an Association message with Suspension
indication, and then Hypervisor sends the message to NVE.
Suspension must happen after Successful Association. On receiving
a Suspension message, NVE inactivate, but not remove, the VM's
resources and prepare for the next Resume message. In the state
of suspension, NVE acts similar as it in Pre-association state.
The FDB can be aged out during VM suspension.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Suspension or an Association message with
Suspension indication
* VMID
o Response: After NVE processes Suspension message, it repond to TES
with processing result. The response can be SUCCESS or FAIL . If
it's FAIL, it may be because the NVE is too busy to process the
message.
3.3.4. VM Resume
o Interaction: This event will trigger Hypervisor to compose an
Resume message or an Association message with Resume indication,
and then Hypervisor sends the message to NVE. Resume is supposed
to happen after a successful Suspension message, otherwise, it
will be responded with a SUCCESS message and NVE will do nothing
to the message.. On receiving a Resume message, NVE activates the
VM's resources and prepare.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Resume or an Association message with Resume
indication
* VMID
o Response: After NVE processes Resume message, it repond to TES
with processing result. The response can be SUCCESS or FAIL. If
it's FAIL, it may be because the NVE is too busy to process the
message.
Gu & Li Expires April 22, 2013 [Page 15]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.3.5. VM Emigration
o Interaction: This event will trigger Hypervisor to compose an
Emigration message or an Association message with Emigration
indication, and then Hypervisor sends the message to NVE.
Emigration can happen after Pre-association, Association,
Suspension or Resume.
o On receiving VM Emigration message or indication, NVE inactivate
VM's resources. But NVE doesn't immediately reomve VM's resources
and states, because an emigration maybe fail if the immigration on
the remote server or NVE is failed. In that case, the emigrating
VM may need to continue its work on the current server. NVE will
wait for a next Termination message to remove the VM's resources
or states on NVE.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Association.
* VMID
o Response: After NVE processes VM Emigration, it repond to TES with
processing result. The response can be SUCCESS or FAIL. If it's
FAIL, it may be because the NVE is too busy to process the
message.
3.3.6. VM Immigration
o Interaction: This event will trigger Hypervisor to compose an
Immigration message, or an Pre-association/Association message
with Immigration indication, call them immigration(Pre-asso) and
Immigration(Asso). NVE's reaction to VM Immigration is silimar to
its reaction to Pre-association or Association. If the result of
Immigration processing is FAIL, the VM will not migrate to the new
location and continue its work on old server. VM Manger may have
to find another new location for the VM to migrate to.
o To distinguish Immigration from Pre-association and Association is
meaningful, [statemigration-framework]shows the problem of VM's
flow-coupled state migration in case of VM live migration. The
Immigration message can be a indication or trigger for the flow-
coupled state migration on middleboxes.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
Gu & Li Expires April 22, 2013 [Page 16]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
* Operation, i.e. Immigration or an (Pre-)Association message
with Immigration indication.
* VMID
* VNID(s)
* MAC addresses
* Policies
o Response: After NVE processes Immigration message, it repond to
TES with processing result. The response can be SUCCESS or FAIL
with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
POLICIES(e.g. the provisioned policies are conflict with other
existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
has not enough resources to install the provisioned policies).
3.3.7. VM Termination
o Interaction: This event will trigger Hypervisor to compose an
Termination message. NVE' will release VM's resources on NVE and
remove all state about this VM.
o Parameters: The signalling from TES to NVE should at least include
the following mandatory parameters.
* Operation, i.e. Termination
* VMID
o Response: After NVE processes Termination message, it repond to
TES with processing result. The response can be SUCCESS or FAIL.
If it's FAIL, it maybe because NVE is too busy to process the
Termination message, however the VM can be terminated on the
server anyway.
3.3.8. Keep-alive
This is not a VM lifetime events. Since the resources on NVE is
precious, if a associated, pre-associated or suspended VM keeps idle
for a pre-defined time, NVE will remove the VM's resources, so that
NVE can serve other active VMs. In order to keep VM's resource on
NVE, Hypervisor has to create keep-alive message, or an Pre-
association/Association message with Keep-alive indication, NVE will
update VM's timer upon the Keep-alive message.
Parameters: The signalling from TES to NVE should at least include
Gu & Li Expires April 22, 2013 [Page 17]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
the following mandatory parameters.
o Operation, i.e. Keep-alive or an (Pre-)Association message with
Keep-alive indication.
o VMID
3.3.9. NVE Local Changes
While VM associate with a VNID on NVE, NVE will generate local
significant indicators for the VM and VNIDs, e.g. VID. If the
indicators are sent to Hypervisor in previous response, and the
indicators change later on, NVE need to create an Associate or a
dedicated message with the changed indicators and send to Hypervisor,
and Hypervisor will respond with processing result.
Note: Although we use the VM Lifetime events names as the names of
messages in this section, it does mean that there should be a
dedicated message for each event in the future signalling. Some of
the events can be carried in one signalled message with different
operation type. For example, an Association message with Immigration
indication or an Association message with Suspension indication.
3.4. Signalling Design Considerations
3.4.1. General Requirements
3.4.1.1. Basic Requirements
REQUIREMENT-1: The TNS (TES to NVE Signalling) MUST support TES to
notify NVE about the VM's events, including but not limited to
Pre-Association, Association, Emigration, Immigration and
Termination.
REQUIREMENT-2: The TNS MUST support TES to notify NVE about the VM's
VNID, which can be one identifier or a combination of several
indentifier.
REQUIREMENT-3: The TNS MUST support TES to notify NVE about the VM's
address. The address MUST include one or both of MAC address of
VM's virtual NIC and VM's IP address. And it SHOULD be
extensible to carry new address type.
REQUIREMENT-4: The TNS MUST support NVE to notify TES about the VM's
local tag. The local Tag type supported by TNP MUST include IEEE
802.1Q tag. And it SHOULD be extensible to carry other type of
local tag.
Gu & Li Expires April 22, 2013 [Page 18]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
3.4.1.2. Extension Requirements
REQUIREMENT-5: The TNS SHOULD support NVE to notify TES about the
VM's traffic PCP value.
In typical DC, where physical server connects to adjacent bridge, the
data frame from server can be tagged with PCP or untaggged. If a
data frame is untagged, it can be tagged with PCP on adjacent bridge.
While in virtualized DC, the adjacent bridge is Hypervisor. There
are two options to deal with PCP tag, 1) data frame is tagged with
PCP by VM, 2)data frame is tagged with PCP by Hypervisor and 3) data
frame is tagged with PCP by NVE.
In cloud service, the VM can be anybody and it may want a higher
priority than it should have. The VM can tag it's data frame with
higher PCP value and get better service. Based on the assumption
that PCP provided by VM is not reliable, it's more reasonable to let
the network to define the PCP value based on VM's priority, and
enable bridges to tag the PCP value, as 2) or 3).
This problem is similar to local VID, which can be tagged either by
Hypervisor or by NVE. The benefit to tag PCP by Hypervisor is to
reduce the load on NVE.
3.4.2. Consideration
To be added.
3.4.3. Signalling States Machine
The interaction should be stateful. Both Hypervisor and NVE need to
record the state of their signalling state. The main states are Pre-
association, Association, Suspension, and Termination. The following
diagram shows a the state machine of TES to NVE signalling. Only
reasonable situations are listed in the diagram. In the future, more
situation will be added to the state machine.
Gu & Li Expires April 22, 2013 [Page 19]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
|------------------->/```\----------------------|
| \~~~/ |
| |Pre-Asso |
| |or |
| |Immigration(Pre-Asso) |
/~~~~~~~~~~~\ Aged out v |
|Termination|<----| /~~~~~~~~~~~~~~~~\ Asso
\~~~~~~~~~~~/<-\ ---|Pre-Association | or
^ \ \~~~~~~~~~~~~~~~~/ Immigration(Asso)
| \ | |
Aged out Aged out |Asso |
or or |or |
Termination Termination |Immigration(Asso) |
| \----| v |
/~~~~~~~~~~~\Suspension/~~~~~~~~~~~~~\ |
|Suspension |<---------| Association |<----------------|
\~~~~~~~~~~~/--------->\~~~~~~~~~~~~~/
Resume / ^
/ \
/~~~\ | |
\~~~/ States |-Emigration-|
or
Immigration(Asso)
------ Message
Figure 5: TES to NVE signalling State Machine
4. Security Considerations
There are some considerations on security in [overlay-cp]. Most of
the considerations are about mechanism between NVE and external
controller, and the attack on underlying networks, which can not be
resolved only by the mechanism between TES and NVE. One security
issue related to the mechanism between TES and NVE is about the
authentication of VM who announces to associate with a particular VN.
There is a hypervisor between VMs and NVEs, and both VMs and
hypervisor are not always reliable. For example, a poisoned
hypervisor may modify the VN Name, or identification for similar
intention, in order to associate with a VN that it doesn't belong to.
5. Appendix 1: Mechanism Analysis
5.1. IEEE 802.1Qbg
Gu & Li Expires April 22, 2013 [Page 20]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
5.1.1. Brief Introduction
VDP has four basic TLV types.
o Pre-Associate: Pre-Associate is used to pre-associate a VSI
instance with a bridge port. The bridge validates the request and
returns a failure Status in case of errors. Successful pre-
association does not imply that the indicated VSI Type will be
applied to any traffic flowing through the VSI. The pre-associate
enables faster response to an associate, by allowing the bridge to
obtain the VSI Type prior to an association.
o Pre-Associate with resource reservation: Pre-Associate with
Resource Reservation involves the same steps as Pre-Associate, but
on successful pre-association also reserves resources in the
Bridge to prepare for a subsequent Associate request.
o Associate: The Associate TLV Type creates and activates an
association between a VSI instance and a bridge port. The Bridge
allocates any required bridge resources for the referenced VSI.
The Bridge activates the configuration for the VSI Type ID. This
association is then applied to the traffic flow to/from the VSI
instance.
o Deassociate: The de-associate TLV Type is used to remove an
association between a VSI instance and a bridge port. Pre-
Associated and Associated VSIs can be de-associated. De-associate
releases any resources that were reserved as a result of prior
Associate or Pre-Associate operations for that VSI instance.
|1 |2 |3 |4 |7 |8 |9 |25 |26 |25+M
|---------+--------+--------+--------+--------+------+-------+-----------+------------|
|TLV type|TLV info | Status |VSI Type|VSI Type|VSIID |VSIID |Filter Info|Filter Infor|
|(7bits) |strlength|(1octet)| ID |version |format|(16oct)| format | (M octets) |
| | (9bits) | |(3oct) |(1oct) |(1oct)| | (1 octet)| |
|--------+---------+--------+--------+--------+------+-------+-----------+------------|
| |<-------VSI type&instance------>|<-------Filter----------|
| |<--------------------VSI attibutes---------------------->|
|<----TLV header--><--------------TLV information string = 23+Moctets---------------->|
Figure 6: VDP TLV definitions
Some important flag values in VDP request:
o M-bit (Bit 5): Indicates that the user of the VSI (e.g., the VM)
is migrating (M-bit = 1) or provides no guidance on the migration
of the user of the VSI (M-bit = 0). The M-bit is used as an
indicator relative to the VSI that the user is migrating to.
Gu & Li Expires April 22, 2013 [Page 21]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
o S-bit (Bit 6): Indicates that the VSI user (e.g., the VM) is
suspended (S-bit = 1) or provides no guidance as to whether the
user of the VSI is suspended (S-bit = 0). A keep-alive Associate
request with S-bit = 1 can be sent when the VSI user is suspended.
The S-bit is used as an indicator relative to the VSI that the
user is migrating from.
The filter information field supports the following format:
o VID
+---------+------+-------+--------+
| #of | PS | PCP | VID |
|entries |(1bit)|(3bits)|(12bits)|
|(2octets)| | | |
+---------+------+-------+--------+
|<--Repeated per entry->|
Figure 7
o MAC/VID
+---------+--------------+------+-------+--------+
| #of | MAC address | PS | PCP | VID |
|entries | (6 octets) |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | |
+---------+--------------+------+-------+--------+
|<--------Repeated per entry---------->|
Figure 8
o GroupID/VID
+---------+--------------+------+-------+--------+
| #of | GroupID | PS | PCP | VID |
|entries | (4 octets) |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | |
+---------+--------------+------+-------+--------+
|<--------Repeated per entry---------->|
Figure 9
o GroupID/MAC/VID
+---------+-----------+-------------+------+-------+--------+
| #of | GroupID | MAC address | PS | PCP | VID |
|entries |(4 octets) | (6 octets) |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | | |
+---------+-----------+-------------+------+-------+--------+
|<--------------Repeated per entry--------------->|
Figure 10
Gu & Li Expires April 22, 2013 [Page 22]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
In each format, the null VID can be used in the VDP Request. In this
case, the Bridge is expected to supply the corresponding local VID
value in the VDP Response.
The VSIID in VDP request that identify a VM can be one of the
following format: IPV4 address, IPV6 address, MAC address, UUID or
locally defined.
+--------------------------------------------------+----------------+
| VDP features | Requirements |
| | Matching |
+--------------------------------------------------+----------------+
| Pre-Associate/ Pre-Associate with resource | Requirement-1 |
| reservation/ Associate/ Deassociate | |
| M-bit/S-bit | Requirement-1 |
| VSI type&instance in VDP request | Requirement-2 |
| Filter Infor | Requirement-3 |
| VID infor in VDP response | Requirement-4 |
| PCP in VDP response | Requirement-5 |
+--------------------------------------------------+----------------+
VDP TLV types
5.2. BGP
gives a brief analysis on how BGP can be reused for TES and NVE
signalling. Please refer to it for more information. [server2nve]
5.3. External Controller
6. References
6.1. Normative Reference
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", March 1997.
[Qbg] "IEEE P802.1Qbg Edge Virtual Bridging".
6.2. Informative Reference
[framework]
Marc Lasserre, Marc., Balus, Florin., Morin, Thomas.,
Bitar, Nabil., and Yakov. Rekhter,
"draft-ietf-nvo3-framework-00", September 2012.
[overlay-cp]
Gu & Li Expires April 22, 2013 [Page 23]
Internet-Draft NVO3 TES to NVE mechanism Oct 2012
Kreeger, L., Dutt, D., Narten, T., Black, D., and M.
Sridharan, "draft-kreeger-nvo3-overlay-cp-00", Jan 2012.
[server2nve]
Kompella, K.,
"draft-dunbar-nvo3-overlay-mobility-issues-00", July 2012.
[statemigration-framework]
Gu, Y., Shore, M., and S. Sivakumar, "A Framework and
Problem Statement for Flow-associated Middlebox State
Migration", October 2012.
Authors' Addresses
Gu Yingjie
Huawei
No. 101 Software Avenue
Nanjing, Jiangsu Province 210001
P.R.China
Phone: +86-25-56625392
Email: guyingjie@huawei.com
Yizhou Li
Huawei
No. 101 Software Avenue
Nanjing, Jiangsu Province 210001
P.R.China
Phone:
Email: liyizhou@huawei.com
Gu & Li Expires April 22, 2013 [Page 24]