Internet DRAFT - draft-rekhter-nvo3-vm-mobility-issues

draft-rekhter-nvo3-vm-mobility-issues









Network Working Group                                         Y. Rekhter
Internet Draft                                          Juniper Networks
Category: Standards Track
Expiration Date: April 2013
                                                           W. Henderickx
                                                          Alcatel-Lucent

                                                              R. Shekhar
                                                        Juniper Networks

                                                             Luyuan Fang
                                                           Cisco Systems

                                                            Linda Dunbar
                                                                  Huawei

                                                             Ali Sajassi
                                                           Cisco Systems

                                                          October 7 2012


                   Network-related VM Mobility Issues


              draft-rekhter-nvo3-vm-mobility-issues-03.txt

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.





Rekhter                                                         [Page 1]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


Copyright and License Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008. The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.



Abstract

   This document describes a set of network-related issues presented by
   the desire to support seamless Virtual Machine mobility in the data
   center and between data centers. In particular, it looks at the
   implications of meeting the requirements for "seamless mobility".
















Rekhter                                                         [Page 2]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


Table of Contents

 1          Specification of requirements  .........................   3
 2          Introduction  ..........................................   3
 2.1        Terminology  ...........................................   4
 3          Problem Statement  .....................................   7
 3.1        Usage of VLAN-IDs  .....................................   7
 3.2        Maintaining Connectivity in the Presence of VM Mobility  ...8
 3.3        Layer 2 Extension  .....................................   8
 3.4        Optimal IP Routing  ....................................   9
 3.5        Preserving Policies  ...................................  10
 4          IANA Considerations  ...................................  10
 5          Security Considerations  ...............................  10
 6          Acknowledgements  ......................................  10
 7          References  ............................................  10
 8          Author's Address  ......................................  11






1. Specification of requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2. Introduction

   An important feature of data centers identified in [nvo3-problem] is
   the support of Virtual Machine (VM) mobility within the data center
   and between data centers. This document describes a set of network-
   related issues presented by the desire to support seamless Virtual
   Machine mobility in the data center, where seamless mobility is
   defined as the ability to move a VM from one server in the data
   center to another server in the same or different data center, while
   retaining the IP and MAC address of the VM. In the context of this
   document the term mobility, or a reference to moving a VM should be
   considered to imply seamless mobility, unless otherwise stated.

   Note that in the scenario where a VM is moved between servers located



Rekhter                                                         [Page 3]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   in different data centers, there are certain issues related to the
   current state of the art of the Virtual Machine technology, the
   bandwidth that may be available between the data centers, the
   distance between the data centers, the ability to manage and operate
   such VM mobility, storage-related issues (the moved VM has to have
   access to the same virtual disk), etc.  Discussion of these issues is
   outside the scope of this document.


2.1. Terminology

   In this document the term "Top of Rack Switch (ToR)" is used to refer
   to a switch in a data center that is connected to the servers that
   host VMs. A data center may have multiple ToRs. When External Bridge
   Port Extenders (as defined by 802.1BR) are used to connect the
   servers to the data center network, the ToR switch is the Controlling
   Bridge.

   Several data centers could be connected by a network. In addition to
   providing interconnect among the data centers, such a network could
   provide connectivity between the VMs hosted in these data centers and
   the sites that contain hosts communicating with such VMs. Each data
   center has one or more Data Center Border Router (DCBR) that connects
   the data center to the network, and provides (a) connectivity between
   VMs hosted in the data center and VMs hosted in other data centers,
   and (b) connectivity between VMs hosted in the data center and hosts
   communicating with these VMs.

   The following figure illustrates the above:


                   __________
                  (          )
                 ( Data Center)
                ( Interconnect )-------------------------
                 (  Network   )                          |
                  (__________)                           |
                     |    |                              |
                 ----      ----                          |
                |              |                         |
        --------+--------------+---------------        -------------
       |        |              |       Data     |     |             |
       |     ------          ------    Center   |     | Data Center |
       |    | DBCR |        | DBCR |            |     |             |
       |     ------          ------             |      -------------
       |        |              |                |
       |         ---        ---                 |
       |         ___|______|__                  |



Rekhter                                                         [Page 4]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


       |        (             )                 |
       |       (  Data Center  )                |
       |        (   Network   )                 |
       |         (___________)                  |
       |            |      |                    |
       |        ----        ----                |
       |       |                |               |
       |  ------------        -----             |
       | | ToR Switch |      | ToR |            |
       |  ------------        -----             |
       |   |                    |               |
       |   |   ----------       |   ----------  |
       |   |--| Server   |      |--| Server   | |
       |   |  |          |      |   ----------  |
       |   |  |  ----    |      |               |
       |   |  | | VM |   |      |   ----------  |
       |   |  |  -----   |       --| Server   | |
       |   |  |  | VM |  |          ----------  |
       |   |  |   -----  |                      |
       |   |  |   | VM | |                      |
       |   |  |    ----  |                      |
       |   |   ----------                       |
       |   |                                    |
       |   |   ----------                       |
       |   |--| Server   |                      |
       |   |   ----------                       |
       |   |                                    |
       |   |   ----------                       |
       |    --| Server   |                      |
       |       ----------                       |
       |                                        |
        ----------------------------------------






   The data centers and the network that interconnects them may be
   either (a) under the same administrative control, or (b) controlled
   by different administrations.

   Consider a set of VMs that (as a matter of policy) are allowed to
   communicate with each other, and a collection of devices that
   interconnect these VMs. If communication among any VMs in that set
   could be accomplished in such a way as to preserve MAC source and
   destination addresses in the Ethernet header of the packets exchanged
   among these VMs (as these packets traverse from their sources to



Rekhter                                                         [Page 5]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   their destinations), we will refer to such set of VMs as an Layer 2
   based Closed User Group (L2-based CUG).

   A given VM may be a member of more than one L2-based CUG.

   In terms of IP address assignment this document assumes that all VMs
   of a given L2-based CUG have their IP addresses assigned out of a
   single IP prefix. Thus, in the context of this document a single IP
   subnet corresponds to a single L2-based CUG.  If a given VM is a
   member of more than one L2-based CUG, this VM would have multiple IP
   addresses and multiple logical interface, one IP address and one
   logical interface per each such CUG.

   A VM that is a member of a given L2-based CUG may (as a matter of
   policy) be allowed to communicate with VMs that belong to other
   L2-based CUGs, or with other hosts. Such communication involves IP
   forwarding, and thus would result in changing MAC source and
   destination addresses in the Ethernet header of the packets being
   exchanged.

   In this document the term "L2 physical domain" refers to a collection
   of interconnected devices that perform forwarding based on the
   information carried in the Ethernet header. A trivial L2 physical
   domain consists of just one server. In a non-trivial L2 physical
   domain (domain that contains multiple forwarding entities) forwarding
   could be provided by such layer 2 technologies as Spanning Tree
   Protocol (STP), etc...  Note that any multi-chassis LAG can not span
   more than one L2 physical domain. This document assumes that a layer
   2 access domain is an L2 physical domain.

   A physical server connected to a given L2 physical domain may host
   VMs that belong to different L2-based CUGs (while each of these CUGs
   may span multiple L2 physical domains). If an L2 physical domain
   contains servers that host VMs belonging to different L2-based CUGs,
   then enforcing L2-based CUGs boundaries among these VMs within that
   domain is accomplished by relying on Layer 2 mechanisms (e.g.,
   VLANs).

   We say that an L2 physical domain contains a given VM (or that a
   given VM is in a given L2 physical domain), if the server presently
   hosting this VM is part of that domain, or the server is connected to
   a ToR that is part of that domain.

   We say that a given L2-based CUG is present within a given data
   center if one or more VMs that are part of that CUG are presently
   hosted by the servers located in that data center.

   In the context of this document when we talk about VLAN-ID used by a



Rekhter                                                         [Page 6]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   given VM, we refer to the VLAN-ID carried by the traffic that is
   within the same L2 physical domain as the VM, and that is either
   originated or destined to that VM - e.g., VLAN-ID only has local
   significance within the L2 physical domain, unless it is stated
   otherwise.


3. Problem Statement

   This section describes the specific problems/issues that need to be
   addressed to enable seamless VM mobility.


3.1. Usage of VLAN-IDs

   This document assumes that within a given non-trivial L2 physical
   domain traffic from/to VMs that are in that domain, and belong to the
   same L2-based CUG MUST have the same VLAN-ID. This document assumes
   that in different non-trivial L2 physical domains traffic from/to VMs
   that are in these domains and belong to the same L2-based CUG MAY
   have either the same or different VLAN-IDs.  Thus when a given VM
   moves from one non-trivial L2 physical domain to another, the VLAN-ID
   of the traffic from/to VM in the former may be different than in the
   latter, and thus can not assume to stay the same.

   This document assumes that within a trivial L2 physical domain
   traffic from/to VMs that are in this domain may not have VLAN-IDs at
   all.

   If a given VM's Guest OS sends packets that carry VLAN-ID, then when
   the VM moves from one L2 physical domain to another the VLAN-ID used
   by the Guest OS can not change (this is irrespective of whether L2
   physical domains are trivial or non-trivial). In other words, the
   VLAN-IDs used by a tagged VM network interface are part of the VM's
   state and cannot be changed when the VM moves from one L2 physical
   domain to another, even though it is possible for an entity, such as
   hypervisor virtual switch, to change the VLAN-ID from the value used
   by NVE to the value expected by the VM (in contrast, a VLAN tag
   assigned by a hypervisor for use with an untagged VM network
   interface can change). If the L2 physical domain is extended to
   include VM tagged interfaces, the hypervisor virtual switch, and the
   DC bridged network, then special consideration is needed in
   assignment of VLAN tags for the VMs, the L2 physical domain and other
   domains into which the VM may move.

   This document assumes that within a given non-trivial L2 physical
   domain traffic from/to VMs that are in that domain, and belong to
   different L2-based CUG MUST have different VLAN-IDs.



Rekhter                                                         [Page 7]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   The above assumptions about VLAN-IDs are driven by (a) the assumption
   that within a given L2 physical domain VLANs are used to identify
   individual L2-based CUGs, and (b) the need to overcome the limitation
   on the number of different VLAN-IDs.


3.2. Maintaining Connectivity in the Presence of VM Mobility

   In the context of this document the ability to maintain connectivity
   in the presence of VM mobility means the ability to exchange traffic
   between a VM and its peer(s), as the VM moves from one server to
   another, where the peer(s) may be either other VM(s) or hosts.
   Furthermore, the peer(s) need not be within the same data center as
   the VM itself.

   A given VM could be moved from one server to another in stopped or
   suspended state ("cold" VM mobility), or the hypervisors might move a
   running VM ("hot" VM mobility). IP address preservation is sometimes
   highly desired for cold VM mobility; it's mandatory to preserve
   transport connections when a running VM is moved.

   VM mobility may result in transient loss of IP connectivity between
   VM and its peers. In the case of hot VM mobility the upper bound on
   the duration of such transients is (much) lower than in the case of
   cold VM mobility (due to the requirement of preserving transport
   connections and potential additional application requirements).

   Furthermore, while with cold VM mobility one may assume that VM's ARP
   cache gets flushed once VM moves to another server, one can not make
   such an assumption with hot VM mobility.


3.3. Layer 2 Extension

   Consider a scenario where a VM that is a member of a given L2-based
   CUG moves from one server to another, and these two servers are in
   different L2 physical domains, where these domains may be located in
   the same or different data centers. In order to enable communication
   between this VM and other VMs of that L2-based CUG, the new L2
   physical domain must become interconnected with the other L2 physical
   domain(s) that presently contain the rest of the VMs of that CUG, and
   the interconnect must not violate the L2-based CUG requirement to
   preserve source and destination MAC addresses in the Ethernet header
   of the packets exchange between this VM and other members of that
   CUG.

   Moreover, if the previous L2 physical domain no longer contains any
   VMs of that CUG, the previous domain no longer needs to be



Rekhter                                                         [Page 8]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   interconnected with the other L2 physical domains(s) that contain the
   rest of the VMs of that CUG.

   Note that supporting VM mobility implies that the set of L2 physical
   domains that contain VMs that belong to a given L2-based CUG may
   change over time (new domains added, old domains deleted).

   We will refer to this as the "layer 2 extension problem".

   Note that the layer 2 extension problem is a special case of
   maintaining connectivity in the presence of VM mobility, as the
   former restricts communicating VMs to a single/common L2-based CUG,
   while the latter does not.


3.4. Optimal IP Routing

   In the context of this document optimal IP routing, or just optimal
   routing, in the presence of VM mobility could be partitioned into two
   problems:

     + Optimal routing of a VM's outbound traffic. This means that as a
       given VM moves from one server to another, the VM's default
       gateway should be in a close topological proximity to the ToR
       that connects the server presently hosting that VM. Note that
       when we talk about optimal routing of the VM's outbound traffic,
       we mean traffic from that VM to the destinations that are outside
       of the VM's L2-based CUG. This document refers to this problem as
       the VM default gateway problem.

     + Optimal routing of VM's inbound traffic. This means that as a
       given VM moves from one server to another, the (inbound) traffic
       originated outside of the VM's L2-based CUG, and destined to that
       VM be routed via the router of the VM's L2-based CUG that is in a
       close topological proximity to the ToR that connects the server
       presently hosting that VM, without first traversing some other
       router of that L2-based CUG (the router of the VM's L2-based CUG
       may be either DCBR or ToR itself). This is also known as avoiding
       "triangular routing". This document refers to this problem as the
       triangular routing problem.

   Note that optimal routing is a special case of maintaining
   connectivity in the presence of VM mobility, as the former assumes
   not only the ability to maintain connectivity, but also that this
   connectivity is maintained using optimal routing. On the other hand,
   maintaining connectivity does not make optimal routing a pre-
   requisite.




Rekhter                                                         [Page 9]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


   The ability to deliver optimal routing (as defined above) in the
   presence of stateful devices is outside the scope of this document.


3.5. Preserving Policies

   Moving VM from one L2 physical domain to another means (among other
   things) that the NVE in the new domain that provides connectivity
   between this VM and VMs in other L2 physical domains must be able to
   implement the policies that control connectivity between this VM and
   VMs in other L2 physical domains. In other words, the policies that
   control connectivity between a given VM and its peers MUST NOT change
   as the VM moves from one L2 physical domain to another.  Moreover,
   policies, if any, within the L2 physical domain that contain a given
   VM MUST NOT preclude realization of the policies that control
   connectivity between this VM and its peers. All of the above is
   irrespective of whether the L2 physical domains are trivial or not.


4. IANA Considerations

   This document introduces no new IANA Considerations.


5. Security Considerations

   TBD.


6. Acknowledgements

   The authors would like to thank Adrian Farrel for his review and
   comments. The authors would also like to thank Ivan Pepelnjak and
   David Black for their contributions to this document.



7. References

   [nvo3-problem] Narten T.et al., "Overlays for Network
   Virtualization", draft-narten-nvo3-overlay-problem-statement, work in
   progress.









Rekhter                                                        [Page 10]


Internet Draftdraft-rekhter-nvo3-vm-mobility-issues-03.txt  October 2012


8. Author's Address

   Yakov Rekhter
   Juniper Networks
   1194 North Mathilda Ave.
   Sunnyvale, CA 94089
   Email: yakov@juniper.net

   Wim Henderickx
   Alcatel-Lucent
   Email: wim.henderickx@alcatel-lucent.com

   Ravi Shekhar
   Juniper Networks
   1194 North Mathilda Ave.
   Sunnyvale, CA 94089
   Email: rshekhar@juniper.net

   Luyuan Fang
   Cisco Systems
   111 Wood Avenue South
   Iselin, NJ 08830
   Email: lufang@cisco.com

   Linda Dunbar
   Huawei Technologies
   5340 Legacy Drive, Suite 175
   Plano, TX 75024, USA
   Phone: (469) 277 5840
   Email: ldunbar@huawei.com

   Ali Sajassi
   Cisco Systems
   Email: sajassi@cisco.com

   Rahul Aggarwal
   Arktan, Inc
   Email: raggarwa_1@yahoo.com













Rekhter                                                        [Page 11]