draft-ietf-nvo3-vmm-06.txt   draft-ietf-nvo3-vmm-07.txt 
Network Working Group L. Dunbar Network Working Group L. Dunbar
Internet Draft Futurewei Internet Draft Futurewei
Intended status: Informational B. Sarikaya Intended status: Informational B. Sarikaya
Expires: May 18, 2020 Denpel Informatique Expires: August 21, 2020 Denpel Informatique
B.Khasnabish B.Khasnabish
Independent Independent
T. Herbert T. Herbert
Intel Intel
S. Dikshit S. Dikshit
Aruba-HPE Aruba-HPE
November 18, 2019 February 21, 2020
Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks
draft-ietf-nvo3-vmm-06 draft-ietf-nvo3-vmm-07
Abstract Abstract
This document discusses Virtual Machine (VM) mobility solutions that This document describes virtual machine mobility solutions commonly
are commonly used in overlay-based Data Center (DC) networks. The used in data centers built with overlay-based network. This document
objective is to describe the solutions and their impact on moving is intended for describing the solutions and the impact of moving
VMs (and applications) from one rack to another connected by the VMs (or applications) from one Rack to another connected by the
Overlay networks. Overlay networks.
For layer 2 networks, it is based on using an NVA (Network For layer 2, it is based on using an NVA (Network Virtualization
Virtualization Authority) - NVE (Network Virtualization Edge) Authority) - NVE (Network Virtualization Edge) protocol to update
protocol to update the ARP (Address Resolution Protocol) table or ARP (Address Resolution Protocol) table or neighbor cache entries
neighbor cache entries after a VM (virtual machine) moves from an after a VM (virtual machine) moves from an Old NVE to a New NVE.
Old NVE to a New NVE. For Layer 3, it is based on migration of For Layer 3, it is based on address and connection migration after
address and connection after the move. the move.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.This Internet-Draft is submitted in provisions of BCP 78 and BCP 79.
full conformance with the provisions of BCP 78 and BCP 79. This
document may not be modified, and derivative works of it may not be This Internet-Draft is submitted in full conformance with the
created, except to publish it as an RFC and to translate it into provisions of BCP 78 and BCP 79. This document may not be modified,
languages other than English. and derivative works of it may not be created, except to publish it
as an RFC and to translate it into languages other than English.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on May 10, 2020. This Internet-Draft will expire on August 21, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License. warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
2. Conventions used in this document..............................4 2. Conventions used in this document..............................4
3. Requirements...................................................5 3. Requirements...................................................5
4. Overview of the VM Mobility Solutions..........................5 4. Overview of the VM Mobility Solutions..........................6
4.1. VM Migration in Layer-2 Network...........................5 4.1. VM Migration in Layer 2 Network...........................6
4.2. Task Migration in Layer-3 Network.........................7 4.2. Task Migration in Layer-3 Network.........................7
4.2.1. Address and Connection Migration in Task Migration...8 4.2.1. Address and Connection Migration in Task Migration...8
5. Handling Packets in Flight.....................................9 5. Handling Packets in Flight.....................................9
6. Moving Local State of VM......................................10 6. Moving Local State of VM......................................10
7. Handling of Hot, Warm and Cold VM Mobility....................10 7. Handling of Hot, Warm and Cold VM Mobility....................10
8. VM Operation..................................................11 8. Other VM Mobility Options.....................................11
9. Security Considerations.......................................11 9. VM Lifecycle Management.......................................11
10. IANA Considerations..........................................12 10. Security Considerations......................................11
11. Acknowledgments..............................................12 11. IANA Considerations..........................................12
12. Change Log...................................................12 12. Acknowledgments..............................................12
13. References...................................................12 13. Change Log...................................................12
13.1. Normative References....................................13 14. References...................................................12
13.2. Informative References..................................14 14.1. Normative References....................................13
14.2. Informative References..................................14
1. Introduction 1. Introduction
This document describes the overlay-based DC networking solutions This document describes the overlay-based data center networks
in support of multi-tenancy and VM mobility. Many large DCs, solutions in supporting multitenancy and VM (Virtual Machine)
especially Cloud DCs, host tasks (or workloads) for multiple mobility. Many large DCs (Data Centers), especially Cloud DCs,
tenants. A tenant can be a department of one organization or an host tasks (or workloads) for multiple tenants. A tenant can be a
organization. There is communication among tasks belonging to one department of one organization or an organization. There are
tenant and communication among tasks belonging to different communication among tasks belonging to one tenant and
tenants or with external entities. communication among tasks belonging to different tenants or with
external entities.
Server Virtualization, which is being used in almost all of Server Virtualization, which is being used in almost all of
today's DCs, enables many VMs to run on a single physical computer today's data centers, enables many VMs to run on a single physical
or server sharing the processor/memory/storage. Network computer or server sharing the processor/memory/storage. Network
connectivity among VMs is provided by the network virtualization connectivity among VMs is provided by the network virtualization
edge (NVE) [RFC8014]. It is highly desirable [RFC7364] to allow edge (NVE) [RFC8014]. It is highly desirable [RFC7364] to allow
VMs to move dynamically (live, hot, or cold move) from one VMs to be moved dynamically (live, hot, or cold move) from one
server to another for dynamic load balancing or optimized workload server to another for dynamic load balancing or optimized work
distribution. distribution.
There are many challenges and requirements related to VM mobility There are many challenges and requirements related to VM mobility
in large data centers, including dynamically attaching/detaching in large data centers, including dynamic attaching/detaching VMs
VMs to/from Virtual Network Edges (VNEs). In addition, retaining to/from Virtual Network Edges (VNEs). In addition, retaining IP
the IP addresses after a move is a key requirement [RFC7364]. addresses after a move is a key requirement [RFC7364]. Such a
Such a requirement is needed in order to maintain existing requirement is needed in order to maintain existing transport
transport connections. connections.
In traditional Layer-3 based networks, retaining IP addresses In traditional Layer-3 based networks, retaining IP addresses
after a move is generally not recommended because the frequent after a move is generally not recommended because the frequent
move will cause fragmented IP addresses, which complicates IP move will cause fragmented IP addresses, which introduces
address management. complexity in IP address management.
In view of many VM mobility schemes that exist today, there is a In view of many VM mobility schemes that exist today, there is a
need to document comprehensive VM mobility solutions that cover desire to document comprehensive VM mobility solutions that cover
both IPv4 and IPv6. Large DC networks can be organized as one both IPv4 and IPv6. The large Data Center networks can be
large (a) Layer-2 network geographically distributed across organized as one large Layer-2 network geographically distributed
buildings/cities or (b) Layer-3 networks with large number of host in several buildings/cities or Layer-3 networks with large number
routes that cannot be aggregated as a result of frequent moves of host routes that cannot be aggregated as the result of frequent
from one location to another without changing the IP addresses. moves from one location to another without changing their IP
addresses. The connectivity between Layer 2 boundaries can be
The connectivity between Layer 2 boundaries can be achieved by the achieved by the network virtualization edge (NVE) functioning as
NVE functioning as Layer-3 gateway, performing routing across Layer 3 gateway routing across bridging domain such as in
bridging domain such as in Warehouse Scale Computers (WSC). Warehouse Scale Computers (WSC).
2. Conventions used in this document 2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
RFC 2119 [RFC2119] and [RFC8014]. RFC 2119 [RFC2119] and [RFC8014].
This document uses the terminology defined in [RFC7364]. In This document uses the terminology defined in [RFC7364]. In
addition, we make the following definitions: addition, we make the following definitions:
skipping to change at page 4, line 40 skipping to change at page 5, line 8
Warm VM Mobility: In case of warm VM mobility, the VM states are Warm VM Mobility: In case of warm VM mobility, the VM states are
mirrored to the secondary server (or domain) at a mirrored to the secondary server (or domain) at a
predefined (configurable) regular intervals. This predefined (configurable) regular intervals. This
reduces the overheads and complexity, but this may also reduces the overheads and complexity, but this may also
lead to a situation when both servers may not contain lead to a situation when both servers may not contain
the exact same data (state information) the exact same data (state information)
Cold VM Mobility: A given VM could be moved from one server to Cold VM Mobility: A given VM could be moved from one server to
another in stopped or suspended state. another in stopped or suspended state.
Old NVE: This refers to the old NVE where packets were forwarded Old NVE: refers to the old NVE where packets were forwarded to
to before migration. before migration.
New NVE: This refers to the new NVE after migration. New NVE: refers to the new NVE after migration.
Packets in flight: This refers to the packets received by the Old Packets in flight: refers to the packets received by the Old NVE
NVE sent by the correspondents that have old ARP or sent by the correspondents that have old ARP or neighbor
neighbor cache entry before VM or task migration. cache entry before VM or task migration.
Users of VMs in diskless systems or the systems that are not Users of VMs in diskless systems or systems not using
using configuration files are called end user clients. configuration files are called end user clients.
Cloud DC: Third party DCs that host applications, tasks or Cloud DC: Third party data centers that host applications,
workloads and owned by different organizations or tasks or workloads owned by different organizations or
tenants. tenants.
3. Requirements 3. Requirements
This section states VM mobility requirements on DC networks. This section states requirements on data center network virtual
machine mobility.
DC networks should support both IPv4 and IPv6 VM mobility. Data center network should support both IPv4 and IPv6 VM mobility.
VM mobility should not require changing their IP addresses after the Virtual machine (VM) mobility should not require changing VMs' IP
move. addresses after the move.
There exist "Hot Migration" where transport service continuity is There is "Hot Migration" with transport service continuing, and
maintained, and "Cold Migration" where the transport service needs "Cold Migration" with transport service restarted, i.e. the task
to be restarted, i.e., execution of the tasks is stopped on the running is stopped on the Old NVE, moved to the New NVE and the task
"Old" NVE, moved to the "New" NVE and the task is restarted. is restarted. Not all DCs support "Hot Migration. DCs that only
support Cold Migration should make their customers aware of the
potential service interruption during the Cold Migration.
VM mobility solutions/procedures should minimize triangular routing VM mobility solutions/procedures should minimize triangular routing
except for handling packets in flight. except for handling packets in flight.
VM mobility solutions/procedures should not need to use tunneling VM mobility solutions/procedures should not need to use tunneling
except for handling packets in flight. except for handling packets in flight.
4. Overview of the VM Mobility Solutions 4. Overview of the VM Mobility Solutions
Layer-2 and Layer-3 mobility solutions are described respectively Layer 2 and Layer 3 mobility solutions are described respectively
in the following sections. in the following sections.
4.1. VM Migration in Layer-2 Network 4.1. VM Migration in Layer 2 Network
Ability to move VMs dynamically, from one server to another, makes
it possible for dynamic load balancing or workload distribution.
Therefore, this scheme is highly desirable for utilization in Being able to move VMs dynamically, from one server to another,
large scale multi-tenant DCs. makes it possible for dynamic load balancing or work distribution.
Therefore, dynamic VM Mobility is highly desirable for large scale
multi-tenant DCs.
In a Layer-2 based VM migration approach, a VM that is moving to In a Layer-2 based approach, VM moving to another server does not
another server does not change its IP address. But since this VM change its IP address. But this VM is now under a new NVE,
is now under a new NVE, previously communicating NVEs will previously communicating NVEs will continue sending their packets
continue sending their packets to the Old NVE. To solve this to the Old NVE. To solve this problem, Address Resolution
problem, Address Resolution Protocol (ARP) cache in IPv4 [RFC0826] Protocol (ARP) cache in IPv4 [RFC0826] or neighbor cache in IPv6
or neighbor cache in IPv6 [RFC4861] in the NVEs need to be updated [RFC4861] in the NVEs need to be updated promptly. All NVEs need
promptly. All NVEs need to change their caches associating the VM to change their caches associating the VM Layer-2 or Medium Access
Layer-2 or Medium Access Control (MAC) address with the new NVE's Control (MAC) address with the new NVE's IP address as soon as the
IP address as soon as the VM moves. Such a change enables all NVEs VM is moved. Such a change enables all NVEs to encapsulate the
to encapsulate the outgoing MAC frames with the current target NVE outgoing MAC frames with the current target NVE IP address. It may
IP address. It may take some time to refresh the ARP/ND cache when take some time to refresh ARP/ND cache when a VM is moved to a New
a VM has moved to a New NVE. During this period, a tunnel is NVE. During this period, a tunnel is needed for that Old NVE to
needed for that Old NVE to forward packets destined to the VM forward packets destined to the VM to the New NVE.
under the New NVE.
In case of IPv4, immediately after the move, the VM should send a In IPv4, the VM immediately after the move should send a
gratuitous ARP request message containing its IPv4 and Layer-2 MAC gratuitous ARP request message containing its IPv4 and Layer 2 MAC
address to its new NVE. This message's destination address is the address in its new NVE. This message's destination address is the
broadcast address. Upon receiving this message, both old and new broadcast address. Upon receiving this message, both Old and New
NVEs should update the VM's ARP entry in the central directory at NVEs should update the VM's ARP entry in the central directory at
the NVA, to update its mappings to record the IPv4 address and MAC the NVA, to update its mappings to record the IPv4 address & MAC
address of the moving VM along with the new NVE IPv4 address. An address of the moving VM along with the new NVE IPv4 address. An
NVE-to-NVA protocol is used for this purpose [RFC8014]. NVE-to-NVA protocol is used for this purpose [RFC8014].
Reverse ARP (RARP) which enables the host to discover its IPv4 Reverse ARP (RARP) which enables the host to discover its IPv4
address when it boots from a local server [RFC0903], is not used address when it boots from a local server [RFC0903], is not used
by VMs because the VM already knows its IPv4 address. Next, we by VMs if the VM already knows its IPv4 address (most common
describe a case where RARP is used. scenario). Next, we describe a case where RARP is used.
There are some vendor deployments (e.g., diskless systems or There are some vendor deployments (diskless systems or systems
systems without configuration files) where the VM's user, i.e., without configuration files) wherein the VM's user, i.e. end-user
end-user client asks for the same MAC address upon migration. client askes for the same MAC address upon migration. This can be
This can be achieved by the clients sending RARP request message achieved by the clients sending RARP request message which carries
which carries the MAC address looking for an IP address the MAC address looking for an IP address allocation. The server,
allocation. The server, in this case the new NVE, needs to in this case the new NVE needs to communicate with NVA, just like
communicate with NVA, just like in the gratuitous ARP case to in the gratuitous ARP case to ensure that the same IPv4 address is
ensure that the same IPv4 address is assigned to the VM. NVA uses assigned to the VM. NVA uses the MAC address as the key in the
the MAC address as the key in the search of ARP cache to find the search of ARP cache to find the IP address and informs this to the
IP address and informs this to the new NVE which in turn sends new NVE which in turn sends RARP reply message. This completes IP
RARP reply message. This completes IP address assignment to the address assignment to the migrating VM.
migrating VM.
Other NVEs communicating with this VM could have the old ARP Other NVEs communicating with this VM could have the old ARP
entry. If any VMs in those NVEs need to communicate with the VM entry. If any VMs in those NVEs need to communicate with the VM
attached to the new NVE, old ARP entries might be used. Thus, the attached to the New NVE, old ARP entries might be used. Thus, the
packets are delivered to the old NVE. The old NVE MUST tunnel packets are delivered to the Old NVE. The Old NVE MUST tunnel
these in-flight packets to the new NVE. these in-flight packets to the New NVE.
When an ARP entry for those VMs times out, their corresponding When an ARP entry for those VMs times out, their corresponding
NVEs should access the NVA for an update. NVEs should access the NVA for an update.
IPv6 operation is slightly different: IPv6 operation is slightly different:
In IPv6, after the move, the VM immediately sends an unsolicited In IPv6, after the move, the VM immediately sends an unsolicited
neighbor advertisement message containing its IPv6 address and neighbor advertisement message containing its IPv6 address and
Layer-2 MAC address to its new NVE. This message is sent to the Layer-2 MAC address to its new NVE. This message is sent to the
IPv6 Solicited Node Multicast Address corresponding to the target IPv6 Solicited Node Multicast Address corresponding to the target
address which is the VM's IPv6 address. The NVE receiving this address which is the VM's IPv6 address. The NVE receiving this
message should send request to update VM's neighbor cache entry in message should send request to update VM's neighbor cache entry in
the central directory of the NVA. The NVA's neighbor cache entry the central directory of the NVA. The NVA's neighbor cache entry
should include IPv6 address of the VM, MAC address of the VM and should include IPv6 address of the VM, MAC address of the VM and
the NVE IPv6 address. An NVE-to-NVA protocol is used for this the NVE IPv6 address. An NVE-to-NVA protocol is used for this
purpose [RFC8014]. purpose [RFC8014].
Other NVEs communicating with this VM might still use the old Other NVEs communicating with this VM might still use the old
neighbor cache entry. If any VM in those NVEs need to communicate neighbor cache entry. If any VM in those NVEs need to communicate
with the VM attached to the new NVE, it could use the old neighbor with the VM attached to the New NVE, it could use the old neighbor
cache entry. Thus, the packets are delivered to the old NVE. The cache entry. Thus, the packets are delivered to the Old NVE. The
old NVE MUST tunnel these in-flight packets to the new NVE. Old NVE MUST tunnel these in-flight packets to the New NVE.
When a neighbor cache entry in those VMs times out, their When a neighbor cache entry in those VMs times out, their
corresponding NVEs should access the NVA for an update. corresponding NVEs should access the NVA for an update.
4.2. Task Migration in Layer-3 Network 4.2. Task Migration in Layer-3 Network
Layer-2 based DC networks become quickly prohibitive because ARP/neighbor cache scalability considerations can limit the size
ARP/neighbor caches don't scale. Scaling can be accomplished of Layer-2 based DC networks. Scaling can be accomplished
seamlessly in Layer-3 data center networks by just giving each seamlessly in Layer-3 data center networks by just giving each
virtual network an IP subnet and a default route that points to virtual network an IP subnet and a default route that points to
its NVE. This means no explosion of ARP/ neighbor cache in VMs its NVE. This means no explosion of ARP/ neighbor cache in VMs
and NVEs (just one ARP/ neighbor cache entry for the default and NVEs (just one ARP/ neighbor cache entry for the default
route) and there is no need to have Ethernet header in route) and there is no need to have Ethernet header in
encapsulation [RFC7348] which saves at least 16 bytes. encapsulation [RFC7348] which saves at least 16 bytes.
Even though the term VM and Task are used interchangeably in this Even though the term VM and Task are used interchangeably in this
document, the term Task is used in the context of Layer-3 document, the term Task is used in the context of Layer-3
migration mainly to have slight emphasis on the task of moving an migration mainly to have slight emphasis on the moving an entity
entity that is instantiated on a VM or a container. (Task) that is instantiated on a VM or a container.
Traditional Layer-3 based DC networks require IP address of the Traditional Layer-3 based data center networks require IP address
task to change after moving because the pre-fixes of the IP of the task to change after moving because the prefixes of the IP
address usually reflect the locations. It is necessary to have an address usually reflect the locations. It is necessary to have an
IP based VM migration solution that can allow IP addresses staying IP based VM migration solution that can allow IP addresses staying
the same after the VMs move to different locations. The Identifier the same after moving to different locations. The Identifier
Locator Addressing or ILA [I-D.herbert-nvo3-ila] is one of such Locator Addressing or ILA [I-D.herbert-nvo3-ila] is one of such
solutions. solutions.
Because broadcasting is not available in Layer-3 based networks, Because broadcasting is not available in Layer-3 based networks,
multicast of neighbor solicitations in IPv6 would need to be multicast of neighbor solicitations in IPv6 would need to be
emulated. emulated.
Cold task migration, which is a common practice in many data Cold task migration, which is a common practice in many data
centers, involves the following steps: centers, involves the following steps:
- Stop running the task. - Stop running the task.
- Package the runtime state of the job. - Package the runtime state of the job.
- Send the runtime state of the task to the new NVE where the - Send the runtime state of the task to the New NVE where the
task is to run. task is to run.
- Instantiate the task's state on the new machine. - Instantiate the task's state on the new machine.
- Start the tasks continuing it from the point at which it was - Start the tasks for the task continuing from the point at which
stopped. it was stopped.
Address migration and connection migration in moving tasks or VMs
are addressed next.
4.2.1. Address and Connection Migration in Task Migration 4.2.1. Address and Connection Migration in Task Migration
Address migration is achieved as follows: Address migration is achieved as follows:
- Configure IPv4/v6 address on the target Task. - Configure IPv4/v6 address on the target Task.
- Suspend use of the address on the old Task. This includes - Suspend use of the address on the old Task. This includes
handling established connections. A state may be established handling established connections. A state may be established
to drop packets or send ICMPv4 or ICMPv6 destination to drop packets or send ICMPv4 or ICMPv6 destination
unreachable message when packets to the migrated address are unreachable message when packets to the migrated address are
skipping to change at page 9, line 37 skipping to change at page 9, line 40
5. Handling Packets in Flight 5. Handling Packets in Flight
The Old NVE may receive packets from the VM's ongoing The Old NVE may receive packets from the VM's ongoing
communications. These packets should not be lost; they should be communications. These packets should not be lost; they should be
sent to the New NVE to be delivered to the VM. The steps involved sent to the New NVE to be delivered to the VM. The steps involved
in handling packets in flight are as follows: in handling packets in flight are as follows:
Preparation Step: It takes some time, possibly a few seconds for Preparation Step: It takes some time, possibly a few seconds for
a VM to move from its Old NVE to a New NVE. During this period, a a VM to move from its Old NVE to a New NVE. During this period, a
tunnel needs to be established so that the Old NVE can forward tunnel needs to be established so that the Old NVE can forward
packets to the New NVE. Old NVE gets New NVE address from NVA in packets to the New NVE. Old NVE gets New NVE address from its NVA
the request to move the VM. The Old NVE can store the New NVE assuming that the NVA gets the notification when a VM is moved
address for the VM with a timer. When the timer expired, the entry from one NVE to another. It is out of the scope of this document
for the New NVE for the VM can be deleted. on which entity manages the VM move and how NVA gets notified of
the move. The Old NVE can store the New NVE address for the VM
with a timer. When the timer expired, the entry for the New NVE
for the VM can be deleted.
Tunnel Establishment - IPv6: Inflight packets are tunneled to the Tunnel Establishment - IPv6: Inflight packets are tunneled to the
New NVE using the encapsulation protocol such as VXLAN in IPv6. New NVE using the encapsulation protocol such as VXLAN in IPv6.
Tunnel Establishment - IPv4: Inflight packets are tunneled to the Tunnel Establishment - IPv4: Inflight packets are tunneled to the
New NVE using the encapsulation protocol such as VXLAN in IPv4. New NVE using the encapsulation protocol such as VXLAN in IPv4.
Tunneling Packets - IPv6: IPv6 packets received for the migrating Tunneling Packets - IPv6: IPv6 packets received for the migrating
VM are encapsulated in an IPv6 header at the Old NVE. New NVE VM are encapsulated in an IPv6 header at the Old NVE. New NVE
decapsulates the packet and sends IPv6 packet to the migrating VM. decapsulates the packet and sends IPv6 packet to the migrating VM.
skipping to change at page 10, line 19 skipping to change at page 10, line 26
Stop Tunneling Packets: When the Timer for storing the New NVE Stop Tunneling Packets: When the Timer for storing the New NVE
address for the VM expires. The Timer should be long enough for address for the VM expires. The Timer should be long enough for
all other NVEs that need to communicate with the VM to get their all other NVEs that need to communicate with the VM to get their
NVE-VM cache entries updated. NVE-VM cache entries updated.
6. Moving Local State of VM 6. Moving Local State of VM
In addition to the VM mobility related signaling (VM Mobility In addition to the VM mobility related signaling (VM Mobility
Registration Request/Reply), the VM state needs to be transferred Registration Request/Reply), the VM state needs to be transferred
to the New NVE. The state includes its memory and file system if to the New NVE. The state includes its memory and file system if
the VM cannot access the memory and the file system after moving the VM cannot access the memory and the file system after moving
to the New NVE. Old NVE opens a TCP connection with New NVE over to the New NVE.
which VM's memory state is transferred.
File system or local storage is more complicated to transfer. The The mechanism of transferring VM States and file system is out of
transfer should ensure consistency, i.e. the VM at the New NVE the scope of this document.
should find the same file system it had at the Old NVE. Pre-
copying is a commonly used technique for transferring the file
system. First the whole disk image is transferred while VM
continues to run. After the VM is moved, any changes in the file
system are packaged together and sent to the New NVE Hypervisor
which reflects these changes to the file system locally at the
destination.
7. Handling of Hot, Warm and Cold VM Mobility 7. Handling of Hot, Warm and Cold VM Mobility
Both Cold and Warm VM mobility (migration), refers to the VM being Both Cold and Warm VM mobility (or migration) refers to the VM
completely shut down at the old NVE before restarted at the new being completely shut down at the Old NVE before restarted at the
NVE. Therefore, all transport services to the VM need to restart. New NVE. Therefore, all transport services to the VM are
restarted.
Upon starting at the new NVE, the VM should send an ARP or Upon starting at the New NVE, the VM should send an ARP or
Neighbor Discovery message. Cold VM mobility also allows the Old Neighbor Discovery message. Cold VM mobility also allows the Old
NVE and all communicating NVEs to time out ARP/neighbor cache NVE and all communicating NVEs to time out ARP/neighbor cache
entries of the VM. It is necessary for the NVA to push the entries of the VM. It is necessary for the NVA to push the
updated ARP/neighbor cache entry to NVEs or for NVEs to pull the updated ARP/neighbor cache entry to NVEs or for NVEs to pull the
updated ARP/neighbor cache entry from NVA. updated ARP/neighbor cache entry from NVA.
The Cold VM mobility can be facilitated by cold standby entity The Cold VM mobility can be facilitated by cold standby entity
receiving scheduled backup information. The cold standby entity receiving scheduled backup information. The cold standby entity
can be a VM or other form factors which is beyond the scope of can be a VM or can be other form factors which is beyond the scope
this document. The cold mobility option can be used for non- of this document. The cold mobility option can be used for non-
critical applications and services that can tolerate interrupted critical applications and services that can tolerate interrupted
TCP connections. TCP connections.
The Warm VM mobility refers the backup entities receive backup The Warm VM mobility refers the backup entities receive backup
information at more frequent intervals. The duration of the information at more frequent intervals. The duration of the
interval determines the warmth of the option. The larger the interval determines the warmth of the option. The larger the
duration, the less warm (and hence cold) the Warm VM mobility duration, the less warm (and hence cold) the Warm VM mobility
option becomes. option becomes.
For Hot VM Mobility, once a VM moves to a New NVE, the VM IP
address does not change and the VM should be able to continue to
receive packets to its address(es). The VM needs to send a
gratuitous Address Resolution message or unsolicited Neighbor
Advertisement message upstream after each move.
8. Other VM Mobility Options
There is also a Hot Standby option in addition to the Hot There is also a Hot Standby option in addition to the Hot
Mobility, where there are VMs in both primary and secondary NVEs. Mobility, where there are VMs in both primary and secondary NVEs.
They have identical information and can provide services They have identical information and can provide services
simultaneously as in load-share mode of operation. If the VM in simultaneously as in load-share mode of operation. If the VM in
the primary NVE fails, there is no need to actively move the VM to the primary NVE fails, there is no need to actively move the VM to
the secondary NVE because the VM in the secondary NVE already the secondary NVE because the VM in the secondary NVE already
contains identical information. The Hot Standby option is the contain identical information. The Hot Standby option is the
costliest mechanism, and hence this option is utilized only for costliest mechanism, and hence this option is utilized only for
mission-critical applications and services. In Hot Standby mission-critical applications and services. In Hot Standby
option, regarding TCP connections, one option is to start with and option, regarding TCP connections, one option is to start with and
maintain TCP connections to two different VMs at the same time. maintain TCP connections to two different VMs at the same time.
The least loaded VM responds first and starts providing service The least loaded VM responds first and pickup providing service
while the sender (origin) still continues to receive Ack from the while the sender (origin) still continues to receive Ack from the
heavily loaded (secondary) VM and chooses not to use the service heavily loaded (secondary) VM and chooses not to use the service
of the secondary responding VM. If the situation (loading of the secondary responding VM. If the situation (loading
condition of the primary responding VM) changes the secondary VM condition of the primary responding VM) changes the secondary
may start providing service to the sender (origin). responding VM may start providing service to the sender (origin).
8. VM Operation
Once a VM moves to a new NVE, the VM's IP address does not change
and the VM should be able to continue to receive packets to its
address(es).
The VM needs to send a gratuitous Address Resolution message or
unsolicited Neighbor Advertisement message upstream after each
move.
9. VM Lifecycle Management
The VM lifecycle management is a complicated task, which is beyond The VM lifecycle management is a complicated task, which is beyond
the scope of this document. Not only it involves monitoring server the scope of this document. Not only it involves monitoring server
utilization, balancing the distribution of workload, etc., but utilization, balanced distribution of workload, etc., but also
also needs seamless management VM migration from one server to needs to manage seamlessly VM migration from one server to
another. another.
9. Security Considerations 10. Security Considerations
Security threats for the data and control plane for overlay Security threats for the data and control plane for overlay
networks are discussed in [RFC8014]. There are several issues in networks are discussed in [RFC8014]. There are several issues in
a multi-tenant environment that create problems. In Layer-2 based a multi-tenant environment that create problems. In Layer-2 based
overlay DC networks, lack of security in VXLAN, and corruption of overlay data center networks, lack of security in VXLAN,
VNI can lead to delivery of information to the wrong tenant. corruption of VNI can lead to delivery to wrong tenant. Also, ARP
in IPv4 and ND in IPv6 are not secure, especially if we accept
Also, ARP in IPv4 and ND in IPv6 are not secure, especially if we gratuitous versions. When these are done over a UDP
accept the gratuitous versions. When these are done over a UDP encapsulation, like VXLAN, the problem is worse since it is
encapsulation, as in VXLAN, the problem gets worse since it is
trivial for a non-trusted entity to spoof UDP packets. trivial for a non-trusted entity to spoof UDP packets.
In Layer-3 based overlay data center networks, the problem of In Layer-3 based overlay data center networks, the problem of
address spoofing may arise. An NVE may have untrusted tasks address spoofing may arise. An NVE may have untrusted tasks
attached to it. This usually happens in situations when the VMs attached. This usually happens in cases like the VMs (tasks)
(tasks) running third party applications. This requires the usage running third party applications. This requires the usage of
of stronger security mechanisms. stronger security mechanisms.
10. IANA Considerations 11. IANA Considerations
This document makes no request to IANA. This document makes no request to IANA.
11. Acknowledgments 12. Acknowledgments
The authors are grateful to Bob Briscoe, David Black, Dave R. The authors are grateful to Bob Briscoe, David Black, Dave R.
Worley, Qiang Zu, and Andrew Malis for helpful comments. Worley, Qiang Zu, Andrew Malis for helpful comments.
12. Change Log 13. Change Log
. submitted version -00 as a working group draft after adoption . submitted version -00 as a working group draft after adoption
. submitted version -01 with these changes: references are updated, . submitted version -01 with these changes: references are updated,
o added packets in flight definition to Section 2 o added packets in flight definition to Section 2
. submitted version -02 with updated address. . submitted version -02 with updated address.
. submitted version -03 to fix the nits. . submitted version -03 to fix the nits.
. submitted version -04 in reference to the WG Last call comments. . submitted version -04 in reference to the WG Last call comments.
. Submitted version - 05 to address IETF LC comments from TSV area. . Submitted version - 05 to address IETF LC comments from TSV area.
13. References 14. References
13.1. Normative References 14.1. Normative References
[RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or [RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or
Converting Network Protocol Addresses to 48.bit Ethernet Converting Network Protocol Addresses to 48.bit Ethernet
Address for Transmission on Ethernet Hardware", STD 37, Address for Transmission on Ethernet Hardware", STD 37,
RFC 826, DOI 10.17487/RFC0826, November 1982, RFC 826, DOI 10.17487/RFC0826, November 1982,
<https://www.rfc-editor.org/info/rfc826>. <https://www.rfc-editor.org/info/rfc826>.
[RFC0903] Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A [RFC0903] Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A
Reverse Address Resolution Protocol", STD 38, RFC 903, Reverse Address Resolution Protocol", STD 38, RFC 903,
DOI 10.17487/RFC0903, June 1984, <https://www.rfc- DOI 10.17487/RFC0903, June 1984, <https://www.rfc-
skipping to change at page 14, line 11 skipping to change at page 14, line 11
Overlays for Network Virtualization", RFC 7364, DOI Overlays for Network Virtualization", RFC 7364, DOI
10.17487/RFC7364, October 2014, <https://www.rfc- 10.17487/RFC7364, October 2014, <https://www.rfc-
editor.org/info/rfc7364>. editor.org/info/rfc7364>.
[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. [RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T.
Narten, "An Architecture for Data-Center Network Narten, "An Architecture for Data-Center Network
Virtualization over Layer 3 (NVO3)", RFC 8014, DOI Virtualization over Layer 3 (NVO3)", RFC 8014, DOI
10.17487/RFC8014, December 2016, <https://www.rfc- 10.17487/RFC8014, December 2016, <https://www.rfc-
editor.org/info/rfc8014>. editor.org/info/rfc8014>.
13.2. Informative References 14.2. Informative References
[I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier- [I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier-
locator addressing for IPv6", draft-herbert-nvo3-ila-04 locator addressing for IPv6", draft-herbert-nvo3-ila-04
(work in progress), March 2017. (work in progress), March 2017.
Authors' Addresses Authors' Addresses
Linda Dunbar Linda Dunbar
Futurewei Futurewei
Email: ldunbar@futurewei.com Email: ldunbar@futurewei.com
 End of changes. 62 change blocks. 
188 lines changed or deleted 182 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/