draft-ietf-nvo3-vmm-07.txt   draft-ietf-nvo3-vmm-08.txt 
Network Working Group L. Dunbar Network Working Group L. Dunbar
Internet Draft Futurewei Internet Draft Futurewei
Intended status: Informational B. Sarikaya Intended status: Informational B. Sarikaya
Expires: August 21, 2020 Denpel Informatique Expires: September 25, 2020 Denpel Informatique
B.Khasnabish B.Khasnabish
Independent Independent
T. Herbert T. Herbert
Intel Intel
S. Dikshit S. Dikshit
Aruba-HPE Aruba-HPE
February 21, 2020 March 25, 2020
Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks
draft-ietf-nvo3-vmm-07 draft-ietf-nvo3-vmm-08
Abstract Abstract
This document describes virtual machine mobility solutions commonly This document describes virtual machine mobility solutions commonly
used in data centers built with overlay-based network. This document used in data centers built with overlay-based network. This document
is intended for describing the solutions and the impact of moving is intended for describing the solutions and the impact of moving
VMs (or applications) from one Rack to another connected by the VMs (or applications) from one Rack to another connected by the
Overlay networks. Overlay networks.
For layer 2, it is based on using an NVA (Network Virtualization For layer 2, it is based on using an NVA (Network Virtualization
skipping to change at page 2, line 21 skipping to change at page 2, line 21
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on August 21, 2020. This Internet-Draft will expire on September 24, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 4 skipping to change at page 3, line 4
Section 4.e of the Trust Legal Provisions and are provided without Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License. warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
2. Conventions used in this document..............................4 2. Conventions used in this document..............................4
3. Requirements...................................................5 3. Requirements...................................................5
4. Overview of the VM Mobility Solutions..........................6 4. Overview of the VM Mobility Solutions..........................6
4.1. VM Migration in Layer 2 Network...........................6 4.1. VM Migration in Layer 2 Network...........................6
4.2. Task Migration in Layer-3 Network.........................7 4.2. VM Migration in Layer-3 Network...........................8
4.2.1. Address and Connection Migration in Task Migration...8 4.3. Address and Connection Migration in Task Migration........9
5. Handling Packets in Flight.....................................9 5. Handling Packets in Flight....................................10
6. Moving Local State of VM......................................10 6. Moving Local State of VM......................................10
7. Handling of Hot, Warm and Cold VM Mobility....................10 7. Handling of Hot, Warm and Cold VM Mobility....................11
8. Other VM Mobility Options.....................................11 8. Other VM Mobility Options.....................................11
9. VM Lifecycle Management.......................................11 9. VM Lifecycle Management.......................................12
10. Security Considerations......................................11 10. Security Considerations......................................12
11. IANA Considerations..........................................12 11. IANA Considerations..........................................12
12. Acknowledgments..............................................12 12. Acknowledgments..............................................12
13. Change Log...................................................12 13. Change Log...................................................13
14. References...................................................12 14. References...................................................13
14.1. Normative References....................................13 14.1. Normative References....................................13
14.2. Informative References..................................14 14.2. Informative References..................................14
1. Introduction 1. Introduction
This document describes the overlay-based data center networks This document describes the overlay-based data center networks
solutions in supporting multitenancy and VM (Virtual Machine) solutions in supporting multitenancy and VM (Virtual Machine)
mobility. Many large DCs (Data Centers), especially Cloud DCs, mobility. This document is strictly within the DCVPN, as defined
host tasks (or workloads) for multiple tenants. A tenant can be a by the NVO3 Framework [RFC 7365]. The intent is to describe Layer
department of one organization or an organization. There are 2 and Layer 3 Network behavior when VMs are moved from one NVE to
communication among tasks belonging to one tenant and another. This document assumes that the VMs move is initiated by
communication among tasks belonging to different tenants or with VM management system, i.e. planed move. How and when to move VM
external entities. are out of the scope of this document. RFC7666 already has the
description of the MIB for VMs controlled by Hypervisor. The
impact of VM mobility on higher layer protocols and applications
is outside its scope.
Many large DCs (Data Centers), especially Cloud DCs, host tasks
(or workloads) for multiple tenants. A tenant can be a department
of one organization or an organization. There are communications
among tasks belonging to one tenant and communications among tasks
belonging to different tenants or with external entities.
Server Virtualization, which is being used in almost all of Server Virtualization, which is being used in almost all of
today's data centers, enables many VMs to run on a single physical today's data centers, enables many VMs to run on a single physical
computer or server sharing the processor/memory/storage. Network computer or server sharing the processor/memory/storage. Network
connectivity among VMs is provided by the network virtualization connectivity among VMs is provided by the network virtualization
edge (NVE) [RFC8014]. It is highly desirable [RFC7364] to allow edge (NVE) [RFC8014]. It is highly desirable [RFC7364] to allow
VMs to be moved dynamically (live, hot, or cold move) from one VMs to be moved dynamically (live, hot, or cold move) from one
server to another for dynamic load balancing or optimized work server to another for dynamic load balancing or optimized work
distribution. distribution.
There are many challenges and requirements related to VM mobility There are many challenges and requirements related to VM mobility
in large data centers, including dynamic attaching/detaching VMs in large data centers, including dynamic attaching/detaching VMs
skipping to change at page 6, line 10 skipping to change at page 6, line 16
except for handling packets in flight. except for handling packets in flight.
VM mobility solutions/procedures should not need to use tunneling VM mobility solutions/procedures should not need to use tunneling
except for handling packets in flight. except for handling packets in flight.
4. Overview of the VM Mobility Solutions 4. Overview of the VM Mobility Solutions
Layer 2 and Layer 3 mobility solutions are described respectively Layer 2 and Layer 3 mobility solutions are described respectively
in the following sections. in the following sections.
This document assumes that the communication with external
entities are via the NVO3 Gateway as described in RFC8014 (NVO3
Architecture). RFC 8014 (Section 5.3) has the discussion whether a
VM move may result in or cannot result in a change to the network
node providing the NV03 Gateway functionality - if such a change
is not possible, then the path to the external entity may be hair-
pinned to the NVO3 Gateway used prior to the VM move.
4.1. VM Migration in Layer 2 Network 4.1. VM Migration in Layer 2 Network
Being able to move VMs dynamically, from one server to another, Being able to move VMs dynamically, from one server to another,
makes it possible for dynamic load balancing or work distribution. makes it possible for dynamic load balancing or work distribution.
Therefore, dynamic VM Mobility is highly desirable for large scale Therefore, dynamic VM Mobility is highly desirable for large scale
multi-tenant DCs. multi-tenant DCs.
In a Layer-2 based approach, VM moving to another server does not In a Layer-2 based approach, VM moving to another server does not
change its IP address. But this VM is now under a new NVE, change its IP address. But this VM is now under a new NVE,
previously communicating NVEs will continue sending their packets previously communicating NVEs will continue sending their packets
skipping to change at page 7, line 12 skipping to change at page 7, line 28
in this case the new NVE needs to communicate with NVA, just like in this case the new NVE needs to communicate with NVA, just like
in the gratuitous ARP case to ensure that the same IPv4 address is in the gratuitous ARP case to ensure that the same IPv4 address is
assigned to the VM. NVA uses the MAC address as the key in the assigned to the VM. NVA uses the MAC address as the key in the
search of ARP cache to find the IP address and informs this to the search of ARP cache to find the IP address and informs this to the
new NVE which in turn sends RARP reply message. This completes IP new NVE which in turn sends RARP reply message. This completes IP
address assignment to the migrating VM. address assignment to the migrating VM.
Other NVEs communicating with this VM could have the old ARP Other NVEs communicating with this VM could have the old ARP
entry. If any VMs in those NVEs need to communicate with the VM entry. If any VMs in those NVEs need to communicate with the VM
attached to the New NVE, old ARP entries might be used. Thus, the attached to the New NVE, old ARP entries might be used. Thus, the
packets are delivered to the Old NVE. The Old NVE MUST tunnel packets are delivered to the Old NVE. The Old NVE needs to tunnel
these in-flight packets to the New NVE. these in-flight packets to the New NVE to avoid packets loss.
When an ARP entry for those VMs times out, their corresponding When an ARP entry for those VMs times out, their corresponding
NVEs should access the NVA for an update. NVEs should access the NVA for an update.
IPv6 operation is slightly different: IPv6 operation is slightly different:
In IPv6, after the move, the VM immediately sends an unsolicited In IPv6, after the move, the VM immediately sends an unsolicited
neighbor advertisement message containing its IPv6 address and neighbor advertisement message containing its IPv6 address and
Layer-2 MAC address to its new NVE. This message is sent to the Layer-2 MAC address to its new NVE. This message is sent to the
IPv6 Solicited Node Multicast Address corresponding to the target IPv6 Solicited Node Multicast Address corresponding to the target
skipping to change at page 7, line 35 skipping to change at page 8, line 5
message should send request to update VM's neighbor cache entry in message should send request to update VM's neighbor cache entry in
the central directory of the NVA. The NVA's neighbor cache entry the central directory of the NVA. The NVA's neighbor cache entry
should include IPv6 address of the VM, MAC address of the VM and should include IPv6 address of the VM, MAC address of the VM and
the NVE IPv6 address. An NVE-to-NVA protocol is used for this the NVE IPv6 address. An NVE-to-NVA protocol is used for this
purpose [RFC8014]. purpose [RFC8014].
Other NVEs communicating with this VM might still use the old Other NVEs communicating with this VM might still use the old
neighbor cache entry. If any VM in those NVEs need to communicate neighbor cache entry. If any VM in those NVEs need to communicate
with the VM attached to the New NVE, it could use the old neighbor with the VM attached to the New NVE, it could use the old neighbor
cache entry. Thus, the packets are delivered to the Old NVE. The cache entry. Thus, the packets are delivered to the Old NVE. The
Old NVE MUST tunnel these in-flight packets to the New NVE. Old NVE needs to tunnel these in-flight packets to the New NVE.
When a neighbor cache entry in those VMs times out, their When a neighbor cache entry in those VMs times out, their
corresponding NVEs should access the NVA for an update. corresponding NVEs should access the NVA for an update.
4.2. Task Migration in Layer-3 Network 4.2. VM Migration in Layer-3 Network
ARP/neighbor cache scalability considerations can limit the size
of Layer-2 based DC networks. Scaling can be accomplished
seamlessly in Layer-3 data center networks by just giving each
virtual network an IP subnet and a default route that points to
its NVE. This means no explosion of ARP/ neighbor cache in VMs
and NVEs (just one ARP/ neighbor cache entry for the default
route) and there is no need to have Ethernet header in
encapsulation [RFC7348] which saves at least 16 bytes.
Even though the term VM and Task are used interchangeably in this Traditional Layer-3 based data center networks usually have all
document, the term Task is used in the context of Layer-3 hosts (tasks) within one subnet attached to one NVE. By this
migration mainly to have slight emphasis on the moving an entity design, the NVE becomes the default route for all hosts (tasks)
(Task) that is instantiated on a VM or a container. within the subnet. But this design requires IP address of a host
(task) to change after the move to comply with the prefixes of the
IP address under the new NVE.
Traditional Layer-3 based data center networks require IP address A VM migration in Layer 3 Network solution is to allow IP
of the task to change after moving because the prefixes of the IP addresses staying the same after moving to different locations.
address usually reflect the locations. It is necessary to have an The Identifier Locator Addressing or ILA [I-D.herbert-nvo3-ila] is
IP based VM migration solution that can allow IP addresses staying one of such solutions.
the same after moving to different locations. The Identifier
Locator Addressing or ILA [I-D.herbert-nvo3-ila] is one of such
solutions.
Because broadcasting is not available in Layer-3 based networks, Because broadcasting is not available in Layer-3 based networks,
multicast of neighbor solicitations in IPv6 would need to be multicast of neighbor solicitations in IPv6 would need to be
emulated. emulated.
Cold task migration, which is a common practice in many data Hot VM Migration in Layer 3 involves coordination among many
centers, involves the following steps: entities, such as VM management system and NVA. Cold task
migration, which is a common practice in many data centers,
involves the following steps:
- Stop running the task. - Stop running the task.
- Package the runtime state of the job. - Package the runtime state of the job.
- Send the runtime state of the task to the New NVE where the - Send the runtime state of the task to the New NVE where the
task is to run. task is to run.
- Instantiate the task's state on the new machine. - Instantiate the task's state on the new machine.
- Start the tasks for the task continuing from the point at which - Start the tasks for the task continuing from the point at which
it was stopped. it was stopped.
4.2.1. Address and Connection Migration in Task Migration RFC7666 has the more detailed description of the State Machine of
VMs controlled by Hypervisor
Address migration is achieved as follows: 4.3. Address and Connection Migration in Task Migration
- Configure IPv4/v6 address on the target Task. The term "Task" is referring to an entity (Task) that is
- Suspend use of the address on the old Task. This includes instantiated on a VM or a container, in another word, a Task can
be an "Application" or a "workload" running on a VM or a
Container.
Moving a Task running on a VM attached to one NVE to another VM
attached to a New NVE is same as moving the VM from one NVE to the
New NVE. The VM attached to the New NVE needs to be assigned with
the same address as VM attached to the Old NVE, which is called
Address Migration in this document. Here is an example of the
steps involved in Address Migration:
- Configure IPv4/v6 address on the target VM/NVE.
- Suspend use of the address on the old NVE. This includes
handling established connections. A state may be established handling established connections. A state may be established
to drop packets or send ICMPv4 or ICMPv6 destination to drop packets or send ICMPv4 or ICMPv6 destination
unreachable message when packets to the migrated address are unreachable message when packets to the migrated address are
received. received. Referring to the VM State Machine described in
- Push the new mapping to VM. Communicating VMs will learn of RFC7666.
the new mapping via a control plane either by participating in - Push the new NVE-VM mapping to other NVEs which have the
a protocol for mapping propagation or by getting the new attached VMs communicating with the VM being moved. All
mapping from a central database such as Domain Name System relevant NVEs will learn the new mapping via their
(DNS). corresponding NVA.
Connection migration involves reestablishing existing TCP Connection migration involves reestablishing existing TCP
connections of the task in the new place. connections of the task in the new place.
The simplest course of action is to drop all TCP connections to The simplest course of action is to drop all TCP connections to
the VM across a migration. If the migrations are relatively rare the VM across a migration. If the migrations are relatively rare
events in a data center, impact is relatively small when TCP events in a data center, impact is relatively small when TCP
connections are automatically closed in the network stack during a connections are automatically closed in the network stack during a
migration event. If the applications running are known to handle migration event. If the applications running are known to handle
this gracefully (i.e. reopen dropped connections) then this this gracefully (i.e. reopen dropped connections) then this
skipping to change at page 11, line 7 skipping to change at page 11, line 30
The Cold VM mobility can be facilitated by cold standby entity The Cold VM mobility can be facilitated by cold standby entity
receiving scheduled backup information. The cold standby entity receiving scheduled backup information. The cold standby entity
can be a VM or can be other form factors which is beyond the scope can be a VM or can be other form factors which is beyond the scope
of this document. The cold mobility option can be used for non- of this document. The cold mobility option can be used for non-
critical applications and services that can tolerate interrupted critical applications and services that can tolerate interrupted
TCP connections. TCP connections.
The Warm VM mobility refers the backup entities receive backup The Warm VM mobility refers the backup entities receive backup
information at more frequent intervals. The duration of the information at more frequent intervals. The duration of the
interval determines the warmth of the option. The larger the interval determines the effectiveness (or benefit) of Warm VM
duration, the less warm (and hence cold) the Warm VM mobility mobility. The larger the duration, the less effective the Warm VM
option becomes. mobility option becomes.
For Hot VM Mobility, once a VM moves to a New NVE, the VM IP For Hot VM Mobility, once a VM moves to a New NVE, the VM IP
address does not change and the VM should be able to continue to address does not change and the VM should be able to continue to
receive packets to its address(es). The VM needs to send a receive packets to its address(es). The VM needs to send a
gratuitous Address Resolution message or unsolicited Neighbor gratuitous Address Resolution message or unsolicited Neighbor
Advertisement message upstream after each move. Advertisement message upstream after each move.
8. Other VM Mobility Options 8. Other VM Mobility Options
There is also a Hot Standby option in addition to the Hot There is also a Hot Standby option in addition to the Hot
Mobility, where there are VMs in both primary and secondary NVEs. Mobility, where there are VMs in both primary and secondary NVEs.
skipping to change at page 12, line 37 skipping to change at page 13, line 18
. submitted version -01 with these changes: references are updated, . submitted version -01 with these changes: references are updated,
o added packets in flight definition to Section 2 o added packets in flight definition to Section 2
. submitted version -02 with updated address. . submitted version -02 with updated address.
. submitted version -03 to fix the nits. . submitted version -03 to fix the nits.
. submitted version -04 in reference to the WG Last call comments. . submitted version -04 in reference to the WG Last call comments.
. Submitted version - 05 to address IETF LC comments from TSV area. . Submitted version - 05, 06, 07, and 08 to address IETF LC comments
from TSV area.
14. References 14. References
14.1. Normative References 14.1. Normative References
[RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or [RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or
Converting Network Protocol Addresses to 48.bit Ethernet Converting Network Protocol Addresses to 48.bit Ethernet
Address for Transmission on Ethernet Hardware", STD 37, Address for Transmission on Ethernet Hardware", STD 37,
RFC 826, DOI 10.17487/RFC0826, November 1982, RFC 826, DOI 10.17487/RFC0826, November 1982,
<https://www.rfc-editor.org/info/rfc826>. <https://www.rfc-editor.org/info/rfc826>.
[RFC0903] Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A [RFC0903] Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A
Reverse Address Resolution Protocol", STD 38, RFC 903, Reverse Address Resolution Protocol", STD 38, RFC 903,
skipping to change at page 14, line 5 skipping to change at page 14, line 27
Framework for Overlaying Virtualized Layer 2 Networks over Framework for Overlaying Virtualized Layer 2 Networks over
Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, August Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, August
2014, <https://www.rfc-editor.org/info/rfc7348>. 2014, <https://www.rfc-editor.org/info/rfc7348>.
[RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., [RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L.,
Kreeger, L., and M. Napierala, "Problem Statement: Kreeger, L., and M. Napierala, "Problem Statement:
Overlays for Network Virtualization", RFC 7364, DOI Overlays for Network Virtualization", RFC 7364, DOI
10.17487/RFC7364, October 2014, <https://www.rfc- 10.17487/RFC7364, October 2014, <https://www.rfc-
editor.org/info/rfc7364>. editor.org/info/rfc7364>.
[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. [RFC7666] H. Asai, et al, "Management Information Base for Virtual
Machines Controlled by a Hypervisor", RFC7666, Oct 2015.
[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T.
Narten, "An Architecture for Data-Center Network Narten, "An Architecture for Data-Center Network
Virtualization over Layer 3 (NVO3)", RFC 8014, DOI Virtualization over Layer 3 (NVO3)", RFC 8014, DOI
10.17487/RFC8014, December 2016, <https://www.rfc- 10.17487/RFC8014, December 2016, <https://www.rfc-
editor.org/info/rfc8014>. editor.org/info/rfc8014>.
14.2. Informative References 14.2. Informative References
[I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier- [I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier-
locator addressing for IPv6", draft-herbert-nvo3-ila-04 locator addressing for IPv6", draft-herbert-nvo3-ila-04
(work in progress), March 2017. (work in progress), March 2017.
 End of changes. 24 change blocks. 
59 lines changed or deleted 85 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/