draft-ietf-nvo3-vmm-08.txt | draft-ietf-nvo3-vmm-09.txt | |||
---|---|---|---|---|
Network Working Group L. Dunbar | Network Working Group L. Dunbar | |||
Internet Draft Futurewei | Internet Draft Futurewei | |||
Intended status: Informational B. Sarikaya | Intended status: Informational B. Sarikaya | |||
Expires: September 25, 2020 Denpel Informatique | Expires: September 26, 2020 Denpel Informatique | |||
B.Khasnabish | B.Khasnabish | |||
Independent | Independent | |||
T. Herbert | T. Herbert | |||
Intel | Intel | |||
S. Dikshit | S. Dikshit | |||
Aruba-HPE | Aruba-HPE | |||
March 25, 2020 | March 26, 2020 | |||
Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks | Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks | |||
draft-ietf-nvo3-vmm-08 | draft-ietf-nvo3-vmm-09 | |||
Abstract | Abstract | |||
This document describes virtual machine mobility solutions commonly | This document describes virtual machine mobility solutions commonly | |||
used in data centers built with overlay-based network. This document | used in data centers built with overlay-based network. This document | |||
is intended for describing the solutions and the impact of moving | is intended for describing the solutions and the impact of moving | |||
VMs (or applications) from one Rack to another connected by the | VMs (or applications) from one Rack to another connected by the | |||
Overlay networks. | Overlay networks. | |||
For layer 2, it is based on using an NVA (Network Virtualization | For layer 2, it is based on using an NVA (Network Virtualization | |||
skipping to change at page 2, line 21 ¶ | skipping to change at page 2, line 21 ¶ | |||
months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
This Internet-Draft will expire on September 24, 2020. | This Internet-Draft will expire on September 25, 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 44 ¶ | skipping to change at page 2, line 44 ¶ | |||
document must include Simplified BSD License text as described in | document must include Simplified BSD License text as described in | |||
Section 4.e of the Trust Legal Provisions and are provided without | Section 4.e of the Trust Legal Provisions and are provided without | |||
warranty as described in the Simplified BSD License. | warranty as described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction...................................................3 | 1. Introduction...................................................3 | |||
2. Conventions used in this document..............................4 | 2. Conventions used in this document..............................4 | |||
3. Requirements...................................................5 | 3. Requirements...................................................5 | |||
4. Overview of the VM Mobility Solutions..........................6 | 4. Overview of the VM Mobility Solutions..........................6 | |||
4.1. VM Migration in Layer 2 Network...........................6 | 4.1. Inter-VNs communication...................................6 | |||
4.2. VM Migration in Layer-3 Network...........................8 | 4.2. VM Migration in Layer 2 Network...........................6 | |||
4.3. Address and Connection Migration in Task Migration........9 | 4.3. VM Migration in Layer-3 Network...........................8 | |||
4.4. Address and Connection Management in VM Migration.........9 | ||||
5. Handling Packets in Flight....................................10 | 5. Handling Packets in Flight....................................10 | |||
6. Moving Local State of VM......................................10 | 6. Moving Local State of VM......................................11 | |||
7. Handling of Hot, Warm and Cold VM Mobility....................11 | 7. Handling of Hot, Warm and Cold VM Mobility....................11 | |||
8. Other VM Mobility Options.....................................11 | 8. Other Options.................................................12 | |||
9. VM Lifecycle Management.......................................12 | 9. VM Lifecycle Management.......................................12 | |||
10. Security Considerations......................................12 | 10. Security Considerations......................................12 | |||
11. IANA Considerations..........................................12 | 11. IANA Considerations..........................................13 | |||
12. Acknowledgments..............................................12 | 12. Acknowledgments..............................................13 | |||
13. Change Log...................................................13 | 13. Change Log...................................................13 | |||
14. References...................................................13 | 14. References...................................................13 | |||
14.1. Normative References....................................13 | 14.1. Normative References....................................14 | |||
14.2. Informative References..................................14 | 14.2. Informative References..................................15 | |||
1. Introduction | 1. Introduction | |||
This document describes the overlay-based data center networks | This document describes the overlay-based data center networks | |||
solutions in supporting multitenancy and VM (Virtual Machine) | solutions in supporting multitenancy and VM (Virtual Machine) | |||
mobility. This document is strictly within the DCVPN, as defined | mobility. This document is strictly within the DCVPN, as defined | |||
by the NVO3 Framework [RFC 7365]. The intent is to describe Layer | by the NVO3 Framework [RFC 7365]. The intent is to describe Layer | |||
2 and Layer 3 Network behavior when VMs are moved from one NVE to | 2 and Layer 3 Network behavior when VMs are moved from one NVE to | |||
another. This document assumes that the VMs move is initiated by | another. This document assumes that the VMs move is initiated by | |||
VM management system, i.e. planed move. How and when to move VM | VM management system, i.e. planed move. How and when to move VM | |||
are out of the scope of this document. RFC7666 already has the | are out of the scope of this document. RFC7666 already has the | |||
skipping to change at page 4, line 24 ¶ | skipping to change at page 4, line 25 ¶ | |||
in several buildings/cities or Layer-3 networks with large number | in several buildings/cities or Layer-3 networks with large number | |||
of host routes that cannot be aggregated as the result of frequent | of host routes that cannot be aggregated as the result of frequent | |||
moves from one location to another without changing their IP | moves from one location to another without changing their IP | |||
addresses. The connectivity between Layer 2 boundaries can be | addresses. The connectivity between Layer 2 boundaries can be | |||
achieved by the network virtualization edge (NVE) functioning as | achieved by the network virtualization edge (NVE) functioning as | |||
Layer 3 gateway routing across bridging domain such as in | Layer 3 gateway routing across bridging domain such as in | |||
Warehouse Scale Computers (WSC). | Warehouse Scale Computers (WSC). | |||
2. Conventions used in this document | 2. Conventions used in this document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL | ||||
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in | ||||
RFC 2119 [RFC2119] and [RFC8014]. | ||||
This document uses the terminology defined in [RFC7364]. In | This document uses the terminology defined in [RFC7364]. In | |||
addition, we make the following definitions: | addition, we make the following definitions: | |||
VM: Virtual Machine | VM: Virtual Machine | |||
Tasks: Task is a program instantiated or running on a virtual | Tasks: Task is a program instantiated or running on a virtual | |||
machine or container. Tasks in virtual machines or | machine or container. Tasks in virtual machines or | |||
containers can be migrated from one server to another. | containers can be migrated from one server to another. | |||
We use task, workload and virtual machine | We use task, workload and virtual machine | |||
interchangeably in this document. | interchangeably in this document. | |||
skipping to change at page 6, line 16 ¶ | skipping to change at page 6, line 13 ¶ | |||
except for handling packets in flight. | except for handling packets in flight. | |||
VM mobility solutions/procedures should not need to use tunneling | VM mobility solutions/procedures should not need to use tunneling | |||
except for handling packets in flight. | except for handling packets in flight. | |||
4. Overview of the VM Mobility Solutions | 4. Overview of the VM Mobility Solutions | |||
Layer 2 and Layer 3 mobility solutions are described respectively | Layer 2 and Layer 3 mobility solutions are described respectively | |||
in the following sections. | in the following sections. | |||
This document assumes that the communication with external | 4.1. Inter-VNs communication | |||
entities are via the NVO3 Gateway as described in RFC8014 (NVO3 | ||||
Architecture). RFC 8014 (Section 5.3) has the discussion whether a | ||||
VM move may result in or cannot result in a change to the network | ||||
node providing the NV03 Gateway functionality - if such a change | ||||
is not possible, then the path to the external entity may be hair- | ||||
pinned to the NVO3 Gateway used prior to the VM move. | ||||
4.1. VM Migration in Layer 2 Network | Inter VNs (Virtual Networks) communication refers to communication | |||
among tenants (or hosts) belonging to different VNs. Those tenants | ||||
can be attached to the NVEs co-located in the same Data Center or | ||||
in different Data centers. This document assumes that the inter- | ||||
VNs communication is via the NVO3 Gateway as described in RFC8014 | ||||
(NVO3 Architecture). RFC 8014 (Section 5.3) describes the NVO3 | ||||
Gateway function which is to relay traffic onto and off of a | ||||
virtual network, i.e. among different VNs. | ||||
After a VM is moved to a new NVE, the VM's corresponding Gateway | ||||
may need to change as well. If such a change is not possible, then | ||||
the path to the external entity need to be hair-pinned to the NVO3 | ||||
Gateway used prior to the VM move. | ||||
4.2. VM Migration in Layer 2 Network | ||||
Being able to move VMs dynamically, from one server to another, | Being able to move VMs dynamically, from one server to another, | |||
makes it possible for dynamic load balancing or work distribution. | makes it possible for dynamic load balancing or work distribution. | |||
Therefore, dynamic VM Mobility is highly desirable for large scale | Therefore, dynamic VM Mobility is highly desirable for large scale | |||
multi-tenant DCs. | multi-tenant DCs. | |||
In a Layer-2 based approach, VM moving to another server does not | In a Layer-2 based approach, VM moving to another NVE does not | |||
change its IP address. But this VM is now under a new NVE, | change its IP address. But this VM is now under a new NVE, | |||
previously communicating NVEs will continue sending their packets | previously communicating NVEs will continue sending their packets | |||
to the Old NVE. To solve this problem, Address Resolution | to the Old NVE. Therefore, Address Resolution Protocol (ARP) | |||
Protocol (ARP) cache in IPv4 [RFC0826] or neighbor cache in IPv6 | cache in IPv4 [RFC0826] or neighbor cache in IPv6 [RFC4861] in the | |||
[RFC4861] in the NVEs need to be updated promptly. All NVEs need | NVEs need to be updated promptly, especially for the NVEs that | |||
to change their caches associating the VM Layer-2 or Medium Access | have attached VMs communicating with the VM being moved. If the VM | |||
Control (MAC) address with the new NVE's IP address as soon as the | being moved has communication with external entities, the NVO3 | |||
VM is moved. Such a change enables all NVEs to encapsulate the | gateway needs to be notified of the new NVE where the VM is moved | |||
outgoing MAC frames with the current target NVE IP address. It may | to. | |||
take some time to refresh ARP/ND cache when a VM is moved to a New | ||||
NVE. During this period, a tunnel is needed for that Old NVE to | Such a change enables those NVEs to encapsulate the outgoing MAC | |||
forward packets destined to the VM to the New NVE. | frames with the current target NVE IP address. | |||
In IPv4, the VM immediately after the move should send a | In IPv4, the VM immediately after the move should send a | |||
gratuitous ARP request message containing its IPv4 and Layer 2 MAC | gratuitous ARP request message containing its IPv4 and Layer 2 MAC | |||
address in its new NVE. This message's destination address is the | address in its new NVE. This message's destination address is the | |||
broadcast address. Upon receiving this message, both Old and New | broadcast address. Upon receiving this message, the New NVE can | |||
NVEs should update the VM's ARP entry in the central directory at | update its cache of mapping MAC with IP. The New NVE should send a | |||
the NVA, to update its mappings to record the IPv4 address & MAC | notification of the newly attached VM to the central directory | |||
address of the moving VM along with the new NVE IPv4 address. An | [RFC7067] embedded in the NVA to update the mapping of the IPv4 | |||
NVE-to-NVA protocol is used for this purpose [RFC8014]. | address & MAC address of the moving VM along with the new NVE | |||
address. An NVE-to-NVA protocol is used for this purpose | ||||
[RFC8014]. If an NVE has a VM being moved away or detached, the | ||||
NVE should send an ARP scan to all its attached VMs to refresh its | ||||
ARP Cache. | ||||
Reverse ARP (RARP) which enables the host to discover its IPv4 | Reverse ARP (RARP) which enables the host to discover its IPv4 | |||
address when it boots from a local server [RFC0903], is not used | address when it boots from a local server [RFC0903], is not used | |||
by VMs if the VM already knows its IPv4 address (most common | by VMs if the VM already knows its IPv4 address (most common | |||
scenario). Next, we describe a case where RARP is used. | scenario). Next, we describe a case where RARP is used. | |||
There are some vendor deployments (diskless systems or systems | There are some vendor deployments (diskless systems or systems | |||
without configuration files) wherein the VM's user, i.e. end-user | without configuration files) wherein the VM's user, i.e. end-user | |||
client askes for the same MAC address upon migration. This can be | client askes for the same MAC address upon migration. This can be | |||
achieved by the clients sending RARP request message which carries | achieved by the clients sending RARP request message which carries | |||
the MAC address looking for an IP address allocation. The server, | the MAC address looking for an IP address allocation. The server, | |||
in this case the new NVE needs to communicate with NVA, just like | in this case the new NVE needs to communicate with NVA, just like | |||
in the gratuitous ARP case to ensure that the same IPv4 address is | in the gratuitous ARP case to ensure that the same IPv4 address is | |||
assigned to the VM. NVA uses the MAC address as the key in the | assigned to the VM. NVA uses the MAC address as the key in the | |||
search of ARP cache to find the IP address and informs this to the | search of ARP cache to find the IP address and informs this to the | |||
new NVE which in turn sends RARP reply message. This completes IP | new NVE which in turn sends RARP reply message. This completes IP | |||
address assignment to the migrating VM. | address assignment to the migrating VM. | |||
Other NVEs communicating with this VM could have the old ARP | Other NVEs communicating with this VM could have the old ARP | |||
entry. If any VMs in those NVEs need to communicate with the VM | entry. To avoid old ARP entries being used by other NVEs, the old | |||
attached to the New NVE, old ARP entries might be used. Thus, the | NVE upon discovering a VM is detached should send a notification | |||
packets are delivered to the Old NVE. The Old NVE needs to tunnel | to all other NVEs and its NVO3 Gateway to time out the ARP cache | |||
these in-flight packets to the New NVE to avoid packets loss. | for the VM [RFC8171]. When an NVE (including the old NVE) receives | |||
packet or ARP request destined towards a VM (its MAC or IP | ||||
When an ARP entry for those VMs times out, their corresponding | address) that is not in the NVE's ARP cache, the NVE should send | |||
NVEs should access the NVA for an update. | query to NVA's Directory Service to get the associated NVE address | |||
for the VM. This is how the Old NVE tunneling these in-flight | ||||
packets to the New NVE to avoid packets loss. | ||||
IPv6 operation is slightly different: | IPv6 operation is slightly different: | |||
In IPv6, after the move, the VM immediately sends an unsolicited | In IPv6, after the move, the VM immediately sends an unsolicited | |||
neighbor advertisement message containing its IPv6 address and | neighbor advertisement message containing its IPv6 address and | |||
Layer-2 MAC address to its new NVE. This message is sent to the | Layer-2 MAC address to its new NVE. This message is sent to the | |||
IPv6 Solicited Node Multicast Address corresponding to the target | IPv6 Solicited Node Multicast Address corresponding to the target | |||
address which is the VM's IPv6 address. The NVE receiving this | address which is the VM's IPv6 address. The NVE receiving this | |||
message should send request to update VM's neighbor cache entry in | message should send request to update VM's neighbor cache entry in | |||
the central directory of the NVA. The NVA's neighbor cache entry | the central directory of the NVA. The NVA's neighbor cache entry | |||
should include IPv6 address of the VM, MAC address of the VM and | should include IPv6 address of the VM, MAC address of the VM and | |||
the NVE IPv6 address. An NVE-to-NVA protocol is used for this | the NVE IPv6 address. An NVE-to-NVA protocol is used for this | |||
purpose [RFC8014]. | purpose [RFC8014]. | |||
Other NVEs communicating with this VM might still use the old | To avoid other NVEs communicating with this VM using the old | |||
neighbor cache entry. If any VM in those NVEs need to communicate | neighbor cache entry, the old NVE upon discovering a VM being | |||
with the VM attached to the New NVE, it could use the old neighbor | moved or VM management system which initiates the VM move should | |||
cache entry. Thus, the packets are delivered to the Old NVE. The | send a notification to all NVEs to timeout the ND cache for the VM | |||
Old NVE needs to tunnel these in-flight packets to the New NVE. | being moved. When a ND cache entry for those VMs times out, their | |||
corresponding NVEs should send query to the NVA for an update. | ||||
When a neighbor cache entry in those VMs times out, their | ||||
corresponding NVEs should access the NVA for an update. | ||||
4.2. VM Migration in Layer-3 Network | 4.3. VM Migration in Layer-3 Network | |||
Traditional Layer-3 based data center networks usually have all | Traditional Layer-3 based data center networks usually have all | |||
hosts (tasks) within one subnet attached to one NVE. By this | hosts (tasks) within one subnet attached to one NVE. By this | |||
design, the NVE becomes the default route for all hosts (tasks) | design, the NVE becomes the default route for all hosts (tasks) | |||
within the subnet. But this design requires IP address of a host | within the subnet. But this design requires IP address of a host | |||
(task) to change after the move to comply with the prefixes of the | (task) to change after the move to comply with the prefixes of the | |||
IP address under the new NVE. | IP address under the new NVE. | |||
A VM migration in Layer 3 Network solution is to allow IP | A VM migration in Layer 3 Network solution is to allow IP | |||
addresses staying the same after moving to different locations. | addresses staying the same after moving to different locations. | |||
The Identifier Locator Addressing or ILA [I-D.herbert-nvo3-ila] is | The Identifier Locator Addressing or ILA [I-D.herbert-nvo3-ila] is | |||
one of such solutions. | one of such solutions. | |||
Because broadcasting is not available in Layer-3 based networks, | Because broadcasting is not available in Layer-3 based networks, | |||
multicast of neighbor solicitations in IPv6 would need to be | multicast of neighbor solicitations in IPv6 and ARP for IPv4 would | |||
emulated. | need to be emulated. Scalability of the multicast (such as IPv6 ND | |||
and IPv4 ARP) can become problematic because the hosts belonging | ||||
to one subnet (or one VLAN) can span across many NVEs. Sending | ||||
broadcast traffic to all NVEs can cause unnecessary traffic in the | ||||
DCN if the hosts belonging to one subnet are only attached to a | ||||
very small number of NVEs. It is preferable to have a directory | ||||
[RFC7067] or NVA to manage the updates to an NVE of the potential | ||||
other NVEs a specific subnet may be attached and get periodic | ||||
reports from an NVE of all the subnets being attached/detached, as | ||||
described by RFC8171. | ||||
Hot VM Migration in Layer 3 involves coordination among many | Hot VM Migration in Layer 3 involves coordination among many | |||
entities, such as VM management system and NVA. Cold task | entities, such as VM management system and NVA. Cold task | |||
migration, which is a common practice in many data centers, | migration, which is a common practice in many data centers, | |||
involves the following steps: | involves the following steps: | |||
- Stop running the task. | - Stop running the task. | |||
- Package the runtime state of the job. | - Package the runtime state of the job. | |||
- Send the runtime state of the task to the New NVE where the | - Send the runtime state of the task to the New NVE where the | |||
task is to run. | task is to run. | |||
- Instantiate the task's state on the new machine. | - Instantiate the task's state on the new machine. | |||
- Start the tasks for the task continuing from the point at which | - Start the tasks for the task continuing from the point at which | |||
it was stopped. | it was stopped. | |||
RFC7666 has the more detailed description of the State Machine of | RFC7666 has the more detailed description of the State Machine of | |||
VMs controlled by Hypervisor | VMs controlled by Hypervisor | |||
4.3. Address and Connection Migration in Task Migration | 4.4. Address and Connection Management in VM Migration | |||
The term "Task" is referring to an entity (Task) that is | ||||
instantiated on a VM or a container, in another word, a Task can | ||||
be an "Application" or a "workload" running on a VM or a | ||||
Container. | ||||
Moving a Task running on a VM attached to one NVE to another VM | Since the VM attached to the New NVE needs to be assigned with the | |||
attached to a New NVE is same as moving the VM from one NVE to the | same address as VM attached to the Old NVE, extra processing or | |||
New NVE. The VM attached to the New NVE needs to be assigned with | configuration is needed, such as: | |||
the same address as VM attached to the Old NVE, which is called | ||||
Address Migration in this document. Here is an example of the | ||||
steps involved in Address Migration: | ||||
- Configure IPv4/v6 address on the target VM/NVE. | - Configure IPv4/v6 address on the target VM/NVE. | |||
- Suspend use of the address on the old NVE. This includes | - Suspend use of the address on the old NVE. This includes the | |||
handling established connections. A state may be established | old NVE sending query to NVA upon receiving packets destined | |||
to drop packets or send ICMPv4 or ICMPv6 destination | towards the VM being moved away. If there is no response from | |||
unreachable message when packets to the migrated address are | NVA for the new NVE for the VM, the old NVE can only drop the | |||
received. Referring to the VM State Machine described in | packets. Referring to the VM State Machine described in | |||
RFC7666. | RFC7666. | |||
- Push the new NVE-VM mapping to other NVEs which have the | - Trigger NVA to push the new NVE-VM mapping to other NVEs which | |||
attached VMs communicating with the VM being moved. All | have the attached VMs communicating with the VM being moved. | |||
relevant NVEs will learn the new mapping via their | ||||
corresponding NVA. | ||||
Connection migration involves reestablishing existing TCP | Connection management for the VM being moved involves | |||
connections of the task in the new place. | reestablishing existing TCP connections in the new place. | |||
The simplest course of action is to drop all TCP connections to | The simplest course of action is to drop all TCP connections to | |||
the VM across a migration. If the migrations are relatively rare | the VM across a migration. If the migrations are relatively rare | |||
events in a data center, impact is relatively small when TCP | events in a data center, impact is relatively small when TCP | |||
connections are automatically closed in the network stack during a | connections are automatically closed in the network stack during a | |||
migration event. If the applications running are known to handle | migration event. If the applications running are known to handle | |||
this gracefully (i.e. reopen dropped connections) then this | this gracefully (i.e. reopen dropped connections) then this | |||
approach may be viable. | approach may be viable. | |||
More involved approach to connection migration entails pausing the | More involved approach to connection migration entails pausing the | |||
skipping to change at page 11, line 6 ¶ | skipping to change at page 11, line 13 ¶ | |||
NVE-VM cache entries updated. | NVE-VM cache entries updated. | |||
6. Moving Local State of VM | 6. Moving Local State of VM | |||
In addition to the VM mobility related signaling (VM Mobility | In addition to the VM mobility related signaling (VM Mobility | |||
Registration Request/Reply), the VM state needs to be transferred | Registration Request/Reply), the VM state needs to be transferred | |||
to the New NVE. The state includes its memory and file system if | to the New NVE. The state includes its memory and file system if | |||
the VM cannot access the memory and the file system after moving | the VM cannot access the memory and the file system after moving | |||
to the New NVE. | to the New NVE. | |||
The mechanism of transferring VM States and file system is out of | The mechanism of transferring VM States and file system is out of | |||
the scope of this document. | the scope of this document. Referring to RFC7666 for detailed | |||
information. | ||||
7. Handling of Hot, Warm and Cold VM Mobility | 7. Handling of Hot, Warm and Cold VM Mobility | |||
Both Cold and Warm VM mobility (or migration) refers to the VM | Both Cold and Warm VM mobility (or migration) refers to the VM | |||
being completely shut down at the Old NVE before restarted at the | being completely shut down at the Old NVE before restarted at the | |||
New NVE. Therefore, all transport services to the VM are | New NVE. Therefore, all transport services to the VM are | |||
restarted. | restarted. | |||
Upon starting at the New NVE, the VM should send an ARP or | Upon starting at the New NVE, the VM should send an ARP or | |||
Neighbor Discovery message. Cold VM mobility also allows the Old | Neighbor Discovery message. Cold VM mobility also allows the Old | |||
NVE and all communicating NVEs to time out ARP/neighbor cache | NVE and all communicating NVEs to time out ARP/neighbor cache | |||
entries of the VM. It is necessary for the NVA to push the | entries of the VM. It is necessary for the NVA to push the | |||
updated ARP/neighbor cache entry to NVEs or for NVEs to pull the | updated ARP/neighbor cache entry to NVEs or for NVEs to pull the | |||
updated ARP/neighbor cache entry from NVA. | updated ARP/neighbor cache entry from NVA. | |||
The Cold VM mobility can be facilitated by cold standby entity | The Cold VM mobility can be facilitated by VM management system to | |||
receiving scheduled backup information. The cold standby entity | exchange the needed states between the Old NVE and the New NVE. | |||
can be a VM or can be other form factors which is beyond the scope | The cold mobility option can be used for non-critical applications | |||
of this document. The cold mobility option can be used for non- | and services that can tolerate interrupted TCP connections. | |||
critical applications and services that can tolerate interrupted | ||||
TCP connections. | ||||
The Warm VM mobility refers the backup entities receive backup | The Warm VM mobility refers to having the backup entities receive | |||
information at more frequent intervals. The duration of the | backup information at more frequent intervals, so that it can take | |||
less time to launch the VM under the new NVE. The duration of the | ||||
interval determines the effectiveness (or benefit) of Warm VM | interval determines the effectiveness (or benefit) of Warm VM | |||
mobility. The larger the duration, the less effective the Warm VM | mobility. The larger the duration, the less effective the Warm VM | |||
mobility option becomes. | mobility option becomes. | |||
For Hot VM Mobility, once a VM moves to a New NVE, the VM IP | For Hot VM Mobility, once a VM moves to a New NVE, the VM IP | |||
address does not change and the VM should be able to continue to | address does not change and the VM should be able to continue to | |||
receive packets to its address(es). The VM needs to send a | receive packets to its address(es). The VM needs to send a | |||
gratuitous Address Resolution message or unsolicited Neighbor | gratuitous Address Resolution message or unsolicited Neighbor | |||
Advertisement message upstream after each move. | Advertisement message upstream after each move. | |||
8. Other VM Mobility Options | 8. Other Options | |||
There is also a Hot Standby option in addition to the Hot | There is also a Hot Standby option in addition to the Hot | |||
Mobility, where there are VMs in both primary and secondary NVEs. | Mobility, where there are VMs in both primary and secondary NVEs. | |||
They have identical information and can provide services | They have identical information and can provide services | |||
simultaneously as in load-share mode of operation. If the VM in | simultaneously as in load-share mode of operation. If the VM in | |||
the primary NVE fails, there is no need to actively move the VM to | the primary NVE fails, there is no need to actively move the VM to | |||
the secondary NVE because the VM in the secondary NVE already | the secondary NVE because the VM in the secondary NVE already | |||
contain identical information. The Hot Standby option is the | contain identical information. The Hot Standby option is the | |||
costliest mechanism, and hence this option is utilized only for | costliest mechanism, and hence this option is utilized only for | |||
mission-critical applications and services. In Hot Standby | mission-critical applications and services. In Hot Standby | |||
option, regarding TCP connections, one option is to start with and | option, regarding TCP connections, one option is to start with and | |||
skipping to change at page 12, line 22 ¶ | skipping to change at page 12, line 33 ¶ | |||
9. VM Lifecycle Management | 9. VM Lifecycle Management | |||
The VM lifecycle management is a complicated task, which is beyond | The VM lifecycle management is a complicated task, which is beyond | |||
the scope of this document. Not only it involves monitoring server | the scope of this document. Not only it involves monitoring server | |||
utilization, balanced distribution of workload, etc., but also | utilization, balanced distribution of workload, etc., but also | |||
needs to manage seamlessly VM migration from one server to | needs to manage seamlessly VM migration from one server to | |||
another. | another. | |||
10. Security Considerations | 10. Security Considerations | |||
Security threats for the data and control plane for overlay | Security threats for the data and control plane for overlay | |||
networks are discussed in [RFC8014]. There are several issues in | networks are discussed in [RFC8014]. ARP (IPv40 and ND (IPv6) are | |||
a multi-tenant environment that create problems. In Layer-2 based | not secure, especially if we accept gratuitous versions in multi- | |||
overlay data center networks, lack of security in VXLAN, | tenant environment. | |||
corruption of VNI can lead to delivery to wrong tenant. Also, ARP | ||||
in IPv4 and ND in IPv6 are not secure, especially if we accept | ||||
gratuitous versions. When these are done over a UDP | ||||
encapsulation, like VXLAN, the problem is worse since it is | ||||
trivial for a non-trusted entity to spoof UDP packets. | ||||
In Layer-3 based overlay data center networks, the problem of | In Layer-3 based overlay data center networks, the problem of | |||
address spoofing may arise. An NVE may have untrusted tasks | address spoofing may arise. An NVE may have untrusted VMs | |||
attached. This usually happens in cases like the VMs (tasks) | attached. This usually happens in cases like the VMs running third | |||
running third party applications. This requires the usage of | party applications. Those untrusted VMs can send falsified ARP | |||
stronger security mechanisms. | (IPv4) and ND (IPv6) messages, causing NVE, NVO3 Gateway, and NVA | |||
to be overwhelmed and not able to perform legitimate functions. | ||||
The attacker can intercept, modify, or even stop data in-transit | ||||
ARP/ND messages intended for other VNs and initiate DDOS attacks | ||||
to other VMs attached to the same NVE. | ||||
This requires VM management system to apply stronger security | ||||
mechanisms when add a VM to an NVE. VM Management system is out of | ||||
scope of this document. | ||||
11. IANA Considerations | 11. IANA Considerations | |||
This document makes no request to IANA. | This document makes no request to IANA. | |||
12. Acknowledgments | 12. Acknowledgments | |||
The authors are grateful to Bob Briscoe, David Black, Dave R. | The authors are grateful to Bob Briscoe, David Black, Dave R. | |||
Worley, Qiang Zu, Andrew Malis for helpful comments. | Worley, Qiang Zu, Andrew Malis for helpful comments. | |||
skipping to change at page 14, line 14 ¶ | skipping to change at page 14, line 29 ¶ | |||
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, | [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, | |||
DOI 10.17487/RFC2629, June 1999, <https://www.rfc- | DOI 10.17487/RFC2629, June 1999, <https://www.rfc- | |||
editor.org/info/rfc2629>. | editor.org/info/rfc2629>. | |||
[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, | [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, | |||
"Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, | "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, | |||
DOI 10.17487/RFC4861, September 2007, <https://www.rfc- | DOI 10.17487/RFC4861, September 2007, <https://www.rfc- | |||
editor.org/info/rfc4861>. | editor.org/info/rfc4861>. | |||
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., | [RFC7067] L. Dunbar, D. Eastlake, R. Perlman, I. Gashinsky, | |||
Kreeger, L., Sridhar, T., Bursell, M., and C. Wright, | "directory Assistance Problem and High Level Design | |||
"Virtual eXtensible Local Area Network (VXLAN): A | Proposal", RFC7067, Nov. 2013 | |||
Framework for Overlaying Virtualized Layer 2 Networks over | ||||
Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, August | [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, | |||
2014, <https://www.rfc-editor.org/info/rfc7348>. | L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | |||
eXtensible Local Area Network (VXLAN): A Framework for | ||||
Overlaying Virtualized Layer 2 Networks over Layer 3 | ||||
Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, | ||||
<https://www.rfc-editor.org/info/rfc7348>. | ||||
[RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., | [RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., | |||
Kreeger, L., and M. Napierala, "Problem Statement: | Kreeger, L., and M. Napierala, "Problem Statement: | |||
Overlays for Network Virtualization", RFC 7364, DOI | Overlays for Network Virtualization", RFC 7364, DOI | |||
10.17487/RFC7364, October 2014, <https://www.rfc- | 10.17487/RFC7364, October 2014, <https://www.rfc- | |||
editor.org/info/rfc7364>. | editor.org/info/rfc7364>. | |||
[RFC7666] H. Asai, et al, "Management Information Base for Virtual | [RFC7666] H. Asai, et al, "Management Information Base for Virtual | |||
Machines Controlled by a Hypervisor", RFC7666, Oct 2015. | Machines Controlled by a Hypervisor", RFC7666, Oct 2015. | |||
[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. | [RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. | |||
Narten, "An Architecture for Data-Center Network | Narten, "An Architecture for Data-Center Network | |||
Virtualization over Layer 3 (NVO3)", RFC 8014, DOI | Virtualization over Layer 3 (NVO3)", RFC 8014, DOI | |||
10.17487/RFC8014, December 2016, <https://www.rfc- | 10.17487/RFC8014, December 2016, <https://www.rfc- | |||
editor.org/info/rfc8014>. | editor.org/info/rfc8014>. | |||
[RFC8171] D. Eastlake, L. Dunbar, R. Perlman, Y. Li, "Edge Directory | ||||
Assistance Mechanisms", RFC 8171, June 2017 | ||||
14.2. Informative References | 14.2. Informative References | |||
[I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier- | [I-D.herbert-nvo3-ila] Herbert, T. and P. Lapukhov, "Identifier- | |||
locator addressing for IPv6", draft-herbert-nvo3-ila-04 | locator addressing for IPv6", draft-herbert-nvo3-ila-04 | |||
(work in progress), March 2017. | (work in progress), March 2017. | |||
Authors' Addresses | Authors' Addresses | |||
Linda Dunbar | Linda Dunbar | |||
Futurewei | Futurewei | |||
End of changes. 32 change blocks. | ||||
111 lines changed or deleted | 127 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |