draft-ietf-nvo3-vmm-10.txt   draft-ietf-nvo3-vmm-11.txt 
Network Working Group L. Dunbar Network Working Group L. Dunbar
Internet Draft Futurewei Internet Draft Futurewei
Intended status: Informational B. Sarikaya Intended status: Informational B. Sarikaya
Expires: September 27, 2020 Denpel Informatique Expires: September 30, 2020 Denpel Informatique
B.Khasnabish B.Khasnabish
Independent Independent
T. Herbert T. Herbert
Intel Intel
S. Dikshit S. Dikshit
Aruba-HPE Aruba-HPE
March 27, 2020 March 30, 2020
Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks Virtual Machine Mobility Solutions for L2 and L3 Overlay Networks
draft-ietf-nvo3-vmm-10 draft-ietf-nvo3-vmm-11
Abstract Abstract
This document describes virtual machine mobility solutions commonly This document describes virtual machine mobility solutions commonly
used in data centers built with overlay-based network. This document used in data centers built with overlay-based network. This document
is intended for describing the solutions and the impact of moving is intended for describing the solutions and the impact of moving
VMs (or applications) from one Rack to another connected by the VMs (or applications) from one Rack to another connected by the
Overlay networks. Overlay networks.
For layer 2, it is based on using an NVA (Network Virtualization For layer 2, it is based on using an NVA (Network Virtualization
skipping to change at page 2, line 21 skipping to change at page 2, line 21
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on September 26, 2020. This Internet-Draft will expire on September 27, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 8 skipping to change at page 3, line 8
1. Introduction...................................................3 1. Introduction...................................................3
2. Conventions used in this document..............................4 2. Conventions used in this document..............................4
3. Requirements...................................................5 3. Requirements...................................................5
4. Overview of the VM Mobility Solutions..........................6 4. Overview of the VM Mobility Solutions..........................6
4.1. Inter-VNs communication...................................6 4.1. Inter-VNs communication...................................6
4.2. VM Migration in Layer 2 Network...........................6 4.2. VM Migration in Layer 2 Network...........................6
4.3. VM Migration in Layer-3 Network...........................8 4.3. VM Migration in Layer-3 Network...........................8
4.4. Address and Connection Management in VM Migration.........9 4.4. Address and Connection Management in VM Migration.........9
5. Handling Packets in Flight....................................10 5. Handling Packets in Flight....................................10
6. Moving Local State of VM......................................10 6. Moving Local State of VM......................................11
7. Handling of Hot, Warm and Cold VM Mobility....................11 7. Handling of Hot, Warm and Cold VM Mobility....................11
8. Other Options.................................................11 8. Other Options.................................................12
9. VM Lifecycle Management.......................................12 9. VM Lifecycle Management.......................................13
10. Security Considerations......................................12 10. Security Considerations......................................13
11. IANA Considerations..........................................13 11. IANA Considerations..........................................13
12. Acknowledgments..............................................13 12. Acknowledgments..............................................14
13. Change Log...................................................13 13. Change Log...................................................14
14. References...................................................13 14. References...................................................14
14.1. Normative References....................................14 14.1. Normative References....................................14
14.2. Informative References..................................15 14.2. Informative References..................................16
1. Introduction 1. Introduction
This document describes the overlay-based data center networks This document describes the overlay-based data center networks
solutions in supporting multitenancy and VM (Virtual Machine) solutions in supporting multitenancy and VM (Virtual Machine)
mobility. Being able to move VMs dynamically, from one server to mobility. Being able to move VMs dynamically, from one server to
another, makes it possible for dynamic load balancing or work another, makes it possible for dynamic load balancing or work
distribution. Therefore, dynamic VM Mobility is highly desirable distribution. Therefore, dynamic VM Mobility is highly desirable
for large scale multi-tenant DCs. for large scale multi-tenant DCs.
This document is strictly within the DCVPN, as defined by the NVO3 This document is strictly within the DCVPN, as defined by the NVO3
Framework [RFC 7365]. The intent is to describe Layer 2 and Layer Framework [RFC 7365]. The intent is to describe Layer 2 and Layer
skipping to change at page 6, line 27 skipping to change at page 6, line 27
Inter VNs (Virtual Networks) communication refers to communication Inter VNs (Virtual Networks) communication refers to communication
among tenants (or hosts) belonging to different VNs. Those tenants among tenants (or hosts) belonging to different VNs. Those tenants
can be attached to the NVEs co-located in the same Data Center or can be attached to the NVEs co-located in the same Data Center or
in different Data centers. This document assumes that the inter- in different Data centers. This document assumes that the inter-
VNs communication is via the NVO3 Gateway as described in RFC8014 VNs communication is via the NVO3 Gateway as described in RFC8014
(NVO3 Architecture). RFC 8014 (Section 5.3) describes the NVO3 (NVO3 Architecture). RFC 8014 (Section 5.3) describes the NVO3
Gateway function which is to relay traffic onto and off of a Gateway function which is to relay traffic onto and off of a
virtual network, i.e. among different VNs. virtual network, i.e. among different VNs.
When a VM communicates with an external entity, the VM is
effectively communicating with a peer in a different network or a
globally reachable host. Communicating with hosts in other VNs
and external hosts are all through the NVO3 Gateway. There are
different policies on the NVo3 Gateway to govern the communication
among VNs and with external hosts.
After a VM is moved to a new NVE, the VM's corresponding Gateway After a VM is moved to a new NVE, the VM's corresponding Gateway
may need to change as well. If such a change is not possible, then may need to change as well. If such a change is not possible, then
the path to the external entity need to be hair-pinned to the NVO3 the path to the external entity need to be hair-pinned to the NVO3
Gateway used prior to the VM move. Gateway used prior to the VM move.
4.2. VM Migration in Layer 2 Network 4.2. VM Migration in Layer 2 Network
In a Layer-2 based approach, VM moving to another NVE does not In a Layer-2 based approach, VM moving to another NVE does not
change its IP address. But this VM is now under a new NVE, change its IP address. But this VM is now under a new NVE,
previously communicating NVEs may continue sending their packets previously communicating NVEs may continue sending their packets
skipping to change at page 11, line 25 skipping to change at page 11, line 32
restarted. restarted.
In this document, all VM mobility is initiated by VM Management In this document, all VM mobility is initiated by VM Management
System. The Cold VM mobility only exchange the needed states System. The Cold VM mobility only exchange the needed states
between the Old NVE and the New NVE after the VM attached to the between the Old NVE and the New NVE after the VM attached to the
Old NVE is completely shut down. There is time delay before the Old NVE is completely shut down. There is time delay before the
new VM is launched. The cold mobility option can be used for non- new VM is launched. The cold mobility option can be used for non-
critical applications and services that can tolerate interrupted critical applications and services that can tolerate interrupted
TCP connections. TCP connections.
The Warm VM mobility refers to having the backup entities receive The Warm VM mobility refers to having the functional components
backup information at more frequent intervals, so that it can take under the new NVE to receive running status of the VM at frequent
less time to launch the VM under the new NVE and other NVEs that intervals, so that it can take less time to launch the VM under
communicate with the VM can be notified prior to the VM move. The the new NVE and other NVEs that communicate with the VM can be
duration of the interval determines the effectiveness (or benefit) notified of the VM move more promptly. The duration of the
of Warm VM mobility. The larger the duration, the less effective interval determines the effectiveness (or benefit) of Warm VM
the Warm VM mobility option becomes. mobility. The larger the duration, the less effective the Warm VM
mobility option becomes.
For Hot VM Mobility, once a VM moves to a New NVE, the VM IP For Hot VM Mobility, once a VM moves to a New NVE, the VM IP
address does not change and the VM should be able to continue to address does not change and the VM should be able to continue to
receive packets to its address(es). The VM needs to send a receive packets to its address(es). The VM needs to send a
gratuitous Address Resolution message or unsolicited Neighbor gratuitous Address Resolution message or unsolicited Neighbor
Advertisement message upstream after each move. Advertisement message upstream after each move.
Upon starting at the New NVE, the VM should send an ARP or Upon starting at the New NVE, the VM should send an ARP or
Neighbor Discovery message. Cold VM mobility also allows the Old Neighbor Discovery message. Cold VM mobility also allows the Old
NVE and all communicating NVEs to time out ARP/neighbor cache NVE and all communicating NVEs to time out ARP/neighbor cache
entries of the VM. It is necessary for the NVA to push the entries of the VM. It is necessary for the NVA to push the
updated ARP/neighbor cache entry to NVEs or for NVEs to pull the updated ARP/neighbor cache entry to NVEs or for NVEs to pull the
updated ARP/neighbor cache entry from NVA. updated ARP/neighbor cache entry from NVA.
8. Other Options 8. Other Options
VM Hot mobility is to enable uninterrupted running of the VM Hot mobility is to enable uninterrupted running of the
application or workload instantiated on the VM when the VM running application or workload instantiated on the VM when the VM running
conditions changes, such as utilization overload, hardware running conditions changes, such as utilization overload, hardware running
condition changes, or others. condition changes, or others. Hot, Warm and Cold mobility are
planned activities which are managed by VM management system.
There is also a Hot Standby option to prevent unexpected failure For unexpected events, such as unexpected failure, a VM might need
conditions, where there are VMs in both primary and secondary to move to a new NVE, which is called Hot VM Failover in this
NVEs. They have identical information and can provide services document. For Hot VM Failover, there are VMs in both primary and
simultaneously as in load-share mode of operation. If the VM in secondary NVEs. They can provide services simultaneously as in
the primary NVE fails, there is no need to actively move the VM to load-share mode of operation. If the VM in the primary NVE fails,
the secondary NVE because the VM in the secondary NVE already there is no need to actively move the VM to the secondary NVE
contain identical information. The Hot Standby option is the because the VM in the secondary NVE can immediately pick up the
costliest mechanism, and hence this option is utilized only for processing. It is out of the scope of this document on how and
mission-critical applications and services. In Hot Standby what information are exchange between the two VMs under two
option, regarding TCP connections, one option is to start with and different NVE.
maintain TCP connections to two different VMs at the same time.
The least loaded VM responds first and pickup providing service The VM Failover to the new NVE is transparent to the peers that
while the sender (origin) still continues to receive Ack from the communicate with this VM. This can be achieved by both active VM
heavily loaded (secondary) VM and chooses not to use the service and standby VM share the same TCP port and same IP address. There
of the secondary responding VM. If the situation (loading must be a load balancer that can distribute the packets to the VM
condition of the primary responding VM) changes the secondary under the new NVE. The new VM can pick up providing service while
responding VM may start providing service to the sender (origin). the sender (peer) still continues to receive Ack from the old VM
and chooses not to use the service of the secondary responding VM.
If the situation (loading condition of the primary responding VM)
changes the secondary responding VM may start providing service to
the sender (peers).
If TCP states are not properly synchronized among the two VMs, the
VM under the New NVE after failover can force the peers to re-
establish a new TCP connection by stopping the previous TCP
connection. As most TCP connections are short lived, re-
establishing a new one is not a big problem.
The Hot VM Failover option is the costliest mechanism, and hence
this option is utilized only for mission-critical applications and
services.
9. VM Lifecycle Management 9. VM Lifecycle Management
The VM lifecycle management is a complicated task, which is beyond The VM lifecycle management is a complicated task, which is beyond
the scope of this document. Not only it involves monitoring server the scope of this document. Not only it involves monitoring server
utilization, balanced distribution of workload, etc., but also utilization, balanced distribution of workload, etc., but also
needs to manage seamlessly VM migration from one server to needs to manage seamlessly VM migration from one server to
another. another.
10. Security Considerations 10. Security Considerations
Security threats for the data and control plane for overlay Security threats for the data and control plane for overlay
networks are discussed in [RFC8014]. ARP (IPv40 and ND (IPv6) are networks are discussed in [RFC8014]. ARP (IPv40 and ND (IPv6) are
not secure, especially if we accept gratuitous versions in multi- not secure, especially if we accept gratuitous versions in multi-
tenant environment. tenant environment.
In Layer-3 based overlay data center networks, the problem of In Layer-3 based overlay data center networks, ARP and ND messages
address spoofing may arise. An NVE may have untrusted VMs can be used to mount address spoofing attacks. An NVE may have
attached. This usually happens in cases like the VMs running third untrusted VMs attached. This usually happens in cases like the VMs
party applications. Those untrusted VMs can send falsified ARP running third party applications. Those untrusted VMs can send
(IPv4) and ND (IPv6) messages, causing NVE, NVO3 Gateway, and NVA falsified ARP (IPv4) and ND (IPv6) messages, causing NVE, NVO3
to be overwhelmed and not able to perform legitimate functions. Gateway, and NVA to be overwhelmed and not able to perform
The attacker can intercept, modify, or even stop data in-transit legitimate functions. The attacker can intercept, modify, or even
ARP/ND messages intended for other VNs and initiate DDOS attacks stop data in-transit ARP/ND messages intended for other VNs and
to other VMs attached to the same NVE. initiate DDOS attacks to other VMs attached to the same NVE. A
simple black-hole attacks can be mounted by sending a falsified
ARP/ND message to indicate that the victim's IP address has moved
to the attacker's VM. That technique can also be used to mount
man-in-the-middle attacks with some more effort to ensure that the
intercepted traffic is eventually delivered to the victim.
The locator-identifier mechanism given as an example (ILA) doesn't The locator-identifier mechanism given as an example (ILA) doesn't
include secure binding. It doesn't discuss how to securely bind include secure binding. It doesn't discuss how to securely bind
the new locator to the identifier. the new locator to the identifier.
This requires VM management system to apply stronger security Because of those threats, VM management system needs to apply
mechanisms when add a VM to an NVE. VM Management system is out of stronger security mechanisms when add a VM to an NVE. Some tenants
scope of this document. may have requirement that prohibit their VMs to be co-attached to
the NVEs with other tenants. Some Data Centers have their NVO3
Gateways to be equipped with capability to mitigate ARP/ND
threats, such as periodically exchanging its ARP/ND cache with
NVA's central control system.
11. IANA Considerations 11. IANA Considerations
This document makes no request to IANA. This document makes no request to IANA.
12. Acknowledgments 12. Acknowledgments
The authors are grateful to Bob Briscoe, David Black, Dave R. The authors are grateful to Bob Briscoe, David Black, Dave R.
Worley, Qiang Zu, Andrew Malis for helpful comments. Worley, Qiang Zu, Andrew Malis for helpful comments.
 End of changes. 14 change blocks. 
49 lines changed or deleted 81 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/