draft-ietf-nvo3-vmm-03.txt   draft-ietf-nvo3-vmm-04.txt 
Network Working Group B. Sarikaya Network Working Group B. Sarikaya
Internet-Draft Independent Internet-Draft Denpel Informatique
Intended status: Best Current Practice L. Dunbar Intended status: Best Current Practice L. Dunbar
Expires: November 3, 2018 Huawei USA Expires: February 10, 2019 Huawei USA
B. Khasnabish B. Khasnabish
ZTE (TX) Inc. ZTE (TX) Inc.
T. Herbert T. Herbert
Quantonium Quantonium
S. Dikshit S. Dikshit
Cisco Systems Cisco Systems
May 25, 2018 August 9, 2018
Virtual Machine Mobility Protocol for L2 and L3 Overlay Networks Virtual Machine Mobility Protocol for L2 and L3 Overlay Networks
draft-ietf-nvo3-vmm-03.txt draft-ietf-nvo3-vmm-04.txt
Abstract Abstract
This document describes a virtual machine mobility protocol commonly This document describes a virtual machine mobility protocol commonly
used in data centers built with overlay-based network virtualization used in data centers built with overlay-based network virtualization
approach. For layer 2, it is based on using a Network Virtualization approach. For layer 2, it is based on using a Network Virtualization
Authority (NVA)-Network Virtualization Edge (NVE) protocol to update Authority (NVA)-Network Virtualization Edge (NVE) protocol to update
Address Resolution Protocol (ARP) table or neighbor cache entries at Address Resolution Protocol (ARP) table or neighbor cache entries at
the NVA and the source NVEs tunneling in-flight packets to the the NVA and the source NVEs tunneling in-flight packets to the
destination NVE after the virtual machine moves from source NVE to destination NVE after the virtual machine moves from source NVE to
skipping to change at page 1, line 45 skipping to change at page 1, line 45
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 3, 2018. This Internet-Draft will expire on February 10, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 26 skipping to change at page 2, line 26
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions and Terminology . . . . . . . . . . . . . . . . . 3 2. Conventions and Terminology . . . . . . . . . . . . . . . . . 3
3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Overview of the protocol . . . . . . . . . . . . . . . . . . 4 4. Overview of the protocol . . . . . . . . . . . . . . . . . . 4
4.1. VM Migration . . . . . . . . . . . . . . . . . . . . . . 4 4.1. VM Migration . . . . . . . . . . . . . . . . . . . . . . 5
4.2. Task Migration . . . . . . . . . . . . . . . . . . . . . 6 4.2. Task Migration . . . . . . . . . . . . . . . . . . . . . 6
4.2.1. Address and Connection Migration in Task Migration . 7 4.2.1. Address and Connection Migration in Task Migration . 7
5. Handling Packets in Flight . . . . . . . . . . . . . . . . . 8 5. Handling Packets in Flight . . . . . . . . . . . . . . . . . 8
6. Moving Local State of VM . . . . . . . . . . . . . . . . . . 8 6. Moving Local State of VM . . . . . . . . . . . . . . . . . . 9
7. Handling of Hot, Warm and Cold Virtual Machine Mobility . . . 9 7. Handling of Hot, Warm and Cold Virtual Machine Mobility . . . 9
8. Virtual Machine Operation . . . . . . . . . . . . . . . . . . 9 8. Virtual Machine Operation . . . . . . . . . . . . . . . . . . 10
8.1. Virtual Machine Lifecycle Management . . . . . . . . . . 10 8.1. Virtual Machine Lifecycle Management . . . . . . . . . . 10
9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11
12. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 11 12. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 11
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
13.1. Normative References . . . . . . . . . . . . . . . . . . 11 13.1. Normative References . . . . . . . . . . . . . . . . . . 11
13.2. Informative references . . . . . . . . . . . . . . . . . 12 13.2. Informative references . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12
1. Introduction 1. Introduction
Data center networks are being increasingly used by telecom operators Data center networks are being increasingly used by telecom operators
as well as by enterprises. In this document we are interested in as well as by enterprises. In this document we are interested in
skipping to change at page 4, line 23 skipping to change at page 4, line 23
Source NVE refers to the old NVE where packets were forwarded to Source NVE refers to the old NVE where packets were forwarded to
before migration. before migration.
Destination NVE refers to the new NVE after migration. Destination NVE refers to the new NVE after migration.
Packets in flight refers to the packets received by the source NVE Packets in flight refers to the packets received by the source NVE
sent by the correspondents that have old ARP or neighbor cache entry sent by the correspondents that have old ARP or neighbor cache entry
before VM or task migration. before VM or task migration.
Users of VMs in diskless systems or systems not using configuration
files are called end user clients.
3. Requirements 3. Requirements
This section states requirements on data center network virtual This section states requirements on data center network virtual
machine mobility. machine mobility.
Data center network SHOULD support virtual machine mobility in IPv6. Data center network SHOULD support virtual machine mobility in IPv6.
IPv4 SHOULD also be supported in virtual machine mobility. IPv4 SHOULD also be supported in virtual machine mobility.
Virtual machine mobility protocol MAY support host routes to Virtual machine mobility protocol MAY support host routes to
skipping to change at page 5, line 13 skipping to change at page 5, line 21
change its IP address. Because of this an IP based virtual machine change its IP address. Because of this an IP based virtual machine
mobility protocol is not needed. However, when a virtual machine mobility protocol is not needed. However, when a virtual machine
moves, NVEs need to change their caches associating VM Layer 2 or moves, NVEs need to change their caches associating VM Layer 2 or
Medium Access Control (MAC) address with NVE's IP address. Such a Medium Access Control (MAC) address with NVE's IP address. Such a
change enables NVE to send outgoing MAC frames addressed to the change enables NVE to send outgoing MAC frames addressed to the
virtual machine. VM movement across Layer 3 boundaries is not virtual machine. VM movement across Layer 3 boundaries is not
typical but the same solution applies if the VM moves in the same typical but the same solution applies if the VM moves in the same
link such as in WSCs. link such as in WSCs.
Virtual machine moves from its source NVE to a new, destination NVE. Virtual machine moves from its source NVE to a new, destination NVE.
After the move After the move the virtual machine IP address(es) do not change but
the virtual machine IP address(es) do not change but this virtual this virtual machine is now under a new NVE, previously communicating
machine is now under a new NVE, previously communicating NVEs will NVEs will continue to send their packets to the source NVE. Address
continue to send their packets to the source NVE. Address Resolution Resolution Protocol (ARP) cache in IPv4 [RFC0826] or neighbor cache
Protocol (ARP) cache in IPv4 [RFC0826] or neighbor cache in IPv6 in IPv6 [RFC4861] in the NVEs need to be updated.
[RFC4861] in the NVEs need to be updated.
It may take some time to refresh ARP/ND cache when a VM is moved to a It may take some time to refresh ARP/ND cache when a VM is moved to a
new destination NVE. During this period, a tunnel is needed so that new destination NVE. During this period, a tunnel is needed so that
source NVE forwards packets to the destination NVE. source NVE forwards packets to the destination NVE.
In IPv4, the virtual machine immediately after the move should send a In IPv4, the virtual machine immediately after the move should send a
gratuitous ARP request message containing its IPv4 and Layer 2 or MAC gratuitous ARP request message containing its IPv4 and Layer 2 or MAC
address in its new NVE, destination NVE. This message's destination address in its new NVE, destination NVE. This message's destination
address is the broadcast address. NVE receives this message. NVE address is the broadcast address. Source NVE receives this message.
should update VM's ARP entry in the central directory at the NVA. source NVE should update VM's ARP entry in the central directory at
NVE asks NVA to update its mappings to record IPv4 address of VM the NVA. Source NVE asks NVA to update its mappings to record IPv4
along with MAC address of VM, and NVE IPv4 address. An NVE-to-NVA address of the moving VM along with MAC address of VM, and NVE IPv4
protocol is used for this purpose [RFC8014]. address. An NVE-to-NVA protocol is used for this purpose [RFC8014].
Reverse ARP (RARP) which enables the host to discover its IPv4 Reverse ARP (RARP) which enables the host to discover its IPv4
address when it boots from a local server [RFC0903] is not used by address when it boots from a local server [RFC0903] is not used by
VMs because the VM already knows its IPv4 address. IPv4/v6 address VMs because the VM already knows its IPv4 address. IPv4/v6 address
is assigned to a newly created VM, possibly using Dynamic Host is assigned to a newly created VM, possibly using Dynamic Host
Configuration Protocol (DHCP). There are some vendor deployments Configuration Protocol (DHCP). Next, we describe a case where RARP
(diskless systems or systems without configuration files) wherein VM is used.
users, i.e. end-user clients ask for the same MAC address upon
migration. This can be achieved by the clients sending RARP request There are some vendor deployments (diskless systems or systems
reverse message which carries the old MAC address looking for an IP without configuration files) wherein VM users, i.e. end-user clients
address allocation. The server, in this case the new NVE needs to ask for the same MAC address upon migration. This can be achieved by
communicate with NVA, just like in the gratuitous ARP case to ensure the clients sending RARP request reverse message which carries the
that the same IPv4 address is assigned to the VM. NVA uses the MAC old MAC address looking for an IP address allocation. The server, in
address as the key in the search of ARP cache to find the IP address this case the new NVE needs to communicate with NVA, just like in the
and informs this to the new NVE which in turns sends RARP reply gratuitous ARP case to ensure that the same IPv4 address is assigned
reverse message. This completes IP address assignment to the to the VM. NVA uses the MAC address as the key in the search of ARP
migrating VM. cache to find the IP address and informs this to the new NVE which in
turn sends RARP reply reverse message. This completes IP address
assignment to the migrating VM.
All NVEs communicating with this virtual machine uses the old ARP All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry. Thus the packets are destination NVE, it uses the old ARP entry. Thus the packets are
delivered to the source NVE. The source NVE MUST tunnel these in- delivered to the source NVE. The source NVE MUST tunnel these in-
flight packets to the destination NVE. flight packets to the destination NVE.
When an ARP entry in those VMs times out, their corresponding NVEs When an ARP entry in those VMs times out, their corresponding NVEs
should access the NVA for an update. should access the NVA for an update.
IPv6 operation is slightly different: IPv6 operation is slightly different:
In IPv6, the virtual machine immediately after the move sends an In IPv6, the virtual machine immediately after the move sends an
unsolicited neighbor advertisement message containing its IPv6 unsolicited neighbor advertisement message containing its IPv6
address and Layer-2 MAC address in its new NVE, the destination NVE. address and Layer-2 MAC address in its new NVE, the destination NVE.
This message is sent to the IPv6 Solicited Node Multicast Address This message is sent to the IPv6 Solicited Node Multicast Address
corresponding to the target address which is VM's IPv6 address. NVE corresponding to the target address which is VM's IPv6 address. NVE
receives this message. NVE should update VM's neighbor cache entry receives this message. NVE should update VM's neighbor cache entry
in the central directory at the NVA. IPv6 address of VM, MAC address in the central directory of the NVA. IPv6 address of VM, MAC address
of VM and NVE IPv6 address are recorded to the entry. An NVE-to-NVA of VM and NVE IPv6 address are recorded in the entry. An NVE-to-NVA
protocol is used for this purpose [RFC8014]. protocol is used for this purpose [RFC8014].
All NVEs communicating with this virtual machine uses the old All NVEs communicating with this virtual machine uses the old
neighbor cache entry. If any VM in those NVEs need to talk to the neighbor cache entry. If any VM in those NVEs need to talk to the
new VM in the destination NVE, it uses the old neighbor cache entry. new VM in the destination NVE, it uses the old neighbor cache entry.
Thus the packets are delivered to the source NVE. The source NVE Thus the packets are delivered to the source NVE. The source NVE
MUST tunnel these in-flight packets to the destination NVE. MUST tunnel these in-flight packets to the destination NVE.
When a neighbor cache entry in those VMs times out, their When a neighbor cache entry in those VMs times out, their
corresponding NVEs should access the NVA for an update. corresponding NVEs should access the NVA for an update.
4.2. Task Migration 4.2. Task Migration
Virtualization in L2 based data center networks becomes quickly Virtualization in L2 based data center networks becomes quickly
prohibitive because ARP/neighbor caches don't scale. Scaling can be prohibitive because ARP/neighbor caches don't scale. Scaling can be
accomplished seamlessly in L3 data center networks by just giving accomplished seamlessly in L3 data center networks by just giving
each virtual network an IP subnet and a default route that points to each virtual network an IP subnet and a default route that points to
NVE. This means no explosion of ARP/ neighbor cache in guests (just NVE. This means no explosion of ARP/ neighbor cache in VMs and NVEs
one ARP/ neighbor cache entry for default route) and we do not need (just one ARP/ neighbor cache entry for default route) and there is
to have Ethernet header in encapsulation [RFC7348] which saves at no need to have Ethernet header in encapsulation [RFC7348] which
least 16 bytes. saves at least 16 bytes.
In L3 based data center networks, since IP address of the task has to In L3 based data center networks, since IP address of the task has to
change after move, an IP based task migration protocol is needed. change after move, an IP based task migration protocol is needed.
The protocol mostly used is the identifier locator addressing or ILA The protocol mostly used is the identifier locator addressing or ILA
[I-D.herbert-nvo3-ila]. Address and connection migration introduce [I-D.herbert-nvo3-ila]. Address and connection migration introduce
complications in task migration protocol as we discuss below. complications in task migration protocol as we discuss below.
Especially informing the communicating hosts of the migration becomes Especially informing the communicating hosts of the migration becomes
a major issue. Also, in L3 based networks, because broadcasting is a major issue. Also, in L3 based networks, because broadcasting is
not available, multicast of neighbor solicitations in IPv6 would need not available, multicast of neighbor solicitations in IPv6 would need
to be emulated. to be emulated.
skipping to change at page 9, line 41 skipping to change at page 9, line 51
In cases of warm standby option, the backup VMs receive backup In cases of warm standby option, the backup VMs receive backup
information at regular intervals. The duration of the interval information at regular intervals. The duration of the interval
determines the warmth of the standby option. The larger the determines the warmth of the standby option. The larger the
duration, the less warm (and hence cold) the standby option becomes. duration, the less warm (and hence cold) the standby option becomes.
In case of hot standby option, the VMs in both primary and secondary In case of hot standby option, the VMs in both primary and secondary
domains have identical information and can provide services domains have identical information and can provide services
simultaneously as in load-share mode of operation. If the VMs in the simultaneously as in load-share mode of operation. If the VMs in the
primary domain fails, there is no need to actively move the VMs to primary domain fails, there is no need to actively move the VMs to
the secondary domain because the VMs in the secondary domain already the secondary domain because the VMs in the secondary domain already
contains identical information. The hot standby option is the most contain identical information. The hot standby option is the most
costly mechanism for providing redundancy, and hence this option is costly mechanism for providing redundancy, and hence this option is
utilized only for mission-critical applications and services. utilized only for mission-critical applications and services. In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time. The least loaded VM responds first and pickup providing
service while the sender (origin) still continues to receive Ack from
the heavily loaded (secondary) VM and chooses not use the service of
the secondary responding VM. If the situation (loading condition of
the primary responding VM) changes the secondary responding VM may
start providing service to the sender (origin).
8. Virtual Machine Operation 8. Virtual Machine Operation
Virtual machines are not involved in any mobility signalling. Once Virtual machines are not involved in any mobility signalling. Once
VM moves to the destination NVE, VM IP address does not change and VM VM moves to the destination NVE, VM IP address does not change and VM
should be able to continue to receive packets to its address(es). should be able to continue to receive packets to its address(es).
This happens in hot VM mobility scenarios. This happens in hot VM mobility scenarios.
Virtual machine sends a gratuitous Address Resolution Protocol or Virtual machine sends a gratuitous Address Resolution Protocol or
unsolicited Neighbor Advertisement message upstream after each move. unsolicited Neighbor Advertisement message upstream after each move.
skipping to change at page 10, line 50 skipping to change at page 11, line 19
This usually happens in cases like the virtual machines running third This usually happens in cases like the virtual machines running third
part applications. This requires the usage of stronger security part applications. This requires the usage of stronger security
mechanisms. mechanisms.
10. IANA Considerations 10. IANA Considerations
This document makes no request to IANA. This document makes no request to IANA.
11. Acknowledgements 11. Acknowledgements
The authors are grateful to Qiang Zu, Andrew Malis for helpful The authors are grateful to Dave R. Worley, Qiang Zu, Andrew Malis
comments. for helpful comments.
12. Change Log 12. Change Log
o submitted version -00 as a working group draft after adoption. o submitted version -00 as a working group draft after adoption
o submitted version -01 with these changes: references are updated, o submitted version -01 with these changes: references are updated,
added packets in flight definition to Section 2 added packets in flight definition to Section 2
o submitted version -02 with updated address. o submitted version -02 with updated address.
o submitted version -03 to fix the nits.
o submitted version -04 in reference to the WG Last call comments.
13. References 13. References
13.1. Normative References 13.1. Normative References
[RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or [RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or
Converting Network Protocol Addresses to 48.bit Ethernet Converting Network Protocol Addresses to 48.bit Ethernet
Address for Transmission on Ethernet Hardware", STD 37, Address for Transmission on Ethernet Hardware", STD 37,
RFC 826, DOI 10.17487/RFC0826, November 1982, RFC 826, DOI 10.17487/RFC0826, November 1982,
<https://www.rfc-editor.org/info/rfc826>. <https://www.rfc-editor.org/info/rfc826>.
skipping to change at page 12, line 27 skipping to change at page 12, line 48
13.2. Informative references 13.2. Informative references
[I-D.herbert-nvo3-ila] [I-D.herbert-nvo3-ila]
Herbert, T. and P. Lapukhov, "Identifier-locator Herbert, T. and P. Lapukhov, "Identifier-locator
addressing for IPv6", draft-herbert-nvo3-ila-04 (work in addressing for IPv6", draft-herbert-nvo3-ila-04 (work in
progress), March 2017. progress), March 2017.
Authors' Addresses Authors' Addresses
Behcet Sarikaya Behcet Sarikaya
Plano, TX, USA Denpel Informatique
Email: sarikaya@ieee.org Email: sarikaya@ieee.org
Linda Dunbar Linda Dunbar
Huawei USA Huawei USA
5340 Legacy Dr. Building 3 5340 Legacy Dr. Building 3
Plano, TX 75024 Plano, TX 75024
Email: linda.dunbar@huawei.com Email: linda.dunbar@huawei.com
Bhumip Khasnabish Bhumip Khasnabish
ZTE (TX) Inc. ZTE (TX) Inc.
1900 McCarthy Blvd., Suite 205 55 Madison Avenue, Suite 160
Milpitas, CA 95035 Morristown, NJ 07960
Email: vumip1@gmail.com, bhumip.khasnabish@ztetx.com Email: vumip1@gmail.com, bhumip.khasnabish@ztetx.com
Tom Herbert Tom Herbert
Quantonium Quantonium
Email: tom@herbertland.com Email: tom@herbertland.com
Saumya Dikshit Saumya Dikshit
Cisco Systems Cisco Systems
Cessna Business Park Cessna Business Park
Bangalore, Karnataka, India 560 087 Bangalore, Karnataka, India 560 087
Email: sadikshi@cisco.com Email: sadikshi@cisco.com
 End of changes. 24 change blocks. 
48 lines changed or deleted 64 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/