draft-ietf-mboned-dc-deploy-08.txt | draft-ietf-mboned-dc-deploy-09.txt | |||
---|---|---|---|---|
MBONED M. McBride | MBONED M. McBride | |||
Internet-Draft Futurewei | Internet-Draft Futurewei | |||
Intended status: Informational O. Komolafe | Intended status: Informational O. Komolafe | |||
Expires: August 7, 2020 Arista Networks | Expires: August 7, 2020 Arista Networks | |||
February 4, 2020 | February 4, 2020 | |||
Multicast in the Data Center Overview | Multicast in the Data Center Overview | |||
draft-ietf-mboned-dc-deploy-08 | draft-ietf-mboned-dc-deploy-09 | |||
Abstract | Abstract | |||
The volume and importance of one-to-many traffic patterns in data | The volume and importance of one-to-many traffic patterns in data | |||
centers is likely to increase significantly in the future. Reasons | centers is likely to increase significantly in the future. Reasons | |||
for this increase are discussed and then attention is paid to the | for this increase are discussed and then attention is paid to the | |||
manner in which this traffic pattern may be judiously handled in data | manner in which this traffic pattern may be judiciously handled in | |||
centers. The intuitive solution of deploying conventional IP | data centers. The intuitive solution of deploying conventional IP | |||
multicast within data centers is explored and evaluated. Thereafter, | multicast within data centers is explored and evaluated. Thereafter, | |||
a number of emerging innovative approaches are described before a | a number of emerging innovative approaches are described before a | |||
number of recommendations are made. | number of recommendations are made. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
skipping to change at page 2, line 41 ¶ | skipping to change at page 2, line 41 ¶ | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | |||
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . 15 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . 16 | 9.2. Informative References . . . . . . . . . . . . . . . . . 16 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
1. Introduction | 1. Introduction | |||
The volume and importance of one-to-many traffic patterns in data | The volume and importance of one-to-many traffic patterns in data | |||
centers is likely to increase significantly in the future. Reasons | centers will likely continue to increase. Reasons for this increase | |||
for this increase include the nature of the traffic generated by | include the nature of the traffic generated by applications hosted in | |||
applications hosted in the data center, the need to handle broadcast, | the data center, the need to handle broadcast, unknown unicast and | |||
unknown unicast and multicast (BUM) traffic within the overlay | multicast (BUM) traffic within the overlay technologies used to | |||
technologies used to support multi-tenancy at scale, and the use of | support multi-tenancy at scale, and the use of certain protocols that | |||
certain protocols that traditionally require one-to-many control | traditionally require one-to-many control message exchanges. | |||
message exchanges. | ||||
These trends, allied with the expectation that future highly | These trends, allied with the expectation that highly virtualized | |||
virtualized large-scale data centers must support communication | large-scale data centers must support communication between | |||
between potentially thousands of participants, may lead to the | potentially thousands of participants, may lead to the natural | |||
natural assumption that IP multicast will be widely used in data | assumption that IP multicast will be widely used in data centers, | |||
centers, specifically given the bandwidth savings it potentially | specifically given the bandwidth savings it potentially offers. | |||
offers. However, such an assumption would be wrong. In fact, there | However, such an assumption would be wrong. In fact, there is | |||
is widespread reluctance to enable conventional IP multicast in data | widespread reluctance to enable conventional IP multicast in data | |||
centers for a number of reasons, mostly pertaining to concerns about | centers for a number of reasons, mostly pertaining to concerns about | |||
its scalability and reliability. | its scalability and reliability. | |||
This draft discusses some of the main drivers for the increasing | This draft discusses some of the main drivers for the increasing | |||
volume and importance of one-to-many traffic patterns in data | volume and importance of one-to-many traffic patterns in data | |||
centers. Thereafter, the manner in which conventional IP multicast | centers. Thereafter, the manner in which conventional IP multicast | |||
may be used to handle this traffic pattern is discussed and some of | may be used to handle this traffic pattern is discussed and some of | |||
the associated challenges highlighted. Following this discussion, a | the associated challenges highlighted. Following this discussion, a | |||
number of alternative emerging approaches are introduced, before | number of alternative emerging approaches are introduced, before | |||
concluding by discussing key trends and making a number of | concluding by discussing key trends and making a number of | |||
skipping to change at page 5, line 38 ¶ | skipping to change at page 5, line 37 ¶ | |||
Another key contributor to the rise in one-to-many traffic patterns | Another key contributor to the rise in one-to-many traffic patterns | |||
is the proposed architecture for supporting large-scale multi-tenancy | is the proposed architecture for supporting large-scale multi-tenancy | |||
in highly virtualized data centers [RFC8014]. In this architecture, | in highly virtualized data centers [RFC8014]. In this architecture, | |||
a tenant's VMs are distributed across the data center and are | a tenant's VMs are distributed across the data center and are | |||
connected by a virtual network known as the overlay network. A | connected by a virtual network known as the overlay network. A | |||
number of different technologies have been proposed for realizing the | number of different technologies have been proposed for realizing the | |||
overlay network, including VXLAN [RFC7348], VXLAN-GPE [I-D.ietf-nvo3- | overlay network, including VXLAN [RFC7348], VXLAN-GPE [I-D.ietf-nvo3- | |||
vxlan-gpe], NVGRE [RFC7637] and GENEVE [I-D.ietf-nvo3-geneve]. The | vxlan-gpe], NVGRE [RFC7637] and GENEVE [I-D.ietf-nvo3-geneve]. The | |||
often fervent and arguably partisan debate about the relative merits | often fervent and arguably partisan debate about the relative merits | |||
of these overlay technologies belies the fact that, conceptually, it | of these overlay technologies belies the fact that, conceptually, it | |||
may be said that these overlays mainly simply provide a means to | may be said that these overlays simply provide a means to encapsulate | |||
encapsulate and tunnel Ethernet frames from the VMs over the data | and tunnel Ethernet frames from the VMs over the data center IP | |||
center IP fabric, thus emulating a Layer 2 segment between the VMs. | fabric, thus emulating a Layer 2 segment between the VMs. | |||
Consequently, the VMs believe and behave as if they are connected to | Consequently, the VMs believe and behave as if they are connected to | |||
the tenant's other VMs by a conventional Layer 2 segment, regardless | the tenant's other VMs by a conventional Layer 2 segment, regardless | |||
of their physical location within the data center. | of their physical location within the data center. | |||
Naturally, in a Layer 2 segment, point to multi-point traffic can | Naturally, in a Layer 2 segment, point to multi-point traffic can | |||
result from handling BUM (broadcast, unknown unicast and multicast) | result from handling BUM (broadcast, unknown unicast and multicast) | |||
traffic. And, compounding this issue within data centers, since the | traffic. And, compounding this issue within data centers, since the | |||
tenant's VMs attached to the emulated segment may be dispersed | tenant's VMs attached to the emulated segment may be dispersed | |||
throughout the data center, the BUM traffic may need to traverse the | throughout the data center, the BUM traffic may need to traverse the | |||
data center fabric. | data center fabric. | |||
skipping to change at page 7, line 7 ¶ | skipping to change at page 7, line 7 ¶ | |||
Section 2.1, Section 2.2 and Section 2.3 have discussed how the | Section 2.1, Section 2.2 and Section 2.3 have discussed how the | |||
trends in the types of applications, the overlay technologies used | trends in the types of applications, the overlay technologies used | |||
and some of the essential networking protocols results in an increase | and some of the essential networking protocols results in an increase | |||
in the volume of one-to-many traffic patterns in modern highly- | in the volume of one-to-many traffic patterns in modern highly- | |||
virtualized data centers. Section 3 explores how such traffic flows | virtualized data centers. Section 3 explores how such traffic flows | |||
may be handled using conventional IP multicast. | may be handled using conventional IP multicast. | |||
3. Handling one-to-many traffic using conventional multicast | 3. Handling one-to-many traffic using conventional multicast | |||
Faced with ever increasing volumes of one-to-many traffic flows for | Faced with ever increasing volumes of one-to-many traffic flows, for | |||
the reasons presented in Section 2, arguably the intuitive initial | the reasons presented in Section 2, it makes sense for a data center | |||
course of action for a data center operator is to explore if and how | operator to explore if and how conventional IP multicast could be | |||
conventional IP multicast could be deployed within the data center. | deployed within the data center. This section introduces the key | |||
This section introduces the key protocols, discusses some example use | protocols, discusses some example use cases where they are deployed | |||
cases where they are deployed in data centers and discusses some of | in data centers and discusses some of the advantages and | |||
the advantages and disadvantages of such deployments. | disadvantages of such deployments. | |||
3.1. Layer 3 multicast | 3.1. Layer 3 multicast | |||
PIM is the most widely deployed multicast routing protocol and so, | PIM is the most widely deployed multicast routing protocol and so, | |||
unsurprisingly, is the primary multicast routing protocol considered | unsurprisingly, is the primary multicast routing protocol considered | |||
for use in the data center. There are three potential popular modes | for use in the data center. There are three potential popular modes | |||
of PIM that may be used: PIM-SM [RFC4601], PIM-SSM [RFC4607] or PIM- | of PIM that may be used: PIM-SM [RFC4601], PIM-SSM [RFC4607] or PIM- | |||
BIDIR [RFC5015]. It may be said that these different modes of PIM | BIDIR [RFC5015]. It may be said that these different modes of PIM | |||
tradeoff the optimality of the multicast forwarding tree for the | tradeoff the optimality of the multicast forwarding tree for the | |||
amount of multicast forwarding state that must be maintained at | amount of multicast forwarding state that must be maintained at | |||
skipping to change at page 9, line 21 ¶ | skipping to change at page 9, line 21 ¶ | |||
specification [RFC7348], a data-driven flood and learn control plane | specification [RFC7348], a data-driven flood and learn control plane | |||
was proposed, requiring the data center IP fabric to support | was proposed, requiring the data center IP fabric to support | |||
multicast routing. A multicast group is associated with each virtual | multicast routing. A multicast group is associated with each virtual | |||
network, each uniquely identified by its VXLAN network identifiers | network, each uniquely identified by its VXLAN network identifiers | |||
(VNI). VXLAN tunnel endpoints (VTEPs), typically located in the | (VNI). VXLAN tunnel endpoints (VTEPs), typically located in the | |||
hypervisor or ToR switch, with local VMs that belong to this VNI | hypervisor or ToR switch, with local VMs that belong to this VNI | |||
would join the multicast group and use it for the exchange of BUM | would join the multicast group and use it for the exchange of BUM | |||
traffic with the other VTEPs. Essentially, the VTEP would | traffic with the other VTEPs. Essentially, the VTEP would | |||
encapsulate any BUM traffic from attached VMs in an IP multicast | encapsulate any BUM traffic from attached VMs in an IP multicast | |||
packet, whose destination address is the associated multicast group | packet, whose destination address is the associated multicast group | |||
address, and transmit the packet to the data center fabric. Thus, | address, and transmit the packet to the data center fabric. Thus, a | |||
PIM must be running in the fabric to maintain a multicast | multicast routing protocol (typically PIM) must be running in the | |||
distribution tree per VNI. | fabric to maintain a multicast distribution tree per VNI. | |||
Alternatively, rather than setting up a multicast distribution tree | Alternatively, rather than setting up a multicast distribution tree | |||
per VNI, a tree can be set up whenever hosts within the VNI wish to | per VNI, a tree can be set up whenever hosts within the VNI wish to | |||
exchange multicast traffic. For example, whenever a VTEP receives an | exchange multicast traffic. For example, whenever a VTEP receives an | |||
IGMP report from a locally connected host, it would translate this | IGMP report from a locally connected host, it would translate this | |||
into a PIM join message which will be propagated into the IP fabric. | into a PIM join message which will be propagated into the IP fabric. | |||
In order to ensure this join message is sent to the IP fabric rather | In order to ensure this join message is sent to the IP fabric rather | |||
than over the VXLAN interface (since the VTEP will have a route back | than over the VXLAN interface (since the VTEP will have a route back | |||
to the source of the multicast packet over the VXLAN interface and so | to the source of the multicast packet over the VXLAN interface and so | |||
would naturally attempt to send the join over this interface) a more | would naturally attempt to send the join over this interface) a more | |||
skipping to change at page 14, line 17 ¶ | skipping to change at page 14, line 17 ¶ | |||
to the packet. This header contains a bit string in which each bit | to the packet. This header contains a bit string in which each bit | |||
maps to an egress router, known as Bit-Forwarding Egress Router | maps to an egress router, known as Bit-Forwarding Egress Router | |||
(BFER). If a bit is set, then the packet should be forwarded to the | (BFER). If a bit is set, then the packet should be forwarded to the | |||
associated BFER. The routers within the BIER domain, Bit-Forwarding | associated BFER. The routers within the BIER domain, Bit-Forwarding | |||
Routers (BFRs), use the BIER header in the packet and information in | Routers (BFRs), use the BIER header in the packet and information in | |||
the Bit Index Forwarding Table (BIFT) to carry out simple bit- wise | the Bit Index Forwarding Table (BIFT) to carry out simple bit- wise | |||
operations to determine how the packet should be replicated optimally | operations to determine how the packet should be replicated optimally | |||
so it reaches all the appropriate BFERs. | so it reaches all the appropriate BFERs. | |||
BIER is deemed to be attractive for facilitating one-to-many | BIER is deemed to be attractive for facilitating one-to-many | |||
communications in data centers [I-D.ietf-bier-use-cases]. The | communications in data centers [I-D.ietf-bier-use-cases]. The BFIRs | |||
deployment envisioned with overlay networks is that the the | are the encapsulation endpoints in the deployment envisioned with | |||
encapsulation endpoints would be the BFIR. So knowledge about the | overlay networks. So knowledge about the actual multicast groups | |||
actual multicast groups does not reside in the data center fabric, | does not reside in the data center fabric, improving the scalability | |||
improving the scalability compared to conventional IP multicast. | compared to conventional IP multicast. Additionally, a centralized | |||
Additionally, a centralized controller or a BGP-EVPN control plane | controller or a BGP-EVPN control plane may be used with BIER to | |||
may be used with BIER to ensure the BFIR have the required | ensure the BFIR have the required information. A challenge | |||
information. A challenge associated with using BIER is that it | associated with using BIER is that it requires changes to the | |||
requires changes to the forwarding behaviour of the routers used in | forwarding behaviour of the routers used in the data center IP | |||
the data center IP fabric. | fabric. | |||
4.5. Segment Routing | 4.5. Segment Routing | |||
Segment Routing (SR) [RFC8402] is a manifestation of the source | Segment Routing (SR) [RFC8402] is a manifestation of the source | |||
routing paradigm, so called as the path a packet takes through a | routing paradigm, so called as the path a packet takes through a | |||
network is determined at the source. The source encodes this | network is determined at the source. The source encodes this | |||
information in the packet header as a sequence of instructions. | information in the packet header as a sequence of instructions. | |||
These instructions are followed by intermediate routers, ultimately | These instructions are followed by intermediate routers, ultimately | |||
resulting in the delivery of the packet to the desired destination. | resulting in the delivery of the packet to the desired destination. | |||
In SR, the instructions are known as segments and a number of | In SR, the instructions are known as segments and a number of | |||
End of changes. 8 change blocks. | ||||
40 lines changed or deleted | 39 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |