--- 1/draft-ietf-rtcweb-rtp-usage-05.txt 2013-02-26 03:31:11.492124379 +0100 +++ 2/draft-ietf-rtcweb-rtp-usage-06.txt 2013-02-26 03:31:11.620170330 +0100 @@ -1,21 +1,21 @@ Network Working Group C. Perkins Internet-Draft University of Glasgow Intended status: Standards Track M. Westerlund -Expires: April 25, 2013 Ericsson +Expires: August 29, 2013 Ericsson J. Ott Aalto University - October 22, 2012 + February 25, 2013 Web Real-Time Communication (WebRTC): Media Transport and Use of RTP - draft-ietf-rtcweb-rtp-usage-05 + draft-ietf-rtcweb-rtp-usage-06 Abstract The Web Real-Time Communication (WebRTC) framework provides support for direct interactive rich communication using audio, video, text, collaboration, games, etc. between two peers' web-browsers. This memo describes the media transport aspects of the WebRTC framework. It specifies how the Real-time Transport Protocol (RTP) is used in the WebRTC context, and gives requirements for which RTP features, profiles, and extensions need to be supported. @@ -28,25 +28,25 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on April 25, 2013. + This Internet-Draft will expire on August 29, 2013. Copyright Notice - Copyright (c) 2012 IETF Trust and the persons identified as the + Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as @@ -63,73 +63,73 @@ 4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 8 4.4. RTP Session Multiplexing . . . . . . . . . . . . . . . . . 8 4.5. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 9 4.6. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 10 4.7. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . . 10 4.8. Choice of RTP Synchronisation Source (SSRC) . . . . . . . 10 4.9. Generation of the RTCP Canonical Name (CNAME) . . . . . . 11 5. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 11 5.1. Conferencing Extensions . . . . . . . . . . . . . . . . . 11 5.1.1. Full Intra Request (FIR) . . . . . . . . . . . . . . . 12 - 5.1.2. Picture Loss Indication (PLI) . . . . . . . . . . . . 12 + 5.1.2. Picture Loss Indication (PLI) . . . . . . . . . . . . 13 5.1.3. Slice Loss Indication (SLI) . . . . . . . . . . . . . 13 5.1.4. Reference Picture Selection Indication (RPSI) . . . . 13 5.1.5. Temporal-Spatial Trade-off Request (TSTR) . . . . . . 13 5.1.6. Temporary Maximum Media Stream Bit Rate Request (TMMBR) . . . . . . . . . . . . . . . . . . . . . . . 13 5.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 14 5.2.1. Rapid Synchronisation . . . . . . . . . . . . . . . . 14 5.2.2. Client-to-Mixer Audio Level . . . . . . . . . . . . . 14 5.2.3. Mixer-to-Client Audio Level . . . . . . . . . . . . . 15 6. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 15 6.1. Negative Acknowledgements and RTP Retransmission . . . . . 15 6.2. Forward Error Correction (FEC) . . . . . . . . . . . . . . 16 - 7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . . 16 + 7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . . 17 7.1. Boundary Conditions and Circuit Breakers . . . . . . . . . 17 - 7.2. RTCP Limitations for Congestion Control . . . . . . . . . 18 - 7.3. Congestion Control Interoperability With Legacy Systems . 19 - 8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 19 + 7.2. RTCP Extensions for Congestion Control . . . . . . . . . . 18 + 7.3. RTCP Limitations for Congestion Control . . . . . . . . . 18 + 7.4. Congestion Control Interoperability With Legacy Systems . 19 + 8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 20 9. WebRTC Use of RTP: Future Extensions . . . . . . . . . . . . . 20 10. Signalling Considerations . . . . . . . . . . . . . . . . . . 20 - 11. WebRTC API Considerations . . . . . . . . . . . . . . . . . . 21 - 11.1. API MediaStream to RTP Mapping . . . . . . . . . . . . . . 21 - 12. RTP Implementation Considerations . . . . . . . . . . . . . . 22 - 12.1. RTP Sessions and PeerConnection . . . . . . . . . . . . . 22 + 11. WebRTC API Considerations . . . . . . . . . . . . . . . . . . 22 + 12. RTP Implementation Considerations . . . . . . . . . . . . . . 23 + 12.1. RTP Sessions and PeerConnections . . . . . . . . . . . . . 23 12.2. Multiple Sources . . . . . . . . . . . . . . . . . . . . . 24 - 12.3. Multiparty . . . . . . . . . . . . . . . . . . . . . . . . 24 - 12.4. SSRC Collision Detection . . . . . . . . . . . . . . . . . 25 - 12.5. Contributing Sources . . . . . . . . . . . . . . . . . . . 26 + 12.3. Multiparty . . . . . . . . . . . . . . . . . . . . . . . . 25 + 12.4. SSRC Collision Detection . . . . . . . . . . . . . . . . . 26 + 12.5. Contributing Sources and the CSRC List . . . . . . . . . . 27 12.6. Media Synchronization . . . . . . . . . . . . . . . . . . 27 - 12.7. Multiple RTP End-points . . . . . . . . . . . . . . . . . 27 - 12.8. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 28 + 12.7. Multiple RTP End-points . . . . . . . . . . . . . . . . . 28 + 12.8. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 29 12.9. Differentiated Treatment of Flows . . . . . . . . . . . . 29 - 13. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 30 - 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 - 15. Security Considerations . . . . . . . . . . . . . . . . . . . 31 - 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 - 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 - 17.1. Normative References . . . . . . . . . . . . . . . . . . . 32 - 17.2. Informative References . . . . . . . . . . . . . . . . . . 35 - Appendix A. Supported RTP Topologies . . . . . . . . . . . . . . 36 - A.1. Point to Point . . . . . . . . . . . . . . . . . . . . . . 37 - A.2. Multi-Unicast (Mesh) . . . . . . . . . . . . . . . . . . . 40 - A.3. Mixer Based . . . . . . . . . . . . . . . . . . . . . . . 43 - A.3.1. Media Mixing . . . . . . . . . . . . . . . . . . . . . 43 - A.3.2. Media Switching . . . . . . . . . . . . . . . . . . . 46 - A.3.3. Media Projecting . . . . . . . . . . . . . . . . . . . 49 - A.4. Translator Based . . . . . . . . . . . . . . . . . . . . . 52 - A.4.1. Transcoder . . . . . . . . . . . . . . . . . . . . . . 52 - A.4.2. Gateway / Protocol Translator . . . . . . . . . . . . 53 - A.4.3. Relay . . . . . . . . . . . . . . . . . . . . . . . . 55 - A.5. End-point Forwarding . . . . . . . . . . . . . . . . . . . 59 - A.6. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 60 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 61 + 13. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 31 + 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 + 15. Security Considerations . . . . . . . . . . . . . . . . . . . 32 + 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 33 + 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 + 17.1. Normative References . . . . . . . . . . . . . . . . . . . 33 + 17.2. Informative References . . . . . . . . . . . . . . . . . . 36 + Appendix A. Supported RTP Topologies . . . . . . . . . . . . . . 38 + A.1. Point to Point . . . . . . . . . . . . . . . . . . . . . . 38 + A.2. Multi-Unicast (Mesh) . . . . . . . . . . . . . . . . . . . 41 + A.3. Mixer Based . . . . . . . . . . . . . . . . . . . . . . . 44 + A.3.1. Media Mixing . . . . . . . . . . . . . . . . . . . . . 44 + A.3.2. Media Switching . . . . . . . . . . . . . . . . . . . 47 + A.3.3. Media Projecting . . . . . . . . . . . . . . . . . . . 50 + A.4. Translator Based . . . . . . . . . . . . . . . . . . . . . 53 + A.4.1. Transcoder . . . . . . . . . . . . . . . . . . . . . . 53 + A.4.2. Gateway / Protocol Translator . . . . . . . . . . . . 54 + A.4.3. Relay . . . . . . . . . . . . . . . . . . . . . . . . 56 + A.5. End-point Forwarding . . . . . . . . . . . . . . . . . . . 60 + A.6. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 61 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 62 1. Introduction The Real-time Transport Protocol (RTP) [RFC3550] provides a framework for delivery of audio and video teleconferencing data and other real- time media applications. Previous work has defined the RTP protocol, along with numerous profiles, payload formats, and other extensions. When combined with appropriate signalling, these form the basis for many teleconferencing systems. @@ -283,22 +283,22 @@ degradation when interoperating with legacy implementations. Other implementation considerations are discussed in Section 12. 4.2. Choice of the RTP Profile The complete specification of RTP for a particular application domain requires the choice of an RTP Profile. For WebRTC use, the "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)- Based Feedback (RTP/SAVPF)" [RFC5124] as extended by - [I-D.terriberry-avp-codecs] MUST be implemented. This builds on the - basic RTP/AVP profile [RFC3551], the RTP profile for RTCP-based + [I-D.ietf-avtcore-avp-codecs] MUST be implemented. This builds on + the basic RTP/AVP profile [RFC3551], the RTP profile for RTCP-based feedback (RTP/AVPF) [RFC4585], and the secure RTP profile (RTP/SAVP) [RFC3711]. The RTCP-based feedback extensions [RFC4585] are needed for the improved RTCP timer model, that allows more flexible transmission of RTCP packets in response to events, rather than strictly according to bandwidth. This is vital for being able to report congestion events. These extensions also save RTCP bandwidth, and will commonly only use the full RTCP bandwidth allocation if there are many events that require feedback. They are also needed to make use of the RTP @@ -321,21 +321,21 @@ transforms listed in Section 5 of [RFC3711] SHALL apply. Implementations MUST support DTLS-SRTP [RFC5764] for key-management. Other key management schemes MAY be supported. 4.3. Choice of RTP Payload Formats Implementations MUST follow the WebRTC Audio Codec and Processing Requirements [I-D.ietf-rtcweb-audio] and SHOULD follow the updated recommendations for audio codecs in the RTP/AVP Profile - [I-D.terriberry-avp-codecs]. Support for other audio codecs is + [I-D.ietf-avtcore-avp-codecs]. Support for other audio codecs is OPTIONAL. (tbd: the mandatory to implement video codec is not yet decided) Endpoints MAY signal support for multiple RTP payload formats, or multiple configurations of a single RTP payload format, provided each payload format uses a different RTP payload type number. An endpoint that has signalled support for multiple RTP payload formats SHOULD accept data in any of those payload formats at any time, unless it has previously signalled limitations on its decoding capability. @@ -365,39 +365,43 @@ its own RTCP packets (i.e., one RTP session for the audio, with a separate RTP session using a different transport address for the video; if SDP is used, this corresponds to one RTP session for each "m=" line in the SDP). WebRTC implementations of RTP are REQUIRED to implement support for multimedia sessions in this way, for compatibility with legacy systems. In today's networks, however, with the widespread use of Network Address/Port Translators (NAT/NAPT) and Firewalls (FW), it is desirable to reduce the number of transport addresses used by real- - time media applications using RTP by combining multimedia traffic in - a single RTP session. (Details of how this is to be done are tbd, - but see [I-D.lennox-rtcweb-rtp-media-type-mux], - [I-D.holmberg-mmusic-sdp-bundle-negotiation] and - [I-D.westerlund-avtcore-multiplex-architecture].) Using a single RTP - session also effects the possibility for differentiated treatment of - media flows. This is further discussed in Section 12.9. + time media applications using RTP by combining all RTP media streams + in a single RTP session. Using a single RTP session also effects the + possibility for differentiated treatment of media flows. This is + further discussed in Section 12.9. WebRTC implementations of RTP are + REQUIRED to support transport of all RTP media streams, independent + of media type, in a single RTP session according to + [I-D.ietf-avtcore-multi-media-rtp-session]. If such RTP session + set-up is to be used, this MUST be negotiated during the signalling + phase [I-D.ietf-mmusic-sdp-bundle-negotiation]. - WebRTC implementations of RTP are REQUIRED to support multiplexing of - a multimedia session onto a single RTP session according to (tbd). - If such RTP session multiplexing is to be used, this MUST be - negotiated during the signalling phase. Support for multiple RTP - sessions over a single UDP flow as defined by - [I-D.westerlund-avtcore-transport-multiplexing] is RECOMMENDED/ - OPTIONAL. + Support for multiple RTP sessions over a single UDP flow as defined + by [I-D.westerlund-avtcore-transport-multiplexing] is RECOMMENDED/ + OPTIONAL. If multiple RTP sessions are to be multiplexed onto a + single UDP flow, this MUST be negotiated during the signalling phase. - (tbd: No consensus on the level of including support of Multiple RTP + (tbd: No consensus on the level of support of Multiple RTP sessions over a single UDP flow.) + Further discussion about when different RTP session structures and + multiplexing methods are suitable can be found in the memo on + Guidelines for using the Multiplexing Features of RTP + [I-D.westerlund-avtcore-multiplex-architecture]. + 4.5. RTP and RTCP Multiplexing Historically, RTP and RTCP have been run on separate transport layer addresses (e.g., two UDP ports for each RTP session, one port for RTP and one port for RTCP). With the increased use of Network Address/ Port Translation (NAPT) this has become problematic, since maintaining multiple NAT bindings can be costly. It also complicates firewall administration, since multiple ports need to be opened to allow RTP traffic. To reduce these costs and session set-up times, support for multiplexing RTP data packets and RTCP control packets on @@ -467,21 +471,21 @@ meant to stay unchanged, so that RTP endpoints can be uniquely identified and associated with their RTP media streams within a set of related RTP sessions. For proper functionality, each RTP endpoint needs to have a unique RTCP CNAME value. The RTP specification [RFC3550] includes guidelines for choosing a unique RTP CNAME, but these are not sufficient in the presence of NAT devices. In addition, long-term persistent identifiers can be problematic from a privacy viewpoint. Accordingly, support for generating a short-term persistent RTCP CNAMEs following - [I-D.rescorla-avtcore-6222bis] is RECOMMENDED. + [I-D.ietf-avtcore-6222bis] is RECOMMENDED. An WebRTC end-point MUST support reception of any CNAME that matches the syntax limitations specified by the RTP specification [RFC3550] and cannot assume that any CNAME will be chosen according to the form suggested above. 5. WebRTC Use of RTP: Extensions There are a number of RTP extensions that are either needed to obtain full functionality, or extremely useful to improve on the baseline @@ -546,24 +550,24 @@ understand the react to this feedback message since it greatly improves the user experience when using centralised mixer-based conferencing; support for sending the FIR message is OPTIONAL. 5.1.2. Picture Loss Indication (PLI) The Picture Loss Indication is defined in Section 6.3.1 of the RTP/ AVPF profile [RFC4585]. It is used by a receiver to tell the sending encoder that it lost the decoder context and would like to have it repaired somehow. This is semantically different from the Full Intra - Request above as there there could be multiple ways to fulfil the - request. It is REQUIRED that WebRTC senders understand and react to - this feedback message as a loss tolerance mechanism; receivers MAY - send PLI messages. + Request above as there could be multiple ways to fulfil the request. + It is REQUIRED that WebRTC senders understand and react to this + feedback message as a loss tolerance mechanism; receivers MAY send + PLI messages. 5.1.3. Slice Loss Indication (SLI) The Slice Loss Indicator is defined in Section 6.3.2 of the RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the encoder that it has detected the loss or corruption of one or more consecutive macro blocks, and would like to have these repaired somehow. The use of this feedback message is OPTIONAL as a loss tolerance mechanism. 5.1.4. Reference Picture Selection Indication (RPSI) @@ -793,21 +797,42 @@ media codecs provides upper- and lower-bounds on the supported bit- rates that the application can utilise to provide useful quality, and the packetization choices that exist. In addition, the signalling channel can establish maximum media bit-rate boundaries using the SDP "b=AS:" or "b=CT:" lines, and the RTP/AVPF Temporary Maximum Media Stream Bit Rate (TMMBR) Requests (see Section 5.1.6 of this memo). The combination of media codec choice and signalled bandwidth limits SHOULD be used to limit traffic based on known bandwidth limitations, for example the capacity of the edge links, to the extent possible. -7.2. RTCP Limitations for Congestion Control +7.2. RTCP Extensions for Congestion Control + + As described in Section 5.1.6, the Temporary Maximum Media Stream Bit + Rate (TMMBR) request is supported by WebRTC senders. This request + can be used by a media receiver to impose limitations on the media + sender based on the receiver's determined bit-rate limitations, to + provide a limited means of congestion control. + + (tbd: What other RTP/RTCP extensions are needed?) + + With proprietary congestion control algorithms issues can arise when + different algorithms and implementations interact in a communication + session. If the different implementations have made different + choices in regards to the type of adaptation, for example one sender + based, and one receiver based, then one could end up in situation + where one direction is dual controlled, when the other direction is + not controlled. + + (tbd: How to ensure that both paths and sender and receiver based + solutions can interact) + +7.3. RTCP Limitations for Congestion Control Experience with the congestion control algorithms of TCP [RFC5681], TFRC [RFC5348], and DCCP [RFC4341], [RFC4342], [RFC4828], has shown that feedback on packet arrivals needs to be sent roughly once per round trip time. We note that the real-time media traffic might not have to adapt to changing path conditions as rapidly as needed for the elastic applications TCP was designed for, but frequent feedback is still needed to allow the congestion control algorithm to track the path dynamics. @@ -830,21 +855,21 @@ In group communication, the share of RTCP bandwidth needs to be shared by all group members, reducing the capacity and thus the reporting frequency per node. Example: assuming 512 kbit/s video yields 3200 bytes/s RTCP bandwidth, split across two entities in a point-to-point session. An endpoint could thus send a report of 100 bytes about every 70ms or for every other frame in a 30 fps video. -7.3. Congestion Control Interoperability With Legacy Systems +7.4. Congestion Control Interoperability With Legacy Systems There are legacy implementations that do not implement RTCP, and hence do not provide any congestion feedback. Congestion control cannot be performed with these end-points. WebRTC implementations that need to interwork with such end-points MUST limit their transmission to a low rate, equivalent to a VoIP call using a low bandwidth codec, that is unlikely to cause any significant congestion. When interworking with legacy implementations that support RTCP using @@ -959,25 +984,20 @@ These parameters are often expressed in SDP messages conveyed within an offer/answer exchange. RTP does not depend on SDP or on the offer/answer model, but does require all the necessary parameters to be agreed upon, and provided to the RTP implementation. We note that in the WebRTC context it will depend on the signalling model and API how these parameters need to be configured but they will be need to either set in the API or explicitly signalled between the peers. 11. WebRTC API Considerations - The following sections describe how the WebRTC API features map onto - the RTP mechanisms described in this memo. - -11.1. API MediaStream to RTP Mapping - The WebRTC API and its media function have the concept of a WebRTC MediaStream that consists of zero or more tracks. A track is an individual stream of media from any type of media source like a microphone or a camera, but also conceptual sources, like a audio mix or a video composition, are possible. The tracks within a WebRTC MediaStream are expected to be synchronized. A track correspond to the media received with one particular SSRC. There might be additional SSRCs associated with that SSRC, like for RTP retransmission or Forward Error Correction. However, one SSRC @@ -993,63 +1013,56 @@ which WebRTC MediaStreams a given SSRC is associated with at the signalling level. A proposal for how the binding between WebRTC MediaStreams and SSRC can be done is specified in "Cross Session Stream Identification in the Session Description Protocol" [I-D.alvestrand-rtcweb-msid]. (tbd: This text needs to be improved and achieved consensus on. Interim meeting in June 2012 shows large differences in opinions.) -12. RTP Implementation Considerations + (tbd: It is an open question whether these considerations are best + discussed in this draft, in the W3C WebRTC API spec, or elsewhere. - The following provide some guidance on the implementation of the RTP - features described in this memo. +12. RTP Implementation Considerations - This section discusses RTP functionality that is part of the RTP - standard, needed by decisions made, or to enable use cases raised and - their motivations. This discussion is from an WebRTC end-point - perspective. It will occasionally talk about central nodes, but as - this specification is for an end-point, this is where the focus lies. - For more discussion on the central nodes and details about RTP - topologies please see Appendix A. + The following discussion provides some guidance on the implementation + of the RTP features described in this memo. The focus is on a WebRTC + end-point implementation perspective, and while some mention is made + of the behaviour of middleboxes, that is not the focus of this memo. - The section will touch on the relation with certain RTP/RTCP - extensions, but will focus on the RTP core functionality. The - definition of what functionalities and the level of requirement on - implementing it is defined in Section 2. +12.1. RTP Sessions and PeerConnections -12.1. RTP Sessions and PeerConnection + An RTP session is an association among RTP nodes, which have a single + shared SSRC space. An RTP session can include a large number of end- + points and nodes, each sourcing, sinking, manipulating, or reporting + on the RTP media streams being sent within the RTP session. - An RTP session is an association among RTP nodes, which have one - common SSRC space. An RTP session can include any number of end- - points and nodes sourcing, sinking, manipulating or reporting on the - RTP media streams being sent within the RTP session. A - PeerConnection being a point-to-point association between an end- - point and another node. That peer node can be both an end-point or - centralized processing node of some type; thus, the RTP session can - terminate immediately on the far end of the PeerConnection, but it - might also continue as further discussed below in Multiparty - (Section 12.3) and Multiple RTP End-points (Section 12.7). + A PeerConnection is a point-to-point association between an end-point + and some other peer node. That peer node can be either an end-point + or a centralized processing node of some type. Hence, an RTP session + can terminate immediately at the far end of a PeerConnection, or it + might continue as further discussed below for multiparty sessions + (Section 12.3) and sessions with multiple end points (Section 12.7). - A PeerConnection can contain one or more RTP session depending on how - it is setup and how many UDP flows it uses. A common usage has been - to have one RTP session per media type, e.g. one for audio and one - for video, each sent over different UDP flows. However, the default - usage in WebRTC will be to use one RTP session for all media types. - This usage then uses only one UDP flow, as also RTP and RTCP - multiplexing is mandated (Section 4.5). However, for legacy - interworking and network prioritization (Section 12.9) based on - flows, a WebRTC end-point needs to support a mode of operation where - one RTP session per media type is used. Currently, each RTP session - has to use its own UDP flow. Discussions are ongoing if a solution - enabling multiple RTP sessions over a single UDP flow, see + A PeerConnection can contain one or more RTP sessions, depending on + how it is set up, and how many UDP flows it uses. A common usage has + been to have one RTP session per media type, e.g. one for audio and + one for video, each sent over a different UDP flow. However, the + default usage in WebRTC will be to use one RTP session for all media + types, with RTP and RTCP multiplexing (Section 4.5) also mandated. + This RTP session then uses only one UDP flow. However, for legacy + interworking and flow-based network prioritization (Section 12.9), a + WebRTC end-point needs to support a mode of operation where one RTP + session per media type is used. Currently, each RTP session has to + use its own UDP flow in this case, however it might be possible to + multiplex several RTP sessions over a single UDP flow, see Section 4.4. The multi-unicast- or mesh-based multi-party topology (Figure 1) is a good example for this section as it concerns the relation between RTP sessions and PeerConnections. In this topology, each participant sends individual unicast RTP/UDP/IP flows to each of the other participants using independent PeerConnections in a full mesh. This topology has the benefit of not requiring central nodes. The downside is that it increases the used bandwidth at each sender by requiring one copy of the RTP media streams for each participant that @@ -1187,42 +1200,64 @@ in a multiparty conference create new sources and signals those towards the central server. In cases where the SSRC/CSRC are propagated between the different end-points from the central node collisions can occur. Another scenario is when the central node manages to connect an end- point's PeerConnection to another PeerConnection the end-point already has, thus forming a loop where the end-point will receive its own traffic. While is is clearly considered a bug, it is important that the end-point is able to recognise and handle the case when it - occurs. + occurs. This case becomes even more problematic when media mixers, + and so on, are involved, where the stream received is a different + stream but still contains this client's input. -12.5. Contributing Sources + These SSRC/CSRC collisions can only be handled on RTP level as long + as the same RTP session is extended across multiple PeerConnections + by a RTP middlebox. To resolve the more generic case where multiple + PeerConnections are interconnected, then identification of the media + source(s) part of a MediaStreamTrack being propagated across multiple + interconnected PeerConnection needs to be preserved across these + interconnections. - Contributing Sources (CSRC) is a functionality in the RTP header that - allows an RTP node to combine media packets from multiple sources - into one and to identify which sources yielded the result. For - WebRTC end-points, supporting contributing sources is trivial. The - set of CSRCs is provided in a given RTP packet. This information can - then be exposed to the applications using some form of API, possibly - a mapping back into WebRTC MediaStream identities to avoid having to - expose two name spaces and the handling of SSRC collision handling to - the JavaScript. +12.5. Contributing Sources and the CSRC List - (tbd: does the API need to provide the ability to add a CSRC list to - an outgoing packet? this is only useful if the sender is mixing - content) + RTP allows a mixer, or other RTP-layer middlebox, to combine media + flows from multiple sources to form a new media flow. The RTP data + packets in that new flow will include a Contributing Source (CSRC) + list, indicating which original SSRCs contributed to the combined + packet. As described in Section 4.1, implementations need to support + reception of RTP data packets containing a CSRC list and RTCP packets + that relate to sources present in the CSRC list. - There are also at least one extension that depends on the CSRC list - being used: the Mixer-to-client audio level [RFC6465], which enhances - the information provided by the CSRC to actual energy levels for - audio for each contributing source. + The CSRC list can change on a packet-by-packet basis, depending on + the mixing operation being performed. Knowledge of what sources + contributed to a particular RTP packet can be important if the user + interface indicates which participants are active in the session. + Changes in the CSRC list included in packets needs to be exposed to + the WebRTC application using some API, if the application is to be + able to track changes in session participation. It is desirable to + map CSRC values back into WebRTC MediaStream identities as they cross + this API, to avoid exposing the SSRC/CSRC name space to JavaScript + applications. + + If the mixer-to-client audio level extension [RFC6465] is being used + in the session (see Section 5.2.3), the information in the CSRC list + is augmented by audio level information for each contributing source. + This information can usefully be exposed in the user interface. + + This memo does not require implementations to be able to add a CSRC + list to outgoing RTP packets. It is expected that the any CSRC list + will be added by a mixer or other middlebox that performs in-network + processing of RTP streams. If there is a desire to allow end-system + mixing, the requirement in Section 4.1 will need to be updated to + support setting the CSRC list in outgoing RTP data packets. 12.6. Media Synchronization When an end-point sends media from more than one media source, it needs to consider if (and which of) these media sources are to be synchronized. In RTP/RTCP, synchronisation is provided by having a set of RTP media streams be indicated as coming from the same synchronisation context and logical end-point by using the same CNAME identifier. @@ -1311,221 +1346,244 @@ 12.9. Differentiated Treatment of Flows There are use cases for differentiated treatment of RTP media streams. Such differentiation can happen at several places in the system. First of all is the prioritization within the end-point sending the media, which controls, both which RTP media streams that will be sent, and their allocation of bit-rate out of the current available aggregate as determined by the congestion control. + It is expected that the WebRTC API will allow the application to + indicate relative priorities for different MediaStreamTracks. These + priorities can then be used to influence the local RTP processing, + especially when it comes to congestion control response in how to + divide the available bandwidth between the RTP flows. Any changes in + relative priority will also need to be considered for RTP flows that + are associated with the main RTP flows, such as RTP retransmission + streams and FEC. The importance of such associated RTP traffic flows + is dependent on the media type and codec used, in regards to how + robust that codec is to packet loss. However, a default policy might + to be to use the same priority for associated RTP flows as for the + primary RTP flow. + Secondly, the network can prioritize packet flows, including RTP media streams. Typically, differential treatment includes two steps, the first being identifying whether an IP packet belongs to a class that has to be treated differently, the second the actual mechanism to prioritize packets. This is done according to three methods; DiffServ: The end-point marks a packet with a DiffServ code point to indicate to the network that the packet belongs to a particular class. Flow based: Packets that need to be given a particular treatment are identified using a combination of IP and port address. Deep Packet Inspection: A network classifier (DPI) inspects the packet and tries to determine if the packet represents a particular application and type that is to be prioritized. - With the exception of DiffServ both flow based and DPI have issues - with running multiple media types and flows on a single UDP flow, - especially when combined with data transport (SCTP/DTLS). DPI has - issues because multiple types of flows are aggregated and thus it - becomes more difficult to analyse them. The flow-based - differentiation will provide the same treatment to all packets within - the flow, i.e., relative prioritization is not possible. Moreover, - if the resources are limited it might not be possible to provide - differential treatment compared to best-effort for all the flows in a - WebRTC application. - - When flow-based differentiation is available the WebRTC application - needs to know about it so that it can provide the separation of the - RTP media streams onto different UDP flows to enable a more granular - usage of flow based differentiation. + Flow-based differentiation will provide the same treatment to all + packets within a flow, i.e., relative prioritization is not possible. + Moreover, if the resources are limited it might not be possible to + provide differential treatment compared to best-effort for all the + flows in a WebRTC application. When flow-based differentiation is + available the WebRTC application needs to know about it so that it + can provide the separation of the RTP media streams onto different + UDP flows to enable a more granular usage of flow based + differentiation. That way at least providing different + prioritization of audio and video if desired by application. DiffServ assumes that either the end-point or a classifier can mark the packets with an appropriate DSCP so that the packets are treated according to that marking. If the end-point is to mark the traffic two requirements arise in the WebRTC context: 1) The WebRTC application or browser has to know which DSCP to use and that it can use them on some set of RTP media streams. 2) The information needs to be propagated to the operating system when transmitting the packet. These issues are discussed in DSCP and other packet markings for RTCWeb QoS [I-D.ietf-rtcweb-qos]. - tbd: The model for providing differentiated treatment needs to be - evolved. Most of this is not the responsibility of this memo. - However, this memo could include: - - 1. How can the application can prioritize MediaStreamTracks - differently in the API? + For packet based marking schemes it would be possible in the context + to mark individual RTP packets differently based on the relative + priority of the RTP payload. For example video codecs that has I,P + and B pictures could prioritise any payloads carrying only B frames + less, as these are less damaging to loose. But as default policy all + RTP packets related to a media stream ought to be provided with the + same prioritization. - 2. How MediaStreamTrack prioritization maps to the RTP level, and - what type of marking behaviour can occur on the RTP media stream - and its datagram? + It is also important to consider how RTCP packets associated with a + particular RTP media flow need to be marked. RTCP compound packets + with Sender Reports (SR), ought to be marked with the same priority + as the RTP media flow itself, so the RTCP-based round-trip time (RTT) + measurements are done using the same flow priority as the media flow + experiences. RTCP compound packets containing RR packet ought to be + sent with the priority used by the majority of the RTP media flows + reported on. RTCP packets containing time-critical feedback packets + can use higher priority to improve the timeliness and likelihood of + delivery of such feedback. 13. Open Issues This section contains a summary of the open issues or to be done things noted in the document: 1. Need to add references to the RTP payload format for the Video Codec chosen in Section 4.3. 2. The methods and solutions for RTP multiplexing over a single transport is not yet finalized in Section 4.4. 3. RTP congestion control algorithms will probably require some feedback information to be conveyed in RTCP. Are the tools that are mandated by this memo sufficient, or do we need additional - information? + information Section 7.2? 4. RTP congestion control could be implementing using either a sender-based algorithm or a receiver-based algorithm. To ensure interoperability, does this memo need to mandate which end is in - charge of congestion control for a path? + charge of congestion control for a path Section 7.2? 5. Still open if any RTCP XR performance metrics are needed, as discussed in Section 8. 6. The API mapping to RTP level concepts has to be agreed and documented in Section 11. 7. An open question if any requirements are needed to agree and limit the number of simultaneously used media sources (SSRCs) within an RTP session. See Section 12.2. - 8. Is an API needed for expressing any application level media - mixing of an RTP media stream so that the correct CSRC list can - be set as discussed in Section 12.5? - - 9. The method for achieving simulcast of a media source has to be + 8. The method for achieving simulcast of a media source has to be decided as discussed in Section 12.8. - 10. Possible documentation of what support for differentiated + 9. Possible documentation of what support for differentiated treatment that are needed on RTP level as the API and the network level specification matures as discussed in Section 12.9. - 11. Editing of Appendix A to remove redundancy between this and the + 10. Editing of Appendix A to remove redundancy between this and the update of RTP Topologies [I-D.westerlund-avtcore-rtp-topologies-update]. 14. IANA Considerations This memo makes no request of IANA. Note to RFC Editor: this section is to be removed on publication as an RFC. 15. Security Considerations - The security considerations for the WebRTC framework are described in - [I-D.ietf-rtcweb-security]. The overall security architecture for - WebRTC is described in [I-D.ietf-rtcweb-security-arch]. + The overall security architecture for WebRTC is described in + [I-D.ietf-rtcweb-security-arch], and security considerations for the + WebRTC framework are described in [I-D.ietf-rtcweb-security]. These + considerations apply to this memo also. The security considerations of the RTP specification, the RTP/SAVPF profile, and the various RTP/RTCP extensions and RTP payload formats that form the complete protocol suite described in this memo apply. We do not believe there are any new security considerations resulting from the combination of these various protocol extensions. The Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback [RFC5124] (RTP/SAVPF) provides handling of fundamental issues by offering confidentiality, integrity and partial source authentication. A mandatory to implement media security solution is (tbd). - tbd: Privacy concerns, and the generation of untraceable CNAMEs, are - under discussion. + RTCP packets convey a Canonical Name (CNAME) identifier that is used + to associate media flows that need to be synchronised across related + RTP sessions. Inappropriate choice of CNAME values can be a privacy + concern, since long-term persistent CNAME identifiers can be used to + track users across multiple WebRTC calls. Section 4.9 of this memo + provides guidelines for generation of untraceable CNAME values that + alleviate this risk. The guidelines in [RFC6562] apply when using variable bit rate (VBR) - audio codecs, e.g., Opus or the Mixer audio level header extensions. + audio codecs such as Opus (see Section 4.3 for discussion of mandated + audio codecs). These guidelines in [RFC6562] also apply, but are of + lesser importance, when using the client-to-mixer audio level header + extensions (Section 5.2.2) or the mixer-to-client audio level header + extensions (Section 5.2.3). 16. Acknowledgements The authors would like to thank Harald Alvestrand, Cary Bran, Charles Eckel and Cullen Jennings for valuable feedback. 17. References 17.1. Normative References - [I-D.holmberg-mmusic-sdp-bundle-negotiation] - Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation - Using Session Description Protocol (SDP) Port Numbers", - draft-holmberg-mmusic-sdp-bundle-negotiation-00 (work in - progress), October 2011. + [I-D.ietf-avtcore-6222bis] + Rescorla, E. and A. Begen, "Guidelines for Choosing RTP + Control Protocol (RTCP) Canonical Names (CNAMEs)", + draft-ietf-avtcore-6222bis-00 (work in progress), + December 2012. - [I-D.ietf-avtcore-rtp-circuit-breakers] - Perkins, C. and V. Singh, "RTP Congestion Control: Circuit - Breakers for Unicast Sessions", - draft-ietf-avtcore-rtp-circuit-breakers-00 (work in + [I-D.ietf-avtcore-avp-codecs] + Terriberry, T., "Update to Recommended Codecs for the AVP + RTP Profile", draft-ietf-avtcore-avp-codecs-00 (work in + progress), January 2013. + + [I-D.ietf-avtcore-multi-media-rtp-session] + Westerlund, M., Perkins, C., and J. Lennox, "Multiple + Media Types in an RTP Session", + draft-ietf-avtcore-multi-media-rtp-session-01 (work in progress), October 2012. + [I-D.ietf-avtcore-rtp-circuit-breakers] + Perkins, C. and V. Singh, "Multimedia Congestion Control: + Circuit Breakers for Unicast RTP Sessions", + draft-ietf-avtcore-rtp-circuit-breakers-02 (work in + progress), February 2013. + [I-D.ietf-avtcore-srtp-encrypted-header-ext] Lennox, J., "Encryption of Header Extensions in the Secure Real-Time Transport Protocol (SRTP)", - draft-ietf-avtcore-srtp-encrypted-header-ext-02 (work in - progress), July 2012. + draft-ietf-avtcore-srtp-encrypted-header-ext-05 (work in + progress), February 2013. [I-D.ietf-avtext-multiple-clock-rates] Petit-Huguenin, M. and G. Zorn, "Support for Multiple Clock Rates in an RTP Session", - draft-ietf-avtext-multiple-clock-rates-06 (work in - progress), October 2012. + draft-ietf-avtext-multiple-clock-rates-08 (work in + progress), November 2012. + + [I-D.ietf-mmusic-sdp-bundle-negotiation] + Holmberg, C., Alvestrand, H., and C. Jennings, + "Multiplexing Negotiation Using Session Description + Protocol (SDP) Port Numbers", + draft-ietf-mmusic-sdp-bundle-negotiation-03 (work in + progress), February 2013. [I-D.ietf-rtcweb-audio] Valin, J. and C. Bran, "WebRTC Audio Codec and Processing - Requirements", draft-ietf-rtcweb-audio-00 (work in - progress), September 2012. + Requirements", draft-ietf-rtcweb-audio-01 (work in + progress), November 2012. [I-D.ietf-rtcweb-overview] Alvestrand, H., "Overview: Real Time Protocols for Brower- - based Applications", draft-ietf-rtcweb-overview-04 (work - in progress), June 2012. + based Applications", draft-ietf-rtcweb-overview-06 (work + in progress), February 2013. [I-D.ietf-rtcweb-security] Rescorla, E., "Security Considerations for RTC-Web", - draft-ietf-rtcweb-security-03 (work in progress), - June 2012. + draft-ietf-rtcweb-security-04 (work in progress), + January 2013. [I-D.ietf-rtcweb-security-arch] Rescorla, E., "RTCWEB Security Architecture", - draft-ietf-rtcweb-security-arch-05 (work in progress), - October 2012. - - [I-D.lennox-rtcweb-rtp-media-type-mux] - Rosenberg, J. and J. Lennox, "Multiplexing Multiple Media - Types In a Single Real-Time Transport Protocol (RTP) - Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work - in progress), October 2011. - - [I-D.rescorla-avtcore-6222bis] - Rescorla, E. and A. Begen, "Guidelines for Choosing RTP - Control Protocol (RTCP) Canonical Names (CNAMEs)", - draft-rescorla-avtcore-6222bis-00 (work in progress), - October 2012. - - [I-D.terriberry-avp-codecs] - Terriberry, T., "Update to Recommended Codecs for the AVP - RTP Profile", draft-terriberry-avp-codecs-00 (work in - progress), August 2012. + draft-ietf-rtcweb-security-arch-06 (work in progress), + January 2013. [I-D.westerlund-avtcore-transport-multiplexing] Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a Single Lower-Layer Transport", draft-westerlund-avtcore-transport-multiplexing-04 (work in progress), October 2012. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. @@ -1613,40 +1670,40 @@ (work in progress), October 2011. [I-D.ietf-rtcweb-qos] Dhesikan, S., Druta, D., Jones, P., and J. Polk, "DSCP and other packet markings for RTCWeb QoS", draft-ietf-rtcweb-qos-00 (work in progress), October 2012. [I-D.ietf-rtcweb-use-cases-and-requirements] Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- Time Communication Use-cases and Requirements", - draft-ietf-rtcweb-use-cases-and-requirements-09 (work in - progress), June 2012. + draft-ietf-rtcweb-use-cases-and-requirements-10 (work in + progress), December 2012. [I-D.jesup-rtp-congestion-reqs] Jesup, R. and H. Alvestrand, "Congestion Control Requirements For Real Time Media", draft-jesup-rtp-congestion-reqs-00 (work in progress), March 2012. [I-D.westerlund-avtcore-multiplex-architecture] Westerlund, M., Burman, B., Perkins, C., and H. Alvestrand, "Guidelines for using the Multiplexing Features of RTP", draft-westerlund-avtcore-multiplex-architecture-02 (work in progress), July 2012. [I-D.westerlund-avtcore-rtp-topologies-update] Westerlund, M. and S. Wenger, "RTP Topologies", - draft-westerlund-avtcore-rtp-topologies-update-01 (work in - progress), October 2012. + draft-westerlund-avtcore-rtp-topologies-update-02 (work in + progress), February 2013. [RFC4341] Floyd, S. and E. Kohler, "Profile for Datagram Congestion Control Protocol (DCCP) Congestion Control ID 2: TCP-like Congestion Control", RFC 4341, March 2006. [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for Datagram Congestion Control Protocol (DCCP) Congestion Control ID 3: TCP-Friendly Rate Control (TFRC)", RFC 4342, March 2006.