--- 1/draft-ietf-rtcweb-rtp-usage-01.txt 2012-03-12 23:14:02.591087972 +0100 +++ 2/draft-ietf-rtcweb-rtp-usage-02.txt 2012-03-12 23:14:02.651087416 +0100 @@ -1,26 +1,26 @@ Network Working Group C. Perkins Internet-Draft University of Glasgow Intended status: Standards Track M. Westerlund -Expires: May 3, 2012 Ericsson +Expires: September 13, 2012 Ericsson J. Ott Aalto University - October 31, 2011 + March 12, 2012 Web Real-Time Communication (WebRTC): Media Transport and Use of RTP - draft-ietf-rtcweb-rtp-usage-01 + draft-ietf-rtcweb-rtp-usage-02 Abstract - The Web Real-Time Communication (WebRTC) framework aims to provide - support for direct interactive rich communication using audio, video, + The Web Real-Time Communication (WebRTC) framework provides support + for direct interactive rich communication using audio, video, text, collaboration, games, etc. between two peers' web-browsers. This memo describes the media transport aspects of the WebRTC framework. It specifies how the Real-time Transport Protocol (RTP) is used in the WebRTC context, and gives requirements for which RTP features, profiles, and extensions need to be supported. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. @@ -28,136 +28,131 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on May 3, 2012. + This Internet-Draft will expire on September 13, 2012. Copyright Notice - Copyright (c) 2011 IETF Trust and the persons identified as the + Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 3. Media Transport in WebRTC . . . . . . . . . . . . . . . . . . 4 - 3.1. Expected Topologies . . . . . . . . . . . . . . . . . . . 4 - 3.2. Requirements from RTP . . . . . . . . . . . . . . . . . . 7 - 3.2.1. Signalling for RTP sessions . . . . . . . . . . . . . 7 - 3.2.2. (Lack of) Signalling for Payload Format Changes . . . 8 + 3. Discussion and Rationale . . . . . . . . . . . . . . . . . . . 4 + 3.1. Supported RTP Topologies . . . . . . . . . . . . . . . . . 4 + 3.2. Signalling Requirements . . . . . . . . . . . . . . . . . 7 4. WebRTC Use of RTP: Core Protocols . . . . . . . . . . . . . . 8 4.1. RTP and RTCP . . . . . . . . . . . . . . . . . . . . . . . 8 - 4.2. Choice of RTP Profile . . . . . . . . . . . . . . . . . . 9 - 4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 10 + 4.2. Choice of RTP Profile . . . . . . . . . . . . . . . . . . 8 + 4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 9 4.4. RTP Session Multiplexing . . . . . . . . . . . . . . . . . 10 - 5. WebRTC Use of RTP: Optimisations . . . . . . . . . . . . . . . 10 - 5.1. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 10 - 5.2. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 11 - 5.3. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . . 11 - 5.4. Generation of the RTCP Canonical Name (CNAME) . . . . . . 12 - 6. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 12 - 6.1. Conferencing Extensions . . . . . . . . . . . . . . . . . 12 - 6.1.1. Full Intra Request . . . . . . . . . . . . . . . . . . 13 - 6.1.2. Picture Loss Indication . . . . . . . . . . . . . . . 13 - 6.1.3. Slice Loss Indication . . . . . . . . . . . . . . . . 13 - 6.1.4. Reference Picture Selection Indication . . . . . . . . 14 - 6.1.5. Temporary Maximum Media Stream Bit Rate Request . . . 14 - 6.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 14 - 6.3. Rapid Synchronisation Extensions . . . . . . . . . . . . . 15 - 6.4. Mixer Audio Level Extensions . . . . . . . . . . . . . . . 15 - 6.4.1. Client to Mixer Audio Level . . . . . . . . . . . . . 15 - 6.4.2. Mixer to Client Audio Level . . . . . . . . . . . . . 15 - 7. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 16 - 7.1. Retransmission . . . . . . . . . . . . . . . . . . . . . . 16 - 7.2. Forward Error Correction (FEC) . . . . . . . . . . . . . . 17 - 7.2.1. Basic Redundancy . . . . . . . . . . . . . . . . . . . 17 - 7.2.2. Block Based FEC . . . . . . . . . . . . . . . . . . . 19 - 7.2.3. Recommendations for FEC . . . . . . . . . . . . . . . 20 - 8. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . . 20 - 8.1. Rate Control Requirements . . . . . . . . . . . . . . . . 21 - 8.2. RTCP Limiations . . . . . . . . . . . . . . . . . . . . . 21 - 8.3. Legacy Interop Limitations . . . . . . . . . . . . . . . . 22 - - 9. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 23 - 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 - 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 - 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 - 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 - 13.1. Normative References . . . . . . . . . . . . . . . . . . . 24 - 13.2. Informative References . . . . . . . . . . . . . . . . . . 27 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 + 4.5. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 10 + 4.6. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 10 + 4.7. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . . 11 + 4.8. Generation of the RTCP Canonical Name (CNAME) . . . . . . 11 + 5. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 12 + 5.1. Conferencing Extensions . . . . . . . . . . . . . . . . . 12 + 5.1.1. Full Intra Request . . . . . . . . . . . . . . . . . . 13 + 5.1.2. Picture Loss Indication . . . . . . . . . . . . . . . 13 + 5.1.3. Slice Loss Indication . . . . . . . . . . . . . . . . 13 + 5.1.4. Reference Picture Selection Indication . . . . . . . . 13 + 5.1.5. Temporary Maximum Media Stream Bit Rate Request . . . 13 + 5.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 14 + 5.2.1. Rapid Synchronisation . . . . . . . . . . . . . . . . 14 + 5.2.2. Client to Mixer Audio Level . . . . . . . . . . . . . 14 + 5.2.3. Mixer to Client Audio Level . . . . . . . . . . . . . 15 + 6. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 15 + 6.1. Retransmission . . . . . . . . . . . . . . . . . . . . . . 15 + 6.2. Forward Error Correction (FEC) . . . . . . . . . . . . . . 16 + 6.2.1. Basic Redundancy . . . . . . . . . . . . . . . . . . . 17 + 6.2.2. Block Based FEC . . . . . . . . . . . . . . . . . . . 18 + 6.2.3. Recommendations for FEC . . . . . . . . . . . . . . . 19 + 7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . . 19 + 7.1. Congestion Control Requirements . . . . . . . . . . . . . 20 + 7.2. RTCP Limiations . . . . . . . . . . . . . . . . . . . . . 20 + 7.3. Legacy Interop Limitations . . . . . . . . . . . . . . . . 21 + 8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 22 + 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 + 10. Security Considerations . . . . . . . . . . . . . . . . . . . 22 + 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 + 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 + 12.1. Normative References . . . . . . . . . . . . . . . . . . . 23 + 12.2. Informative References . . . . . . . . . . . . . . . . . . 26 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 1. Introduction - The Real-time Transport Protocol (RTP) [RFC3550] was designed to - provide a framework for delivery of audio and video teleconferencing - data and other real-time media applications. This memo describes how - RTP is to be used in the context of the Web Real-Time Communication - (WebRTC) framework, a new activity that aims to provide support for - direct, interactive, and rich communication using audio, video, - collaboration, games, etc. between two peers' web-browsers. + The Real-time Transport Protocol (RTP) [RFC3550] provides a framework + for delivery of audio and video teleconferencing data and other real- + time media applications. Previous work has defined the RTP protocol, + plus numerous profiles, payload formats, and other extensions. When + combined with appropriate signalling, these form the basis for many + teleconferencing systems. - Previous work in the IETF Audio/Video Transport Working Group, and - it's successors, has been about providing a framework for real-time - multimedia transport, but has not specified how the pieces of this - framework should be combined. This is because the choice of building - blocks and protocol features can really only be done in the context - of some application. This memo proposes a set of RTP features and - extensions to be implemented by applications that fit within the - WebRTC application context. This includes applications such as voice - over IP (VoIP), video teleconferencing, and on-demand multimedia - streaming, delivered in the context of the WebRTC browser-based - infrastructure. + The Web Real-Time communication (WebRTC) framework is a new activity + that provides support for direct, interactive, and rich communication + using audio, video, collaboration, games, etc., between two peers' + web-browsers. This memo describes how the RTP framework is used in + the WebRTC context. It proposes a baseline set of RTP features that + should be implemented by all WebRTC-aware browsers, along with some + suggested extensions for enhanced functionality. + + The WebRTC overview [I-D.ietf-rtcweb-overview] outlines the complete + WebRTC framework, of which this memo is a part. 2. Terminology This memo is structured into different topics. For each topic, one or several recommendations from the authors are given. When it comes to the importance of extensions, or the need for implementation support, we use three requirement levels to indicate the importance of the feature to the WebRTC specification: REQUIRED: Functionality that is absolutely needed to make the WebRTC solution work well, or functionality of low complexity that provides high value. RECOMMENDED: Should be included as its brings significant benefit, but the solution can potentially work without it. OPTIONAL: Something that is useful in some cases, but not always a benefit. -3. Media Transport in WebRTC +3. Discussion and Rationale -3.1. Expected Topologies +3.1. Supported RTP Topologies - As WebRTC is focused on peer to peer connections established from - clients in web browsers the following topologies further discussed in - RTP Topologies [RFC5117] are primarily considered. The topologies - are depicted and briefly explained here for ease of the reader. + RTP supports both unicast and group communication, with participants + being connected using wide range of transport-layer topologies. Some + of these topologies involve only the end-points, while others use RTP + translators and mixers to provide in-network processing. Properties + of some RTP topologies are discussed in [RFC5117], and we further + describe those expected to be useful for WebRTC in the following. +---+ +---+ | A |<------->| B | +---+ +---+ Figure 1: Point to Point The point to point topology (Figure 1) is going to be very common in any single user to single user applications. @@ -243,54 +238,49 @@ To support legacy end-point (B) that don't fulfil the requirements of WebRTC it is possible to insert a Translator (Figure 5) that takes on the role to ensure that from A's perspective B looks like a fully compliant end-point. Thus it is the combination of the Translator and B that looks like the end-point B. The intention is that the presence of the translator is transparent to A, however it is not certain that is possible. Thus this case is include so that it can be discussed if any mechanism specified to be used for WebRTC results in such issues and how to handle them. -3.2. Requirements from RTP - - This section discusses some requirements RTP and RTCP [RFC3550] place - on their underlying transport protocol, the signalling channel, etc. - -3.2.1. Signalling for RTP sessions +3.2. Signalling Requirements - RTP is built with the assumption of an external to RTP/RTCP - signalling channel to configure the RTP sessions and its functions. + RTP is built with the assumption of an external signalling channel + that can be used to configure the RTP sessions and their features. The basic configuration of an RTP session consists of the following parameters: RTP Profile: The name of the RTP profile to be used in session. The RTP/AVP [RFC3551] and RTP/AVPF [RFC4585] profiles can interoperate on basic level, as can their secure variants RTP/SAVP [RFC3711] and RTP/SAVPF [RFC5124]. The secure variants of the profiles do not directly interoperate with the non-secure variants, due to the presence of additional header fields in addition to any cryptographic transformation of the packet content. Transport Information: Source and destination address(s) and ports for RTP and RTCP MUST be signalled for each RTP session. If RTP and RTCP multiplexing [RFC5761] is to be used, such that a single port is used for RTP and RTCP flows, this MUST be signalled (see - Section 5.1). If several RTP sessions are to be multiplexed onto + Section 4.5). If several RTP sessions are to be multiplexed onto a single transport layer flow, this MUST also be signalled (see Section 4.4). - RTP Payload Types, media formats, and media format parameters: The - mapping between media type names (and hence the RTP payload - formats to be used) and the RTP payload type numbers must be - signalled. Each media type may also have a number of media type - parameters that must also be signalled to configure the codec and - RTP payload format (the "a=fmtp:" line from SDP). + RTP Payload Types, media formats, and media format + parameters: The mapping between media type names (and hence the RTP + payload formats to be used) and the RTP payload type numbers must + be signalled. Each media type may also have a number of media + type parameters that must also be signalled to configure the codec + and RTP payload format (the "a=fmtp:" line from SDP). RTP Extensions: The RTP extensions one intends to use need to be agreed upon, including any parameters for each respective extension. At the very least, this will help avoiding using bandwidth for features that the other end-point will ignore. But for certain mechanisms there is requirement for this to happen as interoperability failure otherwise happens. RTCP Bandwidth: Support for exchanging RTCP Bandwidth values to the end-points will be necessary, as described in "Session Description @@ -301,102 +291,96 @@ bandwidths may lead to failure to interoperate. These parameters are often expressed in SDP messages conveyed within an offer/answer exchange. RTP does not depend on SDP or on the offer/answer model, but does require all the necessary parameters to be agreed somehow, and provided to the RTP implementation. We note that in RTCWEB context it will depend on the signalling model and API how these parameters need to be configured but they will be need to either set in the API or explicitly signalled between the peers. -3.2.2. (Lack of) Signalling for Payload Format Changes - - As discussed in Section 3.2.1, the mapping between media type name, - and its associated RTP payload format, and the RTP payload type - number to be used for that format must be signalled as part of the - session setup. An endpoint may signal support for multiple media - formats, or multiple configurations of a single format, each using a - different RTP payload type number. If multiple formats are signalled - by an endpoint, that endpoint is REQUIRED to be prepared to receive - data encoded in any of those formats at any time (this is slightly - modified if several RTP sessions are multiplexed onto one transport - layer connection, such that an endpoint must be prepared for a source - to switch between formats of the same media type at any time; see - Section 4.4). RTP does not require advance signalling for changes - between formats that were signalled during the session setup. This - is needed for rapid rate adaptation. - 4. WebRTC Use of RTP: Core Protocols -4.1. RTP and RTCP - - The Real-time Transport Protocol (RTP) [RFC3550] is REQUIRED to be - implemented as the media transport protocol for WebRTC. RTP itself - comprises two parts: the RTP data transfer protocol, and the RTP - control protocol (RTCP). RTCP is a fundamental and integral part of - the RTP protocol, and is REQUIRED to be implemented. - RTP and RTCP are flexible and extensible protocols that allow, on the one hand, choosing from a variety of building blocks and combining those to meet application needs, but on the other hand, offer the ability to create extensions where existing mechanisms are not sufficient. This memo requires a number of RTP and RTCP extensions that have been shown to be provide important functionality in the WebRTC context be implemented. It is possible that future extensions will be needed: several documents provide guidelines for the use and extension of RTP and RTCP, including Guidelines for Writers of RTP Payload Format Specifications [RFC2736] and Guidelines for Extending the RTP Control Protocol [RFC5968], and should be consulted before extending this memo. + The following describe core features of RTP, and core extensions, + that must be supported in all WebRTC implementations. + +4.1. RTP and RTCP + + The Real-time Transport Protocol (RTP) [RFC3550] is REQUIRED to be + implemented as the media transport protocol for WebRTC. RTP itself + comprises two parts: the RTP data transfer protocol, and the RTP + control protocol (RTCP). RTCP is a fundamental and integral part of + the RTP protocol, and is REQUIRED to be implemented. + 4.2. Choice of RTP Profile The complete specification of RTP for a particular application domain requires the choice of an RTP Profile. For WebRTC use, the "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)- Based Feedback (RTP/SAVPF)" [RFC5124] is REQUIRED to be implemented. This builds on the basic RTP/AVP profile [RFC3551], the RTP/AVPF feedback profile [RFC4585], and the secure RTP/SAVP profile [RFC3711]. The RTP/AVPF part of RTP/SAVPF is required to get the improved RTCP timer model, that allows more flexible transmission of RTCP packets in response to events, rather than strictly according to bandwidth. This also saves RTCP bandwidth and will commonly only use the full amount when there is a lot of events on which to send feedback. This functionality is needed to make use of the RTP conferencing - extensions discussed in Section 6.1. The improved RTCP timer model + extensions discussed in Section 5.1. The improved RTCP timer model defined by RTP/AVPF is backwards compatible with legacy systems that implement only the RTP/AVP profile given some constraints on parameter configuration such as RTCP banwidth and "trr-int". The RTP/SAVP part of RTP/SAVPF is for support for Secure RTP (SRTP) [RFC3711]. This provides media encryption, integrity protection, - replay protection and a limited form of source authentication. It - does not contain a specific keying mechanism, so that, and the set of - security transforms, will be required to be chosen. It is possible - that a security mechanism operating on a lower layer than RTP can be - used instead and that should be evaluated. However, the reasons for - the design of SRTP should be taken into consideration in that - discussion. A mandatory to implement media security mechanism - including keying must be required so that confidentialtiy, integrity - protection and source authentication of the media stream can be - provided when desired by the user. + replay protection and a limited form of source authentication. + + (tbd: There is ongoing discussion on what keying mechanism is to be + required, what are the mandated cryptographic transforms, and whether + fallback to plain RTP should be supported. This section needs to be + updated based on the results of that discussion.) 4.3. Choice of RTP Payload Formats (tbd: say something about the choice of RTP Payload Format for WebRTC. If there is a mandatory to implement set of codecs, this should reference them. In any case, it should reference a discussion of signalling for the choice of codec, once that discussion reaches closure.) + Endpoints may signal support for multiple media formats, or multiple + configurations of a single format, provided each uses a different RTP + payload type number. An endpoint that has signalled it's support for + multiple formats is REQUIRED to accept data in any of those formats + at any time, unless it has previously signalled limitations on it's + decoding capability (this is modified if several RTP sessions are to + be multiplexed onto one transport layer connection, such that an + endpoint must be prepared for a source to switch between formats of + the same media type at any time; see Section 4.4). To support rapid + rate adaptation, RTP does not require advance signalling for changes + between payload formats that were signalled during session setup. + 4.4. RTP Session Multiplexing An association amongst a set of participants communicating with RTP is known as an RTP session. A participant may be involved in multiple RTP sessions at the same time. In a multimedia session, each medium has typically been carried in a separate RTP session with its own RTCP packets (i.e., one RTP session for the audio, with a separate RTP session running on a different transport connection for the video; if SDP is used, this corresponds to one RTP session for each "m=" line in the SDP). WebRTC implementations of RTP are @@ -409,109 +393,100 @@ applications using RTP by combining multimedia traffic in a single RTP session. (Details of how this is to be done are tbd, but see [I-D.lennox-rtcweb-rtp-media-type-mux], [I-D.holmberg-mmusic-sdp-bundle-negotiation] and [I-D.westerlund-avtcore-multiplex-architecture].) WebRTC implementations of RTP are REQUIRED to support multiplexing of multimedia sessions onto a single RTP session according to (tbd). If such RTP session multiplexing is to be used, this MUST be negotiated during the signalling phase. -5. WebRTC Use of RTP: Optimisations - - This section discusses some optimisations that makes RTP/RTCP work - better and more efficient and therefore are considered. - -5.1. RTP and RTCP Multiplexing +4.5. RTP and RTCP Multiplexing Historically, RTP and RTCP have been run on separate UDP ports. With the increased use of Network Address/Port Translation (NAPT) this has become problematic, since maintaining multiple NAT bindings can be costly. It also complicates firewall administration, since multiple ports must be opened to allow RTP traffic. To reduce these costs and session setup times, support for multiplexing RTP data packets and RTCP control packets on a single port [RFC5761] is REQUIRED. - Supporting this specification is generally a simplification in code, - since it relaxes the tests in [RFC3550]. Note that the use of RTP and RTCP multiplexed on a single port ensures that there is occasional traffic sent on that port, even if there is no active media traffic. This may be useful to keep-alive NAT bindings and is recommend method for application level keep- alives of RTP sessions [RFC6263]. -5.2. Reduced Size RTCP - - RTCP packets are usually sent as compound RTCP packets; and RFC 3550 - demands that those compound packets always start with an SR or RR - packet. However, especially when using frequent feedback messages, - these general statistics are not needed in every packet and - unnecessarily increase the mean RTCP packet size and thus limit the - frequency at which RTCP packets can be sent within the RTCP bandwidth - share. +4.6. Reduced Size RTCP - RFC5506 "Support for Reduced-Size Real-Time Transport Control - Protocol (RTCP): Opportunities and Consequences" [RFC5506] specifies - how to reduce the mean RTCP message and allow for more frequent - feedback. Frequent feedback, in turn, is essential to make real-time - application quickly aware of changing network conditions and allow - them to adapt their transmission and encoding behaviour. + RTCP packets are usually sent as compound RTCP packets, and RFC 3550 + requires that those compound packets start with an SR or RR packet. + When using frequent feedback messages, these general statistics are + not needed in every packet and unnecessarily increase the mean RTCP + packet size. This can limit the frequency at which RTCP packets can + be sent within the RTCP bandwidth share. - Support for RFC5506 is REQUIRED. + To avoid this problem, [RFC5506] specifies how to reduce the mean + RTCP message and allow for more frequent feedback. Frequent + feedback, in turn, is essential to make real-time application quickly + aware of changing network conditions and allow them to adapt their + transmission and encoding behaviour. Support for RFC5506 is + REQUIRED. -5.3. Symmetric RTP/RTCP +4.7. Symmetric RTP/RTCP RTP entities choose the RTP and RTCP transport addresses, i.e., IP addresses and port numbers, to receive packets on and bind their respective sockets to those. When sending RTP packets, however, they may use a different IP address or port number for RTP, RTCP, or both; e.g., when using a different socket instance for sending and for receiving. Symmetric RTP/RTCP requires that the IP address and port number for sending and receiving RTP/RTCP packets are identical. The reasons for using symmetric RTP is primarily to avoid issues with NAT and Firewalls by ensuring that the flow is actually bi- directional and thus kept alive and registered as flow the intended recipient actually wants. In addition it saves resources in the form of ports at the end-points, but also in the network as NAT mappings or firewall state is not unnecessary bloated. Also the number of QoS state are reduced. Using Symmetric RTP and RTCP [RFC4961] is REQUIRED. -5.4. Generation of the RTCP Canonical Name (CNAME) +4.8. Generation of the RTCP Canonical Name (CNAME) The RTCP Canonical Name (CNAME) provides a persistent transport-level identifier for an RTP endpoint. While the Synchronisation Source (SSRC) identifier for an RTP endpoint may change if a collision is detected, or when the RTP application is restarted, it's RTCP CNAME is meant to stay unchanged, so that RTP endpoints can be uniquely identified and associated with their RTP media streams. For proper functionality, RTCP CNAMEs should be unique among the participants of an RTP session. The RTP specification [RFC3550] includes guidelines for choosing a unique RTP CNAME, but these are not sufficient in the presence of NAT devices. In addition, some may find long-term persistent identifiers problematic from a privacy viewpoint. Accordingly, support for generating a short-term persistent RTCP CNAMEs following method (b) as specified in Section 4.2 of "Guidelines for Choosing RTP Control - Protocol (RTCP) Canonical Names (CNAMEs)" [RFC6222] is RECOMMENDED, + Protocol (RTCP) Canonical Names (CNAMEs)" [RFC6222] is REQUIRED, since this addresses both concerns. -6. WebRTC Use of RTP: Extensions +5. WebRTC Use of RTP: Extensions There are a number of RTP extensions that could be very useful in the WebRTC context. One set is related to conferencing, others are more - generic in nature. + generic in nature. None of these are mandatory to implement, but + they are expected to be generally useful. -6.1. Conferencing Extensions +5.1. Conferencing Extensions RTP is inherently a group communication protocol. Groups can be implemented using a centralised server, multi-unicast, or IP multicast. While IP multicast was popular in early deployments, in today's practice, overlay-based conferencing dominates, typically using one or more central servers to connect endpoints in a star or flat tree topology. These central servers can be implemented in a number of ways [RFC5117], of which the following are the most common: 1. RTP Translator (Relay) with Only Unicast Paths ([RFC5117], @@ -536,156 +511,146 @@ RTP protocol extensions to be used with conferencing are included because they are important in the context of centralised conferencing, where one RTP Mixer (Conference Focus) receives a participants media streams and distribute them to the other participants. These messages are defined in the Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/ AVPF) [RFC4585] and the "Codec Control Messages in the RTP Audio- Visual Profile with Feedback (AVPF)" (CCM) [RFC5104] and are fully usable by the Secure variant of this profile (RTP/SAVPF) [RFC5124]. -6.1.1. Full Intra Request +5.1.1. Full Intra Request The Full Intra Request is defined in Sections 3.5.1 and 4.3.1 of CCM [RFC5104]. It is used to have the mixer request from a session participants a new Intra picture. This is used when switching between sources to ensure that the receivers can decode the video or other predicted media encoding with long prediction chains. It is RECOMMENDED that this feedback message is supported. -6.1.2. Picture Loss Indication +5.1.2. Picture Loss Indication The Picture Loss Indication is defined in Section 6.3.1 of the RTP/ AVPF profile [RFC4585]. It is used by a receiver to tell the encoder that it lost the decoder context and would like to have it repaired somehow. This is semantically different from the Full Intra Request above. It is RECOMMENDED that this feedback message is supported as a loss tolerance mechanism. -6.1.3. Slice Loss Indication +5.1.3. Slice Loss Indication The Slice Loss Indicator is defined in Section 6.3.2 of the RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the encoder that it has detected the loss or corruption of one or more consecutive macroblocks, and would like to have these repaired somehow. The use of this feedback message is OPTIONAL as a loss tolerance mechanism. -6.1.4. Reference Picture Selection Indication +5.1.4. Reference Picture Selection Indication Reference Picture Selection Indication (RPSI) is defined in Section 6.3.3 of the RTP/AVPF profile [RFC4585]. Some video coding standards allow the use of older reference pictures than the most recent one for predictive coding. If such a codec is in used, and if the encoder has learned about a loss of encoder-decoder synchronicity, a known-as-correct reference picture can be used for future coding. The RPSI message allows this to be signalled. The use of this RTCP feedback message is OPTIONAL as a loss tolerance mechanism. -6.1.5. Temporary Maximum Media Stream Bit Rate Request +5.1.5. Temporary Maximum Media Stream Bit Rate Request This feedback message is defined in Section 3.5.4 and 4.2.1 in CCM [RFC5104]. This message and its notification message is used by a media receiver, to inform the sending party that there is a current limitation on the amount of bandwidth available to this receiver. This can be for various reasons, and can for example be used by an RTP mixer to limit the media sender being forwarded by the mixer (without doing media transcoding) to fit the bottlenecks existing towards the other session participants. It is RECOMMENDED that this feedback message is supported. -6.2. Header Extensions +5.2. Header Extensions The RTP specification [RFC3550] provides a capability to extend the RTP header with in-band data, but the format and semantics of the - extensions are poorly specified. Accordingly, if header extensions - are to be used, it is REQUIRED that they be formatted and signalled - according to the general mechanism of RTP header extensions defined - in [RFC5285]. + extensions are poorly specified. The use of header extensions is + OPTIONAL, but if are used, it is REQUIRED that they be formatted and + signalled according to the general mechanism of RTP header extensions + defined in [RFC5285]. As noted in [RFC5285], the requirement from the RTP specification that header extensions are "designed so that the header extension may be ignored" [RFC3550] stands. To be specific, header extensions must only be used for data that can safely be ignored by the recipient without affecting interoperability, and must not be used when the presence of the extension has changed the form or nature of the rest of the packet in a way that is not compatible with the way the stream is signalled (e.g., as defined by the payload type). Valid examples might include metadata that is additional to the usual RTP information. The RTP rapid synchronisation header extension [RFC6051] is - recommended, as discussed in Section 6.3 we also recommend the client - to mixer audio level [I-D.ietf-avtext-client-to-mixer-audio-level], - and consider the mixer to client audio level - [I-D.ietf-avtext-mixer-to-client-audio-level] as optional feature. + recommended, as discussed in Section 5.2.1 we also recommend the + client to mixer audio level [RFC6464], and consider the mixer to + client audio level [RFC6465] as optional feature. - It is REQUIRED that the mechanism to encrypt header extensions - [I-D.ietf-avtcore-srtp-encrypted-header-ext] is implemented when the - client-to-mixer or mixer-to-client audio level indications are in use - in SRTP encrypted sessions, since the information contained in these - header extensions may be considered sensitive. + If the client-to-mixer or mixer-to-client audio level indications are + in use in SRTP encrypted sessions, it is REQUIRED that the extensions + are encrypted according to + [I-D.ietf-avtcore-srtp-encrypted-header-ext] since the information + contained in these header extensions may be considered sensitive. -6.3. Rapid Synchronisation Extensions +5.2.1. Rapid Synchronisation Many RTP sessions require synchronisation between audio, video, and other content. This synchronisation is performed by receivers, using information contained in RTCP SR packets, as described in the RTP specification [RFC3550]. This basic mechanism can be slow, however, so it is RECOMMENDED that the rapid RTP synchronisation extensions described in [RFC6051] be implemented. The rapid synchronisation extensions use the general RTP header extension mechanism [RFC5285], which requires signalling, but are otherwise backwards compatible. -6.4. Mixer Audio Level Extensions - -6.4.1. Client to Mixer Audio Level +5.2.2. Client to Mixer Audio Level - The Client to Mixer Audio Level - [I-D.ietf-avtext-client-to-mixer-audio-level] is an RTP header - extension used by a client to inform a mixer about the level of audio - activity in the packet the header is attached to. This enables a - central node to make mixing or selection decisions without decoding - or detailed inspection of the payload. Thus reducing the needed - complexity in some types of central RTP nodes. + The Client to Mixer Audio Level [RFC6464] is an RTP header extension + used by a client to inform a mixer about the level of audio activity + in the packet the header is attached to. This enables a central node + to make mixing or selection decisions without decoding or detailed + inspection of the payload. Thus reducing the needed complexity in + some types of central RTP nodes. - Assuming that the Client to Mixer Audio Level - [I-D.ietf-avtext-client-to-mixer-audio-level] is published as a - finished specification prior to RTCWEB's first RTP specification then - it is RECOMMENDED that this extension is included. + The Client to Mixer Audio Level [RFC6464] is RECOMMENDED to be + implemented. -6.4.2. Mixer to Client Audio Level +5.2.3. Mixer to Client Audio Level - The Mixer to Client Audio Level header extension - [I-D.ietf-avtext-mixer-to-client-audio-level] provides the client - with the audio level of the different sources mixed into a common mix - from the RTP mixer. Thus enabling a user interface to indicate the - relative activity level of a session participant, rather than just - being included or not based on the CSRC field. This is a pure - optimisations of non critical functions and thus optional + The Mixer to Client Audio Level header extension [RFC6465] provides + the client with the audio level of the different sources mixed into a + common mix from the RTP mixer. Thus enabling a user interface to + indicate the relative activity level of a session participant, rather + than just being included or not based on the CSRC field. This is a + pure optimisations of non critical functions and thus optional functionality. - Assuming that the Mixer to Client Audio Level - [I-D.ietf-avtext-client-to-mixer-audio-level] is published as a - finished specification prior to RTCWEB's first RTP specification then - it is OPTIONAL that this extension is included. + The Mixer to Client Audio Level [RFC6465] is OPTIONAL to implement. -7. WebRTC Use of RTP: Improving Transport Robustness +6. WebRTC Use of RTP: Improving Transport Robustness There are some tools that can make RTP flows robust against Packet loss and reduce the impact on media quality. However they all add extra bits compared to a non-robust stream. These extra bits needs to be considered and the aggregate bit-rate needs to be rate controlled. Thus improving robustness might require a lower base encoding quality but has the potential to give that quality with fewer errors in it. -7.1. Retransmission +6.1. Retransmission Support for RTP retransmission as defined by "RTP Retransmission Payload Format" [RFC4588] is RECOMMENDED. The retransmission scheme in RTP allows flexible application of retransmissions. Only selected missing packets can be requested by the receiver. It also allows for the sender to prioritise between missing packets based on senders knowledge about their content. Compared to TCP, RTP retransmission also allows one to give up on a packet that despite retransmission(s) still has not been received @@ -729,39 +694,39 @@ the importance of the requested media, the probability that the packet will reach the receiver in time for being usable, the consumption of available bit-rate and the impact of the media quality for new encodings. To conclude, the issues raised are implementation concerns that an implementation needs to take into consideration, they are not arguments against including a highly versatile and efficient packet loss repair mechanism. -7.2. Forward Error Correction (FEC) +6.2. Forward Error Correction (FEC) Support of some type of FEC to combat the effects of packet loss is beneficial, but is heavily application dependent. However, some FEC mechanisms are encumbered. The main benefit from FEC is the relatively low additional delay needed to protect against packet losses. The transmission of any repair packets should preferably be done with a time delay that is just larger than any loss events normally encountered. That way the repair packet isn't also lost in the same event as the source data. The amount of repair packets needed varies depending on the amount and pattern of packet loss to be recovered, and on the mechanism used to derive repair data. The later choice also effects the the additional delay required to both encode the repair packets and in the receiver to be able to recover the lost packet(s). -7.2.1. Basic Redundancy +6.2.1. Basic Redundancy The method for providing basic redundancy is to simply retransmit a some time earlier sent packet. This is relatively simple in theory, i.e. one saves any outgoing source (original) packet in a buffer marked with a timestamp of actual transmission, some X ms later one transmit this packet again. Where X is selected to be longer than the common loss events. Thus any loss events shorter than X can be recovered assuming that one doesn't get an another loss event before all the packets lost in the first event has been received. @@ -802,21 +767,21 @@ reports will be correct. The retransmission payload format is used to recover the packets original data thus enabling a perfect recovery. Duplication Grouping Semantics in the Session Description Protocol: This [I-D.begen-mmusic-redundancy-grouping] is proposal for new SDP signalling to indicate media stream duplication using different RTP sessions, or different SSRCs to separate the source and the redundant copy of the stream. -7.2.2. Block Based FEC +6.2.2. Block Based FEC Block based redundancy collects a number of source packets into a data block for processing. The processing results in some number of repair packets that is then transmitted to the other end allowing the receiver to attempt to recover some number of lost packets in the block. The benefit of block based approaches is the overhead which can be lower than 100% and still recover one or more lost source packet from the block. The optimal block codes allows for each received repair packet to repair a single loss within the block. Thus 3 repair packets that are received should allow for any set of 3 @@ -848,25 +813,25 @@ multiple close losses a scheme of hierarchical encodings are need. Thus increasing the overhead significantly. Forward Error Correction (FEC) Framework: This framework [I-D.ietf-fecframe-framework] defines how not only RTP packets but how arbitrary packet flows can be protected. Some solutions produced or under development in FECFRAME WG are RTP specific. There exist alternatives supporting block codes such as Reed- Salomon and Raptor. -7.2.3. Recommendations for FEC +6.2.3. Recommendations for FEC (tbd) -8. WebRTC Use of RTP: Rate Control and Media Adaptation +7. WebRTC Use of RTP: Rate Control and Media Adaptation WebRTC will be used in very varied network environment with a hetrogenous set of link technologies, including wired and wireless, interconnecting peers at different topological locations resulting in network paths with widely varying one way delays, bit-rate capacity, load levels and traffic mixes. In addition individual end-points will open one or more WebRTC sessions between one or more peers. Each of these session may contain different mixes of media and data flows. Assymetric usage of media bit-rates and number of media streams is also to be expected. A single end-point may receive zero @@ -905,53 +870,29 @@ The biggest issue is that there are no standardised and ready to use mechanism that can simply be included in WebRTC Thus there will be need for the IETF to produce such a specification. Therefore the suggested way forward is to specify requirements on any solution for the media adaptation. These requirements is for now proposed to be documented in this specification. In addition a proposed detailed solution will be developed, but is expected to take longer time to finalize than this document. -8.1. Rate Control Requirements - - Note: This section does not yet have WG consensus. - - This section provides a number of requirements on an media - adaptation/congestion control solution for WebRTC. - - 1. All WebRTC media streams MUST be congestion-controlled. (The - same requirement apply to data streams) - - 2. The congestion algorithms used MUST cause WebRTC streams to act - reasonably fairly with TCP and other congestion-controlled flows, - such as DCCP and TFRC, and other WebRTC flows. Note that WebRTC - involves multiple data flows which "normally" would be separately - congestion-controlled. - - 3. The congestion control mechanism MUST be possible to realize - between two indendently implemented WebRTC end-points. - - 4. The congestion control algorithm SHOULD attempt to minimize the - media-stream end-to-end delays between the participants, by - controlling bandwidth appropriately. +7.1. Congestion Control Requirements - 5. The congestion control SHOULD allow for prioritization and - shifting of banwidth between media flows. In other words, if one - flow on the same path as another has to adjust its bit-rate the - other flow can perform that adjustment instead, or divided - between the flows. + Requirements for congestion control of WebRTC sessions are discussed + in [I-D.jesup-rtp-congestion-reqs]. - Thus it is REQUIRED to have an implementation of an RTP Rate Control - mechanism fulfilling the above requirements. + Implementations are REQUIRED to implement the RTP circuit breakers + described in [I-D.perkins-avtcore-rtp-circuit-breakers]. -8.2. RTCP Limiations +7.2. RTCP Limiations Experience with the congestion control algorithms of TCP [RFC5681], TFRC [RFC5348], and DCCP [RFC4341], [RFC4342], [RFC4828], has shown that feedback on packet arrivals needs to be sent roughly once per round trip time. We note that the capabilities of real-time media traffic to adapt to changing path conditions may be less rapid than for the elastic applications TCP was designed for, but frequent feedback is still required to allow the congestion control algorithm to track the path dynamics. @@ -974,21 +915,21 @@ In group communication, the share of RTCP bandwidth needs to be shared by all group members, reducing the capacity and thus the reporting frequency per node. Example: assuming 512 kbit/s video yields 3200 bytes/s RTCP bandwidth, split across two entities in a point-to-point session. An endpoint could thus send a report of 100 bytes about every 70ms or for every other frame in a 30 fps video. -8.3. Legacy Interop Limitations +7.3. Legacy Interop Limitations Congestion control interoperability with most type of legacy devices, even using an translator could be difficult. There are numerous reasons for this: No RTCP Support: There exist legacy implementations that does not even implement RTCP at all. Thus no feedback at all is provided. RTP/AVP Minimal RTCP Interval of 5s: RTP [RFC3550] under the RTP/AVP profile specifies a recommended minimal fixed interval of 5 @@ -1016,114 +957,117 @@ It has been suggested on the RTCWEB mailing list that if interoperating with really limited legacy devices an WebRTC end-point may not send more than 64 kbps of media streams, to avoid it causing massive congestion on most paths in the Internet when communicating with a legacy node not providing sufficient feedback for effective congestion control. This warrants further discussion as there is clearly a number of link layers that don't even provide that amount of bit-rate consistently, and that assumes no competing traffic. -9. WebRTC Use of RTP: Performance Monitoring +8. WebRTC Use of RTP: Performance Monitoring RTCP does contains a basic set of RTP flow monitoring points like packet loss and jitter. There exist a number of extensions that could be included in the set to be supported. However, in most cases which RTP monitoring that is needed depends on the application, which makes it difficult to select which to include when the set of applications is very large. Exposing some metrics in the WebRTC API should be considered allowing the application to gather the measurements of interest. However, security implications for the different data sets exposed will need to be considered in this. -10. IANA Considerations +9. IANA Considerations This memo makes no request of IANA. Note to RFC Editor: this section may be removed on publication as an RFC. -11. Security Considerations +10. Security Considerations RTP and its various extensions each have their own security considerations. These should be taken into account when considering the security properties of the complete suite. We currently don't think this suite creates any additional security issues or properties. The use of SRTP [RFC3711] will provide protection or mitigation against all the fundamental issues by offering confidentiality, integrity and partial source authentication. A mandatory to implement media security solution will be required to be picked. We currently don't discuss the key-management aspect of SRTP in this memo, that needs to be done taking the WebRTC communication model into account. The guidelines in [I-D.ietf-avtcore-srtp-vbr-audio] apply when using variable bit rate (VBR) audio codecs, for example Opus or the Mixer audio level header extensions. Security considerations for the WebRTC work are discussed in [I-D.ietf-rtcweb-security]. -12. Acknowledgements +11. Acknowledgements The authors would like to thank Harald Alvestrand, Cary Bran, and Cullen Jennings for valuable feedback. -13. References +12. References -13.1. Normative References +12.1. Normative References [I-D.holmberg-mmusic-sdp-bundle-negotiation] Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation Using Session Description Protocol (SDP) Port Numbers", draft-holmberg-mmusic-sdp-bundle-negotiation-00 (work in progress), October 2011. [I-D.ietf-avtcore-srtp-encrypted-header-ext] Lennox, J., "Encryption of Header Extensions in the Secure Real-Time Transport Protocol (SRTP)", draft-ietf-avtcore-srtp-encrypted-header-ext-00 (work in progress), June 2011. [I-D.ietf-avtcore-srtp-vbr-audio] Perkins, C. and J. Valin, "Guidelines for the use of Variable Bit Rate Audio with Secure RTP", draft-ietf-avtcore-srtp-vbr-audio-03 (work in progress), July 2011. - [I-D.ietf-avtext-client-to-mixer-audio-level] - Lennox, J., Ivov, E., and E. Marocco, "A Real-Time - Transport Protocol (RTP) Header Extension for Client-to- - Mixer Audio Level Indication", - draft-ietf-avtext-client-to-mixer-audio-level-03 (work in - progress), July 2011. - - [I-D.ietf-avtext-mixer-to-client-audio-level] - Ivov, E., Marocco, E., and J. Lennox, "A Real-Time - Transport Protocol (RTP) Header Extension for Mixer-to- - Client Audio Level Indication", - draft-ietf-avtext-mixer-to-client-audio-level-03 (work in - progress), July 2011. + [I-D.ietf-rtcweb-overview] + Alvestrand, H., "Overview: Real Time Protocols for Brower- + based Applications", draft-ietf-rtcweb-overview-02 (work + in progress), September 2011. [I-D.ietf-rtcweb-security] Rescorla, E., "Security Considerations for RTC-Web", draft-ietf-rtcweb-security-01 (work in progress), October 2011. + [I-D.jesup-rtp-congestion-reqs] + Jesup, R. and H. Alvestrand, "Congestion Control + Requirements For Real Time Media", + draft-jesup-rtp-congestion-reqs-00 (work in progress), + March 2012. + [I-D.lennox-rtcweb-rtp-media-type-mux] Lennox, J. and J. Rosenberg, "Multiplexing Multiple Media Types In a Single Real-Time Transport Protocol (RTP) Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work in progress), October 2011. + [I-D.perkins-avtcore-rtp-circuit-breakers] + Perkins, C. and V. Singh, "RTP Congestion Control: Circuit + Breakers for Unicast Sessions", + draft-perkins-avtcore-rtp-circuit-breakers-00 (work in + progress), March 2012. + [I-D.westerlund-avtcore-multiplex-architecture] Westerlund, M., Burman, B., and C. Perkins, "RTP Multiplexing Architecture", draft-westerlund-avtcore-multiplex-architecture-00 (work in progress), October 2011. [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP Payload Format Specifications", BCP 36, RFC 2736, December 1999. @@ -1176,21 +1120,29 @@ [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and Control Packets on a Single Port", RFC 5761, April 2010. [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP Flows", RFC 6051, November 2010. [RFC6222] Begen, A., Perkins, C., and D. Wing, "Guidelines for Choosing RTP Control Protocol (RTCP) Canonical Names (CNAMEs)", RFC 6222, April 2011. -13.2. Informative References + [RFC6464] Lennox, J., Ivov, E., and E. Marocco, "A Real-time + Transport Protocol (RTP) Header Extension for Client-to- + Mixer Audio Level Indication", RFC 6464, December 2011. + + [RFC6465] Ivov, E., Marocco, E., and J. Lennox, "A Real-time + Transport Protocol (RTP) Header Extension for Mixer-to- + Client Audio Level Indication", RFC 6465, December 2011. + +12.2. Informative References [I-D.begen-mmusic-redundancy-grouping] Begen, A., Cai, Y., and H. Ou, "Duplication Grouping Semantics in the Session Description Protocol", draft-begen-mmusic-redundancy-grouping-01 (work in progress), June 2011. [I-D.cbran-rtcweb-data] Bran, C. and C. Jennings, "RTC-Web Non-Media Data Transport Requirements", draft-cbran-rtcweb-data-00 (work