--- 1/draft-ietf-rtcweb-jsep-21.txt 2017-08-26 11:13:13.652313635 -0700 +++ 2/draft-ietf-rtcweb-jsep-22.txt 2017-08-26 11:13:13.852318417 -0700 @@ -1,21 +1,21 @@ Network Working Group J. Uberti Internet-Draft Google Intended status: Standards Track C. Jennings -Expires: January 4, 2018 Cisco +Expires: February 26, 2018 Cisco E. Rescorla, Ed. Mozilla - July 3, 2017 + August 25, 2017 JavaScript Session Establishment Protocol - draft-ietf-rtcweb-jsep-21 + draft-ietf-rtcweb-jsep-22 Abstract This document describes the mechanisms for allowing a JavaScript application to control the signaling plane of a multimedia session via the interface specified in the W3C RTCPeerConnection API, and discusses how this relates to existing signaling protocols. Status of This Memo @@ -25,21 +25,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on January 4, 2018. + This Internet-Draft will expire on February 26, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -66,109 +66,108 @@ 3.5. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5.1. ICE Gathering Overview . . . . . . . . . . . . . . . 12 3.5.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 13 3.5.2.1. ICE Candidate Format . . . . . . . . . . . . . . 13 3.5.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 14 3.5.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 15 3.6. Video Size Negotiation . . . . . . . . . . . . . . . . . 16 3.6.1. Creating an imageattr Attribute . . . . . . . . . . . 16 3.6.2. Interpreting an imageattr Attribute . . . . . . . . . 17 3.7. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 18 - 3.8. Interactions With Forking . . . . . . . . . . . . . . . . 19 + 3.8. Interactions With Forking . . . . . . . . . . . . . . . . 20 3.8.1. Sequential Forking . . . . . . . . . . . . . . . . . 20 - 3.8.2. Parallel Forking . . . . . . . . . . . . . . . . . . 20 + 3.8.2. Parallel Forking . . . . . . . . . . . . . . . . . . 21 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 4.1. PeerConnection . . . . . . . . . . . . . . . . . . . . . 21 - 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 21 + 4.1. PeerConnection . . . . . . . . . . . . . . . . . . . . . 22 + 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 22 4.1.2. addTrack . . . . . . . . . . . . . . . . . . . . . . 24 4.1.3. removeTrack . . . . . . . . . . . . . . . . . . . . . 24 4.1.4. addTransceiver . . . . . . . . . . . . . . . . . . . 24 - 4.1.5. createDataChannel . . . . . . . . . . . . . . . . . . 24 + 4.1.5. createDataChannel . . . . . . . . . . . . . . . . . . 25 4.1.6. createOffer . . . . . . . . . . . . . . . . . . . . . 25 4.1.7. createAnswer . . . . . . . . . . . . . . . . . . . . 26 - 4.1.8. SessionDescriptionType . . . . . . . . . . . . . . . 26 + 4.1.8. SessionDescriptionType . . . . . . . . . . . . . . . 27 4.1.8.1. Use of Provisional Answers . . . . . . . . . . . 27 4.1.8.2. Rollback . . . . . . . . . . . . . . . . . . . . 28 4.1.9. setLocalDescription . . . . . . . . . . . . . . . . . 29 4.1.10. setRemoteDescription . . . . . . . . . . . . . . . . 30 4.1.11. currentLocalDescription . . . . . . . . . . . . . . . 30 - 4.1.12. pendingLocalDescription . . . . . . . . . . . . . . . 30 - 4.1.13. currentRemoteDescription . . . . . . . . . . . . . . 30 + 4.1.12. pendingLocalDescription . . . . . . . . . . . . . . . 31 + 4.1.13. currentRemoteDescription . . . . . . . . . . . . . . 31 4.1.14. pendingRemoteDescription . . . . . . . . . . . . . . 31 4.1.15. canTrickleIceCandidates . . . . . . . . . . . . . . . 31 - 4.1.16. setConfiguration . . . . . . . . . . . . . . . . . . 31 - 4.1.17. addIceCandidate . . . . . . . . . . . . . . . . . . . 32 + 4.1.16. setConfiguration . . . . . . . . . . . . . . . . . . 32 + 4.1.17. addIceCandidate . . . . . . . . . . . . . . . . . . . 33 4.2. RtpTransceiver . . . . . . . . . . . . . . . . . . . . . 33 4.2.1. stop . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.2. stopped . . . . . . . . . . . . . . . . . . . . . . . 33 - 4.2.3. setDirection . . . . . . . . . . . . . . . . . . . . 33 + 4.2.3. setDirection . . . . . . . . . . . . . . . . . . . . 34 4.2.4. direction . . . . . . . . . . . . . . . . . . . . . . 34 4.2.5. currentDirection . . . . . . . . . . . . . . . . . . 34 4.2.6. setCodecPreferences . . . . . . . . . . . . . . . . . 34 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 35 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 35 5.1.1. Usage Requirements . . . . . . . . . . . . . . . . . 35 - 5.1.2. Profile Names and Interoperability . . . . . . . . . 35 - 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 36 + 5.1.2. Profile Names and Interoperability . . . . . . . . . 36 + 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 37 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 37 - 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 43 + 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 44 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 47 - 5.2.3.1. IceRestart . . . . . . . . . . . . . . . . . . . 47 - 5.2.3.2. VoiceActivityDetection . . . . . . . . . . . . . 47 - 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 48 - 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 48 + 5.2.3.1. IceRestart . . . . . . . . . . . . . . . . . . . 48 + 5.2.3.2. VoiceActivityDetection . . . . . . . . . . . . . 48 + 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 49 + 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 49 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 55 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 56 - 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 56 - 5.4. Modifying an Offer or Answer . . . . . . . . . . . . . . 56 + 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 57 + 5.4. Modifying an Offer or Answer . . . . . . . . . . . . . . 57 5.5. Processing a Local Description . . . . . . . . . . . . . 57 5.6. Processing a Remote Description . . . . . . . . . . . . . 58 - 5.7. Parsing a Session Description . . . . . . . . . . . . . . 58 + 5.7. Parsing a Session Description . . . . . . . . . . . . . . 59 5.7.1. Session-Level Parsing . . . . . . . . . . . . . . . . 59 - 5.7.2. Media Section Parsing . . . . . . . . . . . . . . . . 60 + 5.7.2. Media Section Parsing . . . . . . . . . . . . . . . . 61 5.7.3. Semantics Verification . . . . . . . . . . . . . . . 63 - 5.8. Applying a Local Description . . . . . . . . . . . . . . 64 + 5.8. Applying a Local Description . . . . . . . . . . . . . . 65 5.9. Applying a Remote Description . . . . . . . . . . . . . . 66 - 5.10. Applying an Answer . . . . . . . . . . . . . . . . . . . 69 - 6. Processing RTP/RTCP . . . . . . . . . . . . . . . . . . . . . 72 - 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 72 + 5.10. Applying an Answer . . . . . . . . . . . . . . . . . . . 70 + 6. Processing RTP/RTCP . . . . . . . . . . . . . . . . . . . . . 73 + 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 73 7.2. Detailed Example . . . . . . . . . . . . . . . . . . . . 77 7.3. Early Transport Warmup Example . . . . . . . . . . . . . 87 8. Security Considerations . . . . . . . . . . . . . . . . . . . 94 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 95 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 95 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 95 11.1. Normative References . . . . . . . . . . . . . . . . . . 95 11.2. Informative References . . . . . . . . . . . . . . . . . 100 Appendix A. Appendix A . . . . . . . . . . . . . . . . . . . . . 102 Appendix B. Change log . . . . . . . . . . . . . . . . . . . . . 103 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 112 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 113 1. Introduction This document describes how the W3C WEBRTC RTCPeerConnection interface [W3C.webrtc] is used to control the setup, management and teardown of a multimedia session. 1.1. General Design of JSEP - The thinking behind WebRTC call setup has been to fully specify and - control the media plane, but to leave the signaling plane up to the - application as much as possible. The rationale is that different - applications may prefer to use different protocols, such as the - existing SIP call signaling protocol, or something custom to the - particular application, perhaps for a novel use case. In this - approach, the key information that needs to be exchanged is the - multimedia session description, which specifies the necessary - transport and media configuration information necessary to establish - the media plane. + WebRTC call setup has been designed to focus on controlling the media + plane, leaving signaling plane behavior up to the application as much + as possible. The rationale is that different applications may prefer + to use different protocols, such as the existing SIP call signaling + protocol, or something custom to the particular application, perhaps + for a novel use case. In this approach, the key information that + needs to be exchanged is the multimedia session description, which + specifies the necessary transport and media configuration information + necessary to establish the media plane. With these considerations in mind, this document describes the JavaScript Session Establishment Protocol (JSEP) that allows for full control of the signaling state machine from JavaScript. As described above, JSEP assumes a model in which a JavaScript application executes inside a runtime containing WebRTC APIs (the "JSEP implementation"). The JSEP implementation is almost entirely divorced from the core signaling flow, which is instead handled by the JavaScript making use of two interfaces: (1) passing in local and remote session descriptions and (2) interacting with the ICE state @@ -219,43 +218,44 @@ Through its abstraction of signaling, the JSEP approach does require the application to be aware of the signaling process. While the application does not need to understand the contents of session descriptions to set up a call, the application must call the right APIs at the right times, convert the session descriptions and ICE information into the defined messages of its chosen signaling protocol, and perform the reverse conversion on the messages it receives from the other side. - One way to mitigate this is to provide a JavaScript library that - hides this complexity from the developer; said library would - implement a given signaling protocol along with its state machine and - serialization code, presenting a higher level call-oriented interface - to the application developer. For example, libraries exist to adapt - the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP - provides greater control for the experienced developer without - forcing any additional complexity on the novice developer. + One way to make life easier for the application is to provide a + JavaScript library that hides this complexity from the developer; + said library would implement a given signaling protocol along with + its state machine and serialization code, presenting a higher level + call-oriented interface to the application developer. For example, + libraries exist to adapt the JSEP API into an API suitable for a SIP + or XMPP. Thus, JSEP provides greater control for the experienced + developer without forcing any additional complexity on the novice + developer. 1.2. Other Approaches Considered One approach that was considered instead of JSEP was to include a lightweight signaling protocol. Instead of providing session descriptions to the API, the API would produce and consume messages from this protocol. While providing a more high-level API, this put more control of signaling within the JSEP implementation, forcing it to have to understand and handle concepts like signaling glare (see [RFC3264], Section 4). A second approach that was considered but not chosen was to decouple the management of the media control objects from session descriptions, instead offering APIs that would control each component - directly. This was rejected based on a feeling that requiring + directly. This was rejected based on the argument that requiring exposure of this level of complexity to the application programmer would not be beneficial; it would result in an API where even a simple example would require a significant amount of code to orchestrate all the needed interactions, as well as creating a large API surface that needed to be agreed upon and documented. In addition, these API points could be called in any order, resulting in a more complex set of interactions with the media subsystem than the JSEP approach, which specifies how session descriptions are to be evaluated and applied. @@ -317,25 +317,25 @@ as well as how to handle the media that is received. These parameters are determined by the exchange of session descriptions in offers and answers, and there are certain details to this process that must be handled in the JSEP APIs. Whether a session description applies to the local side or the remote side affects the meaning of that description. For example, the list of codecs sent to a remote party indicates what the local side is willing to receive, which, when intersected with the set of codecs the remote side supports, specifies what the remote side should send. - However, not all parameters follow this rule; for example, the - fingerprints [RFC8122] sent to a remote party are calculated based on - the local certificate(s) offered; the remote party MUST either accept - these parameters or reject them altogether, with no option to choose - different values. + However, not all parameters follow this rule; some parameters are + declarative and the remote side MUST either accept them or reject + them altogether. An example of such a parameter is the DTLS + fingerprints [RFC8122], which are calculated based on the local + certificate(s) offered, and are not subject to negotiation. In addition, various RFCs put different conditions on the format of offers versus answers. For example, an offer may propose an arbitrary number of m= sections (i.e., media descriptions as described in [RFC4566], Section 5.14), but an answer must contain the exact same number as the offer. Lastly, while the exact media parameters are only known only after an offer and an answer have been exchanged, the offerer may receive ICE checks, and possibly media (e.g., in the case of a re-offer after a @@ -356,20 +356,28 @@ JSEP addresses this by adding both setLocalDescription and setRemoteDescription methods and having session description objects contain a type field indicating the type of session description being supplied. This satisfies the requirements listed above for both the offerer, who first calls setLocalDescription(sdp [offer]) and then later setRemoteDescription(sdp [answer]), as well as for the answerer, who first calls setRemoteDescription(sdp [offer]) and then later setLocalDescription(sdp [answer]). + During the offer/answer exchange, the outstanding offer is considered + to be "pending" at the offerer and the answerer, as it may either be + accepted or rejected. If this is a re-offer, each side will also + have "current" local and remote descriptions, which reflect the + result of the last offer/answer exchange. Sections Section 4.1.12, + Section 4.1.14, Section 4.1.11, and Section 4.1.13, provide more + detail on pending and current descriptions. + JSEP also allows for an answer to be treated as provisional by the application. Provisional answers provide a way for an answerer to communicate initial session parameters back to the offerer, in order to allow the session to begin, while allowing a final answer to be specified later. This concept of a final answer is important to the offer/answer model; when such an answer is received, any extra resources allocated by the caller can be released, now that the exact session configuration is known. These "resources" can include things like extra ICE components, TURN candidates, or video decoders. Provisional answers, on the other hand, do no such deallocation; as a @@ -600,21 +608,23 @@ is a "media stream identification" value, as defined in [RFC5888], Section 4, which provides a more robust way to identify the m= section in the session description, using the MID of the associated RtpTransceiver object (which may have been locally generated by the answerer when interacting with a non-JSEP endpoint that does not support the MID attribute, as discussed in Section 5.9 below). If the MID field is present in a received IceCandidate, it MUST be used for identification; otherwise, the m= section index is used instead. When creating an IceCandidate object, JSEP implementations MUST - populate all of these fields. + populate each of the candidate, ufrag, m= section index, and MID + fields. Implementations MUST also be prepared to receive objects + with some fields missing, as mentioned above. 3.5.3. ICE Candidate Policy Typically, when gathering ICE candidates, the JSEP implementation will gather all possible forms of initial candidates - host, server reflexive, and relay. However, in certain cases, applications may want to have more specific control over the gathering process, due to privacy or related concerns. For example, one may want to only use relay candidates, to leak as little location information as possible (keeping in mind that this choice comes with corresponding @@ -771,87 +781,89 @@ attributes have the same "q=" value, the one that appears first in the m= section is used. Note that while JSEP endpoints will include at most one "a=imageattr recv" attribute per media format, JSEP endpoints may receive session descriptions from non-JSEP endpoints with m= sections that contain multiple such attributes. o If there is an applicable "a=imageattr recv" attribute for the encoding, the limits from the attribute are then compared to the encoder resolution. Only the specific limits mentioned below are considered; any other values, such as picture aspect ratio, MUST - be ignored. Note that when considering a MediaStreamTrack that is - producing rotated video, the unrotated resolution MUST be used for - the checks. This is required regardless of whether the receiver + be ignored. When considering a MediaStreamTrack that is producing + rotated video, the unrotated resolution MUST be used for the + checks. This is required regardless of whether the receiver supports performing receive-side rotation (e.g., through CVO [TS26.114]), as it significantly simplifies the matching logic. o If the attribute includes a "sar=" (sample aspect ratio) value set to something other than "1.0", indicating the receiver wants to receive non-square pixels, this cannot be satisfied and the sender MUST NOT transmit the encoding. o If the encoder resolution exceeds the maximum size permitted by the attribute, and the encoder is allowed to adjust its resolution, the encoder SHOULD apply downscaling in order to satisfy the limits, although the downscaling MUST NOT change the picture aspect ratio of the encoding. For example, if the encoder resolution is 1280x720, and the attribute specified a maximum of 640x480, the expected output resolution would be 640x360. If downscaling cannot be applied, the encoding MUST NOT be - transmitted, and an error SHOULD be surfaced to the application. + transmitted, and an error SHOULD be raised to the application. o If the encoder resolution is less than the minimum size permitted by the attribute, the encoding MUST NOT be transmitted, and an - error SHOULD be surfaced to the application; the encoder MUST NOT + error SHOULD be raised to the application; the encoder MUST NOT apply upscaling. JSEP implementations SHOULD avoid this situation by allowing receipt of arbitrarily small resolutions, perhaps via fallback to a software decoder. 3.7. Simulcast JSEP supports simulcast transmission of a MediaStreamTrack, where multiple encodings of the source media can be transmitted within the context of a single m= section. The current JSEP API is designed to allow applications to send simulcasted media but only to receive a single encoding. This allows for multi-user scenarios where each sending client sends multiple encodings to a server, which then, for each receiving client, chooses the appropriate encoding to forward. Applications request support for simulcast by configuring multiple - encodings on an RtpSender, which, upon generation of an offer or - answer, are indicated in SDP markings on the corresponding m= - section, as described below. Receivers that understand simulcast and - are willing to receive it will also include SDP markings to indicate - their support, and JSEP endpoints will use these markings to + encodings on an RtpSender. Upon generation of an offer or answer, + these encodings are indicated via SDP markings on the corresponding + m= section, as described below. Receivers that understand simulcast + and are willing to receive it will also include SDP markings to + indicate their support, and JSEP endpoints will use these markings to determine whether simulcast is permitted for a given RtpSender. If simulcast support is not negotiated, the RtpSender will only use the first configured encoding. Note that the exact simulcast parameters are up to the sending application. While the aforementioned SDP markings are provided to ensure the remote side can receive and demux multiple simulcast encodings, the specific resolutions and bitrates to be used for each encoding are purely a send-side decision in JSEP. JSEP currently does not provide a mechanism to configure receipt of simulcast. This means that if simulcast is offered by the remote endpoint, the answer generated by a JSEP endpoint will not indicate support for receipt of simulcast, and as such the remote endpoint will only send a single encoding per m= section. In addition, JSEP does not provide a mechanism to handle an incoming offer requesting simulcast from the JSEP endpoint. This means that - established simulcast streams will continue to work through a - received re-offer, but setting up initial simulcast by way of a - received offer requires out-of-band signaling or SDP inspection. - Future versions of this specification may add additional APIs to - provide direct control. + setting up simulcast in the case where the JSEP endpoint receives the + initial offer requires out-of-band signaling or SDP inspection. + However, in the case where the JSEP endpoint sets up simulcast in its + in initial offer, any established simulcast streams will continue to + work upon receipt of an incoming re-offer. Future versions of this + specification may add additional APIs to handle the incoming initial + offer scenario. When using JSEP to transmit multiple encodings from a RtpSender, the techniques from [I-D.ietf-mmusic-sdp-simulcast] and [I-D.ietf-mmusic-rid] are used. Specifically, when multiple encodings have been configured for a RtpSender, the m= section for the RtpSender will include an "a=simulcast" attribute, as defined in [I-D.ietf-mmusic-sdp-simulcast], Section 6.2, with a "send" simulcast stream description that lists each desired encoding, and no "recv" simulcast stream description. The m= section will also include an "a=rid" attribute for each encoding, as specified in @@ -923,24 +934,24 @@ endpoints at a time, then from a media engine point of view, this is exactly like the sequential forking case. In the parallel forking case where the JavaScript application wishes to simultaneously exchange media with multiple peers, the flow is slightly more complex, but the JavaScript application can follow the strategy that [RFC3960] describes using UPDATE. The UPDATE approach allows the signaling to set up a separate media flow for each peer that it wishes to exchange media with. In JSEP, this offer used in the UPDATE would be formed by simply creating a new PeerConnection - and making sure that the same local media streams have been added - into this new PeerConnection. Then the new PeerConnection object - would produce a SDP offer that could be used by the signaling to - perform the UPDATE strategy discussed in [RFC3960]. + (see Section 4.1) and making sure that the same local media streams + have been added into this new PeerConnection. Then the new + PeerConnection object would produce a SDP offer that could be used by + the signaling to perform the UPDATE strategy discussed in [RFC3960]. As a result of sharing the media streams, the application will end up with N parallel PeerConnection sessions, each with a local and remote description and their own local and remote addresses. The media flow from these sessions can be managed using setDirection (see Section 4.2.3), or the application can choose to play out the media from all sessions mixed together. Of course, if the application wants to only keep a single session, it can simply terminate the sessions that it no longer needs. @@ -1048,37 +1059,39 @@ The default multiplexing policy MUST be set to "require". Implementations MAY choose to reject attempts by the application to set the multiplexing policy to "negotiate". 4.1.2. addTrack The addTrack method adds a MediaStreamTrack to the PeerConnection, using the MediaStream argument to associate the track with other tracks in the same MediaStream, so that they can be added to the same - "LS" group when creating an offer or answer. addTrack attempts to - minimize the number of transceivers as follows: If the PeerConnection - is in the "have-remote-offer" state, the track will be attached to - the first compatible transceiver that was created by the most recent - call to setRemoteDescription() and does not have a local track. - Otherwise, a new transceiver will be created, as described in - Section 4.1.4. + "LS" group when creating an offer or answer. Adding tracks to the + same "LS" group indicates that the playback of these tracks should be + synchronized for proper lip sync, as described in [RFC5888], + Section 7. addTrack attempts to minimize the number of transceivers + as follows: If the PeerConnection is in the "have-remote-offer" + state, the track will be attached to the first compatible transceiver + that was created by the most recent call to setRemoteDescription() + and does not have a local track. Otherwise, a new transceiver will + be created, as described in Section 4.1.4. 4.1.3. removeTrack The removeTrack method removes a MediaStreamTrack from the PeerConnection, using the RtpSender argument to indicate which sender should have its track removed. The sender's track is cleared, and the sender stops sending. Future calls to createOffer will mark the m= section associated with the sender as recvonly (if - transceiver.currentDirection is sendrecv) or as inactive (if - transceiver.currentDirection is sendonly). + transceiver.direction is sendrecv) or as inactive (if + transceiver.direction is sendonly). 4.1.4. addTransceiver The addTransceiver method adds a new RtpTransceiver to the PeerConnection. If a MediaStreamTrack argument is provided, then the transceiver will be configured with that media type and the track will be attached to the transceiver. Otherwise, the application MUST explicitly specify the type; this mode is useful for creating recvonly transceivers as well as for creating transceivers to which a track can be attached at some later point. @@ -1137,24 +1150,24 @@ Section 5.2.2. below. Session descriptions generated by createOffer must be immediately usable by setLocalDescription; if a system has limited resources (e.g. a finite number of decoders), createOffer should return an offer that reflects the current state of the system, so that setLocalDescription will succeed when it attempts to acquire those resources. Calling this method may do things such as generating new ICE - credentials, but does not result in candidate gathering, or cause - media to start or stop flowing. Specifically, the offer is not - applied, and does not become the pending local description, until - setLocalDescription is called. + credentials, but does not change the PeerConnection state, trigger + candidate gathering, or cause media to start or stop flowing. + Specifically, the offer is not applied, and does not become the + pending local description, until setLocalDescription is called. 4.1.7. createAnswer The createAnswer method generates a blob of SDP that contains a [RFC3264] SDP answer with the supported configuration for the session that is compatible with the parameters supplied in the most recent call to setRemoteDescription, which MUST have been called prior to calling createAnswer. Like createOffer, the returned blob contains descriptions of the media added to this PeerConnection, the codec/RTP/RTCP options negotiated for this session, and any @@ -1167,24 +1180,24 @@ SDP line, the generation of the SDP must follow the process defined for generating an answer from the document that specifies the given SDP line. The exact handling of answer generation is detailed in Section 5.3. below. Session descriptions generated by createAnswer must be immediately usable by setLocalDescription; like createOffer, the returned description should reflect the current state of the system. Calling this method may do things such as generating new ICE - credentials, but does not trigger candidate gathering or cause a - media state change. Specifically, the answer is not applied, and - does not become the pending local description, until - setLocalDescription is called. + credentials, but does not change the PeerConnection state, trigger + candidate gathering, or or cause a media state change. Specifically, + the answer is not applied, and does not become the current local + description, until setLocalDescription is called. 4.1.8. SessionDescriptionType Session description objects (RTCSessionDescription) may be of type "offer", "pranswer", "answer" or "rollback". These types provide information as to how the description parameter should be parsed, and how the media state should be changed. "offer" indicates that a description should be parsed as an offer; said description may include many possible media configurations. A @@ -1296,23 +1309,27 @@ subsequently rolled back MUST be stopped and removed from the PeerConnection. However, a RtpTransceiver MUST NOT be removed if a track was attached to the RtpTransceiver via the addTrack method. This is so that an application may call addTrack, then call setRemoteDescription with an offer, then roll back that offer, then call createOffer and have a m= section for the added track appear in the generated offer. A rollback is performed by supplying a session description of type "rollback" with empty contents to either setLocalDescription or - setRemoteDescription, depending on which was most recently used (i.e. - if the new offer was supplied to setLocalDescription, the rollback - should be done using setLocalDescription as well). + setRemoteDescription. The effect MUST be the same regardless of + whether setLocalDescription or setRemoteDescription is called. + + A rollback may be performed if the PeerConnection is in any state + except for "stable". This means that both offers and provisional + answers can be rolled back. If a rollback is attempted in the + "stable" state, processing MUST stop and an error MUST be returned. 4.1.9. setLocalDescription The setLocalDescription method instructs the PeerConnection to apply the supplied session description as its local configuration. The type field indicates whether the description should be processed as an offer, provisional answer, final answer, or rollback; offers and answers are checked differently, using the various rules that exist for each SDP line. @@ -1444,66 +1461,62 @@ restart and kicking off a new gathering phase, in which the new servers will be used. If the ICE candidate pool has a nonzero size, and a local description has not yet been applied, any existing candidates will be discarded, and new candidates will be gathered from the new servers. o Any change to the ICE candidate policy affects the next gathering phase. If an ICE gathering phase has already started or completed, the 'needs-ice-restart' bit will be set. Either way, changes to the policy have no effect on the candidate pool, - because pooled candidates are not surfaced to the application - until a gathering phase occurs, and so any necessary filtering can - still be done on any pooled candidates. + because pooled candidates are not made available to the + application until a gathering phase occurs, and so any necessary + filtering can still be done on any pooled candidates. o The ICE candidate pool size MUST NOT be changed after applying a local description. If a local description has not yet been applied, any changes to the ICE candidate pool size take effect immediately; if increased, additional candidates are pre-gathered; if decreased, the now-superfluous candidates are discarded. o The bundle and RTCP-multiplexing policies MUST NOT be changed after the construction of the PeerConnection. This call may result in a change to the state of the ICE Agent. 4.1.17. addIceCandidate - The addIceCandidate method provides a remote candidate to the ICE - agent, which, if parsed successfully, will be added to the current - and/or pending remote description according to the rules defined for - Trickle ICE. The pair of MID and ufrag is used to determine the m= - section and ICE candidate generation to which the candidate belongs. - If the MID is not present, the m= section index is used to look up - the locally generated MID (see Section 5.9), which is used in place - of a supplied MID. If these values or the candidate string are - invalid, an error is generated. + The addIceCandidate method provides an update to the ICE agent via an + IceCandidate object Section 3.5.2.1. If the IceCandidate's candidate + field is filled in, the IceCandidate is treated as a new remote ICE + candidate, which will be added to the current and/or pending remote + description according to the rules defined for Trickle ICE. + Otherwise, the IceCandidate is treated as an end-of-candidates + indication, as defined in [I-D.ietf-ice-trickle]. - The purpose of the ufrag is to resolve ambiguities when trickle ICE - is in progress during an ICE restart. If the ufrag is absent, the - candidate MUST be assumed to belong to the most recently applied - remote description. Connectivity checks will be sent to the new - candidate. + In either case, the m= section index, MID, and ufrag fields from the + supplied IceCandidate are used to determine which m= section and ICE + candidate generation the IceCandidate belongs to, as described in + Section 3.5.2.1 above. In the case of an end-of-candidates + indication, the absence of both the m= section index and MID fields + is interpreted to mean that the indication applies to all m= sections + in the specified ICE candidate generation. However, if both fields + are absent for a new remote candidate, this MUST be treated as an + invalid condition, as specified below. - This method can also be used to provide an end-of-candidates - indication to the ICE agent, as defined in [I-D.ietf-ice-trickle]). - The MID and ufrag are used as described above to determine the m= - section and ICE generation for which candidate gathering is complete. - If the ufrag is not present, then the end-of-candidates indication - MUST be assumed to apply to the relevant m= section in the most - recently applied remote description. If neither the MID nor the m= - index is present, then the indication MUST be assumed to apply to all - m= sections in the most recently applied remote description. + If any IceCandidate fields contain invalid values, or an error occurs + during the processing of the IceCandidate object, the supplied + IceCandidate MUST be ignored and an error MUST be returned. - This call will result in a change to the state of the ICE Agent, and - may result in a change to media state if it results in connectivity - being established. + Otherwise, the new remote candidate or end-of-candidates indication + is supplied to the ICE agent. In the case of a new remote candidate, + connectivity checks will be sent to the new candidate. 4.2. RtpTransceiver 4.2.1. stop The stop method stops an RtpTransceiver. This will cause future calls to createOffer to generate a zero port for the associated m= section. See below for more details. 4.2.2. stopped @@ -1513,21 +1526,24 @@ that rejects the associated m= section. In either of these cases, it is set to "true", and otherwise will be set to "false". A stopped RtpTransceiver does not send any outgoing RTP or RTCP or process any incoming RTP or RTCP. It cannot be restarted. 4.2.3. setDirection The setDirection method sets the direction of a transceiver, which affects the direction property of the associated m= section on future - calls to createOffer and createAnswer. + calls to createOffer and createAnswer. The permitted values for + direction are "recvonly", "sendrecv", "sendonly", and "inactive", + mirroring the identically-named directional attributes defined in + [RFC4566], Section 6. When creating offers, the transceiver direction is directly reflected in the output, even for re-offers. When creating answers, the transceiver direction is intersected with the offered direction, as explained in Section 5.3 below. Note that while setDirection sets the direction property of the transceiver immediately (Section 4.2.4), this property does not immediately affect whether the transceiver's RtpSender will send or its RtpReceiver will receive. The direction in effect is represented @@ -1538,24 +1554,25 @@ The direction property indicates the last value passed into setDirection. If setDirection has never been called, it is set to the direction the transceiver was initialized with. 4.2.5. currentDirection The currentDirection property indicates the last negotiated direction for the transceiver's associated m= section. More specifically, it indicates the [RFC3264] directional attribute of the associated m= - section in the last applied answer, with "send" and "recv" directions - reversed if it was a remote answer. For example, if the directional - attribute for the associated m= section in a remote answer is - "recvonly", currentDirection is set to "sendonly". + section in the last applied answer (including provisional answers), + with "send" and "recv" directions reversed if it was a remote answer. + For example, if the directional attribute for the associated m= + section in a remote answer is "recvonly", currentDirection is set to + "sendonly". If an answer that references this transceiver has not yet been applied, or if the transceiver is stopped, currentDirection is set to null. 4.2.6. setCodecPreferences The setCodecPreferences method sets the codec preferences of a transceiver, which in turn affect the presence and order of codecs of the associated m= section on future calls to createOffer and @@ -1601,28 +1618,28 @@ o DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as appropriate for the media type, as specified in [I-D.ietf-rtcweb-security-arch] The SDES SRTP keying mechanism from [RFC4568] MUST NOT be used, as discussed in [I-D.ietf-rtcweb-security-arch]. 5.1.2. Profile Names and Interoperability For media m= sections, JSEP implementations MUST support the - "UDP/TLS/RTP/SAVPF" profile specified in [RFC7850], and MUST indicate + "UDP/TLS/RTP/SAVPF" profile specified in [RFC5764], and MUST indicate this profile for each media m= line they produce in an offer. For data m= sections, implementations MUST support the "UDP/DTLS/SCTP" profile and MUST indicate this profile for each data m= line they - produce in an offer. Because ICE can select either UDP [RFC5245] or - TCP [RFC6544] transport depending on network conditions, this - advertisement is consistent with ICE eventually selecting either - either UDP or TCP. + produce in an offer. Although these profiles are formally associated + with UDP, ICE can select either UDP [RFC5245] or TCP [RFC6544] + transport depending on network conditions, even when advertising a + UDP profile. Unfortunately, in an attempt at compatibility, some endpoints generate other profile strings even when they mean to support one of these profiles. For instance, an endpoint might generate "RTP/AVP" but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its willingness to support "UDP/TLS/RTP/SAVPF" or "TCP/TLS/RTP/SAVPF". In order to simplify compatibility with such endpoints, JSEP implementations MUST follow the following rules when processing the media m= sections in a received offer: @@ -1658,21 +1675,24 @@ is present, assume AVPF timing, i.e., a default value of "trr- int=0". Otherwise, assume that AVPF is being used in an AVP compatible mode and use a value of "trr-int=4000". o For data m= sections, implementations MUST support receiving the "UDP/DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards compatibility) profiles. Note that re-offers by JSEP implementations MUST use the correct profile strings even if the initial offer/answer exchange used an - (incorrect) older profile string. + (incorrect) older profile string. This simplifies JSEP behavior, + with minimal downside, as any remote endpoint that fails to handle + such a re-offer will also fail to handle a JSEP endpoint's initial + offer. 5.2. Constructing an Offer When createOffer is called, a new SDP description must be created that includes the functionality specified in [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are explained below. 5.2.1. Initial Offers @@ -1681,29 +1701,31 @@ The first step in generating an initial offer is to generate session- level attributes, as specified in [RFC4566], Section 5. Specifically: o The first SDP line MUST be "v=0", as specified in [RFC4566], Section 5.1 o The second SDP line MUST be an "o=" line, as specified in [RFC4566], Section 5.2. The value of the field SHOULD - be "-". [RFC3264] requires that the be representable as - a 64-bit signed integer. It is RECOMMENDED that the be - generated as a 64-bit quantity with the high bit being sent to - zero and the remaining 63 bits being cryptographically random. - The value of the tuple - SHOULD be set to a non-meaningful address, such as IN IP4 0.0.0.0, - to prevent leaking the local address in this field. As mentioned - in [RFC4566], the entire o= line needs to be unique, but selecting - a random number for is sufficient to accomplish this. + be "-". The sess-id MUST be representable by a 64-bit signed + integer, and the initial value MUST be less than (2**62)-1, as + required by [RFC3264]. It is RECOMMENDED that the sess-id be + constructed by generating a 64-bit quantity with the two highest + bits being set to zero and the remaining 62 bits being + cryptographically random. The value of the + tuple SHOULD be set to a non-meaningful address, + such as IN IP4 0.0.0.0, to prevent leaking the local address in + this field. As mentioned in [RFC4566], the entire o= line needs + to be unique, but selecting a random number for is + sufficient to accomplish this. o The third SDP line MUST be a "s=" line, as specified in [RFC4566], Section 5.3; to match the "o=" line, a single dash SHOULD be used as the session name, e.g. "s=-". Note that this differs from the advice in [RFC4566] which proposes a single space, but as both "o=" and "s=" are meaningless in JSEP, having the same meaningless value seems clearer. o Session Information ("i="), URI ("u="), Email Address ("e="), Phone Number ("p="), Repeat Times ("r="), and Time Zones ("z=") @@ -2309,27 +2329,25 @@ a=mid:v1 a=recvonly The Section 7.2 example later in this document shows a more involved case of "LS" group generation. The next step is to generate m= sections for each m= section that is present in the remote offer, as specified in [RFC3264], Section 6. For the purposes of this discussion, any session-level attributes in the offer that are also valid as media-level attributes are - considered to be present in each m= section. - - The next step is to go through each offered m= section. Each offered - m= section will have an associated RtpTransceiver, as described in - Section 5.9. If there are more RtpTransceivers than there are m= - sections, the unmatched RtpTransceivers will need to be associated in - a subsequent offer. + considered to be present in each m= section. Each offered m= section + will have an associated RtpTransceiver, as described in Section 5.9. + If there are more RtpTransceivers than there are m= sections, the + unmatched RtpTransceivers will need to be associated in a subsequent + offer. For each offered m= section, if any of the following conditions are true, the corresponding m= section in the answer MUST be marked as rejected by setting the port in the m= line to zero, as indicated in [RFC3264], Section 6, and further processing for this m= section can be skipped: o The associated RtpTransceiver has been stopped. o None of the offered media formats are supported and, if @@ -2611,32 +2628,27 @@ not offer it, making silence suppression support bilateral even with non-JSEP endpoints. 5.4. Modifying an Offer or Answer The SDP returned from createOffer or createAnswer MUST NOT be changed before passing it to setLocalDescription. If precise control over the SDP is needed, the aforementioned createOffer/createAnswer options or RtpTransceiver APIs MUST be used. - Note that the application MAY modify the SDP to reduce the - capabilities in the offer it sends to the far side (post- - setLocalDescription) or the offer that it installs from the far side - (pre-setRemoteDescription), as long as it remains a valid SDP offer - and specifies a subset of what was in the original offer. This is - safe because the answer is not permitted to expand capabilities, and - therefore will just respond to what is present in the offer. - - The application SHOULD NOT modify the SDP in the answer it transmits, - as the answer contains the negotiated capabilities, and this can - cause the two sides to have different ideas about what exactly was - negotiated. + After calling setLocalDescription with an offer or answer, the + application MAY modify the SDP to reduce its capabilities before + sending it to the far side, as long as it follows the rules above + that define a valid JSEP offer or answer. Likewise, an application + that has received an offer or answer from a peer MAY modify the + received SDP, subject to the same constraints, before calling + setRemoteDescription. As always, the application is solely responsible for what it sends to the other party, and all incoming SDP will be processed by the JSEP implementation to the extent of its capabilities. It is an error to assume that all SDP is well-formed; however, one should be able to assume that any implementation of this specification will be able to process, as a remote offer or answer, unmodified SDP coming from any other implementation of this specification. 5.5. Processing a Local Description @@ -2952,26 +2964,25 @@ present. o All RID values referenced in an "a=simulcast" line MUST exist as "a=rid" lines. o Each m= section is also checked to ensure prohibited features are not used. o If the RTP/RTCP multiplexing policy is "require", each m= section MUST contain an "a=rtcp-mux" attribute. If an m= section contains - an "a=rtcp-mux-only" attribute then that section MUST also contain - an "a=rtcp-mux" attribute. + an "a=rtcp-mux-only" attribute, that section MUST also contain an + "a=rtcp-mux" attribute. - o If this m= section was present in the previous answer then the - state of RTP/RTCP multiplexing MUST match what was previously - negotiated. + o If an m= section was present in the previous answer, the state of + RTP/RTCP multiplexing MUST match what was previously negotiated. If this session description is of type "pranswer" or "answer", the following additional checks are applied: o The session description must follow the rules defined in [RFC3264], Section 6, including the requirement that the number of m= sections MUST exactly match the number of m= sections in the associated offer. o For each m= section, the media type and protocol values MUST @@ -2980,21 +2991,21 @@ If any of the preceding checks failed, processing MUST stop and an error MUST be returned. 5.8. Applying a Local Description The following steps are performed at the media engine level to apply a local description. If an error is returned, the session MUST be restored to the state it was in before performing these steps. - Next, m= sections are processed. For each m= section, the following + First, m= sections are processed. For each m= section, the following steps MUST be performed; if any parameters are out of bounds, or cannot be applied, processing MUST stop and an error MUST be returned. o If this m= section is new, begin gathering candidates for it, as defined in [RFC5245], Section 4.1.1, unless it is definitively being bundled (either this is an offer and the m= section is marked bundle-only, or it is an answer and the m= section is bundled into into another m= section.) @@ -3076,26 +3087,27 @@ o Any "AS" bandwidth value MUST be ignored, as the meaning of this construct at the session level is not well defined. For each m= section, the following steps MUST be performed; if any parameters are out of bounds, or cannot be applied, processing MUST stop and an error MUST be returned. o If the ICE ufrag or password changed from the previous remote description: [RFC5245]. - * If the description is of type "offer", note that an ICE restart - is needed, as described in [RFC5245], Section 9.1.1.1 . + * If the description is of type "offer", the implementation MUST + note that an ICE restart is needed, as described in [RFC5245], + Section 9.1.1.1. * If the description is of type "answer" or "pranswer", then check to see if the current local description is an ICE - restart, and if not, generate an error. It the PeerConnection + restart, and if not, generate an error. If the PeerConnection state is "have-remote-pranswer", and the ICE ufrag or password changed from the previous provisional answer, then signal the ICE agent to discard any previous ICE check list state for the m= section. Finally, signal the ICE agent to begin checks as described in [RFC5245], Section 9.3.1.1. o If the current local description indicates an ICE restart, and either the ICE ufrag or password has not changed from the previous remote description, as prescribed by [RFC5245], Section 9.2.1.1, generate an error. @@ -3211,23 +3224,23 @@ suppression for all supported media formats with the same clockrate, as described in [RFC3389], Section 5, except for formats that have their own internal silence suppression mechanisms. Silence suppression for such formats (e.g., Opus) is controlled via fmtp parameters, as discussed in Section 5.2.3.2. + For each specified "telephone-event" media format, enable DTMF transmission for all supported media formats with the same clockrate, as described in [RFC4733], Section 2.5.1.2. - If the application attempts to transmit DTMF when using a - media format that does not have a corresponding telephone- - event format, this MUST result in an error. + If there are any supported media formats that do not have a + corresponding telephone-event format, disable DTMF + transmission for those formats. + For any specified "ptime" value, configure the available media formats to use the specified packet size when sending. If the specified size is not supported for a media format, use the next closest value instead. Finally, if this description is of type "pranswer" or "answer", follow the processing defined in Section 5.10 below. 5.10. Applying an Answer @@ -4345,91 +4358,93 @@ ac701365-eb06-42df-cc93-7f22bc308789 8. Security Considerations The IETF has published separate documents [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing the security architecture for WebRTC as a whole. The remainder of this section describes security considerations for this document. While formally the JSEP interface is an API, it is better to think of - it is an Internet protocol, with the JS being untrustworthy from the - perspective of the endpoint. Thus, the threat model of [RFC3552] - applies. In particular, JS can call the API in any order and with - any inputs, including malicious ones. This is particularly relevant - when we consider the SDP which is passed to setLocalDescription(). - While correct API usage requires that the application pass in SDP - which was derived from createOffer() or createAnswer(), there is no - guarantee that applications do so. The JSEP implementation MUST be - prepared for the JS to pass in bogus data instead. + it is an Internet protocol, with the application JavaScript being + untrustworthy from the perspective of the JSEP implementation. Thus, + the threat model of [RFC3552] applies. In particular, JavaScript can + call the API in any order and with any inputs, including malicious + ones. This is particularly relevant when we consider the SDP which + is passed to setLocalDescription(). While correct API usage requires + that the application pass in SDP which was derived from createOffer() + or createAnswer(), there is no guarantee that applications do so. + The JSEP implementation MUST be prepared for the JavaScript to pass + in bogus data instead. - Conversely, the application programmer MUST recognize that the JS - does not have complete control of endpoint behavior. One case that - bears particular mention is that editing ICE candidates out of the - SDP or suppressing trickled candidates does not have the expected - behavior: implementations will still perform checks from those - candidates even if they are not sent to the other side. Thus, for - instance, it is not possible to prevent the remote peer from learning - your public IP address by removing server reflexive candidates. - Applications which wish to conceal their public IP address should - instead configure the ICE agent to use only relay candidates. + Conversely, the application programmer needs to be aware that the + JavaScript does not have complete control of endpoint behavior. One + case that bears particular mention is that editing ICE candidates out + of the SDP or suppressing trickled candidates does not have the + expected behavior: implementations will still perform checks from + those candidates even if they are not sent to the other side. Thus, + for instance, it is not possible to prevent the remote peer from + learning your public IP address by removing server reflexive + candidates. Applications which wish to conceal their public IP + address should instead configure the ICE agent to use only relay + candidates. 9. IANA Considerations This document requires no actions from IANA. 10. Acknowledgements Harald Alvestrand, Taylor Brandstetter, Suhas Nandakumar, and Peter Thatcher provided significant text for this draft. Bernard Aboba, Adam Bergkvist, Dan Burnett, Ben Campbell, Alissa Cooper, Richard Ejzak, Stefan Hakansson, Ted Hardie, Christer Holmberg Andrew Hutton, - Randell Jesup, Matthew Kaufman, Anant Narayanan, Adam Roach, Neil - Stratford, Martin Thomson, Sean Turner, and Magnus Westerlund all - provided valuable feedback on this proposal. + Randell Jesup, Matthew Kaufman, Anant Narayanan, Adam Roach, Robert + Sparks, Neil Stratford, Martin Thomson, Sean Turner, and Magnus + Westerlund all provided valuable feedback on this proposal. 11. References 11.1. Normative References [I-D.ietf-avtext-rid] Roach, A., Nandakumar, S., and P. Thatcher, "RTP Stream Identifier Source Description (SDES)", draft-ietf-avtext- rid-09 (work in progress), October 2016. [I-D.ietf-ice-trickle] Ivov, E., Rescorla, E., Uberti, J., and P. Saint-Andre, "Trickle ICE: Incremental Provisioning of Candidates for the Interactive Connectivity Establishment (ICE) - Protocol", draft-ietf-ice-trickle-12 (work in progress), - June 2017. + Protocol", draft-ietf-ice-trickle-13 (work in progress), + July 2017. [I-D.ietf-mmusic-dtls-sdp] Holmberg, C. and R. Shpount, "Using the SDP Offer/Answer - Mechanism for DTLS", draft-ietf-mmusic-dtls-sdp-26 (work - in progress), June 2017. + Mechanism for DTLS", draft-ietf-mmusic-dtls-sdp-28 (work + in progress), August 2017. [I-D.ietf-mmusic-msid] Alvestrand, H., "WebRTC MediaStream Identification in the Session Description Protocol", draft-ietf-mmusic-msid-16 (work in progress), February 2017. [I-D.ietf-mmusic-mux-exclusive] Holmberg, C., "Indicating Exclusive Support of RTP/RTCP Multiplexing using SDP", draft-ietf-mmusic-mux- exclusive-12 (work in progress), May 2017. [I-D.ietf-mmusic-rid] Thatcher, P., Zanaty, M., Nandakumar, S., Burman, B., Roach, A., and B. Campen, "RTP Payload Format - Restrictions", draft-ietf-mmusic-rid-10 (work in - progress), March 2017. + Restrictions", draft-ietf-mmusic-rid-11 (work in + progress), July 2017. [I-D.ietf-mmusic-sctp-sdp] Holmberg, C., Shpount, R., Loreto, S., and G. Camarillo, "Session Description Protocol (SDP) Offer/Answer Procedures For Stream Control Transmission Protocol (SCTP) over Datagram Transport Layer Security (DTLS) Transport.", draft-ietf-mmusic-sctp-sdp-26 (work in progress), April 2017. [I-D.ietf-mmusic-sdp-bundle-negotiation] @@ -4439,26 +4454,26 @@ negotiation-38 (work in progress), April 2017. [I-D.ietf-mmusic-sdp-mux-attributes] Nandakumar, S., "A Framework for SDP Attributes when Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-16 (work in progress), December 2016. [I-D.ietf-mmusic-sdp-simulcast] Burman, B., Westerlund, M., Nandakumar, S., and M. Zanaty, "Using Simulcast in SDP and RTP Sessions", draft-ietf- - mmusic-sdp-simulcast-08 (work in progress), March 2017. + mmusic-sdp-simulcast-10 (work in progress), July 2017. [I-D.ietf-rtcweb-fec] Uberti, J., "WebRTC Forward Error Correction - Requirements", draft-ietf-rtcweb-fec-05 (work in - progress), May 2017. + Requirements", draft-ietf-rtcweb-fec-06 (work in + progress), July 2017. [I-D.ietf-rtcweb-rtp-usage] Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time Communication (WebRTC): Media Transport and Use of RTP", draft-ietf-rtcweb-rtp-usage-26 (work in progress), March 2016. [I-D.ietf-rtcweb-security] Rescorla, E., "Security Considerations for WebRTC", draft- ietf-rtcweb-security-08 (work in progress), February 2015. @@ -4468,219 +4483,219 @@ rtcweb-security-arch-12 (work in progress), June 2016. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, - DOI 10.17487/RFC3261, June 2002, - . + DOI 10.17487/RFC3261, June 2002, . [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, - DOI 10.17487/RFC3264, June 2002, - . + DOI 10.17487/RFC3264, June 2002, . [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, - DOI 10.17487/RFC3552, July 2003, - . + DOI 10.17487/RFC3552, July 2003, . [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute in Session Description Protocol (SDP)", RFC 3605, - DOI 10.17487/RFC3605, October 2003, - . + DOI 10.17487/RFC3605, October 2003, . [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, DOI 10.17487/RFC3711, March 2004, - . + . [RFC3890] Westerlund, M., "A Transport Independent Bandwidth Modifier for the Session Description Protocol (SDP)", RFC 3890, DOI 10.17487/RFC3890, September 2004, - . + . [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in the Session Description Protocol (SDP)", RFC 4145, - DOI 10.17487/RFC4145, September 2005, - . + DOI 10.17487/RFC4145, September 2005, . [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, DOI 10.17487/RFC4566, - July 2006, . + July 2006, . [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, - DOI 10.17487/RFC4585, July 2006, - . + DOI 10.17487/RFC4585, July 2006, . [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February - 2008, . + 2008, . [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols", RFC 5245, - DOI 10.17487/RFC5245, April 2010, - . + DOI 10.17487/RFC5245, April 2010, . [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July - 2008, . + 2008, . [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and Control Packets on a Single Port", RFC 5761, - DOI 10.17487/RFC5761, April 2010, - . + DOI 10.17487/RFC5761, April 2010, . [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description Protocol (SDP) Grouping Framework", RFC 5888, - DOI 10.17487/RFC5888, June 2010, - . + DOI 10.17487/RFC5888, June 2010, . [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image Attributes in the Session Description Protocol (SDP)", RFC 6236, DOI 10.17487/RFC6236, May 2011, - . + . [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, - January 2012, . + January 2012, . [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, - September 2012, . + September 2012, . [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure Real-time Transport Protocol (SRTP)", RFC 6904, - DOI 10.17487/RFC6904, April 2013, - . + DOI 10.17487/RFC6904, April 2013, . [RFC7160] Petit-Huguenin, M. and G. Zorn, Ed., "Support for Multiple Clock Rates in an RTP Session", RFC 7160, - DOI 10.17487/RFC7160, April 2014, - . + DOI 10.17487/RFC7160, April 2014, . [RFC7587] Spittka, J., Vos, K., and JM. Valin, "RTP Payload Format for the Opus Speech and Audio Codec", RFC 7587, - DOI 10.17487/RFC7587, June 2015, - . + DOI 10.17487/RFC7587, June 2015, . [RFC7742] Roach, A., "WebRTC Video Processing and Codec Requirements", RFC 7742, DOI 10.17487/RFC7742, March 2016, - . + . [RFC7850] Nandakumar, S., "Registering Values of the SDP 'proto' Field for Transporting RTP Media over TCP under Various RTP Profiles", RFC 7850, DOI 10.17487/RFC7850, April 2016, - . + . [RFC7874] Valin, JM. and C. Bran, "WebRTC Audio Codec and Processing Requirements", RFC 7874, DOI 10.17487/RFC7874, May 2016, - . + . [RFC8108] Lennox, J., Westerlund, M., Wu, Q., and C. Perkins, "Sending Multiple RTP Streams in a Single RTP Session", RFC 8108, DOI 10.17487/RFC8108, March 2017, - . + . [RFC8122] Lennox, J. and C. Holmberg, "Connection-Oriented Media Transport over the Transport Layer Security (TLS) Protocol in the Session Description Protocol (SDP)", RFC 8122, - DOI 10.17487/RFC8122, March 2017, - . + DOI 10.17487/RFC8122, March 2017, . 11.2. Informative References [I-D.ietf-mmusic-trickle-ice-sip] Ivov, E., Stach, T., Marocco, E., and C. Holmberg, "A Session Initiation Protocol (SIP) usage for Trickle ICE", - draft-ietf-mmusic-trickle-ice-sip-07 (work in progress), - March 2017. + draft-ietf-mmusic-trickle-ice-sip-08 (work in progress), + July 2017. [I-D.ietf-rtcweb-ip-handling] Uberti, J. and G. Shieh, "WebRTC IP Address Handling - Requirements", draft-ietf-rtcweb-ip-handling-03 (work in - progress), January 2017. + Requirements", draft-ietf-rtcweb-ip-handling-04 (work in + progress), July 2017. [I-D.ietf-rtcweb-sdp] Nandakumar, S. and C. Jennings, "Annotated Example SDP for WebRTC", draft-ietf-rtcweb-sdp-06 (work in progress), April 2017. [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389, - September 2002, . + September 2002, . [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556, DOI 10.17487/RFC3556, July 2003, - . + . [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing Tone Generation in the Session Initiation Protocol (SIP)", RFC 3960, DOI 10.17487/RFC3960, December 2004, - . + . [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session Description Protocol (SDP) Security Descriptions for Media Streams", RFC 4568, DOI 10.17487/RFC4568, July 2006, - . + . [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. Hakenberg, "RTP Retransmission Payload Format", RFC 4588, - DOI 10.17487/RFC4588, July 2006, - . + DOI 10.17487/RFC4588, July 2006, . [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF Digits, Telephony Tones, and Telephony Signals", RFC 4733, - DOI 10.17487/RFC4733, December 2006, - . + DOI 10.17487/RFC4733, December 2006, . [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size Real-Time Transport Control Protocol (RTCP): Opportunities and Consequences", RFC 5506, DOI 10.17487/RFC5506, April - 2009, . + 2009, . [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009, - . + . [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework for Establishing a Secure Real-time Transport Protocol (SRTP) Security Context Using Datagram Transport Layer Security (DTLS)", RFC 5763, DOI 10.17487/RFC5763, May - 2010, . + 2010, . [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer Security (DTLS) Extension to Establish Keys for the Secure Real-time Transport Protocol (SRTP)", RFC 5764, - DOI 10.17487/RFC5764, May 2010, - . + DOI 10.17487/RFC5764, May 2010, . [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time Transport Protocol (RTP) Header Extension for Client-to- Mixer Audio Level Indication", RFC 6464, - DOI 10.17487/RFC6464, December 2011, - . + DOI 10.17487/RFC6464, December 2011, . [RFC6544] Rosenberg, J., Keranen, A., Lowekamp, B., and A. Roach, "TCP Candidates with Interactive Connectivity Establishment (ICE)", RFC 6544, DOI 10.17487/RFC6544, - March 2012, . + March 2012, . [TS26.114] 3GPP TS 26.114 V12.8.0, "3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; IP Multimedia Subsystem (IMS); Multimedia Telephony; Media handling and interaction (Release 12)", December 2014, . [W3C.webrtc] Bergkvist, A., Burnett, D., Jennings, C., Narayanan, A., @@ -4728,21 +4743,38 @@ | rid | [I-D.ietf-mmusic-rid] Section 10 | | simulcast | [I-D.ietf-mmusic-sdp-simulcast] Section | | | 6.1 | | tls-id | [I-D.ietf-mmusic-dtls-sdp] Section 4 | +------------------------+------------------------------------------+ Table 1: SDP ABNF References Appendix B. Change log - Note: This section will be removed by RFC Editor before publication. + Note to RFC Editor: Please remove this section before publication. + + Changes in draft-22: + + o Clarify currentDirection versus direction. + + o Correct session-id text so that it aligns with RFC 3264. + + o Clarify that generated ICE candidate objects must have all four + fields. + + o Make rollback work from any state besides stable and regardless of + whether setLocalDescription or setRemoteDescription is used. + + o Allow modifying SDP before sending or after receiving either + offers or answers (previously this was forbidden for answers). + + o Provide rationale for several design choices. Changes in draft-21: o Change dtls-id to tls-id to match MMUSIC draft. o Replace regular expression for proto field with a list and clarify that the answer must exactly match the offer. o Remove text about how to error check on setLocal because local descriptions cannot be changed.