--- 1/draft-ietf-rtcweb-security-arch-01.txt 2012-06-05 19:14:11.561341769 +0200 +++ 2/draft-ietf-rtcweb-security-arch-02.txt 2012-06-05 19:14:11.597342957 +0200 @@ -1,35 +1,35 @@ RTCWEB E. Rescorla Internet-Draft RTFM, Inc. -Intended status: Standards Track March 12, 2012 -Expires: September 13, 2012 +Intended status: Standards Track June 5, 2012 +Expires: December 7, 2012 RTCWEB Security Architecture - draft-ietf-rtcweb-security-arch-01 + draft-ietf-rtcweb-security-arch-02 Abstract The Real-Time Communications on the Web (RTCWEB) working group is - tasked with standardizing protocols for real-time communications - between Web browsers. The major use cases for RTCWEB technology are - real-time audio and/or video calls, Web conferencing, and direct data - transfer. Unlike most conventional real-time systems (e.g., SIP- - based soft phones) RTCWEB communications are directly controlled by - some Web server, which poses new security challenges. For instance, - a Web browser might expose a JavaScript API which allows a server to - place a video call. Unrestricted access to such an API would allow - any site which a user visited to "bug" a user's computer, capturing - any activity which passed in front of their camera. [I-D.ietf- - rtcweb-security] defines the RTCWEB threat model. This document - defines an architecture which provides security within that threat - model. + tasked with standardizing protocols for enabling real-time + communications within user-agents using web technologies (e.g + JavaScript). The major use cases for RTCWEB technology are real-time + audio and/or video calls, Web conferencing, and direct data transfer. + Unlike most conventional real-time systems (e.g., SIP-based soft + phones) RTCWEB communications are directly controlled by some Web + server, which poses new security challenges. For instance, a Web + browser might expose a JavaScript API which allows a server to place + a video call. Unrestricted access to such an API would allow any + site which a user visited to "bug" a user's computer, capturing any + activity which passed in front of their camera. [I-D.ietf-rtcweb- + security] defines the RTCWEB threat model. This document defines an + architecture which provides security within that threat model. Legal THIS DOCUMENT AND THE INFORMATION CONTAINED THEREIN ARE PROVIDED ON AN "AS IS" BASIS AND THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST, AND THE INTERNET ENGINEERING TASK FORCE, DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS @@ -43,21 +43,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on September 13, 2012. + This Internet-Draft will expire on December 7, 2012. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -93,27 +93,27 @@ 4.4. Communications and Consent Freshness . . . . . . . . . . . 10 5. Detailed Technical Description . . . . . . . . . . . . . . . . 10 5.1. Origin and Web Security Issues . . . . . . . . . . . . . . 10 5.2. Device Permissions Model . . . . . . . . . . . . . . . . . 11 5.3. Communications Consent . . . . . . . . . . . . . . . . . . 12 5.4. IP Location Privacy . . . . . . . . . . . . . . . . . . . 13 5.5. Communications Security . . . . . . . . . . . . . . . . . 13 5.6. Web-Based Peer Authentication . . . . . . . . . . . . . . 15 6. Security Considerations . . . . . . . . . . . . . . . . . . . 16 6.1. Communications Security . . . . . . . . . . . . . . . . . 16 - 6.2. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 6.2. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.3. Denial of Service . . . . . . . . . . . . . . . . . . . . 17 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 8.1. Normative References . . . . . . . . . . . . . . . . . . . 18 8.2. Informative References . . . . . . . . . . . . . . . . . . 19 - Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19 + Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 1. Introduction The Real-Time Communications on the Web (RTCWEB) working group is tasked with standardizing protocols for real-time communications between Web browsers. The major use cases for RTCWEB technology are real-time audio and/or video calls, Web conferencing, and direct data transfer. Unlike most conventional real-time systems, (e.g., SIP- based[RFC3261] soft phones) RTCWEB communications are directly controlled by some Web server, as shown in Figure 1. @@ -436,21 +436,22 @@ functionality from mixed content pages. It is RECOMMENDED that browsers which allow active mixed content nevertheless disable RTCWEB functionality in mixed content settings. [[ OPEN ISSUE: Should this be a 2119 MUST? It's not clear what set of conditions would make this OK, other than that browser manufacturers have traditionally been permissive here here.]] Note that it is possible for a page which was not mixed content to become mixed content during the duration of the call. Implementations MAY choose to terminate the call or display a warning at that point, but it is also permissible to ignore this condition. This is a deliberate implementation - complexity versus security tradeoff. + complexity versus security tradeoff. [[ OPEN ISSUE:: Should we be + more aggressive about this?]] 5.2. Device Permissions Model Implementations MUST obtain explicit user consent prior to providing access to the camera and/or microphone. Implementations MUST at minimum support the following two permissions models: o Requests for one-time camera/microphone access. o Requests for permanent access. @@ -512,25 +513,27 @@ directly verifiable. Section 5.5 provides more on this. 5.3. Communications Consent Browser client implementations of RTCWEB MUST implement ICE. Server gateway implementations which operate only at public IP addresses may implement ICE-Lite. Browser implementations MUST verify reachability via ICE prior to sending any non-ICE packets to a given destination. Implementations - MUST NOT provide the ICE transaction ID to JavaScript. [Note: this - document takes no position on the split between ICE in JS and ICE in - the browser. The above text is written the way it is for editorial - convenience and will be modified appropriately if the WG decides on - ICE in the JS.] + MUST NOT provide the ICE transaction ID to JavaScript during the + lifetime of the transaction (i.e., during the period when the ICE + stack would accept a new response for that transaction). [Note: + this document takes no position on the split between ICE in JS and + ICE in the browser. The above text is written the way it is for + editorial convenience and will be modified appropriately if the WG + decides on ICE in the JS.] Implementations MUST send keepalives no less frequently than every 30 seconds regardless of whether traffic is flowing or not. If a keepalive fails then the implementation MUST either attempt to find a new valid path via ICE or terminate media for that ICE component. Note that ICE [RFC5245]; Section 10 keepalives use STUN Binding Indications which are one-way and therefore not sufficient. Instead, the consent freshness mechanism [I-D.muthu-behave-consent-freshness] MUST be used. @@ -559,21 +562,22 @@ immediately on incoming call notification, thus reducing post-dial delay, but also to avoid disclosing the user's IP address until they have decided to answer. 5.5. Communications Security Implementations MUST implement DTLS [RFC4347] and DTLS-SRTP [RFC5763][RFC5764]. All data channels MUST be secured via DTLS. DTLS-SRTP MUST be offered for every media channel and MUST be the default; i.e., if an implementation receives an offer for DTLS-SRTP - and SDES and/or plain RTP, DTLS-SRTP MUST be selected. + and SDES, DTLS-SRTP MUST be selected. Media traffic MUST NOT be sent + over plain (unencrypted) RTP. [OPEN ISSUE: What should the settings be here? MUST?] Implementations MAY support SDES and RTP for media traffic for backward compatibility purposes. API Requirement: The API MUST provide a mechanism to indicate that a fresh DTLS key pair is to be generated for a specific call. This is intended to allow for unlinkability. Note that there are also settings where it is attractive to use the same keying material repeatedly, especially those with key continuity-based @@ -594,40 +598,37 @@ The following properties SHOULD be displayed "up-front" in the browser chrome, i.e., without requiring the user to ask for them: * A client MUST provide a user interface through which a user may determine the security characteristics for currently-displayed audio and video stream(s) * A client MUST provide a user interface through which a user may determine the security characteristics for transmissions of their microphone audio and camera video. * The "security characteristics" MUST include an indication as to - whether or not the transmission is cryptographically protected - and whether that protection is based on a key that was - delivered out-of-band (from a server) or was generated as a - result of a pairwise negotiation. - * If the far endpoint was directly verified (see Section 5.6) the - "security characteristics" MUST include the verified - information. + whether the cryptographic keys were delivered out-of-band (from + a server) or were generated as a result of a pairwise + negotiation. + * If the far endpoint was directly verified, either via a third- + party verifiable X.509 certificate or via a Web IdP mechanism + (see Section 5.6) the "security characteristics" MUST include + the verified information. The following properties are more likely to require some "drill- down" from the user: - * If the transmission is cryptographically protected, the The - algorithms in use (For example: "AES-CBC" or "Null Cipher".) - * If the transmission is cryptographically protected, the - "security characteristics" MUST indicate whether PFS is + * The cryptographic algorithms in use (For example: "AES-CBC" or + "Null Cipher".) + * The "security characteristics" MUST indicate whether PFS is provided. - - * If the transmission is cryptographically protected via an end- - to-end mechanism the "security characteristics" MUST include - some mechanism to allow an out-of-band verification of the - peer, such as a certificate fingerprint or an SAS. + * The "security characteristics" MUST include some mechanism to + allow an out-of-band verification of the peer, such as a + certificate fingerprint or an SAS. 5.6. Web-Based Peer Authentication In a number of cases, it is desirable for the endpoint (i.e., the browser) to be able to directly identity the endpoint on the other side without trusting only the signaling service to which they are connected. For instance, users may be making a call via a federated system where they wish to get direct authentication of the other side. Alternately, they may be making a call on a site which they minimally trust (such as a poker site) but to someone who has an @@ -692,21 +693,27 @@ DTLS-SRTP devices will need to happen through gateways. Even if only DTLS/DTLS-SRTP are used, the signaling server can potentially mount a man-in-the-middle attack unless implementations have some mechanism for independently verifying keys. The UI requirements in Section 5.5 are designed to provide such a mechanism for motivated/security conscious users, but are not suitable for general use. The identity service mechanisms in Section 5.6 are more suitable for general use. Note, however, that a malicious signaling service can strip off any such identity assertions, though it cannot - forge new ones. + forge new ones. Note that all of the third-party security mechanisms + available (whether X.509 certificates or a third-party IdP) rely on + the security of the third party--this is of course also true of your + connection to the Web site itself. Users who wish to assure + themselves of security against a malicious identity provider MUST + verify peer credentials directly, e.g., by checking the peer's + fingerprint against a value delivered out of band. 6.2. Privacy The requirements in this document are intended to allow: o Users to participate in calls without revealing their location. o Potential callees to avoid revealing their location and even presence status prior to agreeing to answer a call. However, these privacy protections come at a performance cost in @@ -747,43 +754,52 @@ (which the attacker cannot control). Another related attack is for the signaling service to swap the ICE candidates for the audio and video streams, thus forcing a browser to send video to the sink that the other victim expects will contain audio (perhaps it is only expecting audio!) potentially causing overload. Muxing multiple media flows over a single transport makes it harder to individually suppress a single flow by denying ICE keepalives. Media-level (RTCP) mechanisms must be used in this case. - [TODO: Write up Magnus's ICE forking attack when we get some clarity - on it.] + Yet another attack, suggested by Magnus Westerlund, is for the + attacker to cross-connect offers and answers as follows. It induces + the victim to make a call and then uses its control of other users + browsers to get them to attempt a call to someone. It then + translates their offers into apparent answers to the victim, which + looks like large-scale parallel forking. The victim still responds + to ICE responses and now the browsers all try to send media to the + victim. [[ OPEN ISSUE: How do we address this? ]] + + [TODO: Should we have a mechanism for verifying total expected + bandwidth] Note that attacks based on confusing one end or the other about consent are possible primarily even in the face of the third-party identity mechanism as long as major parts of the signaling messages are not signed. On the other hand, signing the entire message severely restricts the capabilities of the calling application, so there are difficult tradeoffs here. 7. Acknowledgements - Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan, - Matthew Kaufman, Magnus Westerland. + Bernard Aboba, Harald Alvestrand, Dan Druta, Cullen Jennings, Hadriel + Kaplan, Matthew Kaufman, Martin Thomson, Magnus Westerland. 8. References 8.1. Normative References [I-D.ietf-rtcweb-security] Rescorla, E., "Security Considerations for RTC-Web", - draft-ietf-rtcweb-security-01 (work in progress), - October 2011. + draft-ietf-rtcweb-security-02 (work in progress), + March 2012. [I-D.muthu-behave-consent-freshness] Perumal, M., Wing, D., and H. Kaplan, "STUN Usage for Consent Freshness", draft-muthu-behave-consent-freshness-00 (work in progress), March 2012. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.