draft-ietf-rtcweb-security-01.txt | draft-ietf-rtcweb-security-02.txt | |||
---|---|---|---|---|
RTC-Web E. Rescorla | RTC-Web E. Rescorla | |||
Internet-Draft RTFM, Inc. | Internet-Draft RTFM, Inc. | |||
Intended status: Standards Track October 30, 2011 | Intended status: Standards Track March 12, 2012 | |||
Expires: May 2, 2012 | Expires: September 13, 2012 | |||
Security Considerations for RTC-Web | Security Considerations for RTC-Web | |||
draft-ietf-rtcweb-security-01 | draft-ietf-rtcweb-security-02 | |||
Abstract | Abstract | |||
The Real-Time Communications on the Web (RTC-Web) working group is | The Real-Time Communications on the Web (RTC-Web) working group is | |||
tasked with standardizing protocols for real-time communications | tasked with standardizing protocols for real-time communications | |||
between Web browsers. The major use cases for RTC-Web technology are | between Web browsers. The major use cases for RTC-Web technology are | |||
real-time audio and/or video calls, Web conferencing, and direct data | real-time audio and/or video calls, Web conferencing, and direct data | |||
transfer. Unlike most conventional real-time systems (e.g., SIP- | transfer. Unlike most conventional real-time systems (e.g., SIP- | |||
based soft phones) RTC-Web communications are directly controlled by | based soft phones) RTC-Web communications are directly controlled by | |||
some Web server, which poses new security challenges. For instance, | some Web server, which poses new security challenges. For instance, | |||
skipping to change at page 2, line 7 | skipping to change at page 2, line 7 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on May 2, 2012. | This Internet-Draft will expire on September 13, 2012. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2012 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
skipping to change at page 3, line 7 | skipping to change at page 3, line 7 | |||
modifications of such material outside the IETF Standards Process. | modifications of such material outside the IETF Standards Process. | |||
Without obtaining an adequate license from the person(s) controlling | Without obtaining an adequate license from the person(s) controlling | |||
the copyright in such materials, this document may not be modified | the copyright in such materials, this document may not be modified | |||
outside the IETF Standards Process, and derivative works of it may | outside the IETF Standards Process, and derivative works of it may | |||
not be created outside the IETF Standards Process, except to format | not be created outside the IETF Standards Process, except to format | |||
it for publication as an RFC or to translate it into languages other | it for publication as an RFC or to translate it into languages other | |||
than English. | than English. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 6 | 3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 5 | |||
3.1. Access to Local Resources . . . . . . . . . . . . . . . . 7 | 3.1. Access to Local Resources . . . . . . . . . . . . . . . . 6 | |||
3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 7 | 3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 6 | |||
3.3. Bypassing SOP: CORS, WebSockets, and consent to | 3.3. Bypassing SOP: CORS, WebSockets, and consent to | |||
communicate . . . . . . . . . . . . . . . . . . . . . . . 8 | communicate . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
4. Security for RTC-Web Applications . . . . . . . . . . . . . . 8 | 4. Security for RTC-Web Applications . . . . . . . . . . . . . . 7 | |||
4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 8 | 4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 8 | |||
4.1.1. Calling Scenarios and User Expectations . . . . . . . 9 | 4.1.1. Calling Scenarios and User Expectations . . . . . . . 8 | |||
4.1.1.1. Dedicated Calling Services . . . . . . . . . . . . 9 | 4.1.1.1. Dedicated Calling Services . . . . . . . . . . . . 8 | |||
4.1.1.2. Calling the Site You're On . . . . . . . . . . . . 9 | 4.1.1.2. Calling the Site You're On . . . . . . . . . . . . 9 | |||
4.1.1.3. Calling to an Ad Target . . . . . . . . . . . . . 10 | 4.1.1.3. Calling to an Ad Target . . . . . . . . . . . . . 9 | |||
4.1.2. Origin-Based Security . . . . . . . . . . . . . . . . 10 | 4.1.2. Origin-Based Security . . . . . . . . . . . . . . . . 10 | |||
4.1.3. Security Properties of the Calling Page . . . . . . . 12 | 4.1.3. Security Properties of the Calling Page . . . . . . . 11 | |||
4.2. Communications Consent Verification . . . . . . . . . . . 13 | 4.2. Communications Consent Verification . . . . . . . . . . . 12 | |||
4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 14 | 4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 13 | |||
4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15 | 4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 14 | |||
4.3. Communications Security . . . . . . . . . . . . . . . . . 15 | 4.3. Communications Security . . . . . . . . . . . . . . . . . 14 | |||
4.3.1. Protecting Against Retrospective Compromise . . . . . 16 | 4.3.1. Protecting Against Retrospective Compromise . . . . . 15 | |||
4.3.2. Protecting Against During-Call Attack . . . . . . . . 17 | 4.3.2. Protecting Against During-Call Attack . . . . . . . . 16 | |||
4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 17 | 4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 16 | |||
4.3.2.2. Short Authentication Strings . . . . . . . . . . . 18 | 4.3.2.2. Short Authentication Strings . . . . . . . . . . . 17 | |||
4.3.2.3. Recommendations . . . . . . . . . . . . . . . . . 19 | 4.3.2.3. Third Party Identity . . . . . . . . . . . . . . . 18 | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
7.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | 7.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | |||
7.2. Informative References . . . . . . . . . . . . . . . . . . 20 | 7.2. Informative References . . . . . . . . . . . . . . . . . . 19 | |||
Appendix A. A Proposed Security Architecture [No Consensus on | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
This] . . . . . . . . . . . . . . . . . . . . . . . . 22 | ||||
A.1. Trust Hierarchy . . . . . . . . . . . . . . . . . . . . . 22 | ||||
A.1.1. Authenticated Entities . . . . . . . . . . . . . . . . 22 | ||||
A.1.2. Unauthenticated Entities . . . . . . . . . . . . . . . 23 | ||||
A.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 23 | ||||
A.2.1. Initial Signaling . . . . . . . . . . . . . . . . . . 24 | ||||
A.2.2. Media Consent Verification . . . . . . . . . . . . . . 26 | ||||
A.2.3. DTLS Handshake . . . . . . . . . . . . . . . . . . . . 26 | ||||
A.2.4. Communications and Consent Freshness . . . . . . . . . 27 | ||||
A.3. Detailed Technical Description . . . . . . . . . . . . . . 27 | ||||
A.3.1. Origin and Web Security Issues . . . . . . . . . . . . 27 | ||||
A.3.2. Device Permissions Model . . . . . . . . . . . . . . . 28 | ||||
A.3.3. Communications Consent . . . . . . . . . . . . . . . . 29 | ||||
A.3.4. IP Location Privacy . . . . . . . . . . . . . . . . . 29 | ||||
A.3.5. Communications Security . . . . . . . . . . . . . . . 30 | ||||
A.3.6. Web-Based Peer Authentication . . . . . . . . . . . . 31 | ||||
A.3.6.1. Generic Concepts . . . . . . . . . . . . . . . . . 31 | ||||
A.3.6.2. BrowserID . . . . . . . . . . . . . . . . . . . . 32 | ||||
A.3.6.3. OAuth . . . . . . . . . . . . . . . . . . . . . . 35 | ||||
A.3.6.4. Generic Identity Support . . . . . . . . . . . . . 36 | ||||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 | ||||
1. Introduction | 1. Introduction | |||
The Real-Time Communications on the Web (RTC-Web) working group is | The Real-Time Communications on the Web (RTC-Web) working group is | |||
tasked with standardizing protocols for real-time communications | tasked with standardizing protocols for real-time communications | |||
between Web browsers. The major use cases for RTC-Web technology are | between Web browsers. The major use cases for RTC-Web technology are | |||
real-time audio and/or video calls, Web conferencing, and direct data | real-time audio and/or video calls, Web conferencing, and direct data | |||
transfer. Unlike most conventional real-time systems, (e.g., SIP- | transfer. Unlike most conventional real-time systems, (e.g., SIP- | |||
based[RFC3261] soft phones) RTC-Web communications are directly | based[RFC3261] soft phones) RTC-Web communications are directly | |||
controlled by some Web server. A simple case is shown below. | controlled by some Web server. A simple case is shown below. | |||
skipping to change at page 6, line 11 | skipping to change at page 5, line 11 | |||
particular, it needs to contend with malicious calling services. For | particular, it needs to contend with malicious calling services. For | |||
example, if the calling service can cause the browser to make a call | example, if the calling service can cause the browser to make a call | |||
at any time to any callee of its choice, then this facility can be | at any time to any callee of its choice, then this facility can be | |||
used to bug a user's computer without their knowledge, simply by | used to bug a user's computer without their knowledge, simply by | |||
placing a call to some recording service. More subtly, if the | placing a call to some recording service. More subtly, if the | |||
exposed APIs allow the server to instruct the browser to send | exposed APIs allow the server to instruct the browser to send | |||
arbitrary content, then they can be used to bypass firewalls or mount | arbitrary content, then they can be used to bypass firewalls or mount | |||
denial of service attacks. Any successful system will need to be | denial of service attacks. Any successful system will need to be | |||
resistant to this and other attacks. | resistant to this and other attacks. | |||
A companion document [I-D.ietf-rtcweb-security-arch] describes a | ||||
security architecture intended to address the issues raised in this | ||||
document. | ||||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
3. The Browser Threat Model | 3. The Browser Threat Model | |||
The security requirements for RTC-Web follow directly from the | The security requirements for RTC-Web follow directly from the | |||
requirement that the browser's job is to protect the user. Huang et | requirement that the browser's job is to protect the user. Huang et | |||
skipping to change at page 7, line 38 | skipping to change at page 6, line 43 | |||
at all. For instance, there is no real way to run specific | at all. For instance, there is no real way to run specific | |||
executables directly from a script (though the user can of course be | executables directly from a script (though the user can of course be | |||
induced to download executable files and run them). | induced to download executable files and run them). | |||
3.2. Same Origin Policy | 3.2. Same Origin Policy | |||
Many other resources are accessible but isolated. For instance, | Many other resources are accessible but isolated. For instance, | |||
while scripts are allowed to make HTTP requests via the | while scripts are allowed to make HTTP requests via the | |||
XMLHttpRequest() API those requests are not allowed to be made to any | XMLHttpRequest() API those requests are not allowed to be made to any | |||
server, but rather solely to the same ORIGIN from whence the script | server, but rather solely to the same ORIGIN from whence the script | |||
came.[I-D.abarth-origin] (although CORS [CORS] and WebSockets | came.[RFC6454] (although CORS [CORS] and WebSockets [RFC6455] | |||
[I-D.ietf-hybi-thewebsocketprotocol] provides a escape hatch from | provides a escape hatch from this restriction, as described below.) | |||
this restriction, as described below.) This SAME ORIGIN POLICY (SOP) | This SAME ORIGIN POLICY (SOP) prevents server A from mounting attacks | |||
prevents server A from mounting attacks on server B via the user's | on server B via the user's browser, which protects both the user | |||
browser, which protects both the user (e.g., from misuse of his | (e.g., from misuse of his credentials) and the server (e.g., from DoS | |||
credentials) and the server (e.g., from DoS attack). | attack). | |||
More generally, SOP forces scripts from each site to run in their | More generally, SOP forces scripts from each site to run in their | |||
own, isolated, sandboxes. While there are techniques to allow them | own, isolated, sandboxes. While there are techniques to allow them | |||
to interact, those interactions generally must be mutually consensual | to interact, those interactions generally must be mutually consensual | |||
(by each site) and are limited to certain channels. For instance, | (by each site) and are limited to certain channels. For instance, | |||
multiple pages/browser panes from the same origin can read each | multiple pages/browser panes from the same origin can read each | |||
other's JS variables, but pages from the different origins--or even | other's JS variables, but pages from the different origins--or even | |||
iframes from different origins on the same page--cannot. | iframes from different origins on the same page--cannot. | |||
3.3. Bypassing SOP: CORS, WebSockets, and consent to communicate | 3.3. Bypassing SOP: CORS, WebSockets, and consent to communicate | |||
skipping to change at page 8, line 21 | skipping to change at page 7, line 25 | |||
The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a | The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a | |||
response to this demand. In CORS, when a script from origin A | response to this demand. In CORS, when a script from origin A | |||
executes what would otherwise be a forbidden cross-origin request, | executes what would otherwise be a forbidden cross-origin request, | |||
the browser instead contacts the target server to determine whether | the browser instead contacts the target server to determine whether | |||
it is willing to allow cross-origin requests from A. If it is so | it is willing to allow cross-origin requests from A. If it is so | |||
willing, the browser then allows the request. This consent | willing, the browser then allows the request. This consent | |||
verification process is designed to safely allow cross-origin | verification process is designed to safely allow cross-origin | |||
requests. | requests. | |||
While CORS is designed to allow cross-origin HTTP requests, | While CORS is designed to allow cross-origin HTTP requests, | |||
WebSockets [I-D.ietf-hybi-thewebsocketprotocol] allows cross-origin | WebSockets [RFC6455] allows cross-origin establishment of transparent | |||
establishment of transparent channels. Once a WebSockets connection | channels. Once a WebSockets connection has been established from a | |||
has been established from a script to a site, the script can exchange | script to a site, the script can exchange any traffic it likes | |||
any traffic it likes without being required to frame it as a series | without being required to frame it as a series of HTTP request/ | |||
of HTTP request/response transactions. As with CORS, a WebSockets | response transactions. As with CORS, a WebSockets transaction starts | |||
transaction starts with a consent verification stage to avoid | with a consent verification stage to avoid allowing scripts to simply | |||
allowing scripts to simply send arbitrary data to another origin. | send arbitrary data to another origin. | |||
While consent verification is conceptually simple--just do a | While consent verification is conceptually simple--just do a | |||
handshake before you start exchanging the real data--experience has | handshake before you start exchanging the real data--experience has | |||
shown that designing a correct consent verification system is | shown that designing a correct consent verification system is | |||
difficult. In particular, Huang et al. [huang-w2sp] have shown | difficult. In particular, Huang et al. [huang-w2sp] have shown | |||
vulnerabilities in the existing Java and Flash consent verification | vulnerabilities in the existing Java and Flash consent verification | |||
techniques and in a simplified version of the WebSockets handshake. | techniques and in a simplified version of the WebSockets handshake. | |||
In particular, it is important to be wary of CROSS-PROTOCOL attacks | In particular, it is important to be wary of CROSS-PROTOCOL attacks | |||
in which the attacking script generates traffic which is acceptable | in which the attacking script generates traffic which is acceptable | |||
to some non-Web protocol state machine. In order to resist this form | to some non-Web protocol state machine. In order to resist this form | |||
skipping to change at page 11, line 40 | skipping to change at page 10, line 48 | |||
cases. As discussed above, individual consent puts the user's | cases. As discussed above, individual consent puts the user's | |||
approval in the UI flow for every call. Not only does this quickly | approval in the UI flow for every call. Not only does this quickly | |||
become annoying but it can train the user to simply click "OK", at | become annoying but it can train the user to simply click "OK", at | |||
which point the consent becomes useless. Thus, while it may be | which point the consent becomes useless. Thus, while it may be | |||
necessary to have individual consent in some case, this is not a | necessary to have individual consent in some case, this is not a | |||
suitable solution for (for instance) the calling service case. Where | suitable solution for (for instance) the calling service case. Where | |||
necessary, in-flow user interfaces must be carefully designed to | necessary, in-flow user interfaces must be carefully designed to | |||
avoid the risk of the user blindly clicking through. | avoid the risk of the user blindly clicking through. | |||
The other two options are designed to restrict calls to a given | The other two options are designed to restrict calls to a given | |||
target. Unfortunately, Callee-oriented consent does not work well | target. Callee-oriented consent provided by the calling site not | |||
because a malicious site can claim that the user is calling any user | work well because a malicious site can claim that the user is calling | |||
of his choice. One fix for this is to tie calls to a | any user of his choice. One fix for this is to tie calls to a | |||
cryptographically established identity. While not suitable for all | cryptographically established identity. While not suitable for all | |||
cases, this approach may be useful for some. If we consider the | cases, this approach may be useful for some. If we consider the | |||
advertising case described in Section 4.1.1.3, it's not particularly | advertising case described in Section 4.1.1.3, it's not particularly | |||
convenient to require the advertiser to instantiate an iframe on the | convenient to require the advertiser to instantiate an iframe on the | |||
hosting site just to get permission; a more convenient approach is to | hosting site just to get permission; a more convenient approach is to | |||
cryptographically tie the advertiser's certificate to the | cryptographically tie the advertiser's certificate to the | |||
communication directly. We're still tying permissions to origin | communication directly. We're still tying permissions to origin | |||
here, but to the media origin (and-or destination) rather than to the | here, but to the media origin (and-or destination) rather than to the | |||
Web origin. | Web origin. [I-D.ietf-rtcweb-security-arch] and | |||
[I-D.rescorla-rtcweb-generic-idp] describe mechanisms which | ||||
facilitate this sort of consent. | ||||
Another case where media-level cryptographic identity makes sense is | Another case where media-level cryptographic identity makes sense is | |||
when a user really does not trust the calling site. For instance, I | when a user really does not trust the calling site. For instance, I | |||
might be worried that the calling service will attempt to bug my | might be worried that the calling service will attempt to bug my | |||
computer, but I also want to be able to conveniently call my friends. | computer, but I also want to be able to conveniently call my friends. | |||
If consent is tied to particular communications endpoints, then my | If consent is tied to particular communications endpoints, then my | |||
risk is limited. However, this is also not that convenient an | risk is limited. Naturally, it is somewhat challenging to design UI | |||
interface, since managing individual user permissions can be painful. | primitives which express this sort of policy. | |||
While this is primarily a question not for IETF, it should be clear | ||||
that there is no really good answer. In general, if you cannot trust | ||||
the site which you have authorized for calling not to bug you then | ||||
your security situation is not really ideal. It is RECOMMENDED that | ||||
browsers have explicit (and obvious) indicators that they are in a | ||||
call in order to mitigate this risk. | ||||
4.1.3. Security Properties of the Calling Page | 4.1.3. Security Properties of the Calling Page | |||
Origin-based security is intended to secure against web attackers. | Origin-based security is intended to secure against web attackers. | |||
However, we must also consider the case of network attackers. | However, we must also consider the case of network attackers. | |||
Consider the case where I have granted permission to a calling | Consider the case where I have granted permission to a calling | |||
service by an origin that has the HTTP scheme, e.g., | service by an origin that has the HTTP scheme, e.g., | |||
http://calling-service.example.com. If I ever use my computer on an | http://calling-service.example.com. If I ever use my computer on an | |||
unsecured network (e.g., a hotspot or if my own home wireless network | unsecured network (e.g., a hotspot or if my own home wireless network | |||
is insecure), and browse any HTTP site, then an attacker can bug my | is insecure), and browse any HTTP site, then an attacker can bug my | |||
skipping to change at page 13, line 8 | skipping to change at page 12, line 12 | |||
Even if calls are only possible from HTTPS sites, if the site embeds | Even if calls are only possible from HTTPS sites, if the site embeds | |||
active content (e.g., JavaScript) that is fetched over HTTP or from | active content (e.g., JavaScript) that is fetched over HTTP or from | |||
an untrusted site, because that JavaScript is executed in the | an untrusted site, because that JavaScript is executed in the | |||
security context of the page [finer-grained]. Thus, it is also | security context of the page [finer-grained]. Thus, it is also | |||
dangerous to allow RTC-Web functionality from HTTPS origins that | dangerous to allow RTC-Web functionality from HTTPS origins that | |||
embed mixed content. Note: this issue is not restricted to PAGES | embed mixed content. Note: this issue is not restricted to PAGES | |||
which contain mixed content. If a page from a given origin ever | which contain mixed content. If a page from a given origin ever | |||
loads mixed content then it is possible for a network attacker to | loads mixed content then it is possible for a network attacker to | |||
infect the browser's notion of that origin semi-permanently. | infect the browser's notion of that origin semi-permanently. | |||
[[ OPEN ISSUE: What recommendation should IETF make about (a) | ||||
whether RTCWeb long-term consent should be available over HTTP pages | ||||
and (b) How to handle origins where the consent is to an HTTPS URL | ||||
but the page contains active mixed content? ]] | ||||
4.2. Communications Consent Verification | 4.2. Communications Consent Verification | |||
As discussed in Section 3.3, allowing web applications unrestricted | As discussed in Section 3.3, allowing web applications unrestricted | |||
network access via the browser introduces the risk of using the | network access via the browser introduces the risk of using the | |||
browser as an attack platform against machines which would not | browser as an attack platform against machines which would not | |||
otherwise be accessible to the malicious site, for instance because | otherwise be accessible to the malicious site, for instance because | |||
they are topologically restricted (e.g., behind a firewall or NAT). | they are topologically restricted (e.g., behind a firewall or NAT). | |||
In order to prevent this form of attack as well as cross-protocol | In order to prevent this form of attack as well as cross-protocol | |||
attacks it is important to require that the target of traffic | attacks it is important to require that the target of traffic | |||
explicitly consent to receiving the traffic in question. Until that | explicitly consent to receiving the traffic in question. Until that | |||
skipping to change at page 14, line 20 | skipping to change at page 13, line 19 | |||
As long as communication is limited to UDP, then this risk is | As long as communication is limited to UDP, then this risk is | |||
probably limited, thus masking is not required for UDP. I.e., once | probably limited, thus masking is not required for UDP. I.e., once | |||
communications consent has been verified, it is most likely safe to | communications consent has been verified, it is most likely safe to | |||
allow the implementation to send arbitrary UDP traffic to the chosen | allow the implementation to send arbitrary UDP traffic to the chosen | |||
destination, provided that the STUN keepalives continue to succeed. | destination, provided that the STUN keepalives continue to succeed. | |||
In particular, this is true for the data channel if DTLS is used | In particular, this is true for the data channel if DTLS is used | |||
because DTLS (with the anti-chosen plaintext mechanisms required by | because DTLS (with the anti-chosen plaintext mechanisms required by | |||
TLS 1.1) does not allow the attacker to generate predictable | TLS 1.1) does not allow the attacker to generate predictable | |||
ciphertext. However, with TCP the risk of transparent proxies | ciphertext. However, with TCP the risk of transparent proxies | |||
becomes much more severe. If TCP is to be used, then WebSockets | becomes much more severe. If TCP is to be used, then WebSockets | |||
style masking MUST be employed. | style masking MUST be employed. [Note: current thinking in the | |||
RTCWEB WG is not to support TCP and to support SCTP over DTLS, thus | ||||
removing the need for masking.] | ||||
4.2.3. Backward Compatibility | 4.2.3. Backward Compatibility | |||
A requirement to use ICE limits compatibility with legacy non-ICE | A requirement to use ICE limits compatibility with legacy non-ICE | |||
clients. It seems unsafe to completely remove the requirement for | clients. It seems unsafe to completely remove the requirement for | |||
some check. All proposed checks have the common feature that the | some check. All proposed checks have the common feature that the | |||
browser sends some message to the candidate traffic recipient and | browser sends some message to the candidate traffic recipient and | |||
refuses to send other traffic until that message has been replied to. | refuses to send other traffic until that message has been replied to. | |||
The message/reply pair must be generated in such a way that an | The message/reply pair must be generated in such a way that an | |||
attacker who controls the Web application cannot forge them, | attacker who controls the Web application cannot forge them, | |||
skipping to change at page 15, line 28 | skipping to change at page 14, line 28 | |||
probably the best choice. | probably the best choice. | |||
Once initial consent is verified, we also need to verify continuing | Once initial consent is verified, we also need to verify continuing | |||
consent, in order to avoid attacks where two people briefly share an | consent, in order to avoid attacks where two people briefly share an | |||
IP (e.g., behind a NAT in an Internet cafe) and the attacker arranges | IP (e.g., behind a NAT in an Internet cafe) and the attacker arranges | |||
for a large, unstoppable, traffic flow to the network and then | for a large, unstoppable, traffic flow to the network and then | |||
leaves. The appropriate technologies here are fairly similar to | leaves. The appropriate technologies here are fairly similar to | |||
those for initial consent, though are perhaps weaker since the | those for initial consent, though are perhaps weaker since the | |||
threats is less severe. | threats is less severe. | |||
[[ OPEN ISSUE: Exactly what should be the requirements here? | ||||
Proposals include ICE all the time or ICE but with allowing one of | ||||
these non-ICE things for legacy. ]] | ||||
4.2.4. IP Location Privacy | 4.2.4. IP Location Privacy | |||
Note that as soon as the callee sends their ICE candidates, the | Note that as soon as the callee sends their ICE candidates, the | |||
callee learns the callee's IP addresses. The callee's server | callee learns the callee's IP addresses. The callee's server | |||
reflexive address reveals a lot of information about the callee's | reflexive address reveals a lot of information about the callee's | |||
location. In order to avoid tracking, implementations may wish to | location. In order to avoid tracking, implementations may wish to | |||
suppress the start of ICE negotiation until the callee has answered. | suppress the start of ICE negotiation until the callee has answered. | |||
In addition, either side may wish to hide their location entirely by | In addition, either side may wish to hide their location entirely by | |||
forcing all traffic through a TURN server. | forcing all traffic through a TURN server. | |||
skipping to change at page 19, line 22 | skipping to change at page 18, line 19 | |||
avoid them needing to check it on every call. However, this is | avoid them needing to check it on every call. However, this is | |||
problematic for reasons indicated in Section 4.3.2.1. In principle | problematic for reasons indicated in Section 4.3.2.1. In principle | |||
it is of course possible to render a different UI element to indicate | it is of course possible to render a different UI element to indicate | |||
that calls are using an unauthenticated set of keying material | that calls are using an unauthenticated set of keying material | |||
(recall that the attacker can just present a slightly different name | (recall that the attacker can just present a slightly different name | |||
so that the attack shows the same UI as a call to a new device or to | so that the attack shows the same UI as a call to a new device or to | |||
someone you haven't called before) but as a practical matter, users | someone you haven't called before) but as a practical matter, users | |||
simply ignore such indicators even in the rather more dire case of | simply ignore such indicators even in the rather more dire case of | |||
mixed content warnings. | mixed content warnings. | |||
4.3.2.3. Recommendations | 4.3.2.3. Third Party Identity | |||
[[ OPEN ISSUE: What are the best UI recommendations to make? | ||||
Proposal: take the text from [I-D.kaufman-rtcweb-security-ui] | ||||
Section 2]] | ||||
[[ OPEN ISSUE: Exactly what combination of media security primitives | The conventional approach to providing communications identity has of | |||
should be specified and/or mandatory to implement? In particular, | course been to have some third party identity system (e.g., PKI) to | |||
should we allow DTLS-SRTP only, or both DTLS-SRTP and SDES. Should | authenticate the endpoints. Such mechanisms have proven to be too | |||
we allow RTP for backward compatibility? ]] | cumbersome for use by typical users (and nearly too cumbersome for | |||
administrators). However, a new generation of Web-based identity | ||||
providers (BrowserID, Federated Google Login, Facebook Connect, | ||||
OAuth, OpenID, WebFinger), has recently been developed and use Web | ||||
technologies to provide lightweight (from the user's perspective) | ||||
third-party authenticated transactions. It is possible (see | ||||
[I-D.rescorla-rtcweb-generic-idp]) to use systems of this type to | ||||
authenticate RTCWEB calls, linking them to existing user notions of | ||||
identity (e.g., Facebook adjacencies). Calls which are authenticated | ||||
in this fashion are naturally resistant even to active MITM attack by | ||||
the calling site. | ||||
5. Security Considerations | 5. Security Considerations | |||
This entire document is about security. | This entire document is about security. | |||
6. Acknowledgements | 6. Acknowledgements | |||
Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan (S | Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan (S | |||
4.2.1), Matthew Kaufman, Magnus Westerland. | 4.2.1), Matthew Kaufman, Magnus Westerland. | |||
skipping to change at page 19, line 43 | skipping to change at page 19, line 4 | |||
5. Security Considerations | 5. Security Considerations | |||
This entire document is about security. | This entire document is about security. | |||
6. Acknowledgements | 6. Acknowledgements | |||
Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan (S | Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan (S | |||
4.2.1), Matthew Kaufman, Magnus Westerland. | 4.2.1), Matthew Kaufman, Magnus Westerland. | |||
7. References | 7. References | |||
7.1. Normative References | 7.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
7.2. Informative References | 7.2. Informative References | |||
[CORS] van Kesteren, A., "Cross-Origin Resource Sharing". | [CORS] van Kesteren, A., "Cross-Origin Resource Sharing". | |||
[I-D.abarth-origin] | [I-D.ietf-rtcweb-security-arch] | |||
Barth, A., "The Web Origin Concept", | Rescorla, E., "RTCWEB Security Architecture", | |||
draft-abarth-origin-09 (work in progress), November 2010. | draft-ietf-rtcweb-security-arch-00 (work in progress), | |||
January 2012. | ||||
[I-D.ietf-hybi-thewebsocketprotocol] | ||||
Fette, I. and A. Melnikov, "The WebSocket protocol", | ||||
draft-ietf-hybi-thewebsocketprotocol-17 (work in | ||||
progress), September 2011. | ||||
[I-D.kaufman-rtcweb-security-ui] | [I-D.kaufman-rtcweb-security-ui] | |||
Kaufman, M., "Client Security User Interface Requirements | Kaufman, M., "Client Security User Interface Requirements | |||
for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in | for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in | |||
progress), June 2011. | progress), June 2011. | |||
[I-D.rescorla-rtcweb-generic-idp] | ||||
Rescorla, E., "RTCWEB Generic Identity Provider | ||||
Interface", draft-rescorla-rtcweb-generic-idp-00 (work in | ||||
progress), January 2012. | ||||
[RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. | [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. | |||
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, | [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, | |||
A., Peterson, J., Sparks, R., Handley, M., and E. | A., Peterson, J., Sparks, R., Handley, M., and E. | |||
Schooler, "SIP: Session Initiation Protocol", RFC 3261, | Schooler, "SIP: Session Initiation Protocol", RFC 3261, | |||
June 2002. | June 2002. | |||
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC | [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC | |||
Text on Security Considerations", BCP 72, RFC 3552, | Text on Security Considerations", BCP 72, RFC 3552, | |||
July 2003. | July 2003. | |||
skipping to change at page 21, line 18 | skipping to change at page 20, line 23 | |||
[RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework | [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework | |||
for Establishing a Secure Real-time Transport Protocol | for Establishing a Secure Real-time Transport Protocol | |||
(SRTP) Security Context Using Datagram Transport Layer | (SRTP) Security Context Using Datagram Transport Layer | |||
Security (DTLS)", RFC 5763, May 2010. | Security (DTLS)", RFC 5763, May 2010. | |||
[RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media | [RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media | |||
Path Key Agreement for Unicast Secure RTP", RFC 6189, | Path Key Agreement for Unicast Secure RTP", RFC 6189, | |||
April 2011. | April 2011. | |||
[RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, | ||||
December 2011. | ||||
[RFC6455] Fette, I. and A. Melnikov, "The WebSocket Protocol", | ||||
RFC 6455, December 2011. | ||||
[abarth-rtcweb] | [abarth-rtcweb] | |||
Barth, A., "Prompting the user is security failure", RTC- | Barth, A., "Prompting the user is security failure", RTC- | |||
Web Workshop. | Web Workshop. | |||
[cranor-wolf] | [cranor-wolf] | |||
Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and | Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and | |||
L. cranor, "Crying Wolf: An Empirical Study of SSL Warning | L. cranor, "Crying Wolf: An Empirical Study of SSL Warning | |||
Effectiveness", Proceedings of the 18th USENIX Security | Effectiveness", Proceedings of the 18th USENIX Security | |||
Symposium, 2009. | Symposium, 2009. | |||
skipping to change at page 22, line 5 | skipping to change at page 21, line 14 | |||
Kain, A. and M. Macon, "Design and Evaluation of a Voice | Kain, A. and M. Macon, "Design and Evaluation of a Voice | |||
Conversion Algorithm based on Spectral Envelope Mapping | Conversion Algorithm based on Spectral Envelope Mapping | |||
and Residual Prediction", Proceedings of ICASSP, May | and Residual Prediction", Proceedings of ICASSP, May | |||
2001. | 2001. | |||
[whitten-johnny] | [whitten-johnny] | |||
Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A | Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A | |||
Usability Evaluation of PGP 5.0", Proceedings of the 8th | Usability Evaluation of PGP 5.0", Proceedings of the 8th | |||
USENIX Security Symposium, 1999. | USENIX Security Symposium, 1999. | |||
Appendix A. A Proposed Security Architecture [No Consensus on This] | ||||
This section contains a proposed security architecture, based on the | ||||
considerations discussed in the main body of this memo. This section | ||||
is currently the opinion of the author and does not have consensus | ||||
though some (many?) elements of this proposal do seem to have general | ||||
consensus. | ||||
A.1. Trust Hierarchy | ||||
The basic assumption of this proposal is that network resources exist | ||||
in a hierarchy of trust, rooted in the browser, which serves as the | ||||
user's TRUSTED COMPUTING BASE (TCB). Any security property which the | ||||
user wishes to have enforced must be ultimately guaranteed by the | ||||
browser (or transitively by some property the browser verifies). | ||||
Conversely, if the browser is compromised, then no security | ||||
guarantees are possible. Note that there are cases (e.g., Internet | ||||
kiosks) where the user can't really trust the browser that much. In | ||||
these cases, the level of security provided is limited by how much | ||||
they trust the browser. | ||||
Optimally, we would not rely on trust in any entities other than the | ||||
browser. However, this is unfortunately not possible if we wish to | ||||
have a functional system. Other network elements fall into two | ||||
categories: those which can be authenticated by the browser and thus | ||||
are partly trusted--though to the minimum extent necessary--and those | ||||
which cannot be authenticated and thus are untrusted. This is a | ||||
natural extension of the end-to-end principle. | ||||
A.1.1. Authenticated Entities | ||||
There are two major classes of authenticated entities in the system: | ||||
o Calling services: Web sites whose origin we can verify (optimally | ||||
via HTTPS). | ||||
o Other users: RTC-Web peers whose origin we can verify | ||||
cryptographically (optimally via DTLS-SRTP). | ||||
Note that merely being authenticated does not make these entities | ||||
trusted. For instance, just because we can verify that | ||||
https://www.evil.org/ is owned by Dr. Evil does not mean that we can | ||||
trust Dr. Evil to access our camera an microphone. However, it gives | ||||
the user an opportunity to determine whether he wishes to trust Dr. | ||||
Evil or not; after all, if he desires to contact Dr. Evil, it's safe | ||||
to temporarily give him access to the camera and microphone for the | ||||
purpose of the call. The point here is that we must first identify | ||||
other elements before we can determine whether to trust them. | ||||
It's also worth noting that there are settings where authentication | ||||
is non-cryptographic, such as other machines behind a firewall. | ||||
Naturally, the level of trust one can have in identities verified in | ||||
this way depends on how strong the topology enforcement is. | ||||
A.1.2. Unauthenticated Entities | ||||
Other than the above entities, we are not generally able to identify | ||||
other network elements, thus we cannot trust them. This does not | ||||
mean that it is not possible to have any interaction with them, but | ||||
it means that we must assume that they will behave maliciously and | ||||
design a system which is secure even if they do so. | ||||
A.2. Overview | ||||
This section describes a typical RTCWeb session and shows how the | ||||
various security elements interact and what guarantees are provided | ||||
to the user. The example in this section is a "best case" scenario | ||||
in which we provide the maximal amount of user authentication and | ||||
media privacy with the minimal level of trust in the calling service. | ||||
Simpler versions with lower levels of security are also possible and | ||||
are noted in the text where applicable. It's also important to | ||||
recognize the tension between security (or performance) and privacy. | ||||
The example shown here is aimed towards settings where we are more | ||||
concerned about secure calling than about privacy, but as we shall | ||||
see, there are settings where one might wish to make different | ||||
tradeoffs--this architecture is still compatible with those settings. | ||||
For the purposes of this example, we assume the topology shown in the | ||||
figure below. This topology is derived from the topology shown in | ||||
Figure 1, but separates Alice and Bob's identities from the process | ||||
of signaling. Specifically, Alice and Bob have relationships with | ||||
some Identity Provider (IDP) that supports a protocol such OpenID or | ||||
BrowserID) that can be used to attest to their identity. This | ||||
separation isn't particularly important in "closed world" cases where | ||||
Alice and Bob are users on the same social network and have | ||||
identities based on that network. However, there are important | ||||
settings where that is not the case, such as federation (calls from | ||||
one network to another) and calling on untrusted sites, such as where | ||||
two users who have a relationship via a given social network want to | ||||
call each other on another, untrusted, site, such as a poker site. | ||||
+----------------+ | ||||
| | | ||||
| Signaling | | ||||
| Server | | ||||
| | | ||||
+----------------+ | ||||
^ ^ | ||||
/ \ | ||||
HTTPS / \ HTTPS | ||||
/ \ | ||||
/ \ | ||||
v v | ||||
JS API JS API | ||||
+-----------+ +-----------+ | ||||
| | Media | | | ||||
Alice | Browser |<---------->| Browser | Bob | ||||
| | (DTLS-SRTP)| | | ||||
+-----------+ +-----------+ | ||||
^ ^--+ +--^ ^ | ||||
| | | | | ||||
v | | v | ||||
+-----------+ | | +-----------+ | ||||
| |<--------+ | | | ||||
| IDP | | | IDP | | ||||
| | +------->| | | ||||
+-----------+ +-----------+ | ||||
Figure 2: A call with IDP-based identity | ||||
A.2.1. Initial Signaling | ||||
Alice and Bob are both users of a common calling service; they both | ||||
have approved the calling service to make calls (we defer the | ||||
discussion of device access permissions till later). They are both | ||||
connected to the calling service via HTTPS and so know the origin | ||||
with some level of confidence. They also have accounts with some | ||||
identity provider. This sort of identity service is becoming | ||||
increasingly common in the Web environment in technologies such | ||||
(BrowserID, Federated Google Login, Facebook Connect, OAuth, OpenID, | ||||
WebFinger), and is often provided as a side effect service of your | ||||
ordinary accounts with some service. In this example, we show Alice | ||||
and Bob using a separate identity service, though they may actually | ||||
be using the same identity service as calling service or have no | ||||
identity service at all. | ||||
Alice is logged onto the calling service and decides to call Bob. She | ||||
can see from the calling service that he is online and the calling | ||||
service presents a JS UI in the form of a button next to Bob's name | ||||
which says "Call". Alice clicks the button, which initiates a JS | ||||
callback that instantiates a PeerConnection object. This does not | ||||
require a security check: JS from any origin is allowed to get this | ||||
far. | ||||
Once the PeerConnection is created, the calling service JS needs to | ||||
set up some media. Because this is an audio/video call, it creates | ||||
two MediaStreams, one connected to an audio input and one connected | ||||
to a video input. At this point the first security check is | ||||
required: untrusted origins are not allowed to access the camera and | ||||
microphone. In this case, because Alice is a long-term user of the | ||||
calling service, she has made a permissions grant (i.e., a setting in | ||||
the browser) to allow the calling service to access her camera and | ||||
microphone any time it wants. The browser checks this setting when | ||||
the camera and microphone requests are made and thus allows them. | ||||
In the current W3C API, once some streams have been added, Alice's | ||||
browser + JS generates a signaling message The format of this data is | ||||
currently undefined. It may be a complete message as defined by ROAP | ||||
[REF] or may be assembled piecemeal by the JS. In either case, it | ||||
will contain: | ||||
o Media channel information | ||||
o ICE candidates | ||||
o A fingerprint attribute binding the message to Alice's public key | ||||
[RFC5763] | ||||
Prior to sending out the signaling message, the PeerConnection code | ||||
contacts the identity service and obtains an assertion binding | ||||
Alice's identity to her fingerprint. The exact details depend on the | ||||
identity service (though as discussed in Appendix A.3.6.4 I believe | ||||
PeerConnection can be agnostic to them), but for now it's easiest to | ||||
think of as a BrowserID assertion. | ||||
This message is sent to the signaling server, e.g., by XMLHttpRequest | ||||
[REF] or by WebSockets [I-D.ietf-hybi-thewebsocketprotocol]. The | ||||
signaling server processes the message from Alice's browser, | ||||
determines that this is a call to Bob and sends a signaling message | ||||
to Bob's browser (again, the format is currently undefined). The JS | ||||
on Bob's browser processes it, and alerts Bob to the incoming call | ||||
and to Alice's identity. In this case, Alice has provided an | ||||
identity assertion and so Bob's browser contacts Alice's identity | ||||
provider (again, this is done in a generic way so the browser has no | ||||
specific knowledge of the IDP) to verity the assertion. This allows | ||||
the browser to display a trusted element indicating that a call is | ||||
coming in from Alice. If Alice is in Bob's address book, then this | ||||
interface might also include her real name, a picture, etc. The | ||||
calling site will also provide some user interface element (e.g., a | ||||
button) to allow Bob to answer the call, though this is most likely | ||||
not part of the trusted UI. | ||||
If Bob agrees [I am ignoring early media for now], a PeerConnection | ||||
is instantiated with the message from Alice's side. Then, a similar | ||||
process occurs as on Alice's browser: Bob's browser verifies that | ||||
the calling service is approved, the media streams are created, and a | ||||
return signaling message containing media information, ICE | ||||
candidates, and a fingerprint is sent back to Alice via the signaling | ||||
service. If Bob has a relationship with an IDP, the message will | ||||
also come with an identity assertion. | ||||
At this point, Alice and Bob each know that the other party wants to | ||||
have a secure call with them. Based purely on the interface provided | ||||
by the signaling server, they know that the signaling server claims | ||||
that the call is from Alice to Bob. Because the far end sent an | ||||
identity assertion along with their message, they know that this is | ||||
verifiable from the IDP as well. Of course, the call works perfectly | ||||
well if either Alice or Bob doesn't have a relationship with an IDP; | ||||
they just get a lower level of assurance. Moreover, Alice might wish | ||||
to make an anonymous call through an anonymous calling site, in which | ||||
case she would of course just not provide any identity assertion and | ||||
the calling site would mask her identity from Bob. | ||||
A.2.2. Media Consent Verification | ||||
As described in Section 4.2. This proposal specifies that that be | ||||
performed via ICE. Thus, Alice and Bob perform ICE checks with each | ||||
other. At the completion of these checks, they are ready to send | ||||
non-ICE data. | ||||
At this point, Alice knows that (a) Bob (assuming he is verified via | ||||
his IDP) or someone else who the signaling service is claiming is Bob | ||||
is willing to exchange traffic with her and (b) that either Bob is at | ||||
the IP address which she has verified via ICE or there is an attacker | ||||
who is on-path to that IP address detouring the traffic. Note that | ||||
it is not possible for an attacker who is on-path but not attached to | ||||
the signaling service to spoof these checks because they do not have | ||||
the ICE credentials. Bob's security guarantees with respect to Alice | ||||
are the converse of this. | ||||
A.2.3. DTLS Handshake | ||||
Once the ICE checks have completed [more specifically, once some ICE | ||||
checks have completed], Alice and Bob can set up a secure channel. | ||||
This is performed via DTLS [RFC4347] (for the data channel) and DTLS- | ||||
SRTP [RFC5763] for the media channel. Specifically, Alice and Bob | ||||
perform a DTLS handshake on every channel which has been established | ||||
by ICE. The total number of channels depends on the amount of | ||||
muxing; in the most likely case we are using both RTP/RTCP mux and | ||||
muxing multiple media streams on the same channel, in which case | ||||
there is only one DTLS handshake. Once the DTLS handshake has | ||||
completed, the keys are extracted and used to key SRTP for the media | ||||
channels. | ||||
At this point, Alice and Bob know that they share a set of secure | ||||
data and/or media channels with keys which are not known to any | ||||
third-party attacker. If Alice and Bob authenticated via their IDPs, | ||||
then they also know that the signaling service is not attacking them. | ||||
Even if they do not use an IDP, as long as they have minimal trust in | ||||
the signaling service not to perform a man-in-the-middle attack, they | ||||
know that their communications are secure against the signaling | ||||
service as well. | ||||
A.2.4. Communications and Consent Freshness | ||||
From a security perspective, everything from here on in is a little | ||||
anticlimactic: Alice and Bob exchange data protected by the keys | ||||
negotiated by DTLS. Because of the security guarantees discussed in | ||||
the previous sections, they know that the communications are | ||||
encrypted and authenticated. | ||||
The one remaining security property we need to establish is "consent | ||||
freshness", i.e., allowing Alice to verify that Bob is still prepared | ||||
to receive her communications. ICE specifies periodic STUN | ||||
keepalizes but only if media is not flowing. Because the consent | ||||
issue is more difficult here, we require RTCWeb implementations to | ||||
periodically send keepalives. If a keepalive fails and no new ICE | ||||
channels can be established, then the session is terminated. | ||||
A.3. Detailed Technical Description | ||||
A.3.1. Origin and Web Security Issues | ||||
The basic unit of permissions for RTC-Web is the origin | ||||
[I-D.abarth-origin]. Because the security of the origin depends on | ||||
being able to authenticate content from that origin, the origin can | ||||
only be securely established if data is transferred over HTTPS. | ||||
Thus, clients MUST treat HTTP and HTTPS origins as different | ||||
permissions domains and SHOULD NOT permit access to any RTC-Web | ||||
functionality from scripts fetched over non-secure (HTTP) origins. | ||||
If an HTTPS origin contains mixed active content (regardless of | ||||
whether it is present on the specific page attempting to access RTC- | ||||
Web functionality), any access MUST be treated as if it came from the | ||||
HTTP origin. For instance, if a https://www.example.com/example.html | ||||
loads https://www.example.com/example.js and | ||||
http://www.example.org/jquery.js, any attempt by example.js to access | ||||
RTCWeb functionality MUST be treated as if it came from | ||||
http://www.example.com/. Note that many browsers already track mixed | ||||
content and either forbid it by default or display a warning. | ||||
A.3.2. Device Permissions Model | ||||
Implementations MUST obtain explicit user consent prior to providing | ||||
access to the camera and/or microphone. Implementations MUST at | ||||
minimum support the following two permissions models: | ||||
o Requests for one-time camera/microphone access. | ||||
o Requests for permanent access. | ||||
In addition, they SHOULD support requests for access to a single | ||||
communicating peer. E.g., "Call customerservice@ford.com". Browsers | ||||
servicing such requests SHOULD clearly indicate that identity to the | ||||
user when asking for permission. | ||||
API Requirement: The API MUST provide a mechanism for the requesting | ||||
JS to indicate which of these forms of permissions it is | ||||
requesting. This allows the client to know what sort of user | ||||
interface experience to provide. In particular, browsers might | ||||
display a non-invasive door hanger ("some features of this site | ||||
may not work..." when asking for long-term permissions) but a more | ||||
invasive UI ("here is your own video") for single-call | ||||
permissions. The API MAY grant weaker permissions than the JS | ||||
asked for if the user chooses to authorize only those permissions, | ||||
but if it intends to grant stronger ones SHOULD display the | ||||
appropriate UI for those permissions. | ||||
API Requirement: The API MUST provide a mechanism for the requesting | ||||
JS to relinquish the ability to see or modify the media (e.g., via | ||||
MediaStream.record()). Combined with secure authentication of the | ||||
communicating peer, this allows a user to be sure that the calling | ||||
site is not accessing or modifying their conversion. | ||||
UI Requirement: The UI MUST clearly indicate when the user's camera | ||||
and microphone are in use. This indication MUST NOT be | ||||
suppressable by the JS and MUST clearly indicate how to terminate | ||||
a call, and provide a UI means to immediately stop camera/ | ||||
microphone input without the JS being able to prevent it. | ||||
UI Requirement: If the UI indication of camera/microphone use are | ||||
displayed in the browser such that minimizing the browser window | ||||
would hide the indication, or the JS creating an overlapping | ||||
window would hide the indication, then the browser SHOULD stop | ||||
camera and microphone input. | ||||
Clients MAY permit the formation of data channels without any direct | ||||
user approval. Because sites can always tunnel data through the | ||||
server, further restrictions on the data channel do not provide any | ||||
additional security. (though see Appendix A.3.3 for a related issue). | ||||
Implementations which support some form of direct user authentication | ||||
SHOULD also provide a policy by which a user can authorize calls only | ||||
to specific counterparties. Specifically, the implementation SHOULD | ||||
provide the following interfaces/controls: | ||||
o Allow future calls to this verified user. | ||||
o Allow future calls to any verified user who is in my system | ||||
address book (this only works with address book integration, of | ||||
course). | ||||
Implementations SHOULD also provide a different user interface | ||||
indication when calls are in progress to users whose identities are | ||||
directly verifiable. Appendix A.3.5 provides more on this. | ||||
A.3.3. Communications Consent | ||||
Browser client implementations of RTC-Web MUST implement ICE. Server | ||||
gateway implementations which operate only at public IP addresses may | ||||
implement ICE-Lite. | ||||
Browser implementations MUST verify reachability via ICE prior to | ||||
sending any non-ICE packets to a given destination. Implementations | ||||
MUST NOT provide the ICE transaction ID to JavaScript. [Note: this | ||||
document takes no position on the split between ICE in JS and ICE in | ||||
the browser. The above text is written the way it is for editorial | ||||
convenience and will be modified appropriately if the WG decides on | ||||
ICE in the JS.] | ||||
Implementations MUST send keepalives no less frequently than every 30 | ||||
seconds regardless of whether traffic is flowing or not. If a | ||||
keepalive fails then the implementation MUST either attempt to find a | ||||
new valid path via ICE or terminate media for that ICE component. | ||||
Note that ICE [RFC5245]; Section 10 keepalives use STUN Binding | ||||
Indications which are one-way and therefore not sufficient. We will | ||||
need to define a new mechanism for this. [OPEN ISSUE: what to do | ||||
here.] | ||||
A.3.4. IP Location Privacy | ||||
As mentioned in Section 4.2.4 above, a side effect of the default ICE | ||||
behavior is that the peer learns one's IP address, which leaks large | ||||
amounts of location information, especially for mobile devices. This | ||||
has negative privacy consequences in some circumstances. The | ||||
following two API requirements are intended to mitigate this issue: | ||||
API Requirement: The API MUST provide a mechanism to suppress ICE | ||||
negotiation (though perhaps to allow candidate gathering) until | ||||
the user has decided to answer the call [note: determining when | ||||
the call has been answered is a question for the JS.] This | ||||
enables a user to prevent a peer from learning their IP address if | ||||
they elect not to answer a call. | ||||
API Requirement: The API MUST provide a mechanism for the calling | ||||
application to indicate that only TURN candidates are to be used. | ||||
This prevents the peer from learning one's IP address at all. | ||||
A.3.5. Communications Security | ||||
Implementations MUST implement DTLS and DTLS-SRTP. All data channels | ||||
MUST be secured via DTLS. DTLS-SRTP MUST be offered for every media | ||||
channel and MUST be the default; i.e., if an implementation receives | ||||
an offer for DTLS-SRTP and SDES and/or plain RTP, DTLS-SRTP MUST be | ||||
selected. | ||||
[OPEN ISSUE: What should the settings be here? MUST?] | ||||
Implementations MAY support SDES and RTP for media traffic for | ||||
backward compatibility purposes. | ||||
API Requirement: The API MUST provide a mechanism to indicate that a | ||||
fresh DTLS key pair is to be generated for a specific call. This | ||||
is intended to allow for unlinkability. Note that there are also | ||||
settings where it is attractive to use the same keying material | ||||
repeatedly, especially those with key continuity-based | ||||
authentication. | ||||
API Requirement: The API MUST provide a mechanism to indicate that a | ||||
fresh DTLS key pair is to be generated for a specific call. This | ||||
is intended to allow for unlinkability. | ||||
API Requirement: When DTLS-SRTP is used, the API MUST NOT permit the | ||||
JS to obtain the negotiated keying material. This requirement | ||||
preserves the end-to-end security of the media. | ||||
UI Requirements: A user-oriented client MUST provide an | ||||
"inspector" interface which allows the user to determine the | ||||
security characteristics of the media. [largely derived from | ||||
[I-D.kaufman-rtcweb-security-ui] | ||||
The following properties SHOULD be displayed "up-front" in the | ||||
browser chrome, i.e., without requiring the user to ask for them: | ||||
* A client MUST provide a user interface through which a user may | ||||
determine the security characteristics for currently-displayed | ||||
audio and video stream(s) | ||||
* A client MUST provide a user interface through which a user may | ||||
determine the security characteristics for transmissions of | ||||
their microphone audio and camera video. | ||||
* The "security characteristics" MUST include an indication as to | ||||
whether or not the transmission is cryptographically protected | ||||
and whether that protection is based on a key that was | ||||
delivered out-of-band (from a server) or was generated as a | ||||
result of a pairwise negotiation. | ||||
* If the far endpoint was directly verified Appendix A.3.6 the | ||||
"security characteristics" MUST include the verified | ||||
information. | ||||
The following properties are more likely to require some "drill- | ||||
down" from the user: | ||||
* If the transmission is cryptographically protected, the The | ||||
algorithms in use (For example: "AES-CBC" or "Null Cipher".) | ||||
* If the transmission is cryptographically protected, the | ||||
"security characteristics" MUST indicate whether PFS is | ||||
provided. | ||||
* If the transmission is cryptographically protected via an end- | ||||
to-end mechanism the "security characteristics" MUST include | ||||
some mechanism to allow an out-of-band verification of the | ||||
peer, such as a certificate fingerprint or an SAS. | ||||
A.3.6. Web-Based Peer Authentication | ||||
A.3.6.1. Generic Concepts | ||||
In a number of cases, it is desirable for the endpoint (i.e., the | ||||
browser) to be able to directly identity the endpoint on the other | ||||
side without trusting only the signaling service to which they are | ||||
connected. For instance, users may be making a call via a federated | ||||
system where they wish to get direct authentication of the other | ||||
side. Alternately, they may be making a call on a site which they | ||||
minimally trust (such as a poker site) but to someone who has an | ||||
identity on a site they do trust (such as a social network.) | ||||
Recently, a number of Web-based identity technologies (OAuth, | ||||
BrowserID, Facebook Connect), etc. have been developed. While the | ||||
details vary, what these technologies share is that they have a Web- | ||||
based (i.e., HTTP/HTTPS identity provider) which attests to your | ||||
identity. For instance, if I have an account at example.org, I could | ||||
use the example.org identity provider to prove to others that I was | ||||
alice@example.org. The development of these technologies allows us | ||||
to separate calling from identity provision: I could call you on | ||||
Poker Galaxy but identify myself as alice@example.org. | ||||
Whatever the underlying technology, the general principle is that the | ||||
party which is being authenticated is NOT the signaling site but | ||||
rather the user (and their browser). Similarly, the relying party is | ||||
the browser and not the signaling site. This means that the | ||||
PeerConnection API MUST arrange to talk directly to the identity | ||||
provider in a way that cannot be impersonated by the calling site. | ||||
The following sections provide two examples of this. | ||||
A.3.6.2. BrowserID | ||||
BrowserID [https://browserid.org/] is a technology which allows a | ||||
user with a verified email address to generate an assertion | ||||
(authenticated by their identity provider) attesting to their | ||||
identity (phrased as an email address). The way that this is used in | ||||
practice is that the relying party embeds JS in their site which | ||||
talks to the BrowserID code (either hosted on a trusted intermediary | ||||
or embedded in the browser). That code generates the assertion which | ||||
is passed back to the relying party for verification. The assertion | ||||
can be verified directly or with a Web service provided by the | ||||
identity provider. It's relatively easy to extend this functionality | ||||
to authenticate RTC-Web calls, as shown below. | ||||
+----------------------+ +----------------------+ | ||||
| | | | | ||||
| Alice's Browser | | Bob's Browser | | ||||
| | OFFER ------------> | | | ||||
| Calling JS Code | | Calling JS Code | | ||||
| ^ | | ^ | | ||||
| | | | | | | ||||
| v | | v | | ||||
| PeerConnection | | PeerConnection | | ||||
| | ^ | | | ^ | | ||||
| Finger| |Signed | |Signed | | | | ||||
| print | |Finger | |Finger | |"Alice"| | ||||
| | |print | |print | | | | ||||
| v | | | v | | | ||||
| +--------------+ | | +---------------+ | | ||||
| | BrowserID | | | | BrowserID | | | ||||
| | Signer | | | | Verifier | | | ||||
| +--------------+ | | +---------------+ | | ||||
| ^ | | ^ | | ||||
+-----------|----------+ +----------|-----------+ | ||||
| | | ||||
| Get certificate | | ||||
v | Check | ||||
+----------------------+ | certificate | ||||
| | | | ||||
| Identity |/-------------------------------+ | ||||
| Provider | | ||||
| | | ||||
+----------------------+ | ||||
The way this mechanism works is as follows. On Alice's side, Alice | ||||
goes to initiate a call. | ||||
1. The calling JS instantiates a PeerConnection and tells it that it | ||||
is interested in having it authenticated via BrowserID. | ||||
2. The PeerConnection instantiates the BrowserID signer in an | ||||
invisible IFRAME. The IFRAME is tagged with an origin that | ||||
indicates that it was generated by the PeerConnection (this | ||||
prevents ordinary JS from implementing it). The BrowserID signer | ||||
is provided with Alice's fingerprint. Note that the IFRAME here | ||||
does not render any UI. It is being used solely to allow the | ||||
browser to load the BrowserID signer in isolation, especially | ||||
from the calling site. | ||||
3. The BrowserID signer contacts Alice's identity provider, | ||||
authenticating as Alice (likely via a cookie). | ||||
4. The identity provider returns a short-term certificate attesting | ||||
to Alice's identity and her short-term public key. | ||||
5. The Browser-ID code signs the fingerprint and returns the signed | ||||
assertion + certificate to the PeerConnection. [Note: there are | ||||
well-understood Web mechanisms for this that I am excluding here | ||||
for simplicity.] | ||||
6. The PeerConnection returns the signed information to the calling | ||||
JS code. | ||||
7. The signed assertion gets sent over the wire to Bob's browser | ||||
(via the signaling service) as part of the call setup. | ||||
Obviously, the format of the signed assertion varies depending on | ||||
what signaling style the WG ultimately adopts. However, for | ||||
concreteness, if something like ROAP were adopted, then the entire | ||||
message might look like: | ||||
{ | ||||
"messageType":"OFFER", | ||||
"callerSessionId":"13456789ABCDEF", | ||||
"seq": 1 | ||||
"sdp":" | ||||
v=0\n | ||||
o=- 2890844526 2890842807 IN IP4 192.0.2.1\n | ||||
s= \n | ||||
c=IN IP4 192.0.2.1\n | ||||
t=2873397496 2873404696\n | ||||
m=audio 49170 RTP/AVP 0\n | ||||
a=fingerprint: SHA-1 \ | ||||
4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB\n", | ||||
"identity":{ | ||||
"identityType":"browserid", | ||||
"assertion": { | ||||
"digest":"<hash of fingerprint and session IDs>", | ||||
"audience": "[TBD]" | ||||
"valid-until": 1308859352261, | ||||
}, // signed using user's key | ||||
"certificate": { | ||||
"email": "rescorla@gmail.com", | ||||
"public-key": "<ekrs-public-key>", | ||||
"valid-until": 1308860561861, | ||||
} // certificate is signed by gmail.com | ||||
} | ||||
} | ||||
Note that we only expect to sign the fingerprint values and the | ||||
session IDs, in order to allow the JS or calling service to modify | ||||
the rest of the SDP, while protecting the identity binding. [OPEN | ||||
ISSUE: should we sign seq too?] | ||||
[TODO: NEed to talk about Audience a bit.] | ||||
On Bob's side, he receives the signed assertion as part of the call | ||||
setup message and a similar procedure happens to verify it. | ||||
1. The calling JS instantiates a PeerConnection and provides it the | ||||
relevant signaling information, including the signed assertion. | ||||
2. The PeerConnection instantiates a BrowserID verifier in an IFRAME | ||||
and provides it the signed assertion. | ||||
3. The BrowserID verifier contacts the identity provider to verify | ||||
the certificate and then uses the key to verify the signed | ||||
fingerprint. | ||||
4. Alice's verified identity is returned to the PeerConnection (it | ||||
already has the fingerprint). | ||||
5. At this point, Bob's browser can display a trusted UI indication | ||||
that Alice is on the other end of the call. | ||||
When Bob returns his answer, he follows the converse procedure, which | ||||
provides Alice with a signed assertion of Bob's identity and keying | ||||
material. | ||||
A.3.6.3. OAuth | ||||
While OAuth is not directly designed for user-to-user authentication, | ||||
with a little lateral thinking it can be made to serve. We use the | ||||
following mapping of OAuth concepts to RTC-Web concepts: | ||||
+----------------------+----------------------+ | ||||
| OAuth | RTCWeb | | ||||
+----------------------+----------------------+ | ||||
| Client | Relying party | | ||||
| Resource owner | Authenticating party | | ||||
| Authorization server | Identity service | | ||||
| Resource server | Identity service | | ||||
+----------------------+----------------------+ | ||||
Table 1 | ||||
The idea here is that when Alice wants to authenticate to Bob (i.e., | ||||
for Bob to be aware that she is calling). In order to do this, she | ||||
allows Bob to see a resource on the identity provider that is bound | ||||
to the call, her identity, and her public key. Then Bob retrieves | ||||
the resource from the identity provider, thus verifying the binding | ||||
between Alice and the call. | ||||
Alice IDP Bob | ||||
--------------------------------------------------------- | ||||
Call-Id, Fingerprint -------> | ||||
<------------------- Auth Code | ||||
Auth Code ----------------------------------------------> | ||||
<----- Get Token + Auth Code | ||||
Token ---------------------> | ||||
<------------- Get call-info | ||||
Call-Id, Fingerprint ------> | ||||
This is a modified version of a common OAuth flow, but omits the | ||||
redirects required to have the client point the resource owner to the | ||||
IDP, which is acting as both the resource server and the | ||||
authorization server, since Alice already has a handle to the IDP. | ||||
Above, we have referred to "Alice", but really what we mean is the | ||||
PeerConnection. Specifically, the PeerConnection will instantiate an | ||||
IFRAME with JS from the IDP and will use that IFRAME to communicate | ||||
with the IDP, authenticating with Alice's identity (e.g., cookie). | ||||
Similarly, Bob's PeerConnection instantiates an IFRAME to talk to the | ||||
IDP. | ||||
A.3.6.4. Generic Identity Support | ||||
I believe it's possible to build a generic interface between the | ||||
PeerConnection and any identity sub-module so that the PeerConnection | ||||
just gets pointed to the IDP (which the relying party either trusts | ||||
or not) and JS from the IDP provides the concrete interfaces. | ||||
However, I need to work out the details, so I'm not specifying this | ||||
yet. If it works, the previous two sections will just be examples. | ||||
Author's Address | Author's Address | |||
Eric Rescorla | Eric Rescorla | |||
RTFM, Inc. | RTFM, Inc. | |||
2064 Edgewood Drive | 2064 Edgewood Drive | |||
Palo Alto, CA 94303 | Palo Alto, CA 94303 | |||
USA | USA | |||
Phone: +1 650 678 2350 | Phone: +1 650 678 2350 | |||
Email: ekr@rtfm.com | Email: ekr@rtfm.com | |||
End of changes. 26 change blocks. | ||||
774 lines changed or deleted | 89 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |