draft-ietf-rtcweb-security-00.txt | draft-ietf-rtcweb-security-01.txt | |||
---|---|---|---|---|
RTC-Web E. Rescorla | RTC-Web E. Rescorla | |||
Internet-Draft RTFM, Inc. | Internet-Draft RTFM, Inc. | |||
Intended status: Standards Track September 21, 2011 | Intended status: Standards Track October 30, 2011 | |||
Expires: March 24, 2012 | Expires: May 2, 2012 | |||
Security Considerations for RTC-Web | Security Considerations for RTC-Web | |||
draft-ietf-rtcweb-security-00 | draft-ietf-rtcweb-security-01 | |||
Abstract | Abstract | |||
The Real-Time Communications on the Web (RTC-Web) working group is | The Real-Time Communications on the Web (RTC-Web) working group is | |||
tasked with standardizing protocols for real-time communications | tasked with standardizing protocols for real-time communications | |||
between Web browsers. The major use cases for RTC-Web technology are | between Web browsers. The major use cases for RTC-Web technology are | |||
real-time audio and/or video calls, Web conferencing, and direct data | real-time audio and/or video calls, Web conferencing, and direct data | |||
transfer. Unlike most conventional real-time systems (e.g., SIP- | transfer. Unlike most conventional real-time systems (e.g., SIP- | |||
based soft phones) RTC-Web communications are directly controlled by | based soft phones) RTC-Web communications are directly controlled by | |||
some Web server, which poses new security challenges. For instance, | some Web server, which poses new security challenges. For instance, | |||
skipping to change at page 2, line 7 | skipping to change at page 2, line 7 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on March 24, 2012. | This Internet-Draft will expire on May 2, 2012. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2011 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 3, line 7 | skipping to change at page 3, line 7 | |||
modifications of such material outside the IETF Standards Process. | modifications of such material outside the IETF Standards Process. | |||
Without obtaining an adequate license from the person(s) controlling | Without obtaining an adequate license from the person(s) controlling | |||
the copyright in such materials, this document may not be modified | the copyright in such materials, this document may not be modified | |||
outside the IETF Standards Process, and derivative works of it may | outside the IETF Standards Process, and derivative works of it may | |||
not be created outside the IETF Standards Process, except to format | not be created outside the IETF Standards Process, except to format | |||
it for publication as an RFC or to translate it into languages other | it for publication as an RFC or to translate it into languages other | |||
than English. | than English. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 5 | 3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 6 | |||
3.1. Access to Local Resources . . . . . . . . . . . . . . . . 6 | 3.1. Access to Local Resources . . . . . . . . . . . . . . . . 7 | |||
3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 6 | 3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 7 | |||
3.3. Bypassing SOP: CORS, WebSockets, and consent to | 3.3. Bypassing SOP: CORS, WebSockets, and consent to | |||
communicate . . . . . . . . . . . . . . . . . . . . . . . 7 | communicate . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
4. Security for RTC-Web Applications . . . . . . . . . . . . . . 7 | 4. Security for RTC-Web Applications . . . . . . . . . . . . . . 8 | |||
4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 7 | 4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 8 | |||
4.1.1. Calling Scenarios and User Expectations . . . . . . . 8 | 4.1.1. Calling Scenarios and User Expectations . . . . . . . 9 | |||
4.1.1.1. Dedicated Calling Services . . . . . . . . . . . . 8 | 4.1.1.1. Dedicated Calling Services . . . . . . . . . . . . 9 | |||
4.1.1.2. Calling the Site You're On . . . . . . . . . . . . 8 | 4.1.1.2. Calling the Site You're On . . . . . . . . . . . . 9 | |||
4.1.1.3. Calling to an Ad Target . . . . . . . . . . . . . 9 | 4.1.1.3. Calling to an Ad Target . . . . . . . . . . . . . 10 | |||
4.1.2. Origin-Based Security . . . . . . . . . . . . . . . . 9 | 4.1.2. Origin-Based Security . . . . . . . . . . . . . . . . 10 | |||
4.1.3. Security Properties of the Calling Page . . . . . . . 11 | 4.1.3. Security Properties of the Calling Page . . . . . . . 12 | |||
4.2. Communications Consent Verification . . . . . . . . . . . 12 | 4.2. Communications Consent Verification . . . . . . . . . . . 13 | |||
4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 12 | 4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 13 | 4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 14 | |||
4.3. Communications Security . . . . . . . . . . . . . . . . . 14 | 4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15 | |||
4.3.1. Protecting Against Retrospective Compromise . . . . . 15 | 4.3. Communications Security . . . . . . . . . . . . . . . . . 15 | |||
4.3.2. Protecting Against During-Call Attack . . . . . . . . 15 | 4.3.1. Protecting Against Retrospective Compromise . . . . . 16 | |||
4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 16 | 4.3.2. Protecting Against During-Call Attack . . . . . . . . 17 | |||
4.3.2.2. Short Authentication Strings . . . . . . . . . . . 16 | 4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 17 | |||
4.3.2.3. Recommendations . . . . . . . . . . . . . . . . . 17 | 4.3.2.2. Short Authentication Strings . . . . . . . . . . . 18 | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 17 | 4.3.2.3. Recommendations . . . . . . . . . . . . . . . . . 19 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | |||
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
7.1. Normative References . . . . . . . . . . . . . . . . . . . 18 | 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
7.2. Informative References . . . . . . . . . . . . . . . . . . 18 | 7.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 7.2. Informative References . . . . . . . . . . . . . . . . . . 20 | |||
Appendix A. A Proposed Security Architecture [No Consensus on | ||||
This] . . . . . . . . . . . . . . . . . . . . . . . . 22 | ||||
A.1. Trust Hierarchy . . . . . . . . . . . . . . . . . . . . . 22 | ||||
A.1.1. Authenticated Entities . . . . . . . . . . . . . . . . 22 | ||||
A.1.2. Unauthenticated Entities . . . . . . . . . . . . . . . 23 | ||||
A.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 23 | ||||
A.2.1. Initial Signaling . . . . . . . . . . . . . . . . . . 24 | ||||
A.2.2. Media Consent Verification . . . . . . . . . . . . . . 26 | ||||
A.2.3. DTLS Handshake . . . . . . . . . . . . . . . . . . . . 26 | ||||
A.2.4. Communications and Consent Freshness . . . . . . . . . 27 | ||||
A.3. Detailed Technical Description . . . . . . . . . . . . . . 27 | ||||
A.3.1. Origin and Web Security Issues . . . . . . . . . . . . 27 | ||||
A.3.2. Device Permissions Model . . . . . . . . . . . . . . . 28 | ||||
A.3.3. Communications Consent . . . . . . . . . . . . . . . . 29 | ||||
A.3.4. IP Location Privacy . . . . . . . . . . . . . . . . . 29 | ||||
A.3.5. Communications Security . . . . . . . . . . . . . . . 30 | ||||
A.3.6. Web-Based Peer Authentication . . . . . . . . . . . . 31 | ||||
A.3.6.1. Generic Concepts . . . . . . . . . . . . . . . . . 31 | ||||
A.3.6.2. BrowserID . . . . . . . . . . . . . . . . . . . . 32 | ||||
A.3.6.3. OAuth . . . . . . . . . . . . . . . . . . . . . . 35 | ||||
A.3.6.4. Generic Identity Support . . . . . . . . . . . . . 36 | ||||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 | ||||
1. Introduction | 1. Introduction | |||
The Real-Time Communications on the Web (RTC-Web) working group is | The Real-Time Communications on the Web (RTC-Web) working group is | |||
tasked with standardizing protocols for real-time communications | tasked with standardizing protocols for real-time communications | |||
between Web browsers. The major use cases for RTC-Web technology are | between Web browsers. The major use cases for RTC-Web technology are | |||
real-time audio and/or video calls, Web conferencing, and direct data | real-time audio and/or video calls, Web conferencing, and direct data | |||
transfer. Unlike most conventional real-time systems, (e.g., SIP- | transfer. Unlike most conventional real-time systems, (e.g., SIP- | |||
based[RFC3261] soft phones) RTC-Web communications are directly | based[RFC3261] soft phones) RTC-Web communications are directly | |||
controlled by some Web server. A simple case is shown below. | controlled by some Web server. A simple case is shown below. | |||
skipping to change at page 6, line 40 | skipping to change at page 7, line 40 | |||
induced to download executable files and run them). | induced to download executable files and run them). | |||
3.2. Same Origin Policy | 3.2. Same Origin Policy | |||
Many other resources are accessible but isolated. For instance, | Many other resources are accessible but isolated. For instance, | |||
while scripts are allowed to make HTTP requests via the | while scripts are allowed to make HTTP requests via the | |||
XMLHttpRequest() API those requests are not allowed to be made to any | XMLHttpRequest() API those requests are not allowed to be made to any | |||
server, but rather solely to the same ORIGIN from whence the script | server, but rather solely to the same ORIGIN from whence the script | |||
came.[I-D.abarth-origin] (although CORS [CORS] and WebSockets | came.[I-D.abarth-origin] (although CORS [CORS] and WebSockets | |||
[I-D.ietf-hybi-thewebsocketprotocol] provides a escape hatch from | [I-D.ietf-hybi-thewebsocketprotocol] provides a escape hatch from | |||
this restriction, as described below. This SAME ORIGIN POLICY (SOP) | this restriction, as described below.) This SAME ORIGIN POLICY (SOP) | |||
prevents server A from mounting attacks on server B via the user's | prevents server A from mounting attacks on server B via the user's | |||
browser, which protects both the user (e.g., from misuse of his | browser, which protects both the user (e.g., from misuse of his | |||
credentials) and the server (e.g., from DoS attack). | credentials) and the server (e.g., from DoS attack). | |||
More generally, SOP forces scripts from each site to run in their | More generally, SOP forces scripts from each site to run in their | |||
own, isolated, sandboxes. While there are techniques to allow them | own, isolated, sandboxes. While there are techniques to allow them | |||
to interact, those interactions generally must be mutually consensual | to interact, those interactions generally must be mutually consensual | |||
(by each site) and are limited to certain channels. For instance, | (by each site) and are limited to certain channels. For instance, | |||
multiple pages/browser panes from the same origin can read each | multiple pages/browser panes from the same origin can read each | |||
other's JS variables, but pages from the different origins--or even | other's JS variables, but pages from the different origins--or even | |||
skipping to change at page 7, line 50 | skipping to change at page 8, line 50 | |||
generate traffic which resembles a given protocol. | generate traffic which resembles a given protocol. | |||
4. Security for RTC-Web Applications | 4. Security for RTC-Web Applications | |||
4.1. Access to Local Devices | 4.1. Access to Local Devices | |||
As discussed in Section 1, allowing arbitrary sites to initiate calls | As discussed in Section 1, allowing arbitrary sites to initiate calls | |||
violates the core Web security guarantee; without some access | violates the core Web security guarantee; without some access | |||
restrictions on local devices, any malicious site could simply bug a | restrictions on local devices, any malicious site could simply bug a | |||
user. At minimum, then, it MUST NOT be possible for arbitrary sites | user. At minimum, then, it MUST NOT be possible for arbitrary sites | |||
to initiate calls to arbitrary location without user consent. This | to initiate calls to arbitrary locations without user consent. This | |||
immediately raises the question, however, of what should be the scope | immediately raises the question, however, of what should be the scope | |||
of user consent. | of user consent. | |||
For the rest of this discussion we assume that the user is somehow | For the rest of this discussion we assume that the user is somehow | |||
going to grant consent to some entity (e.g., a social networking | going to grant consent to some entity (e.g., a social networking | |||
site) to initiate a call on his behalf. This consent may be limited | site) to initiate a call on his behalf. This consent may be limited | |||
to a single call or may be a general consent. In order for the user | to a single call or may be a general consent. In order for the user | |||
to make an intelligent decision about whether to allow a call (and | to make an intelligent decision about whether to allow a call (and | |||
hence his camera and microphone input to be routed somewhere), he | hence his camera and microphone input to be routed somewhere), he | |||
must understand either who is request access, where the media is | must understand either who is requesting access, where the media is | |||
going, or both. So, for instance, one might imagine that at the time | going, or both. So, for instance, one might imagine that at the time | |||
access to camera and microphone is requested, the user is shown a | access to camera and microphone is requested, the user is shown a | |||
dialog that says "site X has requested access to camera and | dialog that says "site X has requested access to camera and | |||
microphone, yes or no" (though note that this type of in-flow | microphone, yes or no" (though note that this type of in-flow | |||
interface violates one of the guidelines in Section Section 3). The | interface violates one of the guidelines in Section 3). The user's | |||
user's decision will of course be based on his opinion of Site X. | decision will of course be based on his opinion of Site X. However, | |||
However, as discussed below, this is a complicated concept. | as discussed below, this is a complicated concept. | |||
4.1.1. Calling Scenarios and User Expectations | 4.1.1. Calling Scenarios and User Expectations | |||
While a large number of possible calling scenarios are possible, the | While a large number of possible calling scenarios are possible, the | |||
scenarios discussed in this section illustrate many of the | scenarios discussed in this section illustrate many of the | |||
difficulties of identifying the relevant scope of consent. | difficulties of identifying the relevant scope of consent. | |||
4.1.1.1. Dedicated Calling Services | 4.1.1.1. Dedicated Calling Services | |||
The first scenario we consider is a dedicated calling service. In | The first scenario we consider is a dedicated calling service. In | |||
skipping to change at page 9, line 12 | skipping to change at page 10, line 12 | |||
The paradigmatic case here is the "click here to talk to a | The paradigmatic case here is the "click here to talk to a | |||
representative" windows that appear on many shopping sites. In this | representative" windows that appear on many shopping sites. In this | |||
case, the user's expectation is that they are calling the site | case, the user's expectation is that they are calling the site | |||
they're actually visiting. However, it is unlikely that they want to | they're actually visiting. However, it is unlikely that they want to | |||
provide a general consent to such a site; just because I want some | provide a general consent to such a site; just because I want some | |||
information on a car doesn't mean that I want the car manufacturer to | information on a car doesn't mean that I want the car manufacturer to | |||
be able to activate my microphone whenever they please. Thus, this | be able to activate my microphone whenever they please. Thus, this | |||
suggests the need for a second consent mechanism where I only grant | suggests the need for a second consent mechanism where I only grant | |||
consent for the duration of a given call. As described in | consent for the duration of a given call. As described in | |||
Section 3.1, great care must be taken in the design of this interface | Section 3.1, great care must be taken in the design of this interface | |||
to avoid the users just clicking through. | to avoid the users just clicking through. Note also that the user | |||
interface chrome must clearly display elements showing that the call | ||||
is continuing in order to avoid attacks where the calling site just | ||||
leaves it up indefinitely but shows a Web UI that implies otherwise. | ||||
4.1.1.3. Calling to an Ad Target | 4.1.1.3. Calling to an Ad Target | |||
In both of the previous cases, the user has a direct relationship | In both of the previous cases, the user has a direct relationship | |||
(though perhaps a transient one) with the target of the call. | (though perhaps a transient one) with the target of the call. | |||
Moreover, in both cases he is actually visiting the site of the | Moreover, in both cases he is actually visiting the site of the | |||
person he is being asked to trust. However, this is not always so. | person he is being asked to trust. However, this is not always so. | |||
Consider the case where a user is a visiting a content site which | Consider the case where a user is a visiting a content site which | |||
hosts an advertisement with an invitation to call for more | hosts an advertisement with an invitation to call for more | |||
information. When the user clicks the ad, they are connected with | information. When the user clicks the ad, they are connected with | |||
skipping to change at page 10, line 31 | skipping to change at page 11, line 34 | |||
Cryptographic Consent | Cryptographic Consent | |||
Only allow calls to a given set of peer keying material or to a | Only allow calls to a given set of peer keying material or to a | |||
cryptographically established identity. | cryptographically established identity. | |||
Unfortunately, none of these approaches is satisfactory for all | Unfortunately, none of these approaches is satisfactory for all | |||
cases. As discussed above, individual consent puts the user's | cases. As discussed above, individual consent puts the user's | |||
approval in the UI flow for every call. Not only does this quickly | approval in the UI flow for every call. Not only does this quickly | |||
become annoying but it can train the user to simply click "OK", at | become annoying but it can train the user to simply click "OK", at | |||
which point the consent becomes useless. Thus, while it may be | which point the consent becomes useless. Thus, while it may be | |||
necssary to have individual consent in some case, this is not a | necessary to have individual consent in some case, this is not a | |||
suitable solution for (for instance) the calling service case. Where | suitable solution for (for instance) the calling service case. Where | |||
necessary, in-flow user interfaces must be carefully designed to | necessary, in-flow user interfaces must be carefully designed to | |||
avoid the risk of the user blindly clicking through. | avoid the risk of the user blindly clicking through. | |||
The other two options are designed to restrict calls to a given | The other two options are designed to restrict calls to a given | |||
target. Unfortunately, Callee-oriented consent does not work well | target. Unfortunately, Callee-oriented consent does not work well | |||
because a malicious site can claim that the user is calling any user | because a malicious site can claim that the user is calling any user | |||
of his choice. One fix for this is to tie calls to a | of his choice. One fix for this is to tie calls to a | |||
cryptographically established identity. While not suitable for all | cryptographically established identity. While not suitable for all | |||
cases, this approach may be useful for some. If we consider the | cases, this approach may be useful for some. If we consider the | |||
advertising case described in Section 4.1.1.3, it's not particularly | advertising case described in Section 4.1.1.3, it's not particularly | |||
convenient to require the advertiser to instantiate an iframe on the | convenient to require the advertiser to instantiate an iframe on the | |||
hosting site just to get permission; a more convenient approach is to | hosting site just to get permission; a more convenient approach is to | |||
cryptographically tie the advertiser's certificate to the | cryptographically tie the advertiser's certificate to the | |||
communication directly. We're still tieing permissions to origin | communication directly. We're still tying permissions to origin | |||
here, but to the media origin (and-or destination) rather than to the | here, but to the media origin (and-or destination) rather than to the | |||
Web origin. | Web origin. | |||
Another case where media-level cryptographic identity makes sense is | Another case where media-level cryptographic identity makes sense is | |||
when a user really does not trust the calling site. For instance, I | when a user really does not trust the calling site. For instance, I | |||
might be worried that the calling service will attempt to bug my | might be worried that the calling service will attempt to bug my | |||
computer, but I also want to be able to conveniently call my friends. | computer, but I also want to be able to conveniently call my friends. | |||
If consent is tied to particular communications endpoints, then my | If consent is tied to particular communications endpoints, then my | |||
risk is limited. However, this is also not that convenient an | risk is limited. However, this is also not that convenient an | |||
interface, since managing individual user permissions can be painful. | interface, since managing individual user permissions can be painful. | |||
While this is primarily a question not for IETF, it should be clear | While this is primarily a question not for IETF, it should be clear | |||
that there is no really good answer. In general, if you cannot trust | that there is no really good answer. In general, if you cannot trust | |||
the site which you have authorized for calling not to bug you then | the site which you have authorized for calling not to bug you then | |||
your security situation is not really ideal. It is RECOMMENDED that | your security situation is not really ideal. It is RECOMMENDED that | |||
browsers have explicit (and obvious) indicators that they are in a | browsers have explicit (and obvious) indicators that they are in a | |||
call in order to mitigate this risk. | call in order to mitigate this risk. | |||
4.1.3. Security Properties of the Calling Page | 4.1.3. Security Properties of the Calling Page | |||
Origin-based security is intended to security against web attackers. | Origin-based security is intended to secure against web attackers. | |||
However, we must also consider the case of network attackers. | However, we must also consider the case of network attackers. | |||
Consider the case where I have granted permission to a calling | Consider the case where I have granted permission to a calling | |||
service by an origin that has the HTTP scheme, e.g., | service by an origin that has the HTTP scheme, e.g., | |||
http://calling-service.example.com. If I ever use my computer on an | http://calling-service.example.com. If I ever use my computer on an | |||
unsecured network (e.g., a hotspot or if my own home wireless network | unsecured network (e.g., a hotspot or if my own home wireless network | |||
is insecure), and browse any HTTP site, then an attacker can bug my | is insecure), and browse any HTTP site, then an attacker can bug my | |||
computer. The attack proceeds like this: | computer. The attack proceeds like this: | |||
1. I connect to http://anything.example.org/. Note that this site | 1. I connect to http://anything.example.org/. Note that this site | |||
is unaffiliated with the calling service. | is unaffiliated with the calling service. | |||
skipping to change at page 12, line 14 | skipping to change at page 13, line 16 | |||
infect the browser's notion of that origin semi-permanently. | infect the browser's notion of that origin semi-permanently. | |||
[[ OPEN ISSUE: What recommendation should IETF make about (a) | [[ OPEN ISSUE: What recommendation should IETF make about (a) | |||
whether RTCWeb long-term consent should be available over HTTP pages | whether RTCWeb long-term consent should be available over HTTP pages | |||
and (b) How to handle origins where the consent is to an HTTPS URL | and (b) How to handle origins where the consent is to an HTTPS URL | |||
but the page contains active mixed content? ]] | but the page contains active mixed content? ]] | |||
4.2. Communications Consent Verification | 4.2. Communications Consent Verification | |||
As discussed in Section 3.3, allowing web applications unrestricted | As discussed in Section 3.3, allowing web applications unrestricted | |||
access to the via the browser network introduces the risk of using | network access via the browser introduces the risk of using the | |||
the browser as an attack platform against machines which would not | browser as an attack platform against machines which would not | |||
otherwise be accessible to the malicious site, for instance because | otherwise be accessible to the malicious site, for instance because | |||
they are topologically restricted (e.g., behind a firewall or NAT). | they are topologically restricted (e.g., behind a firewall or NAT). | |||
In order to prevent this form of attack as well as cross-protocol | In order to prevent this form of attack as well as cross-protocol | |||
attacks it is important to require that the target of traffic | attacks it is important to require that the target of traffic | |||
explicitly consent to receiving the traffic in question. Until that | explicitly consent to receiving the traffic in question. Until that | |||
consent has been verified for a given endpoint, traffic other than | consent has been verified for a given endpoint, traffic other than | |||
the consent handshake MUST NOT be sent to that endpoint. | the consent handshake MUST NOT be sent to that endpoint. | |||
4.2.1. ICE | 4.2.1. ICE | |||
Verifying receiver consent requires some sort of explicit handshake, | Verifying receiver consent requires some sort of explicit handshake, | |||
but conveniently we already need one in order to do NAT hole- | but conveniently we already need one in order to do NAT hole- | |||
punching. ICE [RFC5245] includes a handshake designed to verify that | punching. ICE [RFC5245] includes a handshake designed to verify that | |||
the receiving element wishes to receive traffic from the sender. It | the receiving element wishes to receive traffic from the sender. It | |||
is important to remember here that the site initiating ICE is | is important to remember here that the site initiating ICE is | |||
presumed malicious; in order for the handshake to be secure the | presumed malicious; in order for the handshake to be secure the | |||
receiving element MUST demonstrate receipt/knowledge of some value | receiving element MUST demonstrate receipt/knowledge of some value | |||
not available to the site (thus preventing it from forging | not available to the site (thus preventing the site from forging | |||
responses). In order to achieve this objective with ICE, the STUN | responses). In order to achieve this objective with ICE, the STUN | |||
transaction IDs must be generated by the browser and MUST NOT be made | transaction IDs must be generated by the browser and MUST NOT be made | |||
available to the initiating script, even via a diagnostic interface. | available to the initiating script, even via a diagnostic interface. | |||
Verifying receiver consent also requires verifying the receiver wants | ||||
to receive traffic from a particular sender, and at this time; for | ||||
example a malicious site may simply attempt ICE to known servers that | ||||
are using ICE for other sessions. ICE provides this verification as | ||||
well, by using the STUN credentials as a form of per-session shared | ||||
secret. Those credentials are known to the Web application, but | ||||
would need to also be known and used by the STUN-receiving element to | ||||
be useful. | ||||
There also needs to be some mechanism for the browser to verify that | ||||
the target of the traffic continues to wish to receive it. | ||||
Obviously, some ICE-based mechanism will work here, but it has been | ||||
observed that because ICE keepalives are indications, they will not | ||||
work here, so some other mechanism is needed. | ||||
4.2.2. Masking | 4.2.2. Masking | |||
Once consent is verified, there still is some concern about | Once consent is verified, there still is some concern about | |||
misinterpretation attacks as described by Huang et al.[huang-w2sp]. | misinterpretation attacks as described by Huang et al.[huang-w2sp]. | |||
As long as communication is limited to UDP, then this risk is | As long as communication is limited to UDP, then this risk is | |||
probably limited, thus masking is not required for UDP. I.e., once | probably limited, thus masking is not required for UDP. I.e., once | |||
communications consent has been verified, it is most likely safe to | communications consent has been verified, it is most likely safe to | |||
allow the implementation to send arbitrary UDP traffic to the chosen | allow the implementation to send arbitrary UDP traffic to the chosen | |||
destination, provided that the STUN keepalives continue to succeed. | destination, provided that the STUN keepalives continue to succeed. | |||
However, with TCP the risk of transparent proxies becomes much more | In particular, this is true for the data channel if DTLS is used | |||
severe. If TCP is to be used, then WebSockets style masking MUST be | because DTLS (with the anti-chosen plaintext mechanisms required by | |||
employed. | TLS 1.1) does not allow the attacker to generate predictable | |||
ciphertext. However, with TCP the risk of transparent proxies | ||||
becomes much more severe. If TCP is to be used, then WebSockets | ||||
style masking MUST be employed. | ||||
4.2.3. Backward Compatibility | 4.2.3. Backward Compatibility | |||
A requirement to use ICE limits compatibility with legacy non-ICE | A requirement to use ICE limits compatibility with legacy non-ICE | |||
clients. It seems unsafe to completely remove the requirement for | clients. It seems unsafe to completely remove the requirement for | |||
some check. All proposed checks have the common feature that the | some check. All proposed checks have the common feature that the | |||
browser sends some message to the candidate traffic recipient and | browser sends some message to the candidate traffic recipient and | |||
refuses to send other traffic until that message has been replied to. | refuses to send other traffic until that message has been replied to. | |||
The message/reply pair must be generated in such a way that an | The message/reply pair must be generated in such a way that an | |||
attacker who controls the Web application cannot forge them, | attacker who controls the Web application cannot forge them, | |||
skipping to change at page 13, line 51 | skipping to change at page 15, line 20 | |||
responds with responses authenticated with ICE credentials) then this | responds with responses authenticated with ICE credentials) then this | |||
issue does not exist. However, such a design is very close to an | issue does not exist. However, such a design is very close to an | |||
ICE-Lite implementation (indeed, arguably is one). An intermediate | ICE-Lite implementation (indeed, arguably is one). An intermediate | |||
approach would be to have a STUN extension that indicated that one | approach would be to have a STUN extension that indicated that one | |||
was responding to RTC-Web checks but not computing integrity checks | was responding to RTC-Web checks but not computing integrity checks | |||
based on the ICE credentials. This would allow the use of standalone | based on the ICE credentials. This would allow the use of standalone | |||
STUN servers without the risk of confusing them with legacy STUN | STUN servers without the risk of confusing them with legacy STUN | |||
servers. If a non-ICE legacy solution is needed, then this is | servers. If a non-ICE legacy solution is needed, then this is | |||
probably the best choice. | probably the best choice. | |||
[TODO: Write something about consent freshness and RTCP]. | Once initial consent is verified, we also need to verify continuing | |||
consent, in order to avoid attacks where two people briefly share an | ||||
IP (e.g., behind a NAT in an Internet cafe) and the attacker arranges | ||||
for a large, unstoppable, traffic flow to the network and then | ||||
leaves. The appropriate technologies here are fairly similar to | ||||
those for initial consent, though are perhaps weaker since the | ||||
threats is less severe. | ||||
[[ OPEN ISSUE: Exactly what should be the requirements here? | [[ OPEN ISSUE: Exactly what should be the requirements here? | |||
Proposals include ICE all the time or ICE but with allowing one of | Proposals include ICE all the time or ICE but with allowing one of | |||
these non-ICE things for legacy. ]] | these non-ICE things for legacy. ]] | |||
4.2.4. IP Location Privacy | ||||
Note that as soon as the callee sends their ICE candidates, the | ||||
callee learns the callee's IP addresses. The callee's server | ||||
reflexive address reveals a lot of information about the callee's | ||||
location. In order to avoid tracking, implementations may wish to | ||||
suppress the start of ICE negotiation until the callee has answered. | ||||
In addition, either side may wish to hide their location entirely by | ||||
forcing all traffic through a TURN server. | ||||
4.3. Communications Security | 4.3. Communications Security | |||
Finally, we consider a problem familiar from the SIP world: | Finally, we consider a problem familiar from the SIP world: | |||
communications security. For obvious reasons, it MUST be possible | communications security. For obvious reasons, it MUST be possible | |||
for the communicating parties to establish a channel which is secure | for the communicating parties to establish a channel which is secure | |||
against both message recovery and message modification. (See | against both message recovery and message modification. (See | |||
[RFC5479] for more details.) This service must be provided for both | [RFC5479] for more details.) This service must be provided for both | |||
data and voice/video. Ideally the same security mechanisms would be | data and voice/video. Ideally the same security mechanisms would be | |||
used for both types of content. Technology for providing this | used for both types of content. Technology for providing this | |||
service (for instance, DTLS [RFC4347] and DTLS-SRTP [RFC5763]) is | service (for instance, DTLS [RFC4347] and DTLS-SRTP [RFC5763]) is | |||
skipping to change at page 18, line 7 | skipping to change at page 19, line 39 | |||
should be specified and/or mandatory to implement? In particular, | should be specified and/or mandatory to implement? In particular, | |||
should we allow DTLS-SRTP only, or both DTLS-SRTP and SDES. Should | should we allow DTLS-SRTP only, or both DTLS-SRTP and SDES. Should | |||
we allow RTP for backward compatibility? ]] | we allow RTP for backward compatibility? ]] | |||
5. Security Considerations | 5. Security Considerations | |||
This entire document is about security. | This entire document is about security. | |||
6. Acknowledgements | 6. Acknowledgements | |||
Bernard Aboba, Harald Alvestrand, Cullen Jennings, Hadriel Kaplan (S | ||||
4.2.1), Matthew Kaufman, Magnus Westerland. | ||||
7. References | 7. References | |||
7.1. Normative References | 7.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
7.2. Informative References | 7.2. Informative References | |||
[CORS] van Kesteren, A., "Cross-Origin Resource Sharing". | [CORS] van Kesteren, A., "Cross-Origin Resource Sharing". | |||
[I-D.abarth-origin] | [I-D.abarth-origin] | |||
Barth, A., "The Web Origin Concept", | Barth, A., "The Web Origin Concept", | |||
draft-abarth-origin-09 (work in progress), November 2010. | draft-abarth-origin-09 (work in progress), November 2010. | |||
[I-D.ietf-hybi-thewebsocketprotocol] | [I-D.ietf-hybi-thewebsocketprotocol] | |||
Fette, I. and A. Melnikov, "The WebSocket protocol", | Fette, I. and A. Melnikov, "The WebSocket protocol", | |||
draft-ietf-hybi-thewebsocketprotocol-15 (work in | draft-ietf-hybi-thewebsocketprotocol-17 (work in | |||
progress), September 2011. | progress), September 2011. | |||
[I-D.kaufman-rtcweb-security-ui] | [I-D.kaufman-rtcweb-security-ui] | |||
Kaufman, M., "Client Security User Interface Requirements | Kaufman, M., "Client Security User Interface Requirements | |||
for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in | for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in | |||
progress), June 2011. | progress), June 2011. | |||
[RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. | [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. | |||
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, | [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, | |||
skipping to change at page 20, line 12 | skipping to change at page 22, line 5 | |||
Kain, A. and M. Macon, "Design and Evaluation of a Voice | Kain, A. and M. Macon, "Design and Evaluation of a Voice | |||
Conversion Algorithm based on Spectral Envelope Mapping | Conversion Algorithm based on Spectral Envelope Mapping | |||
and Residual Prediction", Proceedings of ICASSP, May | and Residual Prediction", Proceedings of ICASSP, May | |||
2001. | 2001. | |||
[whitten-johnny] | [whitten-johnny] | |||
Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A | Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A | |||
Usability Evaluation of PGP 5.0", Proceedings of the 8th | Usability Evaluation of PGP 5.0", Proceedings of the 8th | |||
USENIX Security Symposium, 1999. | USENIX Security Symposium, 1999. | |||
Appendix A. A Proposed Security Architecture [No Consensus on This] | ||||
This section contains a proposed security architecture, based on the | ||||
considerations discussed in the main body of this memo. This section | ||||
is currently the opinion of the author and does not have consensus | ||||
though some (many?) elements of this proposal do seem to have general | ||||
consensus. | ||||
A.1. Trust Hierarchy | ||||
The basic assumption of this proposal is that network resources exist | ||||
in a hierarchy of trust, rooted in the browser, which serves as the | ||||
user's TRUSTED COMPUTING BASE (TCB). Any security property which the | ||||
user wishes to have enforced must be ultimately guaranteed by the | ||||
browser (or transitively by some property the browser verifies). | ||||
Conversely, if the browser is compromised, then no security | ||||
guarantees are possible. Note that there are cases (e.g., Internet | ||||
kiosks) where the user can't really trust the browser that much. In | ||||
these cases, the level of security provided is limited by how much | ||||
they trust the browser. | ||||
Optimally, we would not rely on trust in any entities other than the | ||||
browser. However, this is unfortunately not possible if we wish to | ||||
have a functional system. Other network elements fall into two | ||||
categories: those which can be authenticated by the browser and thus | ||||
are partly trusted--though to the minimum extent necessary--and those | ||||
which cannot be authenticated and thus are untrusted. This is a | ||||
natural extension of the end-to-end principle. | ||||
A.1.1. Authenticated Entities | ||||
There are two major classes of authenticated entities in the system: | ||||
o Calling services: Web sites whose origin we can verify (optimally | ||||
via HTTPS). | ||||
o Other users: RTC-Web peers whose origin we can verify | ||||
cryptographically (optimally via DTLS-SRTP). | ||||
Note that merely being authenticated does not make these entities | ||||
trusted. For instance, just because we can verify that | ||||
https://www.evil.org/ is owned by Dr. Evil does not mean that we can | ||||
trust Dr. Evil to access our camera an microphone. However, it gives | ||||
the user an opportunity to determine whether he wishes to trust Dr. | ||||
Evil or not; after all, if he desires to contact Dr. Evil, it's safe | ||||
to temporarily give him access to the camera and microphone for the | ||||
purpose of the call. The point here is that we must first identify | ||||
other elements before we can determine whether to trust them. | ||||
It's also worth noting that there are settings where authentication | ||||
is non-cryptographic, such as other machines behind a firewall. | ||||
Naturally, the level of trust one can have in identities verified in | ||||
this way depends on how strong the topology enforcement is. | ||||
A.1.2. Unauthenticated Entities | ||||
Other than the above entities, we are not generally able to identify | ||||
other network elements, thus we cannot trust them. This does not | ||||
mean that it is not possible to have any interaction with them, but | ||||
it means that we must assume that they will behave maliciously and | ||||
design a system which is secure even if they do so. | ||||
A.2. Overview | ||||
This section describes a typical RTCWeb session and shows how the | ||||
various security elements interact and what guarantees are provided | ||||
to the user. The example in this section is a "best case" scenario | ||||
in which we provide the maximal amount of user authentication and | ||||
media privacy with the minimal level of trust in the calling service. | ||||
Simpler versions with lower levels of security are also possible and | ||||
are noted in the text where applicable. It's also important to | ||||
recognize the tension between security (or performance) and privacy. | ||||
The example shown here is aimed towards settings where we are more | ||||
concerned about secure calling than about privacy, but as we shall | ||||
see, there are settings where one might wish to make different | ||||
tradeoffs--this architecture is still compatible with those settings. | ||||
For the purposes of this example, we assume the topology shown in the | ||||
figure below. This topology is derived from the topology shown in | ||||
Figure 1, but separates Alice and Bob's identities from the process | ||||
of signaling. Specifically, Alice and Bob have relationships with | ||||
some Identity Provider (IDP) that supports a protocol such OpenID or | ||||
BrowserID) that can be used to attest to their identity. This | ||||
separation isn't particularly important in "closed world" cases where | ||||
Alice and Bob are users on the same social network and have | ||||
identities based on that network. However, there are important | ||||
settings where that is not the case, such as federation (calls from | ||||
one network to another) and calling on untrusted sites, such as where | ||||
two users who have a relationship via a given social network want to | ||||
call each other on another, untrusted, site, such as a poker site. | ||||
+----------------+ | ||||
| | | ||||
| Signaling | | ||||
| Server | | ||||
| | | ||||
+----------------+ | ||||
^ ^ | ||||
/ \ | ||||
HTTPS / \ HTTPS | ||||
/ \ | ||||
/ \ | ||||
v v | ||||
JS API JS API | ||||
+-----------+ +-----------+ | ||||
| | Media | | | ||||
Alice | Browser |<---------->| Browser | Bob | ||||
| | (DTLS-SRTP)| | | ||||
+-----------+ +-----------+ | ||||
^ ^--+ +--^ ^ | ||||
| | | | | ||||
v | | v | ||||
+-----------+ | | +-----------+ | ||||
| |<--------+ | | | ||||
| IDP | | | IDP | | ||||
| | +------->| | | ||||
+-----------+ +-----------+ | ||||
Figure 2: A call with IDP-based identity | ||||
A.2.1. Initial Signaling | ||||
Alice and Bob are both users of a common calling service; they both | ||||
have approved the calling service to make calls (we defer the | ||||
discussion of device access permissions till later). They are both | ||||
connected to the calling service via HTTPS and so know the origin | ||||
with some level of confidence. They also have accounts with some | ||||
identity provider. This sort of identity service is becoming | ||||
increasingly common in the Web environment in technologies such | ||||
(BrowserID, Federated Google Login, Facebook Connect, OAuth, OpenID, | ||||
WebFinger), and is often provided as a side effect service of your | ||||
ordinary accounts with some service. In this example, we show Alice | ||||
and Bob using a separate identity service, though they may actually | ||||
be using the same identity service as calling service or have no | ||||
identity service at all. | ||||
Alice is logged onto the calling service and decides to call Bob. She | ||||
can see from the calling service that he is online and the calling | ||||
service presents a JS UI in the form of a button next to Bob's name | ||||
which says "Call". Alice clicks the button, which initiates a JS | ||||
callback that instantiates a PeerConnection object. This does not | ||||
require a security check: JS from any origin is allowed to get this | ||||
far. | ||||
Once the PeerConnection is created, the calling service JS needs to | ||||
set up some media. Because this is an audio/video call, it creates | ||||
two MediaStreams, one connected to an audio input and one connected | ||||
to a video input. At this point the first security check is | ||||
required: untrusted origins are not allowed to access the camera and | ||||
microphone. In this case, because Alice is a long-term user of the | ||||
calling service, she has made a permissions grant (i.e., a setting in | ||||
the browser) to allow the calling service to access her camera and | ||||
microphone any time it wants. The browser checks this setting when | ||||
the camera and microphone requests are made and thus allows them. | ||||
In the current W3C API, once some streams have been added, Alice's | ||||
browser + JS generates a signaling message The format of this data is | ||||
currently undefined. It may be a complete message as defined by ROAP | ||||
[REF] or may be assembled piecemeal by the JS. In either case, it | ||||
will contain: | ||||
o Media channel information | ||||
o ICE candidates | ||||
o A fingerprint attribute binding the message to Alice's public key | ||||
[RFC5763] | ||||
Prior to sending out the signaling message, the PeerConnection code | ||||
contacts the identity service and obtains an assertion binding | ||||
Alice's identity to her fingerprint. The exact details depend on the | ||||
identity service (though as discussed in Appendix A.3.6.4 I believe | ||||
PeerConnection can be agnostic to them), but for now it's easiest to | ||||
think of as a BrowserID assertion. | ||||
This message is sent to the signaling server, e.g., by XMLHttpRequest | ||||
[REF] or by WebSockets [I-D.ietf-hybi-thewebsocketprotocol]. The | ||||
signaling server processes the message from Alice's browser, | ||||
determines that this is a call to Bob and sends a signaling message | ||||
to Bob's browser (again, the format is currently undefined). The JS | ||||
on Bob's browser processes it, and alerts Bob to the incoming call | ||||
and to Alice's identity. In this case, Alice has provided an | ||||
identity assertion and so Bob's browser contacts Alice's identity | ||||
provider (again, this is done in a generic way so the browser has no | ||||
specific knowledge of the IDP) to verity the assertion. This allows | ||||
the browser to display a trusted element indicating that a call is | ||||
coming in from Alice. If Alice is in Bob's address book, then this | ||||
interface might also include her real name, a picture, etc. The | ||||
calling site will also provide some user interface element (e.g., a | ||||
button) to allow Bob to answer the call, though this is most likely | ||||
not part of the trusted UI. | ||||
If Bob agrees [I am ignoring early media for now], a PeerConnection | ||||
is instantiated with the message from Alice's side. Then, a similar | ||||
process occurs as on Alice's browser: Bob's browser verifies that | ||||
the calling service is approved, the media streams are created, and a | ||||
return signaling message containing media information, ICE | ||||
candidates, and a fingerprint is sent back to Alice via the signaling | ||||
service. If Bob has a relationship with an IDP, the message will | ||||
also come with an identity assertion. | ||||
At this point, Alice and Bob each know that the other party wants to | ||||
have a secure call with them. Based purely on the interface provided | ||||
by the signaling server, they know that the signaling server claims | ||||
that the call is from Alice to Bob. Because the far end sent an | ||||
identity assertion along with their message, they know that this is | ||||
verifiable from the IDP as well. Of course, the call works perfectly | ||||
well if either Alice or Bob doesn't have a relationship with an IDP; | ||||
they just get a lower level of assurance. Moreover, Alice might wish | ||||
to make an anonymous call through an anonymous calling site, in which | ||||
case she would of course just not provide any identity assertion and | ||||
the calling site would mask her identity from Bob. | ||||
A.2.2. Media Consent Verification | ||||
As described in Section 4.2. This proposal specifies that that be | ||||
performed via ICE. Thus, Alice and Bob perform ICE checks with each | ||||
other. At the completion of these checks, they are ready to send | ||||
non-ICE data. | ||||
At this point, Alice knows that (a) Bob (assuming he is verified via | ||||
his IDP) or someone else who the signaling service is claiming is Bob | ||||
is willing to exchange traffic with her and (b) that either Bob is at | ||||
the IP address which she has verified via ICE or there is an attacker | ||||
who is on-path to that IP address detouring the traffic. Note that | ||||
it is not possible for an attacker who is on-path but not attached to | ||||
the signaling service to spoof these checks because they do not have | ||||
the ICE credentials. Bob's security guarantees with respect to Alice | ||||
are the converse of this. | ||||
A.2.3. DTLS Handshake | ||||
Once the ICE checks have completed [more specifically, once some ICE | ||||
checks have completed], Alice and Bob can set up a secure channel. | ||||
This is performed via DTLS [RFC4347] (for the data channel) and DTLS- | ||||
SRTP [RFC5763] for the media channel. Specifically, Alice and Bob | ||||
perform a DTLS handshake on every channel which has been established | ||||
by ICE. The total number of channels depends on the amount of | ||||
muxing; in the most likely case we are using both RTP/RTCP mux and | ||||
muxing multiple media streams on the same channel, in which case | ||||
there is only one DTLS handshake. Once the DTLS handshake has | ||||
completed, the keys are extracted and used to key SRTP for the media | ||||
channels. | ||||
At this point, Alice and Bob know that they share a set of secure | ||||
data and/or media channels with keys which are not known to any | ||||
third-party attacker. If Alice and Bob authenticated via their IDPs, | ||||
then they also know that the signaling service is not attacking them. | ||||
Even if they do not use an IDP, as long as they have minimal trust in | ||||
the signaling service not to perform a man-in-the-middle attack, they | ||||
know that their communications are secure against the signaling | ||||
service as well. | ||||
A.2.4. Communications and Consent Freshness | ||||
From a security perspective, everything from here on in is a little | ||||
anticlimactic: Alice and Bob exchange data protected by the keys | ||||
negotiated by DTLS. Because of the security guarantees discussed in | ||||
the previous sections, they know that the communications are | ||||
encrypted and authenticated. | ||||
The one remaining security property we need to establish is "consent | ||||
freshness", i.e., allowing Alice to verify that Bob is still prepared | ||||
to receive her communications. ICE specifies periodic STUN | ||||
keepalizes but only if media is not flowing. Because the consent | ||||
issue is more difficult here, we require RTCWeb implementations to | ||||
periodically send keepalives. If a keepalive fails and no new ICE | ||||
channels can be established, then the session is terminated. | ||||
A.3. Detailed Technical Description | ||||
A.3.1. Origin and Web Security Issues | ||||
The basic unit of permissions for RTC-Web is the origin | ||||
[I-D.abarth-origin]. Because the security of the origin depends on | ||||
being able to authenticate content from that origin, the origin can | ||||
only be securely established if data is transferred over HTTPS. | ||||
Thus, clients MUST treat HTTP and HTTPS origins as different | ||||
permissions domains and SHOULD NOT permit access to any RTC-Web | ||||
functionality from scripts fetched over non-secure (HTTP) origins. | ||||
If an HTTPS origin contains mixed active content (regardless of | ||||
whether it is present on the specific page attempting to access RTC- | ||||
Web functionality), any access MUST be treated as if it came from the | ||||
HTTP origin. For instance, if a https://www.example.com/example.html | ||||
loads https://www.example.com/example.js and | ||||
http://www.example.org/jquery.js, any attempt by example.js to access | ||||
RTCWeb functionality MUST be treated as if it came from | ||||
http://www.example.com/. Note that many browsers already track mixed | ||||
content and either forbid it by default or display a warning. | ||||
A.3.2. Device Permissions Model | ||||
Implementations MUST obtain explicit user consent prior to providing | ||||
access to the camera and/or microphone. Implementations MUST at | ||||
minimum support the following two permissions models: | ||||
o Requests for one-time camera/microphone access. | ||||
o Requests for permanent access. | ||||
In addition, they SHOULD support requests for access to a single | ||||
communicating peer. E.g., "Call customerservice@ford.com". Browsers | ||||
servicing such requests SHOULD clearly indicate that identity to the | ||||
user when asking for permission. | ||||
API Requirement: The API MUST provide a mechanism for the requesting | ||||
JS to indicate which of these forms of permissions it is | ||||
requesting. This allows the client to know what sort of user | ||||
interface experience to provide. In particular, browsers might | ||||
display a non-invasive door hanger ("some features of this site | ||||
may not work..." when asking for long-term permissions) but a more | ||||
invasive UI ("here is your own video") for single-call | ||||
permissions. The API MAY grant weaker permissions than the JS | ||||
asked for if the user chooses to authorize only those permissions, | ||||
but if it intends to grant stronger ones SHOULD display the | ||||
appropriate UI for those permissions. | ||||
API Requirement: The API MUST provide a mechanism for the requesting | ||||
JS to relinquish the ability to see or modify the media (e.g., via | ||||
MediaStream.record()). Combined with secure authentication of the | ||||
communicating peer, this allows a user to be sure that the calling | ||||
site is not accessing or modifying their conversion. | ||||
UI Requirement: The UI MUST clearly indicate when the user's camera | ||||
and microphone are in use. This indication MUST NOT be | ||||
suppressable by the JS and MUST clearly indicate how to terminate | ||||
a call, and provide a UI means to immediately stop camera/ | ||||
microphone input without the JS being able to prevent it. | ||||
UI Requirement: If the UI indication of camera/microphone use are | ||||
displayed in the browser such that minimizing the browser window | ||||
would hide the indication, or the JS creating an overlapping | ||||
window would hide the indication, then the browser SHOULD stop | ||||
camera and microphone input. | ||||
Clients MAY permit the formation of data channels without any direct | ||||
user approval. Because sites can always tunnel data through the | ||||
server, further restrictions on the data channel do not provide any | ||||
additional security. (though see Appendix A.3.3 for a related issue). | ||||
Implementations which support some form of direct user authentication | ||||
SHOULD also provide a policy by which a user can authorize calls only | ||||
to specific counterparties. Specifically, the implementation SHOULD | ||||
provide the following interfaces/controls: | ||||
o Allow future calls to this verified user. | ||||
o Allow future calls to any verified user who is in my system | ||||
address book (this only works with address book integration, of | ||||
course). | ||||
Implementations SHOULD also provide a different user interface | ||||
indication when calls are in progress to users whose identities are | ||||
directly verifiable. Appendix A.3.5 provides more on this. | ||||
A.3.3. Communications Consent | ||||
Browser client implementations of RTC-Web MUST implement ICE. Server | ||||
gateway implementations which operate only at public IP addresses may | ||||
implement ICE-Lite. | ||||
Browser implementations MUST verify reachability via ICE prior to | ||||
sending any non-ICE packets to a given destination. Implementations | ||||
MUST NOT provide the ICE transaction ID to JavaScript. [Note: this | ||||
document takes no position on the split between ICE in JS and ICE in | ||||
the browser. The above text is written the way it is for editorial | ||||
convenience and will be modified appropriately if the WG decides on | ||||
ICE in the JS.] | ||||
Implementations MUST send keepalives no less frequently than every 30 | ||||
seconds regardless of whether traffic is flowing or not. If a | ||||
keepalive fails then the implementation MUST either attempt to find a | ||||
new valid path via ICE or terminate media for that ICE component. | ||||
Note that ICE [RFC5245]; Section 10 keepalives use STUN Binding | ||||
Indications which are one-way and therefore not sufficient. We will | ||||
need to define a new mechanism for this. [OPEN ISSUE: what to do | ||||
here.] | ||||
A.3.4. IP Location Privacy | ||||
As mentioned in Section 4.2.4 above, a side effect of the default ICE | ||||
behavior is that the peer learns one's IP address, which leaks large | ||||
amounts of location information, especially for mobile devices. This | ||||
has negative privacy consequences in some circumstances. The | ||||
following two API requirements are intended to mitigate this issue: | ||||
API Requirement: The API MUST provide a mechanism to suppress ICE | ||||
negotiation (though perhaps to allow candidate gathering) until | ||||
the user has decided to answer the call [note: determining when | ||||
the call has been answered is a question for the JS.] This | ||||
enables a user to prevent a peer from learning their IP address if | ||||
they elect not to answer a call. | ||||
API Requirement: The API MUST provide a mechanism for the calling | ||||
application to indicate that only TURN candidates are to be used. | ||||
This prevents the peer from learning one's IP address at all. | ||||
A.3.5. Communications Security | ||||
Implementations MUST implement DTLS and DTLS-SRTP. All data channels | ||||
MUST be secured via DTLS. DTLS-SRTP MUST be offered for every media | ||||
channel and MUST be the default; i.e., if an implementation receives | ||||
an offer for DTLS-SRTP and SDES and/or plain RTP, DTLS-SRTP MUST be | ||||
selected. | ||||
[OPEN ISSUE: What should the settings be here? MUST?] | ||||
Implementations MAY support SDES and RTP for media traffic for | ||||
backward compatibility purposes. | ||||
API Requirement: The API MUST provide a mechanism to indicate that a | ||||
fresh DTLS key pair is to be generated for a specific call. This | ||||
is intended to allow for unlinkability. Note that there are also | ||||
settings where it is attractive to use the same keying material | ||||
repeatedly, especially those with key continuity-based | ||||
authentication. | ||||
API Requirement: The API MUST provide a mechanism to indicate that a | ||||
fresh DTLS key pair is to be generated for a specific call. This | ||||
is intended to allow for unlinkability. | ||||
API Requirement: When DTLS-SRTP is used, the API MUST NOT permit the | ||||
JS to obtain the negotiated keying material. This requirement | ||||
preserves the end-to-end security of the media. | ||||
UI Requirements: A user-oriented client MUST provide an | ||||
"inspector" interface which allows the user to determine the | ||||
security characteristics of the media. [largely derived from | ||||
[I-D.kaufman-rtcweb-security-ui] | ||||
The following properties SHOULD be displayed "up-front" in the | ||||
browser chrome, i.e., without requiring the user to ask for them: | ||||
* A client MUST provide a user interface through which a user may | ||||
determine the security characteristics for currently-displayed | ||||
audio and video stream(s) | ||||
* A client MUST provide a user interface through which a user may | ||||
determine the security characteristics for transmissions of | ||||
their microphone audio and camera video. | ||||
* The "security characteristics" MUST include an indication as to | ||||
whether or not the transmission is cryptographically protected | ||||
and whether that protection is based on a key that was | ||||
delivered out-of-band (from a server) or was generated as a | ||||
result of a pairwise negotiation. | ||||
* If the far endpoint was directly verified Appendix A.3.6 the | ||||
"security characteristics" MUST include the verified | ||||
information. | ||||
The following properties are more likely to require some "drill- | ||||
down" from the user: | ||||
* If the transmission is cryptographically protected, the The | ||||
algorithms in use (For example: "AES-CBC" or "Null Cipher".) | ||||
* If the transmission is cryptographically protected, the | ||||
"security characteristics" MUST indicate whether PFS is | ||||
provided. | ||||
* If the transmission is cryptographically protected via an end- | ||||
to-end mechanism the "security characteristics" MUST include | ||||
some mechanism to allow an out-of-band verification of the | ||||
peer, such as a certificate fingerprint or an SAS. | ||||
A.3.6. Web-Based Peer Authentication | ||||
A.3.6.1. Generic Concepts | ||||
In a number of cases, it is desirable for the endpoint (i.e., the | ||||
browser) to be able to directly identity the endpoint on the other | ||||
side without trusting only the signaling service to which they are | ||||
connected. For instance, users may be making a call via a federated | ||||
system where they wish to get direct authentication of the other | ||||
side. Alternately, they may be making a call on a site which they | ||||
minimally trust (such as a poker site) but to someone who has an | ||||
identity on a site they do trust (such as a social network.) | ||||
Recently, a number of Web-based identity technologies (OAuth, | ||||
BrowserID, Facebook Connect), etc. have been developed. While the | ||||
details vary, what these technologies share is that they have a Web- | ||||
based (i.e., HTTP/HTTPS identity provider) which attests to your | ||||
identity. For instance, if I have an account at example.org, I could | ||||
use the example.org identity provider to prove to others that I was | ||||
alice@example.org. The development of these technologies allows us | ||||
to separate calling from identity provision: I could call you on | ||||
Poker Galaxy but identify myself as alice@example.org. | ||||
Whatever the underlying technology, the general principle is that the | ||||
party which is being authenticated is NOT the signaling site but | ||||
rather the user (and their browser). Similarly, the relying party is | ||||
the browser and not the signaling site. This means that the | ||||
PeerConnection API MUST arrange to talk directly to the identity | ||||
provider in a way that cannot be impersonated by the calling site. | ||||
The following sections provide two examples of this. | ||||
A.3.6.2. BrowserID | ||||
BrowserID [https://browserid.org/] is a technology which allows a | ||||
user with a verified email address to generate an assertion | ||||
(authenticated by their identity provider) attesting to their | ||||
identity (phrased as an email address). The way that this is used in | ||||
practice is that the relying party embeds JS in their site which | ||||
talks to the BrowserID code (either hosted on a trusted intermediary | ||||
or embedded in the browser). That code generates the assertion which | ||||
is passed back to the relying party for verification. The assertion | ||||
can be verified directly or with a Web service provided by the | ||||
identity provider. It's relatively easy to extend this functionality | ||||
to authenticate RTC-Web calls, as shown below. | ||||
+----------------------+ +----------------------+ | ||||
| | | | | ||||
| Alice's Browser | | Bob's Browser | | ||||
| | OFFER ------------> | | | ||||
| Calling JS Code | | Calling JS Code | | ||||
| ^ | | ^ | | ||||
| | | | | | | ||||
| v | | v | | ||||
| PeerConnection | | PeerConnection | | ||||
| | ^ | | | ^ | | ||||
| Finger| |Signed | |Signed | | | | ||||
| print | |Finger | |Finger | |"Alice"| | ||||
| | |print | |print | | | | ||||
| v | | | v | | | ||||
| +--------------+ | | +---------------+ | | ||||
| | BrowserID | | | | BrowserID | | | ||||
| | Signer | | | | Verifier | | | ||||
| +--------------+ | | +---------------+ | | ||||
| ^ | | ^ | | ||||
+-----------|----------+ +----------|-----------+ | ||||
| | | ||||
| Get certificate | | ||||
v | Check | ||||
+----------------------+ | certificate | ||||
| | | | ||||
| Identity |/-------------------------------+ | ||||
| Provider | | ||||
| | | ||||
+----------------------+ | ||||
The way this mechanism works is as follows. On Alice's side, Alice | ||||
goes to initiate a call. | ||||
1. The calling JS instantiates a PeerConnection and tells it that it | ||||
is interested in having it authenticated via BrowserID. | ||||
2. The PeerConnection instantiates the BrowserID signer in an | ||||
invisible IFRAME. The IFRAME is tagged with an origin that | ||||
indicates that it was generated by the PeerConnection (this | ||||
prevents ordinary JS from implementing it). The BrowserID signer | ||||
is provided with Alice's fingerprint. Note that the IFRAME here | ||||
does not render any UI. It is being used solely to allow the | ||||
browser to load the BrowserID signer in isolation, especially | ||||
from the calling site. | ||||
3. The BrowserID signer contacts Alice's identity provider, | ||||
authenticating as Alice (likely via a cookie). | ||||
4. The identity provider returns a short-term certificate attesting | ||||
to Alice's identity and her short-term public key. | ||||
5. The Browser-ID code signs the fingerprint and returns the signed | ||||
assertion + certificate to the PeerConnection. [Note: there are | ||||
well-understood Web mechanisms for this that I am excluding here | ||||
for simplicity.] | ||||
6. The PeerConnection returns the signed information to the calling | ||||
JS code. | ||||
7. The signed assertion gets sent over the wire to Bob's browser | ||||
(via the signaling service) as part of the call setup. | ||||
Obviously, the format of the signed assertion varies depending on | ||||
what signaling style the WG ultimately adopts. However, for | ||||
concreteness, if something like ROAP were adopted, then the entire | ||||
message might look like: | ||||
{ | ||||
"messageType":"OFFER", | ||||
"callerSessionId":"13456789ABCDEF", | ||||
"seq": 1 | ||||
"sdp":" | ||||
v=0\n | ||||
o=- 2890844526 2890842807 IN IP4 192.0.2.1\n | ||||
s= \n | ||||
c=IN IP4 192.0.2.1\n | ||||
t=2873397496 2873404696\n | ||||
m=audio 49170 RTP/AVP 0\n | ||||
a=fingerprint: SHA-1 \ | ||||
4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB\n", | ||||
"identity":{ | ||||
"identityType":"browserid", | ||||
"assertion": { | ||||
"digest":"<hash of fingerprint and session IDs>", | ||||
"audience": "[TBD]" | ||||
"valid-until": 1308859352261, | ||||
}, // signed using user's key | ||||
"certificate": { | ||||
"email": "rescorla@gmail.com", | ||||
"public-key": "<ekrs-public-key>", | ||||
"valid-until": 1308860561861, | ||||
} // certificate is signed by gmail.com | ||||
} | ||||
} | ||||
Note that we only expect to sign the fingerprint values and the | ||||
session IDs, in order to allow the JS or calling service to modify | ||||
the rest of the SDP, while protecting the identity binding. [OPEN | ||||
ISSUE: should we sign seq too?] | ||||
[TODO: NEed to talk about Audience a bit.] | ||||
On Bob's side, he receives the signed assertion as part of the call | ||||
setup message and a similar procedure happens to verify it. | ||||
1. The calling JS instantiates a PeerConnection and provides it the | ||||
relevant signaling information, including the signed assertion. | ||||
2. The PeerConnection instantiates a BrowserID verifier in an IFRAME | ||||
and provides it the signed assertion. | ||||
3. The BrowserID verifier contacts the identity provider to verify | ||||
the certificate and then uses the key to verify the signed | ||||
fingerprint. | ||||
4. Alice's verified identity is returned to the PeerConnection (it | ||||
already has the fingerprint). | ||||
5. At this point, Bob's browser can display a trusted UI indication | ||||
that Alice is on the other end of the call. | ||||
When Bob returns his answer, he follows the converse procedure, which | ||||
provides Alice with a signed assertion of Bob's identity and keying | ||||
material. | ||||
A.3.6.3. OAuth | ||||
While OAuth is not directly designed for user-to-user authentication, | ||||
with a little lateral thinking it can be made to serve. We use the | ||||
following mapping of OAuth concepts to RTC-Web concepts: | ||||
+----------------------+----------------------+ | ||||
| OAuth | RTCWeb | | ||||
+----------------------+----------------------+ | ||||
| Client | Relying party | | ||||
| Resource owner | Authenticating party | | ||||
| Authorization server | Identity service | | ||||
| Resource server | Identity service | | ||||
+----------------------+----------------------+ | ||||
Table 1 | ||||
The idea here is that when Alice wants to authenticate to Bob (i.e., | ||||
for Bob to be aware that she is calling). In order to do this, she | ||||
allows Bob to see a resource on the identity provider that is bound | ||||
to the call, her identity, and her public key. Then Bob retrieves | ||||
the resource from the identity provider, thus verifying the binding | ||||
between Alice and the call. | ||||
Alice IDP Bob | ||||
--------------------------------------------------------- | ||||
Call-Id, Fingerprint -------> | ||||
<------------------- Auth Code | ||||
Auth Code ----------------------------------------------> | ||||
<----- Get Token + Auth Code | ||||
Token ---------------------> | ||||
<------------- Get call-info | ||||
Call-Id, Fingerprint ------> | ||||
This is a modified version of a common OAuth flow, but omits the | ||||
redirects required to have the client point the resource owner to the | ||||
IDP, which is acting as both the resource server and the | ||||
authorization server, since Alice already has a handle to the IDP. | ||||
Above, we have referred to "Alice", but really what we mean is the | ||||
PeerConnection. Specifically, the PeerConnection will instantiate an | ||||
IFRAME with JS from the IDP and will use that IFRAME to communicate | ||||
with the IDP, authenticating with Alice's identity (e.g., cookie). | ||||
Similarly, Bob's PeerConnection instantiates an IFRAME to talk to the | ||||
IDP. | ||||
A.3.6.4. Generic Identity Support | ||||
I believe it's possible to build a generic interface between the | ||||
PeerConnection and any identity sub-module so that the PeerConnection | ||||
just gets pointed to the IDP (which the relying party either trusts | ||||
or not) and JS from the IDP provides the concrete interfaces. | ||||
However, I need to work out the details, so I'm not specifying this | ||||
yet. If it works, the previous two sections will just be examples. | ||||
Author's Address | Author's Address | |||
Eric Rescorla | Eric Rescorla | |||
RTFM, Inc. | RTFM, Inc. | |||
2064 Edgewood Drive | 2064 Edgewood Drive | |||
Palo Alto, CA 94303 | Palo Alto, CA 94303 | |||
USA | USA | |||
Phone: +1 650 678 2350 | Phone: +1 650 678 2350 | |||
Email: ekr@rtfm.com | Email: ekr@rtfm.com | |||
End of changes. 22 change blocks. | ||||
52 lines changed or deleted | 781 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |