Internet DRAFT - draft-ismail-avtcore-media-req
draft-ismail-avtcore-media-req
Network Working Group N. Ismail
Internet-Draft Cisco
Intended status: Informational R. Barnes
Expires: January 05, 2015 Mozilla
D. Benham
N. Buckles
Cisco
July 04, 2014
Requirements for Secure RTP Media Switching
draft-ismail-avtcore-media-req-00
Abstract
This draft outlines the requirements for enabling media switches to
form a multimedia multi-user conferences without needing to have the
keys used to provide confidentiality and integrity for the media in
the conference.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 05, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Ismail, et al. Expires January 05, 2015 [Page 1]
Internet-Draft Secure Media Switching Requirements July 2014
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Media Switching/RTFS Architecture . . . . . . . . . . . . . . 3
4. RTP header manipulation . . . . . . . . . . . . . . . . . . . 5
5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7
6. Example Scenario . . . . . . . . . . . . . . . . . . . . . . 7
7. Security Considerations . . . . . . . . . . . . . . . . . . . 9
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
10.1. Normative References . . . . . . . . . . . . . . . . . . 10
10.2. Informative References . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
Modern audio and video conferencing systems include RTP middleboxes
that can often "switch" video and audio streams without mixing them.
When receivers have homogenous coding capabilities and can receive
multiple streams each, such media switchers avoid the need to decode
and re-encode media for the purpose of compositing video or mixing
audio. Instead they can forward encoded media as it was sent by the
transmitter. In this case, a media switching device can behave more
like a media switching RTP Translator
[I-D.ietf-avtcore-rtp-topologies-update], which we will label an RTP
Translator Forwarding Switch (RTFS).
Modern audio and video conferencing systems have also decomposed
switching infrastructure into a) a controller that deals with the
signaling and keeps track of who is in the conference and b) one or
more media switching devices that receive, rewrite headers and
transmit streams to receivers. In scalable systems, media switching
devices may be deployed in many distributed locations to optimize
bandwidth or latency and may be rented on demand from third-parties
to meet peak loading needs. Therefore, there is a need to locate
switching devices in data centers and/or be operated by third-parties
not otherwise trusted with decryption or encryption of audio and
video media.
This draft outlines the requirements for enabling media switching/
RTFS devices to perform only the functions they need to, including
header rewites and authenticating transmitters and receivers, without
Ismail, et al. Expires January 05, 2015 [Page 2]
Internet-Draft Secure Media Switching Requirements July 2014
having to acquire or use the keys to provide confidentiality and
integrity for the media in SRTP. This enables deployments where the
privacy of the media can be assured even when a third-party service
is used for switching media.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
3. Media Switching/RTFS Architecture
In traditional conferencing systems, the conferencing media
infrastructure fully decrypts, decodes and processes RTP media
streams received from one or more transmitters prior to forwarding
the newly encoded (transcoded, composited and/or mixed) and encrypted
RTP media streams to the rest of receivers. Media Switching Mixers,
which may need to composite or mix media, maintain independent and
persistent SRTP sessions with each endpoint
[I-D.ietf-avtcore-rtp-topologies-update]. More specifically, each
endpoint establishes a point-to-point SRTP session with conferencing
media infrastructure, which has its own persistent SSRCs, SRTP keys
and SRTP contexts (reference the figure below) [RFC7201].
+---+ +--------------------+ +---+
| A |<---- | Encrypt Decrypt |<---- | C |
+---+ | ^ v | +---+
| Traditional MCU |
| or Mixer |
+---+ | v v | +---+
| B |<---- | Encrypt Encrypt | ---->| D |
+---+ +--------------------+ +---+
Figure 1: Traditional MCU or Mixer
When receivers have homogenous coding capabilities and can receive
multiple streams each, a media switcher can avoid processing media
and (selectively) forward streams while manipulating only the
necessary parts of the RTP headers prior to forwarding to receivers.
The RTP payload part of streams from transmitters is forwarded
without any processing or changes.
In this case, a media switching device can behave more like a
scalable RTP Translator Forwarding Switch (RTFS), maintaining the
SSRCs of the transmitting endpoints rather than generating their own
persistent SSRCs towards every receiving endpoint (reference the
figure below). Though this is not the only viable embodiment of a
Ismail, et al. Expires January 05, 2015 [Page 3]
Internet-Draft Secure Media Switching Requirements July 2014
media switching architecture, this is the most relevant for the
requirements discussed in this document.
+---+ +--------------------+ +---+
| A |<---- | |<---- | C |
+---+ | | +---+
| RTP Translator |
| Forwarding Switch |
+---+ | | +---+
| B |<---- | | ---->| D |
+---+ +--------------------+ +---+
Figure 2: Scalable RTP Translator Forwarding Switch (RTFS)
These media switching/RTFS devices may selectively forward only
certain transmitted stream(s) at any given time, such as the video
and audio stream from the currently active speaker. In this case,
endpoints receive different RTP video streams that are generated by
different transmitters, each with its own SSRC, SRTP key and SRTP
context. All these streams are rendered to the end user as a single
video source representing the most active speaker. Moreover,
endpoints do not receive the same RTP streams all the times. For
example, in the figure below, endpoints A, B and D receive the video
streams from endpoint C, the currently active speaker, which is
actually receiving video from endpoint A, the previous active
speaker. Later, when endpoint B becomes the active speaker, then
endpoints A, C and D will start to receive video from B, which
continues to receive video from endpoint C. In the final time slot,
when Endpoint A becomes the active speaker, the process continues.
Time 1
(Prev Speaker) ______________ (Active Speaker)
Endpoint A >a>a>a>a>a>| |>a>a>a>a>a> Endpoint C
<c<c<c<c<c<| |<c<c<c<c<c<
| RTP |
| Translator |
| Forwarding |
Endpoint B <c<c<c<c<c<| Switch |>c>c>c>c>c> Endpoint D
|____________|
Time 2 ______________ (Prev Speaker)
Endpoint A <b<b<b<b<b<| |>b>b>b>b>b> Endpoint C
| |<c<c<c<c<c<
| RTP |
| Translator |
(Active Speaker) | Forwarding |
Endpoint B <c<c<c<c<c<| Switch |>b>b>b>b>b> Endpoint D
Ismail, et al. Expires January 05, 2015 [Page 4]
Internet-Draft Secure Media Switching Requirements July 2014
>b>b>b>b>b>|____________|
Time 3
(Active Speaker) ______________
Endpoint A >a>a>a>a>a>| |>a>a>a>a>a> Endpoint C
<b<b<b<b<b<| |
| RTP |
| Translator |
(Prev Speaker) | Forwarding |
Endpoint B <a<a<a<a<a<| Switch |>a>a>a>a>a> Endpoint D
>b>b>b>b>b>|____________|
Figure 3: RTFS Media Flow for Active Speakers
Meeting the objective of scalability and simplicity in this media
switching architecture starts with minimizing/eliminating the media
processing performed by the media switching device, but can also to
be extended to cryptography, where crypto processing and crypto state
maintained by the media switching/RTFS devices are minimized. With
the advent of cloud-based services, it is essential to enable
deployments where the privacy of the media can be assured even when a
third-party service is used for conference switching. Then
enterprises can use cloud-based, third-party conferencing services
while restricting such from accessing and manipulation of their media
content. The ability to eliminate the need of media switching/RTFS
devices to decrypt and re-encrypt packets is not merely a scalability
and simplicity requirement, but is also a core security requirement
in cloud-based conferencing services.
4. RTP header manipulation
A media switching/RTFS device might need to modify some of the RTP
header fields to map between different values picked by different
endpoints prior to switching. An example is the RTP payload type
values which for SIP endpoints calling into the conference are picked
by the endpoints. Different endpoints are likely to pick different
values for the same media format. The media switching device is
responsible for mapping between such different values. In the case
of RTP payload types, the conference system might be able to send a
SIP reinvite to renegotiate the RTP payload type value down to a
shared value hence avoiding the remapping. This mechanism does not
always work as endpoints can choose to use asymmetric payload types.
Renegotiation also adds complexity and delays to the conferencing
system. Other RTP header fields such as RTP extension headers can
also be modified, deleted or added as they are negotiated separately
with each participants.
Ismail, et al. Expires January 05, 2015 [Page 5]
Internet-Draft Secure Media Switching Requirements July 2014
On the other hand, two of the RTP fields must not be modified by
media switches that do not have access to the media encryption keys.
These two fields are the SSRC and the RTP sequence number. Both
fields are used in the calculation of the SRTP cipher's IV, thus
requiring a total re-encryption upon modification.
Below is the set of RTP header fields along with whether a media
switching/RTFS device might modify them, unlikely to modify them or
must not modify them.
o Version (V): This field is unlikely to be modified by the media
switching device
o Padding marker (P): This field is unlikely to be modified by the
media switching device
o Extension (X): The media switching device might modify this field
when it needs to add RTP extension headers where none existed or
if it needs to delete existing RTP extension headers
o Contributing sources count (CC): The media switching device is
unlikely to modify this field
o Marker bit (M): This field is unlikely to be modified by the media
switching device
o Payload Type (PT): The media switching device might modify this
field to map between different RTP type values picked by different
endpoints
o Sequence Number (SEQ): The media switching device must not modify
this field
o Timestamp (TS): This field is unlikely to be modified by the media
switching device
o Synchronization Source (SSRC): This field must not be modified by
the media switching device
o Extension Header (ExtHDR): The media switching device is likely
modify this field either to change its value or to delete it
completely
Ismail, et al. Expires January 05, 2015 [Page 6]
Internet-Draft Secure Media Switching Requirements July 2014
5. Requirements
The following are the security solution requirements for media
switching/RTFS device that enable media privacy to be maintained
across participant endpoints.
1. Solution needs to maintain all current SRTP security properties.
2. Solution need to extend replay attacks protection to cover cross-
participants replay prevention. Packets sent between the media
switching device and participant A cannot be retransmitted to
participant B undetected.
3. Keys used for encryption and authentication of RTP payloads and
other information deemed unsuitable for accessibility by the
media switching device must not be generated by or accessible to
any of the media switching devices.
4. The media switching devices must be capable, if authorized, of
changing any part of an RTP header except for the RTP sequence
number and SSRC. This in turn mandates that the media switching
devices must have access to the keys used for the authentication
of RTP header fields other than SSRC and RTP sequence number when
a proper authorization is in place.
5. The SRTP master keys must not be generated by the media switching
devices
6. The media switching devices must not be involved in the
distribution of the SRTP master keys to participants nor in the
authentication of the participants identities for the purpose of
key distribution
7. The media switching devices must be able to switch an already
active SRTP stream to a new receiver while guaranteeing the
timely synchronization between the SRTP context of the
transmitter and its old and new receivers. Of special interest
is the RoC part pf the SRTP context due to its dynamic nature.
It is important to note that media switching devices can not
change RTP sequence numbers as that would require packet re-
encryption.
6. Example Scenario
The above requirements (especially 3 and 4) imply that there is a
need for SRTP ciphersuites that allow a split key and split
authentication model. Instead of the current single SRTP master key,
this document requires two independent SRTP master keys. The first
Ismail, et al. Expires January 05, 2015 [Page 7]
Internet-Draft Secure Media Switching Requirements July 2014
is an end to end key that is used for the encryption of the RTP
payload and other information requiring end-to-end encryption. The
end to end key is also used for the authentication of the RTP
payload, the RTP sequence number, RoC and SSRC as well as any other
information requiring end-to-end authentication. The second key is
hop-by-hop key used for the authentication of the RTP packet as well
as any other information requiring hop by hop authentication (e.g.
RTCP packet authentication). The hop-by-hop key can also be used for
encryption of information that the switch is authorized to access and
modify, such as encrypted RTCP packets.
RTP Packet
----------------------- ^
| CC M | PT | Seq Num | |
| Time Stamp | | Auth( RTP Packet + RoC, HopByHopKey )
| SSRC | |
| CSRCs | |
----------------------- | ^
| | | | Enc( Payload, End2endKey )
| Pay Load | | |
| | | | Auth( Payload + SSRC + SeqNum + RoC,
----------------------- V V End2endKey )
Figure 4: SRTP Split key-authentication model
The following figures illustrate how this split-context system could
be used to accomplish the RTP forwarding objectives above. We do not
show the control interactions that would be necessary to distribute
the requisit keys among the participants.
TODO: Flesh out this example case further
Note that media from endpoints are flowing in direction of the arrows.
Time 1
(Prev Speaker) (Active Speaker)
C Context Instantiated ______________ A Context Instantiated
Endpoint A >a>a>a>a>a>| |>a>a>a>a>a> Endpoint C
<c<c<c<c<c<| |<c<c<c<c<c<
| RTP |
| Translator |
| Forwarding |
Endpoint B <c<c<c<c<c<| Switch |>c>c>c>c>c> Endpoint D
C Context Instantiated |____________| C Context Instantiated
Time 2 (Prev Speaker)
C Context Out of Sync A Context Out of Sync
Ismail, et al. Expires January 05, 2015 [Page 8]
Internet-Draft Secure Media Switching Requirements July 2014
B Context Instantiated ______________ B Context Instantiated
Endpoint A <b<b<b<b<b<| |>b>b>b>b>b> Endpoint C
| |<c<c<c<c<c<
| RTP |
| Translator |
(Active Speaker) | Forwarding |
Endpoint B <c<c<c<c<c<| Switch |>b>b>b>b>b> Endpoint D
>b>b>b>b>b>|____________| C Context Out of Sync
C Context Up to Date B Context Instantiated
Time 3
(Active Speaker)
C Context Out of Sync A Context Synchronized
B Context Up to Date ______________ B Context Out of Sync
Endpoint A >a>a>a>a>a>| |>a>a>a>a>a> Endpoint C
<b<b<b<b<b<| |
| RTP |
| Translator |
(Prev Speaker) | Forwarding |
Endpoint B <a<a<a<a<a<| Switch |>a>a>a>a>a> Endpoint D
>b>b>b>b>b>|____________| C Context Out of Sync
C Context Out of Sync B Context Out of Sync
A Context Instantiated A Context Instantiated
Figure 5: SRTP context synchronization
7. Security Considerations
This specification is all about new requirements for a system for
securing RTP headers separately from the RTP body.
The requirements discussed above lead to a need for new SRTP cipher
suites that split protection between hop-by-hop and end-to-end
protections. This split may require new models for managing SRTP
keys, e.g., extensions to DTLS-SRTP or EKT. We do not address
requirements for key management in this document, since they would be
accomplished at the control layer, rather than the RTP forwarding
layer.
8. IANA Considerations
This document requires no actions from IANA.
9. Acknowledgements
The authors would like to thank Eric Rescorla and Cullen Jennings for
their inputs. <GET YOUR NAME HERE - PLEASE SEND COMMENTS>.
Ismail, et al. Expires January 05, 2015 [Page 9]
Internet-Draft Secure Media Switching Requirements July 2014
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
10.2. Informative References
[I-D.ietf-avtcore-rtp-topologies-update]
Westerlund, M. and S. Wenger, "RTP Topologies", draft-
ietf-avtcore-rtp-topologies-update-02 (work in progress),
May 2014.
[I-D.ietf-rtcweb-security-arch]
Rescorla, E., "WebRTC Security Architecture", draft-ietf-
rtcweb-security-arch-09 (work in progress), February 2014.
[I-D.ietf-rtcweb-security]
Rescorla, E., "Security Considerations for WebRTC", draft-
ietf-rtcweb-security-06 (work in progress), January 2014.
[RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP
Sessions", RFC 7201, April 2014.
Authors' Addresses
Nermeen Ismail
Cisco
170 W Tasman Dr.
San Jose
US
Email: nermeen@cisco.com
Richard Barnes
Mozilla
331 E Evelyn Ave.
Mountain View
US
Email: rlb@ipv.sx
Ismail, et al. Expires January 05, 2015 [Page 10]
Internet-Draft Secure Media Switching Requirements July 2014
David Benham
Cisco
170 W Tasman Dr.
San Jose
US
Email: dbenham@cisco.com
Nathan Buckles
Cisco
170 W Tasman Dr.
San Jose
US
Email: nbuckles@cisco.com
Ismail, et al. Expires January 05, 2015 [Page 11]