Network Working Group | M. Westerlund |
Internet-Draft | B. Burman |
Intended status: Standards Track | Ericsson |
Expires: April 24, 2014 | October 21, 2013 |
RTCP Source Description Item SRCNAME to Label Individual Media Sources
draft-westerlund-avtext-rtcp-sdes-srcname-03
This document defines a new Source Description (SDES) item called SRCNAME, which uniquely identifies a single media source, like a camera or a microphone. It also enables identification of the encoding to support when multiple ones are produced. That way anyone receiving the SDES information from a set of interlinked RTP sessions can determine which SSRCs are logically related to the same media source and encoding. In addition the new SDES item is also defined for usage with both a header extension and with the SDP source specific media attribute ("a=ssrc"). Enabling an end-point to receive the SRCNAME with the relevant RTP packets, as well as RTCP, or learn the source bindings through signalling, ahead of receiving RTP and RTCP packets.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 24, 2014.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This specification defines a new RTP/RTCP [RFC3550] Source Description (SDES) item called Source Name (SRCNAME). There exist different use cases, including simulcast and scalable encoding, where a sender transmit multiple RTP packet streams containing full or partial encodings of the same media source. This include multiple independent encodings, where it is desirable to identify the different encodings. These different packet streams needs to be correctly associated with media sources and encodings in an receiver so that they correctly use the packet streams.
The proposed solution provides the RTP packet streams (SSRCs) with identities for both the media source and the specific encoding. The identification is done by creating a RTCP SDES item, SRCNAME, by combing a media source identifier and an encoding identifier separated by a full stop ("."). The SRCNAME can be sent periodically in RTCP SDES packets to enable joiners to receive the information within some time period from when they join. The SRCNAME is also proposed to be sent in an RTP header extension for SDES items [I-D.westerlund-avtext-sdes-hdr-ext] when it is desirable to speed up reception. For example by transmitting the SRCNAME in the first N RTP packets when a new SSRC joins an RTP session. Finally the SRCNAME can be associated with the SSRC in signalling, and source specific attribute is provided for this purpose.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This document uses terminology defined in "A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport Protocol (RTP) Sources" [I-D.lennox-raiarea-rtp-grouping-taxonomy] . In particular the following definitions:
In RTP Applications where an end-point has more than one Media Source in a particular RTP session there can exist need to provide these media sources with an identifier. One reason is to be able to explicitly track it across any SSRC collisions with resulting SSRC changes. Another reason is when there exist multiple RTP Packet Streams (SSRC) associated with that particular media source. Especially in RTP sessions where multiple media sources are simultaneously transmitted. This document focus on the cases that results in multiple packet streams due to the encoding process.
Simulcast [I-D.westerlund-avtcore-rtp-simulcast] as referred to in this document is the process when communication participant provides a media source in multiple encodings using multiple media encoders with different configurations. These different encoded streams are then simultaneously transmitted using RTP to a receiver or a group of receivers. The receiver(s) need two things; First to determine which of the received packet streams (SSRCs) carries which media source, and thus determining the different media sources and secondly what alternative representations of each media source that exist. This can be accomplished using an identifier to refer to a particular encoding of a media source.
Scalable encoding is performed by some few media encoders, with the prime example being H.264 Scalable Video Codec [RFC6190]. A scalable media codec produces one or more base layers, i.e. an encoded stream, and additionally one or more enhancement layers that are dependent on the base layer as well as selected other enhancement layers, these called dependent streams. The encoded and dependent streams can be sent using multiple RTP packet streams, called multi-stream transmission (MST). Thus explicit information are required for which media source a particular packet stream (SSRC) are containing, independent if it is the encoded or dependent stream. In cases where one uses multiple base layers, the encoding identifier can be used to provide RTP/RTCP level identification of the sub-groups of packet streams that form an independent dependency tree. The detailed dependency information between the encoded streams and dependent streams are present in meta data information objects (SEI messages) that are included inside the RTP payloads for SVC [RFC6190].
By providing media source and encoding identity information on RTP and RTCP level we enable or improve usages that prior has been impractical or sub-optimal:
It is important to note that a particular RTP packet stream's role in a communication application can be quite independent to which media source and the particular encoding the packet stream is. Although the media source and encoding is sufficient information in some use cases, there are other cases where additional information about the current role of packet stream or set of streams are required. Further discussion of this in Section 7.
SRCNAME extends and complement the existing solutions using SDP Media Description grouping [RFC5888], or SSRC grouping within a Media Description in SDP [RFC5576] or implicit or heuristic based mapping of packet streams between or within RTP sessions. SRCNAME enables explicit identity information at RTP/RTCP level in a form that are unique across the whole communication session, usable to create relationships on RTP/RTCP level independent if one or more RTP session is used, independent on how the packet streams are distributed over those RTP sessions and how many media sources an end-point have.
This section defines the SRCNAME identifier format and its usage as RTCP SDES item [RFC3550], registers it as an SDES item possible to use in the RTP header extension for SDES items [I-D.westerlund-avtext-sdes-hdr-ext], and in a source specific SDP attribute [RFC5576] as well.
The SRCNAME MUST fulfill the requirements Section 6.5 in RTP [RFC3550] puts on SDES item values in general. These requirements is that it is a UTF-8 [RFC3629] text string that have a maximum length of 255 bytes.
In addition, there are format restrictions to accommodate the separation of the Media Source ID and the encoding ID part, as described by the following ABNF [RFC5234]:
media-source-id = 1*(%x01-09 / %x0B-0C / %x0E-1F / %x21-2D / %x2F-FF) encoding-id = media-source-id *(%x2E media-source-id) ; Same as RFC 4566 "byte-string" ; except for space and the "." separator srcname-content = media-source %x2E encoding-id
Figure 1: SRCNAME Format ABNF
Note, the format do allow multiple "." separators, but only as part of the encoding ID.
The media source identifier is identifying a media source (as defined by section 2.1.4 of [I-D.lennox-raiarea-rtp-grouping-taxonomy]). Each media source ID MUST be unique when combined with the CNAME. Note that if one intended to byte compare the combination of CNAME and media-source-id then one need to pad the CNAME to full 255 bytes with a common pattern prior to concatenation and comparison.
The encoding-id identifies a particular media encoder (Section 2.1.6 in [I-D.lennox-raiarea-rtp-grouping-taxonomy]) and its set of produced encoded or dependent streams (as defined per section 2.1.7 and 2.1.8 in [I-D.lennox-raiarea-rtp-grouping-taxonomy] respectively). The encoding-id MUST be unique in the context of the CNAME and the media source ID.
By require uniqueness scoped by CNAME we simplify the creation of unique identifiers and reduce the overhead for the inclusion of SRCNAME. As the CNAME defines the scope of a single synchronization context, commonly a single host will be responsible for assigning media source and encoding ID to media sources and their encodings. A common case will be for having a single character media source ID followed by stop and then another single character encoding ID, e.g. "a.2".
Distributing the SRCNAME using a RTCP Source Descriptions (SDES) item are a method that should work with all RTP topologies (assuming that any intermediary node is supporting this item) and existing RTP extensions. Thus, a new SDES item called SRCNAME are defined. That way, anyone receiving the SDES information from a set of interlinked RTP sessions or SSRCs in a single session can determine the SRCNAME associated with each SSRC.
The SDES SRCNAME item follows the same format as the other SDES items defined in RTP [RFC3550]:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SRCNAME=TBA1 | length | source name ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: SDES SRCNAME Format
The source name field MUST follow the above [sec-format] srcname-content definition.
When using the SRCNAME SDES item, it is of equally importance with CNAME. Thus, SRCNAME is RECOMMENDED to be included in all full compound RTCP packets being sent. It MAY also be included in non-compound packets in cases where the implementation believes that there might be new receivers needing the information.
"Source-Specific Media Attributes in the Session Description Protocol (SDP)" [RFC5576] defines a way of declaring attributes for SSRC in each RTP session in SDP. With a new SDES item, it is possible to use this framework to define how SRCNAME can also be provided in the SDP for each SSRC in each RTP session, thus enabling an end-point to declare and learn the source bindings ahead of receiving RTP/RTCP packets.
Hence, we define a new SDP source attribute called srcname with the following structure:
a=ssrc:<ssrc-id> srcname:<srcname>
The srcname value MUST be identical to the SRCNAME value the media sender will send in the SDES SRCNAME item in the SDES RTCP packets.
Formal ABNF syntax [RFC5234] for the "srcname" attribute:
srcname-attr = "srcname:" srcname srcname = srcname-content attribute =/ srcname-attr ; The definition of "attribute" is in RFC 4566.
Figure 3: SRCNAME Attribute ABNF
When used in SDP, srcname-content MUST use ISO 10646 in UTF-8 encoding, and MUST be independent of any "a=charset".
In cases when timely deliver of the SRCNAME is required, for example when adding a new SSRC to an RTP session, or when new receiver joins a multiparty RTP session, then the SRCNAME can be included in the RTP header extension for SDES items [I-D.westerlund-avtext-sdes-hdr-ext].
The RTP header extension for SDES items [I-D.westerlund-avtext-sdes-hdr-ext] is functioning for any SDES item, but do require new SDES items to register its URN identifier. This is done below in the IANA section [IANA].
The SDP offer/answer procedures for a=ssrc are specified in Source-Specific Media Attributes in the Session Description Protocol (SDP) [RFC5576]. The SDP offer/answer procedures for a=exthdr are specified in A General Mechanism for RTP Header Extensions [RFC5285].
Clients not supporting SRCNAME will not have the possibility to bind different streams to a specific media source, since they will not understand the SRCNAME SDES item or the RTP header extension. However, sending SRCNAME SDES items to a client not supporting it should not impose any problems since all clients should be prepared that new SDES items may be specified according to RTP [RFC3550].
According to the definition of SDP attributes in SDP: Session Description Protocol [RFC4566], if an attribute is received that is not understood, it MUST be ignored by the receiver. So a receiver not supporting the a=ssrc attribute will simply ignore it.
Source-Specific Media Attributes in the Session Description Protocol (SDP) [RFC5576] defines rules of how new source attributes should be registered, which means that a receiver supporting RFC 5576 should be prepared that new source attributes may be defined. This means that a user supporting some of the source attributes should not have any problems when the user receives an SDP with unknown source attributes.
RTP header extension will only be used when successfully negotiated in SDP, which requires support in both sender and receiver.
There exists a proposal for an application token SDES item [I-D.even-mmusic-application-token], who's purpose is to map SSRCs to application purposes or usages of the RTP packet stream. In this section the similarities and differences are discussed to arrive at the conclusion that for a number of cases both will be required to enable powerful applications.
The APPID is flexible in that it allows applications or specific usage of RTP to define how they map the APPID tokens to particular purpose or usages of the streams. This is clearly intended to provide flexibility. For example one APPID tokens can have meanings such as Presentation stream, main talker, video thumbnail number 3, FEC stream for Audio etc. Such roles can be transient in their behavior. For example main talker is a role that moves around in a multiparty communication session based on who is the current speaker, based on voice activity, or a conference management interface. Thus, the APPID token for this role will be moved between different SSRCs. This is in strong contrast with SRCNAME which identifies a particular media source and encoding. That is not expected to move around, other than in cases of SSRC collisions, when they enable tracking across this event. RTP Mixers that perform mixes or switching between input sources, are them selves having conceptual media sources, which will have stable identities.
A case that makes it clear that SRCNAME identification may benefit from having additional role tokens is the case of having a source projection mixer using simulcast from clients to mixers. From the perspective of a receiver, there will be multiple SSRCs visible for a particular media source, but the source projection mixer will select a sub-set of all potential streams to deliver. A given sub-setting is to only deliver one representation of each media source to the receiver. During a multiparty conference where a main speaker is shown larger at the receiver, and other participants are shown smaller, the mixer may due to congestion be forced to switch representation of the main speaker. If the role would be strictly associated with the encoding representation then main speakers video may for example be reduced in display size. If instead it is explicitly indicated using APPID the receiving application would continue to show the main speaker as a larger display area, despite the reduced quality to ensure the user continuous to understand that this is still the main talker.
Where SRCNAME provides stable identification that a SSRC is associated with a media source and particular encoding of that media source, the APPID can function as a complement when needed to provide explicit indication of the current role and intended of application usage of a SSRC.
Following the guidelines in SDP [RFC4566], in The Session Description Protocol (SDP) Grouping Framework [RFC5888], and in RTP [RFC3550], the IANA is requested to register:
The SDES item or header extension SRCNAMEs being close to opaque identifiers could potentially carry additional meanings or function as overt channel. If the SRCNAME would be permanent between sessions, they have the potential for compromising the users’ privacy as they can be tracked between sessions. See Guidelines for Choosing RTP Control Protocol (RTCP) Canonical Names (CNAMEs) [RFC7022] for more discussion.
A third party modification of the srcname labels either in the RTCP SDES items, in the SDP a=ssrc attribute, or in the RTP header extension can cause service disruption. By modifying labels the wrong streams could be associated, with potentially serious effects including media disruptions. If streams that are to be associated aren't associated, then another type of failures occur. To prevent modification, insertion or deletion of the srcname labels, the carrying channel needs to be protected by integrity protection and source authentication. For RTCP and RTP header extension, various solutions exist, such as SRTP [RFC3711], DTLS [RFC6347], or IPsec [RFC4301]. For protecting the SDP, the signalling channel needs to provide protection. For SIP S/MIME [RFC3261] are the ideal, and hop by hop TLS [RFC5246] provides at least some protection, although not perfect. For SDPs retrieved using RTSP DESCRIBE [RFC2326], TLS would be the RECOMMENDED solution.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC3629] | Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. |
[RFC3550] | Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. |
[RFC5234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. |
[RFC5576] | Lennox, J., Ott, J. and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, June 2009. |
[RFC7022] | Begen, A., Perkins, C., Wing, D. and E. Rescorla, "Guidelines for Choosing RTP Control Protocol (RTCP) Canonical Names (CNAMEs)", RFC 7022, September 2013. |