MMUSIC | R. Gilman |
Internet-Draft | Independent |
Updates: 5939 (if approved) | R. Even |
Intended status: Standards Track | Gesher Erove Ltd |
Expires: July 09, 2013 | F. Andreasen |
Cisco Systems | |
January 05, 2013 |
Session Description Protocol (SDP) Media Capabilities Negotiation
draft-ietf-mmusic-sdp-media-capabilities-17
Session Description Protocol (SDP) capability negotiation provides a general framework for indicating and negotiating capabilities in SDP. The base framework defines only capabilities for negotiating transport protocols and attributes. In this document, we extend the framework by defining media capabilities that can be used to negotiate media types and their associated parameters.
This document updates the IANA Considerations of RFC 5939.
The IETF has been notified of intellectual property rights claimed in regard to some or all of the specification contained in this document. For more information consult the online list of claimed rights.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http:/⁠/⁠datatracker.ietf.org/⁠drafts/⁠current/⁠.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 09, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http:/⁠/⁠trustee.ietf.org/⁠license-⁠info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
Session Description Protocol (SDP) capability negotiation [RFC5939] provides a general framework for indicating and negotiating capabilities in SDP [RFC4566]. The base framework defines only capabilities for negotiating transport protocols and attributes.
RFC 5939 [RFC5939] lists some of the issues with the current SDP capability negotiation process. An additional real life case is to be able to offer one media stream (e.g. audio) but list the capability to support another media stream (e.g. video) without actually offering it concurrently.
In this document, we extend the framework by defining media capabilities that can be used to indicate and negotiate media types and their associated format parameters. This document also adds the ability to declare support for media streams, the use of which can be offered and negotiated later, and the ability to specify session configurations as combinations of media stream configurations. The definitions of new attributes for media capability negotiation are chosen to make the translation from these attributes to "conventional" SDP [RFC4566] media attributes as straightforward as possible in order to simplify implementation. This goal is intended to reduce processing in two ways: each proposed configuration in an offer may be easily translated into a conventional SDP media stream record for processing by the receiver; and the construction of an answer based on a selected proposed configuration is straightforward.
This document updates RFC 5939 [RFC5939] by updating the IANA Considerations. All other extensions defined in this document are considered extensions above and beyond RFC 5939 [RFC5939].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] and indicate requirement levels for compliant implementations.
"Actual Configuration": An actual configuration specifies which combinations of SDP session parameters and media stream components can be used in the current offer/answer exchange and with what parameters. Use of an actual configuration does not require any further negotiation in the offer/answer exchange. See RFC 5939 [RFC5939] for further details.
"Base Attributes": Conventional SDP attributes appearing in the base configuration of a media block.
"Base Configuration": The media configuration represented by a media block exclusive of all the capability negotiation attributes defined in this document, the base capability negotiation document [RFC5939], or any other capability negotiation document. In an offer SDP, the base configuration corresponds to the actual configuration as defined in RFC 5939 [RFC5939].
"Conventional Attribute": Any SDP attribute other than those defined by the series of capability negotiation specifications.
"Conventional SDP": An SDP record devoid of capability negotiation attributes.
"Media Format Capability": A media format, typically a media subtype such as PCMU, H263-1998, or T38, expressed in the form of a capability.
"Media Format Parameter Capability": A media format parameter ("a=fmtp" in conventional SDP) expressed in the form of a capability. The media format parameter capability is associated with a media format capability.
"Media Capability": The combined set of capabilities associated with expressing a media format and its relevant parameters (e.g. media format parameters and media specific parameters).
"Potential Configuration": A potential configuration indicates which combinations of capabilities can be used for the session and its associated media stream components. Potential configurations are not ready for use, however they are offered for potential use in the current offer/answer exchange. They provide an alternative that may be used instead of the actual configuration, subject to negotiation in the current offer/answer exchange. See RFC 5939 [RFC5939] for further details.
"Latent Configuration": A latent configuration indicates which combinations of capabilities could be used in a future negotiation for the session and its associated media stream components. Latent configurations are neither ready for use, nor are they offered for actual or potential use in the current offer/answer exchange. Latent configurations merely inform the other side of possible configurations supported by the entity. Those latent configurations may be used to guide subsequent offer/answer exchanges, but they are not offered for use as part of the current offer/answer exchange.
The SDP capability negotiation [RFC5939] discusses the use of any SDP [RFC4566] attribute (a=) under the attribute capability "acap". The limitations of using acap for fmtp and rtpmap in a potential configuration are described in RFC 5939 [RFC5939]; for example they can be used only at the media level since they are media level attributes. RFC 5939 [RFC5939] does not provide a way to exchange media-level capabilities prior to the actual offer of the associated media stream. This section provides an overview of extensions providing an SDP Media Capability negotiation solution offering more robust capabilities negotiation. This is followed by definitions of new SDP attributes for the solution and its associated updated offer/answer procedures [RFC3264]
The capability negotiation extensions requirements considered herein are as follows.
Other possible extensions have been discussed, but have not been treated in this document. They may be considered in the future. Three such extensions are:
The solution consists of new capability attributes corresponding to conventional SDP line types, new parameters for the pcfg, acfg, and the new lcfg attributes extending the base attributes from RFC 5939 [RFC5939], and a use of the pcfg attribute to return capability information in the SDP answer.
Several new attributes are defined in a manner that can be related to the capabilities specified in a media line, and its corresponding rtpmap and fmtp attributes.
New parameters are defined for the potential configuration (pcfg), latent configuration (lcfg), and accepted configuration (acfg) attributes to associate the new attributes with particular configurations.
Special processing rules are defined for capability attribute arguments in order to reduce the need to replicate essentially-identical attribute lines for the base configuration and potential configurations.
This document extends the base protocol extensions to the offer/answer model that allow for capabilities and potential configurations to be included in an offer. Media capabilities constitute capabilities that can be used in potential and latent configurations. Whereas potential configurations constitute alternative offers that may be accepted by the answerer instead of the actual configuration(s) included in the "m=" line(s) and associated parameters, latent configurations merely inform the other side of possible configurations supported by the entity. Those latent configurations may be used to guide subsequent offer/answer exchanges, but they are not part of the current offer/answer exchange.
The mechanism is illustrated by the offer/answer exchange below, where Alice sends an offer to Bob:
Alice Bob | (1) Offer (SRTP and RTP) | |--------------------------------->| | | | (2) Answer (RTP) | |<---------------------------------| | |
Alice's offer includes RTP and SRTP as alternatives. RTP is the default, but SRTP is the preferred one (long lines are folded to fit the margins):
The required base and extensions are provided by the "a=creq" attribute defined in RFC 5939 [RFC5939], with the option tag "med-v0", which indicates that the extension framework defined here, must be supported. The base level capability negotiation support ("cap-v0" [RFC5939]) is implied since it is required for the extensions.
The "m=" line indicates that Alice is offering to use plain RTP with PCMU or G.729B. The media line implicitly defines the default transport protocol (RTP/AVP in this case) and the default actual configuration.
The "a=tcap:1" line, specified in the SDP Capability Negotiation base protocol [RFC5939], defines transport protocol capabilities, in this case Secure RTP (SAVP profile) as the first option and RTP (AVP profile) as the second option.
The "a=rmcap:1,4" line defines two G.729 RTP-based media format capabilities, numbered 1 and 4, and their encoding rate. The capabilities are of media type "audio" and subtype G729. Note that the media subtype is explicitly specified here, rather than RTP payload type numbers. This permits the assignment of payload type numbers in the media stream configuration specification. In this example, two G.729 subtype capabilities are defined. This permits the declaration of two sets of formatting parameters for G.729.
The "a=rmcap:2" line defines a G.711 mu-law capability, numbered 2.
The "a=rmcap:5" line defines an audio telephone-event capability, numbered 5.
The "a=mfcap:1" line specifies the fmtp formatting parameters for capability 1 (offerer will not accept G.729 Annex B packets).
The "a=mfcap:4" line specifies the fmtp formatting parameters for capability 4 (offerer will accept G.729 Annex B packets).
The "a=mfcap:5" line specifies the fmtp formatting parameters for capability 5 (the DTMF touchtones 0-9,*,#).
The "a=acap:1" line specified in the base protocol provides the "crypto" attribute which provides the keying material for SRTP using SDP security descriptions.
The "a=pcfg:" attributes provide the potential configurations included in the offer by reference to the media capabilities, transport capabilities, attribute capabilities and specified payload type number mappings. Three explicit alternatives are provided; the lowest-numbered one is the preferred one. The "a=pcfg:1 ..." line specifies media capabilities 4 and 5, i.e., G.729B and DTMF (incl. their associated media format parameters), or media capability 1 and 5, i.e., G.729 and DTMF (incl. their associated media format parameters). Furthermore, it specifies transport protocol capability 1 (i.e. the RTP/SAVP profile - secure RTP), and the attribute capability 1, i.e. the crypto attribute provided. Lastly, it specifies a payload type number mapping for (RTP-based) media capabilities 1, 4, and 5, thereby permitting the offerer to distinguish between encrypted media and unencrypted media received prior to receipt of the answer.
Use of unique payload type numbers in alternative configurations is not required; codecs such as AMR-WB [RFC4867] have the potential for so many combinations of options that it may be impractical to define unique payload type numbers for all supported combinations. If unique payload type numbers cannot be specified, then the offerer will be obliged to wait for the SDP answer before rendering received media. For SRTP using SDES inline keying [RFC4568], the offerer will still need to receive the answer before being able to decrypt the stream.
The second alternative ("a=pcfg:2 ...") specifies media capability 2, i.e., PCMU, under the RTP/SAVP profile, with the same SRTP key material.
The third alternative ("a=pcfg:3 ...") offers G.729B unsecured; its only purpose in this example is to show a preference for G.729B over PCMU.
Per RFC 5939 [RFC5939], the media line, with any qualifying attributes such as fmtp or rtpmap, is itself considered a valid configuration (the current actual configuration); it has the lowest preference (per RFC 5939 [RFC5939]).
Bob receives the SDP offer from Alice. Bob supports G.729B, PCMU, and telephone events over RTP, but not SRTP, hence he accepts the potential configuration 3 for RTP provided by Alice. Bob generates the following answer:
Bob includes the "a=csup" and "a=acfg" attributes in the answer to inform Alice that he can support the med-v0 level of capability negotiations. Note that in this particular example, the answerer supported the capability extensions defined here, however had he not, he would simply have processed the offer based on the offered PCMU and G.729 codecs under the RTP/AVP profile only. Consequently, the answer would have omitted the "a=csup" attribute line and chosen one or both of the PCMU and G.729 codecs instead. The answer carries the accepted configuration in the "m=" line along with corresponding rtpmap and/or fmtp parameters, as appropriate.
Note that per the base protocol, after the above, Alice MAY generate a new offer with an actual configuration ("m=" line, etc.) corresponding to the actual configuration referenced in Bob's answer (not shown here).
In this section, we present the new attributes associated with indicating the media capabilities for use by the SDP Capability negotiation. The approach taken is to keep things similar to the existing media capabilities defined by the existing media descriptions ("m=" lines) and the associated "rtpmap" and "fmtp" attributes. We use media subtypes and "media capability numbers" to link the relevant media capability parameters. This permits the capabilities to be defined at the session level and be used for multiple streams, if desired. For RTP-based media formats, payload types are then specified at the media level (see Section 3.3.4.2).
A media capability merely indicates possible support for the media type and media format(s) and parameters in question. In order to actually use a media capability in an offer/answer exchange, it MUST be referenced in a potential configuration.
Media capabilities, i.e. the attributes associated with expressing media capability formats, parameters, etc., can be provided at the session-level and/or the media-level. Media capabilities provided at the session level may be referenced in any pcfg or lcfg attribute at the media level (consistent with the media type), whereas media capabilities provided at the media level may be referenced only by the pcfg or lcfg attribute within that media stream. In either case, the scope of the <med-cap-num> is the entire session description. This enables each media capability to be uniquely referenced across the entire session description (e.g. in a potential configuration).
Media subtypes can be expressed as media format capabilities by use of the "a=rmcap" and "a=omcap" attributes. The "a=rmcap" attribute MUST be used for RTP-based media whereas the "a=omcap" attribute MUST be used for non-RTP-based (other) media formats. The two attributes are defined as follows:
a=rmcap:<media-cap-num-list> <encoding-name>/<clock-rate> [/<encoding-parms>] a=omcap:<media-cap-num-list> <format-name>
where <media-cap-num-list> is a (list of) media capability number(s) used to number a media format capability, the <encoding name> or <format name> is the media subtype, e.g., H263-1998, PCMU, or T38, <clock rate> is the encoding rate, and <encoding parms> are the media encoding parameters for the media subtype. All media format capabilities in the list are assigned to the same media type/subtype. Each occurrence of the rmcap and omcap attribute MUST use unique values in their <media-cap-num-list>; the media capability numbers are shared between the two attributes and the numbers MUST be unique across the entire SDP session. In short, the rmcap and omcap attributes define media format capabilities and associate them with a media capability number in the same manner as the rtpmap attribute defines them and associates them with a payload type number. Additionally, the attributes allow multiple capability numbers to be defined for the media format in question by specifying a range of media capability numbers. This permits the media format to be associated with different media parameters in different configurations. When a range of capability numbers is specified, the first (leftmost) capability number MUST be strictly smaller than the second (rightmost), i.e. the range increases and covers at least two numbers.
In ABNF [RFC5234], we have:
media-capability-line = rtp-mcap / non-rtp-mcap rtp-mcap = "a=rmcap:" media-cap-num-list 1*WSP encoding-name "/" clock-rate ["/" encoding-parms] non-rtp-mcap = "a=omcap:" media-cap-num-list 1*WSP format-name media-cap-num-list = media-cap-num-element *("," media-cap-num-element) media-cap-num-element = media-cap-num / media-cap-num-range media-cap-num-range = media-cap-num "-" media-cap-num media-cap-num = NonZeroDigit *9(DIGIT) encoding-name = token ;defined in RFC4566 clock-rate = NonZeroDigit *9(DIGIT) encoding-parms = token format-name = token ;defined in RFC4566 NonZeroDigit = %x31-39 ; 1-9
The encoding-name, clock-rate and encoding-params are as defined to appear in an rtpmap attribute for each media type/subtype. Thus, it is easy to convert an rmcap attribute line into one or more rtpmap attribute lines, once a payload type number is assigned to a media-cap-num (see Section 3.3.5).
The format-name is a media format description for non-RTP based media as defined for the <fmt> part of the media description ("m=" line) in SDP [RFC4566]. In simple terms, it is the name of the media format, e.g. "t38". This form can also be used in cases such as BFCP [RFC4585] where the fmt list in the m-line is effectively ignored (BFCP uses "*").
The "rmcap" and "omcap" attributes can be provided at the session-level and/or the media-level. There can be more than one rmcap and more than one omcap attribute at both the session and media level (i.e., more than one of each at the session-level and more than one of each in each media description). Media capability numbers cannot include leading zeroes, and each media-cap-num MUST be unique within the entire SDP record; it is used to identify that media capability in potential, latent and actual configurations, and in other attribute lines as explained below. Note that the media-cap-num values are shared between the rmcap and omcap attributes, and hence the uniqueness requirement applies to the union of them. When the media capabilities are used in a potential, latent or actual configuration, the media formats referred by those configurations apply at the media level, irrespective of whether the media capabilities themselves were specified at the session or media level. In other words, the media capability applies to the specific media description associated with the configuration which invokes it.
For example:
This attribute is used to associate media format specific parameters with one or more media format capabilities. The form of the attribute is:
where <media-caps> permits the list of parameters to be associated with one or more media format capabilities and the format parameters are specific to the type of media format. The mfcap lines map to a single traditional SDP fmtp attribute line (one for each entry in <media-caps>) of the form
where <fmt> is the media format parameter defined in RFC 4566 [RFC4566], as appropriate for the particular media stream. The mfcap attribute MUST be used to encode attributes for media capabilities, which would conventionally appear in an fmtp attribute. The existing acap attribute MUST NOT be used to encode fmtp attributes.
The mfcap attribute adheres to SDP [RFC4566] attribute production rules with
media-format-parameter-capability = "a=mfcap:" media-cap-num-list 1*WSP fmt-specific-param-list fmt-specific-param-list = text ; defined in RFC4566
Note that media format parameters can be used with RTP-based and non-RTP based media formats.
The appearance of media subtypes with a large number of formatting options (e.g., AMR-WB [RFC4867]) coupled with the restriction that only a single fmtp attribute can appear per media format, suggests that it is useful to create a combining rule for mfcap parameters which are associated with the same media capability number. Therefore, different mfcap lines MAY include the same media-cap-num in their media-cap-num-list. When a particular media capability is selected for processing, the parameters from each mfcap line which references the particular capability number in its media-cap-num-list are concatenated together via ";", in the order the mfcap attributes appear in the SDP record, to form the equivalent of a single fmtp attribute line. This permits one to define a separate mfcap line for a single parameter and value that is to be applied to each media capability designated in the media-cap-num-list. This provides a compact method to specify multiple combinations of format parameters when using codecs with multiple format options. Note that order-dependent parameters SHOULD be placed in a single mfcap line to avoid possible problems with line rearrangement by a middlebox.
Format parameters are not parsed by SDP; their content is specific to the media type/subtype. When format parameters for a specific media capability are combined from multiple a=mfcap lines which reference that media capability, the format-specific parameters are concatenated together and separated by ";" for construction of the corresponding format attribute (a=fmtp). The resulting format attribute will look something like the following (without line breaks):
a=fmtp:<fmt> <fmt-specific-param-list1>; <fmt-specific-param-list2>; ...
where <fmt> depends on the transport protocol in the manner defined in RFC4566. SDP cannot assess the legality of the resulting parameter list in the "a=fmtp" line; the user must take care to ensure that legal parameter lists are generated.
The "mfcap" attribute can be provided at the session-level and the media-level. There can be more than one mfcap attribute at the session or media level. The unique media-cap-num is used to associate the parameters with a media capability.
As a simple example, a G.729 capability is, by default, considered to support comfort noise as defined by Annex B. Capabilities for G.729 with and without comfort noise support may thus be defined by:
Media capability 1 supports G.729 with Annex B, whereas media capability 2 supports G.729 without Annex B.
Example for H.263 video:
Finally, for six format combinations of the Adaptive MultiRate codec:
So that AMR codec #1, when specified in a pcfg attribute within an audio stream block (and assigned payload type number 98) as in
is essentially equivalent to the following
and AMR codec #4 with payload type number 99,depicted by the potential configuration:
is equivalent to the following:
and so on for the other four combinations. SDP could thus convert the media capabilities specifications into one or more alternative media stream specifications, one of which can be chosen for the answer.
Attributes and parameters associated with a media format are typically specified using the "rtpmap" and "fmtp" attributes in SDP, and the similar "rmcap" and "mfcap" attributes in SDP Media Capabilities. Some SDP extensions define other attributes that need to be associated with media formats, for example the "rtcp-fb" attribute defined in RFC 4585 [RFC4585]. Such media-specific attributes, beyond the rtpmap and fmtp attributes, may be associated with media capability numbers via a new media-specific attribute, mscap, of the following form:
a=mscap:<media caps star> <att field> <att value>
where <media caps star> is a (list of) media capability number(s), <att field> is the attribute name, and <att value> is the value field for the named attribute. Note that the media capability numbers refer to media format capabilities specified elsewhere in the SDP ("rmcap" and/or "omcap"). If a range of capability numbers is specified, the first (leftmost) capability number MUST be strictly smaller than the second (rightmost). The media capability numbers may include a wildcard ("*"), which will be used instead of any payload type mappings in the resulting SDP (see, e.g. RFC 4585 [RFC4585] and the example below). In ABNF, we have:
media-specific-capability = "a=mscap:" media-caps-star 1*WSP att-field ; from RFC4566 1*WSP att-value ; from RFC4566 media-caps-star = media-cap-star-element *("," media-cap-star-element) media-cap-star-element = (media-cap-num [wildcard]) / (media-cap-num-range [wildcard]) wildcard = "*"
Given an association between a media capability and a payload type number as specified by the pt= parameters in a pcfg attribute line, a mscap line may be translated easily into a conventional SDP attribute line of the form
A resulting attribute that is not a legal SDP attribute as specified by RFC4566 MUST be ignored by the receiver.
If a media capability number (or range) contains a wildcard character at the end, any payload type mapping specified for that media specific capability (or range of capabilities) will use the wildcard character in the resulting SDP instead of the payload type specified in the payload type mapping ("pt" parameter) in the configuration attribute.
A single mscap line may refer to multiple media capabilities by use of a capability number range; this is equivalent to multiple mscap lines, each with the same attribute values (but different media capability numbers), one line per media capability.
Multiple mscap lines may refer to the same media capability, but, unlike the mfcap attribute, no concatenation operation is defined. Hence, multiple mscap lines applied to the same media capability is equivalent to multiple lines of the specified attribute in a conventional media record.
Here is an example with the rtcp-fb attribute, modified from an example in RFC 5104 [RFC5104] (with the session-level and audio media omitted). If the offer contains a media block like the following (note the wildcard character),
and if the proposed configuration is chosen, then the equivalent media block would look like
Along with the new attributes for media capabilities, new extension parameters are defined for use in the potential configuration, the actual configuration, and/or the new latent configuration defined in Section 3.3.5.
The media configuration parameter is used to specify the media format(s) and related parameters for a potential, actual, or latent configuration. Adhering to the ABNF for extension-config-list in RFC 5939 [RFC5939] with
ext-cap-name = "m" ext-cap-list = media-cap-num-list [*(BAR media-cap-num-list)]
we have
media-config-list = ["+"] "m=" media-cap-num-list *(BAR media-cap-num-list) ;BAR is defined in RFC5939 ;media-cap-num-list is defined above
Alternative media configurations are separated by a vertical bar ("|"). The alternatives are ordered by preference, most-preferred first. When media capabilities are not included in a potential configuration at the media level, the media type and media format from the associated "m=" line will be used. The use of the plus sign ("+") is described in RFC5939.
The payload type number mapping parameter is used to specify the payload type number to be associated with each RTP-based media format in a potential, actual, or latent configuration. We define the payload type number mapping parameter, payload-number-config-list, in accordance with the extension-config-list format defined in RFC 5939 [RFC5939]. In ABNF:
payload-number-config-list = ["+"] "pt=" media-map-list media-map-list = media-map *("," media-map) media-map = media-cap-num ":" payload-type-number ; media-cap-num is defined in 3.3.1 payload-type-number = NonZeroDigit *2(DIGIT) ; RTP payload ; type number
The example in Section 3.3.7 shows how the parameters from the rmcap line are mapped to payload type numbers from the pcfg "pt" parameter. The use of the plus sign ("+") is described in RFC 5939 [RFC5939].
A latent configuration represents a future capability, hence the pt= parameter is not directly meaningful in the lcfg attribute because no actual media session is being offered or accepted; it is permitted in order to tie any payload type number parameters within attributes to the proper media format. A primary example is the case of format parameters for the Redundant Audio Data (RED) payload, which are payload type numbers. Specific payload type numbers used in a latent configuration MAY be interpreted as suggestions to be used in any future offer based on the latent configuration, but they are not binding; the offerer and/or answerer may use any payload type numbers each deems appropriate. The use of explicit payload type numbers for latent configurations can be avoided by use of the parameter substitution rule of Section 3.3.7. Future extensions are also permitted. Note that leading zeroes are not permitted.
When a latent configuration is specified (always at the media level), indicating the ability to support an additional media stream, it is necessary to specify the media type (audio, video, etc.) as well as the format and transport type. The media type parameter is defined in ABNF as
media-type = ["+"] "mt=" media; media defined in RFC4566
At present, the media-type parameter is used only in the latent configuration attribute, and the use of the "+" prefix to specify that the entire attribute line is to be ignored if the mt= parameter is not understood, is unnecessary. However, if the media-type parameter is later added to an existing capability attribute such as pcfg, then the "+" would be useful. The media format(s) and transport type(s) are specified using the media configuration parameter ("+m=") defined above, and the transport parameter ("t=") defined in RFC 5939 [RFC5939], respectively.
One of the goals of this work is to permit the exchange of supportable media configurations in addition to those offered or accepted for immediate use. Such configurations are referred to as "latent configurations". For example, a party may offer to establish a session with an audio stream, and, at the same time, announce its ability to support a video stream as part of the same session. The offerer can supply its video capabilities by offering one or more latent video configurations along with the media stream for audio; the responding party may indicate its ability and willingness to support such a video session by returning a corresponding latent configuration.
Latent configurations returned in SDP answers MUST match offered latent configurations (or parameter subsets thereof). Therefore, it is appropriate for the offering party to announce most, if not all, of its capabilities in the initial offer. This choice has been made in order to keep the size of the answer more compact by not requiring acap, rmcap, tcap, etc. lines in the answer.
Latent configurations may be announced by use of the latent configuration attribute, which is defined in a manner very similar to the potential configuration attribute. The latent configuration attribute combines the properties of a media line and a potential configuration. A latent configuration MUST include a media type (mt=) and a transport protocol configuration parameter since the latent configuration is independent of any media line present. In most cases, the media configuration (m=) parameter needs to be present as well (see Section 4 for examples). The lcfg attribute is a media level attribute.
Each media line in an SDP description represents an offered simultaneous media stream, whereas each latent configuration represents an additional stream which may be negotiated in a future offer/answer exchange. Session capability attributes may be used to determine whether a latent configuration may be used to form an offer for an additional simultaneous stream or to reconfigure an existing stream in a subsequent offer/answer exchange.
The latent configuration attribute is of the form:
a=lcfg:<config-number> <latent-cfg-list>
which adheres to the SDP [RFC4566] "attribute" production with att-field and att-value defined as:
att-field = "lcfg" att-value = config-number 1*WSP lcfg-cfg-list config-number = NonZeroDigit *9(DIGIT) ; DIGIT defined in RFC5234 lcfg-cfg-list = media-type 1*WSP pot-cfg-list ; as defined in RFC5939 ; and extended herein
The media-type (mt=) parameter identifies the media type (audio, video, etc.) to be associated with the latent media stream, and MUST be present. The pot-cfg-list MUST contain a transport-protocol-config-list (t=) parameter and a media-config-list (m=) parameter. The pot-cfg-list MUST NOT contain more than one instance of each type of parameter list. As specified in RFC 5939 [RFC5939], the use of the "+" prefix with a parameter indicates that the entire configuration MUST be ignored if the parameter is not understood; otherwise, the parameter itself may be ignored.
Media stream payload numbers are not assigned by a latent configuration. Assignment will take place if and when the corresponding stream is actually offered via an m-line in a later exchange. The payload-number-config-list is included as a parameter to the lcfg attribute in case it is necessary to tie payload numbers in attribute capabilities to specific media capabilities.
If an lcfg attribute invokes an acap attribute that appears at the session level, then that attribute will be expected to appear at the session level of a subsequent offer when and if a corresponding media stream is offered. Otherwise, acap attributes which appear at the media level represent media-level attributes. Note, however, that rmcap, omcap, mfcap, mscap, and tcap attributes may appear at the session level because they always result in media-level attributes or m-line parameters.
The configuration numbers for latent configurations do not imply a preference; the offerer will imply a preference when actually offering potential configurations derived from latent configurations negotiated earlier. Note however that the offerer of latent configurations MAY specify preferences for combinations of potential and latent configurations by use of the sescap attribute defined in Section 3.3.8. For example, if an SDP offer contains, say, an audio stream with pcfg:1, and two latent video configurations, lcfg:2, and lcfg:3, then a session with one audio stream and one video stream could be specified by including "a=sescap:1 1,2|3". One audio stream and two video streams could be specified by including "a=sescap:2 1,2,3" in the offer. In order to permit combinations of latent and potential configurations in session capabilities, latent configuration numbers MUST be different from those used for potential configurations. This restriction is especially important if the offerer does not require cmed-v0 capability and the recipient of the offer doesn't support it. If the lcfg attribute is not recognized, the capability attributes intended to be associated with it may be confused with those associated with a potential configuration of some other media stream. Note also that leading zeroes are not permitted in configuration numbers.
If a cryptographic attribute, such as the SDES "a=crypto:" attribute [RFC4568], is referenced by a latent configuration through an acap attribute, any keying material required in the conventional attribute, such as the SDES key/salt string, MUST be included in order to satisfy formatting rules for the attribute. Since the keying material will be visible but not actually used at this stage (since it's a latent configuration), the value(s) of the keying material MUST NOT be a real value used for real exchange of media, and the receiver of the lcfg attribute MUST ignore the values.
The present work requires new extensions (parameters) for the pcfg attribute defined in the SDP Capability Negotiation base protocol [RFC5939]. The parameters and their definitions are "borrowed" from the definitions provided for the latent configuration attribute in Section 3.3.5. The expanded ABNF definition of the pcfg attribute is
a=pcfg: <config-number> [<pot-cfg-list>]
where
config-number = 1*DIGIT ;defined in [RFC5234] pot-cfg-list = pot-config *(1*WSP pot-config) pot-config = attribute-config-list / ;def in [RFC5939] transport-protocol-config-list / ;defined in [RFC5939] extension-config-list / ;[RFC5939] media-config-list / ; Section 3.3.4.1 payload-number-config-list ; Section 3.3.4.2
Except for the extension-config-list, the pot-cfg-list MUST NOT contain more than one instance of each parameter list.
Potential and/or latent configuration attributes may be returned within an answer SDP to indicate the ability of the answerer to support alternative configurations of the corresponding stream(s). For example, an offer may include multiple potential configurations for a media stream and/or latent configurations for additional streams; the corresponding answer will indicate (via an acfg attribute) the configuration accepted and used to construct the base configuration for each active media stream in the reply, but the reply MAY also contain potential and/or latent configuration attributes, with parameters, to indicate which other offered configurations would be acceptable. This information is useful if it becomes desirable to reconfigure a media stream, e.g., to reduce resource consumption.
When potential and/or latent configurations are returned in an answer, all numbering MUST refer to the configuration and capability attribute numbering of the offer. The offered capability attributes need not be returned in the answer. The answer MAY include additional capability attributes and/or configurations (with distinct numbering). The parameter values of any returned pcfg or lcfg attributes MUST be a subset of those included in the offered configurations and/or those added by the answerer; values MAY be omitted only if they were indicated as alternative sets, or optional, in the original offer. The parameter set indicated in the returned acfg attribute need not be repeated in a returned pcfg attribute. The answerer MAY return more than one pcfg attribute with the same configuration number if it is necessary to describe selected combinations of optional or alternative parameters.
Similarly, one or more session capability attributes (a=sescap) MAY be returned to indicate which of the offered session capabilities is/are supportable by the answerer (see Section 3.3.8.)
Note that, although the answerer MAY return capabilities beyond those included by the offerer, these capabilities MUST NOT be used to form any base level media description in the answer. For this reason, it is advisable for the offerer to include most, if not all, potential and latent configurations it can support in the initial offer, unless the size of the resulting SDP is a concern. Either party MAY later announce additional capabilities by renegotiating the session in a second offer/answer exchange.
When media format capabilities defined in rmcap attributes are used in potential configuration lines, the transport protocol uses RTP and it is necessary to assign payload type numbers. In some cases, it is desirable to assign different payload type numbers to the same media format capability when used in different potential configurations. One example is when configurations for AVP and SAVP are offered: the offerer would like the answerer to use different payload type numbers for encrypted and unencrypted media, so the offerer can decide whether or not to render early media which arrives before the answer is received.
This association of distinct payload type number(s) with different transport protocols requires a separate pcfg line for each protocol. Clearly, this technique cannot be used if the number of potential configurations exceeds the number of possible payload type numbers.
When media capabilities negotiation is employed, SDP records are likely to contain conventional attributes such as rtpmap, fmtp, and other media-format-related lines, as well as capability attributes such as rmcap, omcap, mfcap, and mscap which map into those conventional attributes when invoked by a potential configuration. In such cases, it MAY be appropriate to employ the delete-attributes option [RFC5939] in the attribute configuration list parameter in order to avoid the generation of conflicting fmtp attributes for a particular configuration. Any media-specific attributes in the media block which refer to media formats not used by the potential configuration MUST be ignored.
For example:
In this example, PCMU is media capability 1, G729 is media capability 2, and telephone-event is media capability 3. The a=pcfg:1 line specifies that the preferred configuration is G.729 with extended dtmf events, second is G.711 mu-law with extended dtmf events, and the base media-level attributes are to be deleted. Intermixing of G.729, G.711, and "commercial" dtmf events is least preferred (the base configuration provided by the "m=" line, which is, by default, the least preferred configuration). The rtpmap and fmtp attributes of the base configuration are replaced by the rmcap and mfcap attributes when invoked by the proposed configuration.
If the preferred configuration is selected, the SDP answer will look like
In some cases, for example, when an RFC 2198 [RFC2198] redundancy audio subtype (RED) capability is defined in an mfcap attribute, the parameters to an attribute may contain payload type numbers. Two options are available for specifying such payload type numbers. They may be expressed explicitly, in which case they are bound to actual payload types by means of the payload type number parameter (pt=) in the appropriate potential or latent configuration. For example, the following SDP fragment defines a potential configuration with redundant G.711 mu-law:
The potential configuration is then equivalent to
A more general mechanism is provided via the parameter substitution rule. When an mfcap, mscap, or acap attribute is processed, its arguments will be scanned for a payload type number escape sequences of the following form (in ABNF):
If the sequence is found, the sequence is replaced by the payload type number assigned to the media capability number, as specified by the pt= parameter in the selected potential configuration; only actual payload type numbers are supported - wildcards are excluded. The sequence "%%" (null digit string) is replaced by a single percent sign and processing continues with the next character, if any.
For example, the above offer sequence could have been written as
and the equivalent SDP is the same as above.
Potential and latent configurations enable offerers and answerers to express a wide range of alternative configurations for current and future negotiation. However in practice, it may not be possible to support all combinations of these configurations.
The session capability attribute provides a means for the offerer and/or the answerer to specify combinations of specific media stream configurations which it is willing and able to support. Each session capability in an offer or answer MAY be expressed as a list of required potential configurations, and MAY include a list of optional potential and/or latent configurations.
The choices of session capabilities may be based on processing load, total bandwidth, or any other criteria of importance to the communicating parties. If the answerer supports media capabilities negotiation, and session configurations are offered, it MUST accept one of the offered configurations, or it MUST refuse the session. Therefore, if the offer includes any session capabilities, it SHOULD include all the session capabilities the offerer is willing to support.
The session capability attribute is a session-level attribute described by:
"a=sescap:" <session num> <list of configs>
which corresponds to the standard value attribute definition with
att-field = "sescap" att-value = session-num 1*WSP list-of-configs [1*WSP optional-configs] session-num = NonZeroDigit *9(DIGIT) ; DIGIT defined ; in RFC5234 list-of-configs = alt-config *("," alt-config) optional-configs = "[" list-of-configs "]" alt-config = config-number *("|" config-number)
The session-num identifies the session: a lower-number session is preferred over a higher-number session, and leading zeroes are not permitted. Each alt-config list specifies alternative media configurations within the session; preference is based on config-num as specified in RFC 5939 [RFC5939]. Note that the session preference order, when present, takes precedence over the individual media stream configuration preference order.
Use of session capability attributes requires that configuration numbers assigned to potential and latent configurations MUST be unique across the entire session; RFC 5939 [RFC5939] requires only that pcfg configuration numbers be unique within a media description. Also, leading zeroes are not permitted.
As an example, consider an endpoint that is capable of supporting an audio stream with either one H.264 video stream or two H.263 video streams with a floor control stream. In the latter case, the second video stream is optional. The SDP offer might look like the following (offering audio, an H.263 video streams, BFCP and another optional H.263 video stream)- the empty lines are added for readability only (not part of valid SDP):
If the answerer understands MediaCapNeg, but cannot support the Binary Floor Control Protocol, then it would respond with (invalid empty lines in SDP included again for readability):
An endpoint that doesn't support Media capabilities negotiation, but does support H.263 video, would respond with one or two H.263 video streams. In the latter case, the answerer may issue a second offer to reconfigure the session to one audio and one video channel using H.264 or H.263.
Session capabilities can include latent capabilities as well. Here's a similar example in which the offerer wishes to initially establish an audio stream, and prefers to later establish two video streams with chair control. If the answerer doesn't understand Media CapNeg, or cannot support the dual video streams or flow control, then it may support a single H.264 video stream. Note that establishment of the most favored configuration will require two offer/answer exchanges.
In this example, the default offer, as seen by endpoints which do not understand capabilities negotiation, proposes a PCMU audio stream and an H.264 video stream. Note that the offered lcfg lines for the video streams don't carry pt= parameters because they're not needed (payload type numbers will be assigned in the offer/answer exchange that establishes the streams). Note also that the three rmcap, mfcap, and tcap attributes used by lcfg:3 and lcfg:4 are included at the session level so they may be referenced by both latent configurations. As per Section 3.3, the media attributes generated from the rmcap, mfcap, and tcap attributes are always media-level attributes. If the answerer supports Media CapNeg, and supports the most desired configuration, it would return the following SDP:
This exchange supports immediate establishment of an audio stream for preliminary conversation. This exchange would presumably be followed at the appropriate time with a "reconfiguration" offer/answer exchange to add the video and chair control streams.
In this section, we define extensions to the offer/answer model defined in RFC 3264 [RFC3264] and RFC 5939 [RFC5939] to allow for media format and associated parameter capabilities, latent configurations and acceptable combinations of media stream configurations to be used with the SDP Capability Negotiation framework. Note that the procedures defined in this section extend the offer/answer procedures defined in RFC 5939 [RFC5939] Section 6; those procedures form a baseline set of capability negotiation offer/answer procedures that MUST be followed, subject to the extensions defined here.
SDP Capability Negotiation [RFC5939] provides a relatively compact means to offer the equivalent of an ordered list of alternative configurations for offered media streams (as would be described by separate m= lines and associated attributes). The attributes acap, mscap, mfcap, omcap and rmcap are designed to map somewhat straightforwardly into equivalent m= lines and conventional attributes when invoked by a pcfg, lcfg, or acfg attribute with appropriate parameters. The a=pcfg: lines, along with the m= line itself, represent offered media configurations. The a=lcfg: lines represent alternative capabilities for future use.
The Media Capabilities negotiation extensions defined in this document cover the following categories of features:
The high-level description of the operation is as follows:
When an endpoint generates an initial offer and wants to use the functionality described in the current document, it SHOULD identify and define the media formats and associated parameters it can support via the rmcap, omcap, mfcap and mscap attributes. The SDP media line(s) ("m=") should be made up with the actual configuration to be used if the other party does not understand capability negotiations (by default, this is the least preferred configuration). Typically, the media line configuration will contain the minimum acceptable configuration from the offerer's point of view.
Preferred configurations for each media stream are identified following the media line. The present offer may also include latent configuration (lcfg) attributes, at the media level, describing media streams and/or configurations the offerer is not now offering, but which it is willing to support in a future offer/answer exchange. A simple example might be the inclusion of a latent video configuration in an offer for an audio stream.
Lastly, if the offerer wishes to impose restrictions on the combinations of potential configurations to be used, it will include session capability (sescap) attributes indicating those.
If the offerer requires the answerer to understand the media capability extensions, the offerer MUST include a creq attribute containing the value "med-v0". If media capability negotiation is required only for specific media descriptions, the "med-v0" value MUST be provided only in creq attributes within those media descriptions, as described in RFC 5939 [RFC5939].
Below, we provide a more detailed description of how to construct the offer SDP.
For each RTP-based media format the offerer wants to include as a media format capability, the offer MUST include an "rmcap" attribute for the media format as defined in Section 3.3.1.
For each non RTP-based media format the offer wants to include as a media format capability, the offer MUST include an "omcap" attribute for the media format as defined in Section 3.3.1.
Since the media capability number space is shared between the rmcap and omcap attributes, each media capability number provided (including ranges) MUST be unique in the entire SDP.
If an "fmtp" parameter value is needed for a media format (whether RTP-based or not) in a media capability, then the offer MUST include one or more "mfcap" parameters with the relevant fmtp parameter values for that media format as defined in Section 3.3.2. When multiple "mfcap" parameters are provided for a given media capability, they MUST be provided in accordance with the concatenation rules in Section 3.3.2.1.
For each of the media format capabilities above, the offer MAY include one or more "mscap" parameters with attributes needed for those specific media formats as defined in Section 3.3.3. Such attributes will be instantiated at the media-level, and hence session-level only attributes MUST NOT be used in the "mscap" parameter. The "mscap" parameter MUST NOT include an "rtpmap" or "fmtp" attribute (rmcap and mfcap are used instead).
If the offerer wants to limit the relevance (and use) of a media format capability or parameter to a particular media stream, the media format capability or parameter MUST be provided within the corresponding media description. Otherwise, the media format capabilities and parameters MUST be provided at the session level. Note however, that the attribute or parameter embedded in these will always be instantiated at the media-level.
Inclusion of the above does not constitute an offer to use the capabilities; a potential configuration is needed for that. If the offerer wants to offer one or more of the media capabilities above, they MUST be included as part of a potential configuration (pcfg) attribute as defined in Section 3.3.4. Each potential configuration MUST include a config-number, and each config-number MUST be unique in the entire SDP (note that this differs from RFC 5939 [RFC5939], which only requires uniqueness within a media description). Also, the config-number MUST NOT overlap with any config-number used by a latent configuration in the SDP. As described in RFC 5939 [RFC5939], lower config-numbers indicate a higher preference; the ordering still applies within a given media description only though.
For a media capability to be included in a potential configuration, there MUST be an "m=" parameter in the pcfg attribute referencing the media capability number in question. When one or more media capabilities are included in an offered potential configuration (pcfg), they completely replace the list of media formats offered in the actual configuration (m= line). Any attributes included for those formats remain in the SDP though (e.g., rtpmap, fmtp, etc.). For non-RTP based media formats, the format-name (from the "omcap" media capability) is simply added to the "m=" line as a media format (e.g. t38). For RTP-based media, payload type mappings MUST be provided by use of the "pt" parameter in the potential configuration (see Section 3.3.4.2); payload type escaping may be used in mfcap, mscap, and acap attributes as defined in Section 3.3.7.
Note that the "mt" parameter MUST NOT be used with the pcfg attribute (since it is defined for the lcfg attribute only); the media type in a potential configuration cannot be changed from that of the encompassing media description.
If the offerer wishes to offer one or more latent configurations for future use, the offer MUST include a latent configuration attribute (lcfg) for each as defined in Section 3.3.5.
Each lcfg attribute
Each lcfg attribute MAY include additional capability references, which may refer to capabilities anywhere in the session description, subject to any restrictions normally associated with such capabilities. For example, a media-level attribute capability must be present at the media-level in some media description in the SDP. Note that this differs from the potential configuration attribute, which cannot validly refer to media-level capabilities in another media description (per RFC 5939 [RFC5939], Section 3.5.1).
If the offerer wants to indicate restrictions or preferences among combinations of potential and/or latent configuration, a session capability (sescap) attribute MUST be provided at the session-level for each such combination as described in Section 3.3.8. Each sescap attribute MUST include a session-num that is unique in the entire SDP; the lower the session-num the more preferred that combination is. Furthermore, sescap preference order takes precedence over any order specified in individual pcfg attributes.
When receiving an offer, the answerer MUST check the offer for creq attributes containing the value "med-v0"; answerers compliant with this specification will support this value in accordance with the procedures specified in RFC 5939 [RFC5939].
The SDP MAY contain
The high-level informative description of the operation is as follows:
When the answering party receives the offer and if it supports the required capability negotiation extensions, it should select the most-preferred configuration it can support for each media stream, and build its answer accordingly. The configuration selected for each accepted media stream is placed into the answer as a media line with associated parameters and attributes. If a proposed configuration is chosen for a given media stream, the answer must contain an actual configuration (acfg) attribute for that media stream to indicate which offered pcfg attribute was used to build the answer. The answer should also include any potential or latent configurations the answerer can support, especially any configurations compatible with other potential or latent configurations received in the offer. The answerer should make note of those configurations it might wish to offer in the future.
Below we provide a more detailed normative description of how the answerer processes the offer SDP and generates an answer SDP.
The answerer MUST first determine if it needs to perform media capability negotiation by examining the SDP for valid and preferred potential configuration attributes that include media configuration parameters (i.e., an "m" parameter in the pcfg attribute).
Such a potential configuration is valid if:
Note that, since SDP does not interpret the value of fmtp parameters, any resulting fmtp parameter value will be considered valid.
Secondly, the answerer MUST determine the order in which potential configurations are to be negotiated. In the absence of any Session Capability ("sescap") attributes, this simply follows the rules of RFC 5939 [RFC5939], with a lower config-number within a media description being preferred over a higher one. If a valid "sescap" attribute is present, the preference order provided in the "sescap" attribute MUST take precedence. A "sescap" attribute is considered valid if:
The answerer MUST now process the offer for each media stream based on the most preferred valid potential configuration in accordance with the procedures specified in RFC 5939 [RFC5939], Section 3.6.2, and further extended below:
Once the answerer has selected a valid and supported offered potential configuration for all of the media streams (or has fallen back to the actual configuration plus any added session attributes), the answerer MUST generate a valid answer SDP as described in RFC 5939 [RFC5939], Section 3.6.2, and further extended below:
The answerer MUST determine if it needs to perform any latent configuration processing by examining the SDP for valid latent configuration attributes (lcfg). An lcfg attribute is considered valid if:
For each such valid latent configuration in the offer, the answerer checks to see if it could support the latent configuration in a subsequent offer/answer exchange. If so, it includes the latent configuration with the same configuration number in the answer, similar to the way potential configurations are processed and the selected one returned in an actual configuration attribute (see RFC 5939 [RFC5939]). If the answerer supports only a (non-mandatory) subset of the parameters offered in a latent configuration, the answer latent configuration will include only those parameters supported (similar to "acfg" processing). Note that latent configurations do not constitute an actual offer at this point in time; they merely indicate additional configurations that could be supported.
If a Session Capability ("sescap") attribute is included and it references a latent configuration, then the answerer processing of that latent configuration must be done within the constraints specified by that Session Capability, i.e. it must be possible to support it at the same time as any required (i.e. non-optional) potential configurations in the session capability. The answerer may in turn add his own "sescap" indications in the answer as well.
The offerer MUST process the answer in accordance with RFC 5939 [RFC5939] Section 3.6.3, and further explained below.
When the offerer processes the answer SDP based on a valid actual configuration attribute in the answer, and that valid configuration includes one or more media capabilities, the processing MUST furthermore be done as if the offer was sent using those media capabilities instead of the actual configuration. In particular, the media formats in the "m=" line, and any associated payload type mappings (rtpmap), fmtp parameters (mfcap) and media-specific attributes (mscap) MUST be used. Note that this may involve use of concatenation and substitution rules (see Section 3.3.2.1 and 3.3.7). The actual configuration attribute may also be used to infer the lack of acceptability of higher-preference configurations that were not chosen, subject to any constraints provided by a Session Capability attribute ("sescap") in the offer. Note that the SDP Capability Negotiation base specification [RFC5939] requires the answerer to choose the highest preference configuration it can support, subject to local policies.
When the offerer receives the answer, it SHOULD furthermore make note of any capabilities and/or latent configurations included for future use, and any constraints on how those may be combined.
If, at a later time, one of the parties wishes to modify the operating parameters of a session, e.g., by adding a new media stream, or by changing the properties used on an existing stream, it can do so via the mechanisms defined for offer/answer [RFC3264]. If the initiating party has remembered the codecs, potential configurations, latent configurations and session capabilities provided by the other party in the earlier negotiation, it MAY use this knowledge to maximize the likelihood of a successful modification of the session. Alternatively, the initiator MAY perform a new capabilities exchange as part of the reconfiguration. In such a case, the new capabilities will replace the previously-negotiated capabilities. This may be useful if conditions change on the endpoint.
In this section, we provide examples showing how to use the Media Capabilities with the SDP Capability Negotiation.
This example provides a choice of one of six variations of the adaptive multirate codec. In this example, the default configuration as specified by the media line is the same as the most preferred configuration. Each configuration uses a different payload type number so the offerer can interpret early media.
In the above example, media capability 1 could have been excluded from the first rmcap declaration and from the corresponding mfcap attributes, and the pcfg:1 attribute line could have been simply "pcfg:1".
The next example offers a video stream with three options of H.264 and 4 transports. It also includes an audio stream with different audio qualities: four variations of AMR, or AC3. The offer looks something like:
This offer illustrates the advantage in compactness that arises if one can avoid deleting the base configuration attributes and recreating them in acap attributes for the potential configurations.
If an endpoint has limited signal processing capacity, it might be capable of supporting, say, a G.711 mu-law audio stream in combination with an H.264 video stream, or a G.729B audio stream in combination with an H.263-1998 video stream. It might then issue an offer like the following:
Note that the preferred session configuration (and the default as well) is G.729B with H.263. This overrides the individual media stream preferences which are PCMU and H.264 by the potential configuration numbering rule.
Consider a case in which the offerer can support either G.711 mu-law, or G.729B, along with DTMF telephony events for the 12 common touchtone signals, but is willing to support simple G.711 mu-law audio as a last resort. In addition, the offerer wishes to announce its ability to support video and MSRP in the future, but does not wish to offer a video stream or an MSRP stream at present. The offer might look like the following:
The first lcfg attribute line ("lcfg:2") announces support for H.263 and H.264 video (H.263 preferred) for future negotiation. The second lcfg attribute line ("lcfg:3") announces support for MSRP for future negotiation. The m-line and the rtpmap attribute offer an audio stream and provide the lowest precedence configuration (PCMU without any DTMF encoding). The rmcap lines define the RTP-based media format capabilities (PCMU, G729, telephone-event, H263-1998 and H264) and the omcap line defines the non-RTP based media format capability (wildcard). The mfcap attribute provides the format parameters for telephone-event, specifying the 12 commercial DTMF 'digits'. The pcfg attribute line defines the most-preferred media configuration as PCMU plus DTMF events and the next-most-preferred configuration as G.729B plus DTMF events.
If the answerer is able to support all the potential configurations, and also support H.263 video (but not H.264), it would reply with an answer like:
The lcfg attribute line announces the capability to support H.263 video at a later time. The media line and subsequent rtpmap and fmtp attribute lines present the selected configuration for the media stream. The acfg attribute line identifies the potential configuration from which it was taken, and the pcfg attribute line announces the potential capability to support G.729 with DTMF events as well. If, at some later time, congestion becomes a problem in the network, either party may, with expectation of success, offer a reconfiguration of the media stream to use G.729 in order to reduce packet sizes.
The IANA is hereby requested to register the following new SDP attributes:
The IANA is hereby requested to add the new option tag "med-v0", defined in this document, to the SDP Capability Negotiation Option Capability registry created for RFC 5939 [RFC5939].
The IANA is hereby requested to change the "SDP Capability Negotiation Potential Configuration Parameters" registry currently registered and defined by RFC 5939 [RFC5939] as follows:
The name of the registry should be "SDP Capability Negotiation Configuration Parameters Registry" and it should contain a table with the following column headings:
An IANA SDP Capability Negotiation Configuration registration MUST be documented in an RFC in accordance with the IETF Review policy [RFC5226]. Furthermore:
The IANA is hereby requested to register the following capability negotiation configuration parameters:
The security considerations of RFC 5939 [RFC5939] apply for this document.
In RFC 5939 [RFC5939], it was noted that negotiation of transport protocols (e.g. secure and non-secure) and negotiation of keying methods and material are potential security issues that warrant integrity protection to remedy. Latent configuration support provides hints to the other side about capabilities supported for further offer/answer exchanges, including transport protocols and attribute capabilities, e.g. for keying methods. If an attacker can remove or alter latent configuration information to suggest that only insecure or less secure alternatives are supported, then he may be able to force negotiation of a less secure session than would otherwise have occurred. While the specific attack as described here differs from those described in RFC 5939 [RFC5939], the considerations and mitigation strategies are similar to those described in RFC 5939 [RFC5939].
Another variation on the above attack involves the Session Capability ("sescap") attribute defined in this document. The "sescap" enables a preference order to be specified for all the potential configurations, and that preference will take precedence over any preference indication provided in individual potential configuration attributes. Consequently, an attacker that can insert or modify a "sescap" attribute may be able to force negotiation of an insecure or less secure alternative than would otherwise have occurred. Again, the considerations and mitigation strategies are similar to those described in RFC 5939 [RFC5939].
The addition of negotiable media formats and their associated parameters, defined in this specification can cause problems for middleboxes which attempt to control bandwidth utilization, media flows, and/or processing resource consumption as part of network policy, but which do not understand the media capability negotiation feature. As for the initial SDP Capability Negotiation work [RFC5939], the SDP answer is formulated in such a way that it always carries the selected media encoding for every media stream selected. Pending an understanding of capabilities negotiation, the middlebox should examine the answer SDP to obtain the best picture of the media streams being established. As always, middleboxes can best do their job if they fully understand media capabilities negotiation.
The major change is in Section 4.3, Latent Media Streams, fixing the syntax of the answer. All the other changes are editorial.
This version contains several detail changes intended to simplify capability processing and mapping into conventional SDP media blocks.
The documents adds a new attribute for specifying bandwidth capability and a parameter to list in the potential configuration. Other changes are to align the document with the terminology and attribute names from draft-ietf-mmusic-sdp-capability-negotiation-07. The document also clarifies some previous open issues.
The major changes include taking out the "mcap" and "cptmap" parameter. The mapping of payload type is now in the "pt" parameter of "pcfg". Media subtype need to explicitly defined in the "cmed" attribute if referenced in the "pcfg"
This document is heavily influenced by the discussions and work done by the SDP Capability Negotiation Design team. The following people in particular provided useful comments and suggestions to either the document itself or the overall direction of the solution defined herein: Cullen Jennings, Matt Lepinski, Joerg Ott, Colin Perkins, and Thomas Stach.
We thank Ingemar Johansson and Magnus Westerlund for examples that stimulated this work, and for critical reading of the document. We also thank Cullen Jennings, Christer Holmberg, and Miguel Garcia for their review of the document.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC3264] | Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. |
[RFC4566] | Handley, M., Jacobson, V. and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. |
[RFC5226] | Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. |
[RFC5234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. |
[RFC5939] | Andreasen, F., "Session Description Protocol (SDP) Capability Negotiation", RFC 5939, September 2010. |