Network Working Group | R.G. Gellens |
Internet-Draft | QUALCOMM Incorporated |
Obsoletes: 4281 (if approved) | D.S. Singer |
Intended status: Standards Track | Apple Inc. |
Expires: September 06, 2011 | P.F. Frojdh |
Ericsson Research | |
March 05, 2011 |
The Codecs and Profiles Parameters for "Bucket" Media Types
draft-gellens-mime-bucket-bis-03.txt
Several MIME type/subtype combinations exist that can contain different media formats. A receiving agent thus needs to examine the details of such media content to determine if the specific elements can be rendered given an available set of codecs. Especially when the end system has limited resources, or the connection to the end system has limited bandwidth, it is helpful to know from the Content-Type alone if the content can be rendered.
This document specifies two parameters, "codecs" and "profiles", which are used with various MIME types or type/subtype combinations to allow for unambiguous specification of the codecs and/or profiles employed by the media formats contained within.
By labeling content with the specific codecs indicated to render the contained media, receiving systems can determine if the codecs are supported by the end system, and if not, can take appropriate action (such as rejecting the content, sending notification of the situation, transcoding the content to a supported type, fetching and installing the required codecs, further inspection to determine if it will be sufficient to support a subset of the indicated codecs, etc.)
Similarly, the profiles can provide an overall indication, to the receiver, of the specifications with which the content complies. The receiver may be able to work out the extent to which it can handle and render the content by examining to see which of the declared profiles it supports, and what they mean.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 06, 2011.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
One of the original motivations for MIME is the ability to identify the specific media type of a message part. However, due to various factors, it is not always possible from looking at the MIME type and subtype to know which specific media formats are contained in the body part, or which codecs are indicated in order to render the content.
There are several media type/subtypes (either currently registered or deployed with registration pending) that contain codecs chosen from a set. In the absence of the parameters described here, it is necessary to examine each media element in order to determine the codecs or other features required to render the content. For example, video/3gpp may contain any of the video formats H.263 Profile 0, H.263 Profile 3, H.264, MPEG-4 Simple Profile, and/or any of the audio formats Adaptive Multi Rate (AMR), Adaptive Multi Rate - WideBand (AMR-WB), Extended AMR-WB, Advanced Audio Coding (AAC), or Enhanced aacPlus, as specified in [3GPP-Formats].
In some cases, the specific codecs can be determined by examining the header information of the media content. While this isn't as bad as examining the entire content, it still requires specialized knowledge of each format and is resource consumptive.
This ambiguity can be a problem for various clients and servers. For example, it presents a significant burden to Multimedia Messaging (MMS) servers, which must examine the media sent in each message in order to determine which codecs are required to render the content. Only then can such a server determine if the content requires transcoding or specialized handling prior to being transmitted to the handset.
Additionally, it presents a challenge to smart clients on devices with constrained memory, processing power, or transmission bandwidth (such as cellular telephones and PDAs). Such clients often need to determine in advance if they are currently capable of rendering the content contained in an MMS or email message.
Ambiguity:
Note that there are additional media types that are ambiguous, but are outside the scope of this document, including:
With each "bucket" type, a receiving agent only knows that it has a container format. It doesn't even know whether content labeled video/3gpp or video/3gpp2 contains video; it might be audio only, audio and video, or video only.
A solution that permits a receiving agent to determine the specific codecs or profiles required to render media content would help provide efficient and scalable servers, especially for Multimedia Messaging (MMS), and aid the growth of multimedia services in wireless networks.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in "Key words for use in RFCs to Indicate Requirement Levels" [RFC2119] .
The syntax in this document uses the BNF rules specified in [RFC2045] and [RFC2231].
This section adds a parameter to allow unambiguous specification of all codecs indicated to render the content in the MIME part. This parameter is optional in all current types to which it is added. Future types that contain ambiguity are strongly encouraged to include this parameter.
This parameter applies to:
This includes the media types:
Note that MPEG-21 files under the type application/mp21 may, but are not required to, contain a top-level 'moov' atom providing a timed, coded, resource. The codecs parameter SHOULD only be used for MPEG-21 files when this timed material is also present in the file.
Parameter name: codecs
Parameter value: A single value, or a comma-separated list of values identifying the codec(s) indicated to render the content in the body part.
Each value consists of one or more dot-separated elements. The name space for the first element is determined by the MIME type. The name space for each subsequent element is determined by the preceding element. The precise syntax is given below in the Generic Syntax [sec-3.2].
Note that, per [RFC2045], some characters (including the comma used to separate multiple values) require that the entire parameter value be enclosed in quotes.
An element MAY include an octet that must be encoded in order to comply with [RFC2045]. In this case, [RFC2231] is used: an asterisk ("*") is placed at the end of the parameter name (becoming "codecs*" instead of "codecs"), the parameter value usually starts with two single quote ("'") characters (indicating that neither character set nor language is specified), and each octet that requires encoding is represented as a percent sign ("%") followed by two hexadecimal digits. Note that, when the [RFC2231] form is used, the percent sign, asterisk, and single quote characters have special meaning and so must themselves be encoded.
Examples of Generic Syntax: codecs=a.bb.ccc.d codecs="a.bb.ccc.d, e.fff" codecs*=''fo%2e codecs*="''%25%20xz, gork"
When the codecs parameter is used, it MUST contain all codecs indicated by the content present in the body part. The codecs parameter MUST NOT include any codecs that are not indicated by any media elements in the body part.
In some cases, not all indicated codecs are absolutely required in order to render the content. Therefore, when a receiver does not support all listed codecs, special handling MAY be required. For example, the media element(s) MAY need to be examined in order to determine if an unsupported codec is actually required (e.g., there may be alternative tracks (such as English and Spanish audio), there may be timed text that can be dropped, etc.)
NOTE: Although the parameter value MUST be complete and accurate in 'breadth' (that is, it MUST report all four-character codes used in all tracks for ISO-family files, for example) systems MUST NOT rely on it being complete in 'depth'. If the hierarchical rules for a given code (e.g., 'qvxy') were written after a server was implemented, for example, that server will not know what elements to place after 'qvxy'.
If a receiver encounters a body part whose codecs parameter contains codecs that are not indicated by any media elements, then the receiver SHOULD process the body part by discarding the information in the codecs parameter.
If a receiver encounters a body part whose codecs parameter does not contain all codecs indicated by the media elements, then the receiver MAY process the body part by discarding the information in the codecs parameter.
The codecs parameter takes either of two forms. The first form is used when the value does not contain any octets that require encoding. The second form uses [RFC2231] to allow arbitrary octets to be encoded. With either form, quotes allow for commas and other characters in <tspecials> (quotes MAY be used even when not required).
This BNF uses the rules specified in [RFC2045] and [RFC2231] .
Implementations MUST NOT add CFWS between the tokens except after ",". TOKEN is defined in [RFC2045], and <ext-octet> and <attribute-char> are defined in [RFC2231].
The BNF syntax is as follows:
codecs := cod-simple / cod-fancy cod-simple := "codecs" "=" unencodedv unencodedv := id-simple / simp-list simp-list := DQUOTE id-simple *( "," id-simple ) DQUOTE id-simple := element ; "." reserved as hierarchy delimiter element := 1*octet-sim octet-sim := <any TOKEN character> ; Within a codecs parameter value, "." is reserved ; as a hierarchy delimiter cod-fancy := "codecs*" ":=" encodedv encodedv := fancy-sing / fancy-list fancy-sing := [charset] "'" [language] "'" id-encoded ; Parsers MAY ignore <language> ; Parsers MAY support only US-ASCII and UTF-8 fancy-list := DQUOTE [charset] "'" [language] "'" id-list DQUOTE ; Parsers MAY ignore <language> ; Parsers MAY support only US-ASCII and UTF-8 id-list := id-encoded *( "," id-encoded ) id-encoded := encoded-elm *( "." encoded-elm ) ; "." reserved as hierarchy delimiter encoded-elm := 1*octet-fancy octet-fancy := ext-octet / attribute-char DQUOTE := %x22 ; " (double quote)
Initial name space: This document only defines values for files in the ISO Base Media File Format, and QuickTime, families. Other file formats may also define codec naming.
For the ISO Base Media File Format, and the QuickTime movie file format, the first element of a codecs parameter value is a sample description entry four-character code as registered by the MP4 Registration Authority [MP4RA]. Values are case sensitive.
Note that there are potentially multiple tracks in a file, each potentially carrying multiple sample entries (some but not all uses of the ISO File Format restrict the number of sample entries in a track to one).
When the first element of a value is 'mp4a' (indicating some kind of MPEG-4 audio), or 'mp4v' (indicating some kind of MPEG-4 part-2 video), or 'mp4s' (indicating some kind of MPEG-4 Systems streams such as MPEG-4 BIFS), the second element is the hexadecimal representation of the MP4 Registration Authority ObjectTypeIndication (OTI), as specified in [MP4RA] and [MP41] (including amendments). Note that [MP4RA] uses a leading "0x" with these values, which is omitted here and hence implied.
One of the OTI values for 'mp4a' is 40 (identifying MPEG-4 audio). For this value, the third element identifies the audio ObjectTypeIndication (OTI) as defined in [MP4A] (including amendments), expressed as a decimal number.
For example, AAC low complexity has the value 2, so a complete string for AAC-LC would be "mp4a.40.2".
One of the OTI values for 'mp4v' is 20 (identifying MPEG-4 part-2 video). For this value, the third element identifies the video ProfileLevelIndication as defined in [MP4V] (including amendments), expressed as a decimal number.
For example, MPEG-4 Visual Simple Profile Level 0 has the value 9, so a complete string for MPEG-4 Visual Simple Profile Level 0 would be "mp4v.20.9".
When the first element of a value is a code indicating a codec from the Advanced Video Coding specification [AVC], specifically one of the sample entries defined in [AVC-Formats] (such as 'avc1', 'avc2', 'svc1', 'mvc1' and 'mvc2') - indicating AVC (H.264), Scalable Video Coding (SVC) or Multiview Video Coding (MVC), the second element (referred to as 'avcoti' in the formal syntax) is the hexadecimal representation of the following three bytes in the (subset) sequence parameter set NAL unit specified in [AVC]:
Note that the sample entries 'avc1' and 'avc2' do not necessarily indicate that the media only contains AVC NAL units. In fact, the media may be encoded as an SVC or MVC profile and thus contain SVC or MVC NAL units. In order to be able to determine which codec is used further information is necessary (profile_idc). Note also that reserved_zero_2bits is required to be equal to 0 in [AVC], but other values for it may be specified in the future by ITU-T or ISO/IEC.
This is as previously defined in the 3GPP File Format specification 3GPP TS 26.244 [3GPP-Formats], section A.2.2.
When SVC or MVC content is coded in an AVC-compatible fashion, the sample description may include both an AVC configuration record and an SVC or MVC configuration record. Under those circumstances, it is recommended that the two configuration records both be reported as they may contain different AVC profile, level, and compatibility indicator values. Thus the codecs reported would include the sample description code (e.g. 'avc1') twice, with the values from one of the configuration records forming the 'avcoti' information in each.
id-simple :=/ id-iso id-encoded :=/ id-iso id-iso := iso-gen / iso-mpega / iso-mpegv / iso-avc iso-gen := cpid *( element / encoded-elm ) ; <element> used with <codecs-simple> ; <encoded-elm> used with <codecs-fancy> ; ; Note that the BNF permits "." within <element> ; and <encoded-elm> but "." is reserved as the ; hierarchy delimiter iso-mpega := mp4a "." oti [ "." aud-oti ] iso-mpegv := mp4v "." oti [ "." vid-pli ] iso-avc := avc1 / avc2 / svc1 / mvc1 / mvc2 [ "." avcoti ] cpid := 4(octet-simple / octet-fancy) ; <octet-simple> used with <codecs-simple> ; <octet-fancy> used with <codecs-fancy> mp4a := %x6d.70.34.61 ; 'mp4a' oti := 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") ; leading "0x" omitted avc1 := %x61.76.63.31 ; 'avc1' avc2 := %x61.76.63.32 ; 'avc2' svc1 := %x73.76.63.31 ; 'svc1' mvc1 := %x6d.76.63.31 ; 'mvc1' mvc2 := %x6d.76.63.32 ; 'mvc2' avcoti := 6(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") ; leading "0x" omitted aud-oti := 1*DIGIT mp4v := %x6d.70.34.76 ; 'mp4v' vid-pli := 1*DIGIT
This parameter MAY be specified for use with additional MIME media types.
For ISO file formats where the name space as defined here is sufficient, all that needs to be done is to update the media type registration to specify the codecs parameter with a reference to this document. For existing media types, it is generally advisable for the parameter to be optional; for new media types, the parameter MAY be optional or required, as appropriate.
For ISO file formats where the name space as defined here needs to be expanded, a new document MAY update this one by specifying the additional detail.
For non-ISO formats, a new document MAY update this one by specifying the name space for the media type(s).
Content-Type: video/3gpp2; codecs="sevc, s263" (EVRC audio plus H.263 video) Content-Type: audio/3gpp; codecs=samr (AMR audio) Content-Type: video/3gpp; codecs="s263, samr" (H.263 video plus AMR audio) Content-Type: audio/3gpp2; codecs=mp4a.E1 (13k audio) Content-Type: video/3gpp2; codecs="mp4v.20.9, mp4a.E1" (MPEG-4 Visual Simple Profile Level 0 plus 13K voice) Content-Type: video/mp4; codecs="avc1.640028" (H.264/AVC video, High Profile, Level 40, e.g. DVB 720p 50Hz HDTV) Content-Type: video/mp4; codecs="svc1.56401E, avc1.4D401E" (SVC video, Scalable High Profile, Level 30, with a Main Profile AVC base layer, e.g. DVB 25 Hz SDTV) Content-Type: video/mp4; codecs="mvc1.800030, avc1.640030" (MVC video, Stereo High Profile, Level 42, with a High Profile base layer, e.g. as adopted in Blu-ray)
Note: OTI value 20 ("0x20" in [MP4RA]) says "Includes associated Amendment(s) and Corrigendum(a). The actual object types are defined in [MP4V] and are conveyed in the DecoderSpecificInfo as specified in [MP4V], Annex K." (references adjusted).
It is sometimes helpful to provide additional details for a media element (e.g., the number of X and Y pixels, the color depth, etc.). These details are sometimes called "media features" or "media characteristics".
When such additional features are included, the content-features [RFC2912] header provides a handy way to do so.
Just as some codecs have a variety of profiles (subsets of their functionality within which a bitstream can be coded), so also some media files can be profiled, and be associated with one or more profile identifiers of the profiles to which they conform. These profiles can indicate features of the file format itself, which codecs may be present, the profiles of those codecs, and so on. It can be advantageous to a receiving system to know the overall file profile(s) of a file; indeed, under these circumstances it may not be necessary to know the codecs themselves if they are implied by the profile.
This section adds a parameter to allow unambiguous specification of the profiles to which a file claims conformance. This parameter is optional in all current types to which it is added.
This parameter applies to:
This includes the media types:
Parameter name: profiles
Parameter value: A single value, or a comma-separated list of values identifying the profiles(s) to which the file claims conformance.
The name space is determined by the MIME type.
Note that, per [RFC2045], some characters (including the comma used to separate multiple values) require that the entire parameter value be enclosed in quotes.
An element MAY include an octet that must be encoded in order to comply with [RFC2045]. In this case, [RFC2231] is used: an asterisk ("*") is placed at the end of the parameter name (becoming "profiles*" instead of "profiles"), the parameter value usually starts with two single quote ("'") characters(indicating that neither character set nor language is specified), and each octet that requires encoding is represented as a percent sign ("%") followed by two hexadecimal digits. Note that, when the [RFC2231] form is used, the percent sign, asterisk, and single quote characters have special meaning and so must themselves be encoded.
Examples of Generic Syntax: profiles="isom,mp41,qvXt" profiles*="''%25%20xz, gork"
The 'profiles' parameter is an optional parameter that indicates one or more profiles to which the file claims conformance. Like the 'codecs' parameter described above, it may occur as either 'profiles' or 'profiles*', with the same encoding rules. The value is, as for the codecs parameter, a comma-separated list of profile identifiers.
For any file format carrying a brand registered at the MP4 Registration Authority [MP4RA], notably files based on the ISO Base Media File Format ISO/IEC 14496-12 [ISO14496-12] and QuickTime movie files, the profiles parameter MUST list exactly the major brand, followed by the compatible-brands, as listed in the filetype box ('ftyp') or segment-type box ('styp'). The major-brand MUST be first, and MAY be removed from the compatible-brands list. (The file format requires that it be repeated in the compatible brands, but this requirement is relaxed here for compactness.)
An example might be profiles="mp41,isom,qvXt", indicating that MPEG-4 version 1 is the major brand and preferred use, that the file is compatible with the version of the base file format identified by 'isom', and that it is also compatible with the specification/profile 'qvXt' (whatever that may be).
profil := pro-simple / pro-fancy pro-simple := "profiles" "=" unencodedv ; Within a codecs parameter value, "." is reserved ; as a hierarchy delimiter pro-fancy := "profiles*" ":=" encodedv
The IANA has added "codecs" and "profiles" as optional parameters to the media types as listed in Sections 3 and 4, with a reference to this document.
The MPEG4 Registration Authority can be consulted for the most up-to-date registration of sub-parameters for the codecs type, for specific codecs.
The codecs parameter itself does not alter the security considerations of any of the media types with which it is used. Each audio and video media type has its own set of security considerations that continue to apply, regardless of the use of the codecs parameter.
An incorrect codecs parameter might cause media content to be received by a device that is not capable of rendering it, or might cause media content to not be sent to a device that is capable of
receiving it. An incorrect codecs parameter is therefore capable of some types of denial-of-service attacks. However, this is most likely to arise by accident, as an attacker capable of altering media data in transit could cause more harm by altering the media format itself, or even the content type header, rather than just the codecs parameter of the content type header.
Harinath Garudadri provided a great deal of help, which is very much appreciated. Mary Barnes and Bruce Lilly provided detailed and helpful comments. Reviews and comments by Sam Hartman, Russ Housley, and Bert Wijnen were much appreciated. Chris Newman carefully reviewed and improved the BNF.
Christian Timmerer helped with the MPEG-21 material, and Thomas Schierl and Yago Sanchez helped with SVC and MVC.
[RFC3625] | Gellens, R. and H. Garudadri, "The QCP File Format and Media Types for Speech Data", RFC 3625, September 2003. |
[3GPP2-Formats] | Third Generation Partnership Project 2, "3GPP2 File Formats for Multimedia Service", . |
[MP41] | Information technology--Coding of audio-visual objects--Part 1: Systems", ISO/IEC 14496-1:2010, . | , "
[MP4A] | Information technology--Coding of audio-visual objects--3: Audio", ISO/IEC 14496-3:2009, . | , "
[MP4V] | Information technology--Coding of audio-visual objects--Part 2: Visual", ISO/IEC 14496-2:2004, . | , "