Internet DRAFT - draft-mzanaty-moq-loc
draft-mzanaty-moq-loc
Network Working Group M. Zanaty
Internet-Draft S. Nandakumar
Intended status: Informational Cisco
Expires: 5 September 2024 P. Thatcher
Microsoft
4 March 2024
Low Overhead Media Container
draft-mzanaty-moq-loc-03
Abstract
This specification describes a media container format for encoded and
encrypted audio and video media data to be used primarily for
interactive Media over QUIC transport (MOQ), with the goal of it
being a low-overhead format. It also defines the LOC Streaming
Format for the MOQ Common Catalog format for publishers to annouce
and describe their LOC tracks and for subscribers to consume them.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 5 September 2024.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
Zanaty, et al. Expires 5 September 2024 [Page 1]
Internet-Draft media container March 2024
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Requirements Notation and Conventions . . . . . . . . . . 3
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2. Payload Format . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. MOQ Object Mapping . . . . . . . . . . . . . . . . . . . 4
2.2. LOC Header Metadata . . . . . . . . . . . . . . . . . . . 4
2.2.1. Common Header Data . . . . . . . . . . . . . . . . . 5
2.2.2. Video Header Data . . . . . . . . . . . . . . . . . . 5
2.2.3. Audio Header Data . . . . . . . . . . . . . . . . . . 5
2.2.4. Header Data Registration . . . . . . . . . . . . . . 5
3. Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Catalog Fields . . . . . . . . . . . . . . . . . . . . . 6
3.1.1. Optional Extensions for Video . . . . . . . . . . . . 6
3.1.2. Selection Parameters for Video . . . . . . . . . . . 7
3.1.3. Optional Extensions for Audio . . . . . . . . . . . . 7
3.1.4. Selection Parameters for Audio . . . . . . . . . . . 8
3.2. Catalog Examples . . . . . . . . . . . . . . . . . . . . 8
4. Payload Encryption . . . . . . . . . . . . . . . . . . . . . 8
5. Container Serialization . . . . . . . . . . . . . . . . . . . 8
6. Security Considerations . . . . . . . . . . . . . . . . . . . 8
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
8. Normative References . . . . . . . . . . . . . . . . . . . . 9
Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
This specification describes a low-overhead media container format
for encoded and encrypted audio and video media data to be used
primarily for interactive Media over QUIC transport (MOQT)
[MoQTransport], with the goal of it being a low-overhead format. It
also defines the LOC Streaming Format for the MOQ Common Catalog
format [MoQCatalog] for publishers to annouce and describe their LOC
tracks and for subscribers to consume them.
Zanaty, et al. Expires 5 September 2024 [Page 2]
Internet-Draft media container March 2024
"Low-overhead" refers to minimal extra encapsulation as well as
minimal application overhead when interfacing with WebCodecs
[WebCodecs].
The container format description is specified for all audio and video
codecs defined in the WebCodecs Codec Registry
[WEBCODECS-CODEC-REGISTRY]. The audio and video payload bitstream is
identical to the "internal data" inside an EncodedAudioChunk and
EncodedVideoChunk, respectively, specified in the registry.
In addition to the media payloads, critical metadata is also
specified for audio and video payloads. (Note: Align with MOQT
terminology of either "metadata" or "header".)
A primary motivation is to align with media formats used in WebCodecs
to minimize extra encapsulation and application overhead when
interfacing with WebCodecs. Other container formats like CMAF or RTP
would require more extensive application overhead in format
conversions, as well as larger encapsultion overhead which may burden
some use cases like low bitrate audio scenarios.
This specification can also be used by applications outside the
context of WebCodecs or a web browser. While the media payloads are
defined by referring to the "internal data" of an EncodedAudioChunk
or EncodedVideoChunk in the WebCodecs Codec Registry, this "internal
data" is the elementary bitstream format of codecs without any
encapsulation. Referring to the WebCodecs Codec Registry avoids
duplicating it in an identical IANA registry.
* Section 2 defines the core media payload formats.
* Section 2.2 defines the metadata associated with audio and video
payloads.
* Section 3 describes the LOC Streaming Format bindings to the MoQ
Common Catalog format including examples.
1.1. Requirements Notation and Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
Zanaty, et al. Expires 5 September 2024 [Page 3]
Internet-Draft media container March 2024
1.2. Terminology
TODO
2. Payload Format
The WebCodecs Codec Registry defines the contents of an
EncodedAudioChunk and EncodedVideoChunk for the audio and video codec
formats in the registry. The "internal data" in these chunks is used
directly in this specification as the "LOC Payload" bitstream. This
"internal data" is the elementary bitstream format of each codec
without any encapsulation.
For video formats with multiple bitstream formats in the WebCodecs
Registry, such as H.264/AVC or H.265/HEVC, the LOC Payload uses the
"canonical" format ("avcc" or "hevc", not "annexB") with the
following additions: * Parameter sets are sent in the bitstream
before key frames. * 4 byte lengths are sent before each NAL Unit. *
No start codes or emulation prevention are used in the bitstream. *
No additional codec configuration information ("extradata") is
needed.
2.1. MOQ Object Mapping
An application object when transported as a [MoQTransport] object is
composed of a MOQ Object Header and its Payload. Media objects
encoded using the container format defined in this specification
populate the MOQ Object Payload with a LOC Header and LOC Payload as
shown below.
The LOC Payload is the "internal data" of an EncodedAudioChunk or
EncodedVideoChunk.
+--------------+----------+-----------+
| MOQ Object | LOC | LOC |
| Header | Header | Payload |
+--------------+----------------------+
<---------------------->
MOQ Object Payload
MOQ Object with LOC Container
2.2. LOC Header Metadata
The LOC Header carries metadata for the corresponding LOC Payload.
This metadata provides necessary information for intermediaries such
as media switches to perform their media switching decisions when the
payload is inaccessible due to encryption.
Zanaty, et al. Expires 5 September 2024 [Page 4]
Internet-Draft media container March 2024
Section Section 2.2.4 provides a framework for registering new LOC
Header fields that aren't defined by this specification.
2.2.1. Common Header Data
The following metadata MUST be captured for each media frame.
Sequence Number: Identifies a sequentially increasing variable length
integer that is incremented per encoded media frame. This may be
replaced with the Object Sequence from the MOQ Object Header in cases
where a MOQ Object is exactly one frame.
Capture Timestamp in Microseconds: Captures the wall-clock time of
the encoded media frame in a 64-bit unsigned integer.
2.2.2. Video Header Data
Flags for frames which are independent, discardable, or base layer
sync points, as well as temporal and spatial layer identification.
[Framemarking] .
2.2.3. Audio Header Data
Audio Level: Captures the magnitude of the audio level of the
corresponding audio frame encoded in 7 bits as defined in section 3
of [RFC6464].
2.2.4. Header Data Registration
This section details the procedures to register header data fields
that might be useful for a particular class of media applications.
Registering a given metadata field requires the following attributes
to be specified.
Shortname: Short name for the metadata. (Not sent on the wire.)
Description: Detailed description for the metadata. (Not sent on the
wire.)
ID: Identifier assigned by the registry. (varint)
Length: Length of metadata value in bytes. (varint)
Value: Value of metadata. (length bytes)
Registration of type "Specification Required" is followed for
registering new metadata in the LOC Header.
Zanaty, et al. Expires 5 September 2024 [Page 5]
Internet-Draft media container March 2024
3. Catalog
A catalog is a MOQT Object that provides information about tracks
from a given publisher. A catalog is used by subscribers for
consuming tracks and by publishers to advertise and describe the
tracks. The content of a catalog is opaque to the relays and may be
end to end encrypted. A catalog describes the details of tracks such
as Track IDs and corresponding media configuration details, for
example, audio/video codec details.
The LOC Streaming Format uses the MoQ Common Catalog Format
[MoQCatalog] to describe the content being produced by a publisher.
Per Sect 5.1 of [MoQCatalog], this document registers an entry in the
"MoQ Streaming Format Type" table, with the type value 2, the name
"LOC Streaming Format", and the RFC XXX.
Every LOC catalog track MUST declare a streaming format type (See
Sect 3.2.1 of [MoQCatalog]) value of 2.
Every LOC catalog track MUST declare a streaming format version (See
Sect 3.2.1 of [MoQCatalog]) value of 1, which is the version
described in this document.
Every LOC catalog track MUST declare a packaging type (See Sect 3.2.9
of [MoQCatalog]) of "loc".
The catalog track MUST have a track name of "catalog". A catalog
object MAY be independent of other catalog objects or it MAY
represent a delta update of a prior catalog object. The first
catalog object published within a new group MUST be independent. A
catalog object SHOULD only be published only when the availability of
tracks changes.
Each catalog update MUST be mapped to a discreet moq-transport
object.
3.1. Catalog Fields
The MOQ Common Catalog defines the required base fields and optional
extensions.
3.1.1. Optional Extensions for Video
The LOC Streaming Format allows the following optional extensions for
video media.
Zanaty, et al. Expires 5 September 2024 [Page 6]
Internet-Draft media container March 2024
* temporalId: Identifies the temporal layer/sub-layer encoded,
starting with 0 for the base layer, and increasing with higher
temporal fidelity.
* spatialId: Identifies the spatial and quality layer encoded,
starting with 0 for the base layer, and increasing with higher
fidelity.
* depends: Identifies track dependencies for a given track, usually
for video media with scalable layers in separate tracks.
* renderGroup: Identifies a group of time-aligned tracks which
should be rendered simultaneously.
* selectionParams: Selection parameters for media quality, fidelity,
etc.; see next section.
3.1.2. Selection Parameters for Video
Each video track can have the following associated Selection
Parameters.
* codec: Codec information (including profile, level, tier, etc.),
as defined by the codec registrations listed in
[WEBCODECS-CODEC-REGISTRY].
* framerate: As defined in section 7.8 of
[WEBCODECS-CODEC-REGISTRY].
* bitrate: As defined in section 7.7 and 7.8 of
[WEBCODECS-CODEC-REGISTRY].
* width, height: As defined in section 7.8 of
[WEBCODECS-CODEC-REGISTRY].
* displayWidth, displayheight: As defined in section 7.7 of
[WEBCODECS-CODEC-REGISTRY].
3.1.3. Optional Extensions for Audio
The LOC Streaming Format allows the following optional extensions for
audio media.
* renderGroup: Identifies a group of time-aligned tracks which
should be rendered simultaneously.
* selectionParams: Selection parameters for media quality, fidelity,
etc.; see next section.
Zanaty, et al. Expires 5 September 2024 [Page 7]
Internet-Draft media container March 2024
3.1.4. Selection Parameters for Audio
Each audio track can have the following associated Selection
Parameters.
* codec: Codec information as defined by the codec registrations
listed in [WEBCODECS-CODEC-REGISTRY].
* bitrate: As defined in section 7.7 and 7.8 of
[WEBCODECS-CODEC-REGISTRY].
* samplerate: As defined in section 7.7 of
[WEBCODECS-CODEC-REGISTRY].
* chanelConfig: As defined in section 7.7 of
[WEBCODECS-CODEC-REGISTRY].
* lang: The primary language of the track, using standard tags from
[RFC5646].
3.2. Catalog Examples
See section 3.4 of the MOQ Common Catalog [MoQCatalog].
4. Payload Encryption
When end to end encryption is supported, the encoded payload is
encrypted with symmetric keys derived from key establishment
mechanisms, such as [MOQ-MLS], and the payload itself is protected
using mechanisms defined in [SecureObjects].
5. Container Serialization
The wire encoding of the payload conforming to this specification is
a set of length delimited values as shown below.
The Bytes is obtained as output of AEAD operation for encrypting the
Payload with the header data as additional data input.
+--------+------------+-------+------------+
| Payload | Bytes | Payload | Bytes |
| Len | (0) | Len (1) | (1) | ...
+--------+------------+-------+------------+
6. Security Considerations
TODO
Zanaty, et al. Expires 5 September 2024 [Page 8]
Internet-Draft media container March 2024
7. IANA Considerations
A new IANA registry for LOC Header Metadata is defined and populated
with the information in section Section 2.2.4. Specification
required for new metadata registration.
This document creates a new entry in the "MoQ Streaming Format"
Registry (see [MoQTransport] Sect 8). The type value is 0x002, the
name is "LOC Streaming Format" and the RFC is XXX.
8. Normative References
[MoQTransport]
Curley, L., Pugin, K., Nandakumar, S., Vasiliev, V., and
I. Swett, "Media over QUIC Transport", Work in Progress,
Internet-Draft, draft-ietf-moq-transport-02, 24 January
2024, <https://datatracker.ietf.org/doc/html/draft-ietf-
moq-transport-02>.
[MoQCatalog]
Nandakumar, S., Law, W., and M. Zanaty, "Common Catalog
Format for moq-transport", Work in Progress, Internet-
Draft, draft-wilaw-moq-catalogformat-02, 30 November 2023,
<https://datatracker.ietf.org/doc/html/draft-wilaw-moq-
catalogformat-02>.
[Framemarking]
Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame
Marking RTP Header Extension", Work in Progress, Internet-
Draft, draft-ietf-avtext-framemarking-15, 26 July 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-avtext-
framemarking-15>.
[SecureObjects]
"Secure Objects for Media over QUIC", n.d.,
<https://suhashere.github.io/moq-secure-objects/#go.draft-
jennings-moq-secure-objects.html>.
[MOQ-MLS] "Secure Group Key Agreement with MLS over MoQ", n.d.,
<https://suhashere.github.io/moq-e2ee-mls/draft-jennings-
moq-e2ee-mls.html>.
[WebCodecs]
"WebCodecs", July 2023,
<https://www.w3.org/TR/webcodecs/>.
Zanaty, et al. Expires 5 September 2024 [Page 9]
Internet-Draft media container March 2024
[WEBCODECS-CODEC-REGISTRY]
"WebCodecs Codec Registry", July 2023,
<https://www.w3.org/TR/webcodecs-codec-registry/>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time
Transport Protocol (RTP) Header Extension for Client-to-
Mixer Audio Level Indication", RFC 6464,
DOI 10.17487/RFC6464, December 2011,
<https://www.rfc-editor.org/rfc/rfc6464>.
[RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
September 2009, <https://www.rfc-editor.org/rfc/rfc5646>.
Appendix A. Acknowledgements
Thanks to Cullen Jennings for suggestions and review.
Authors' Addresses
Mo Zanaty
Cisco
Email: mzanaty@cisco.com
Suhas Nandakumar
Cisco
Email: snandaku@cisco.com
Peter Thatcher
Microsoft
Email: pthatcher@microsoft.com
Zanaty, et al. Expires 5 September 2024 [Page 10]