A/V Transport Payloads Workgroup T. Edwards
Internet-Draft FOX
Intended status: Informational January 27, 2015
Expires: July 31, 2015

RTP Payload for SMPTE ST 291 Ancillary Data
draft-edwards-payload-rtp-ancillary-01

Abstract

This memo describes an RTP Payload format for SMPTE Ancillary data, as defined by SMPTE ST 291-1. SMPTE Ancillary data is generally used along with professional video formats to carry a range of ancillary data types, including time code, KLV metadata, Closed Captioning, and the Active Format Description (AFD).

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on July 31, 2015.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This memo describes an RTP Payload format for Society of Motion Picture and Television Engineers (SMPTE) Ancillary data (ANC), as defined by SMPTE ST 291-1 [ST291]. ANC can carry a range of data types, including time code, KLV metadata, Closed Captioning, and the Active Format Description (AFD).

ANC is generally associated with the carriage of metadata within the bit stream multiplex of a Serial Digital Interface (SDI) such as SMPTE ST 259 [ST259], the standard definition (SD) Serial Digital Interface (with ANC data inserted as per SMPTE ST 125 [ST125]), or SMPTE ST 292-1 [ST292], the 1.5 Gb/s Serial Digital Interface for high definition (HD) television applications.

ANC data packet payload definitions for a specific application are specified by a SMPTE Standard, Recommended Practice, or Registered Disclosure Document, or by a document generated by another organization, a company, or an individual (an Entity). When a payload format is registered with SMPTE, an application document describing the payload format is required, and the registered ancillary data packet is identified by a registered data identification word.

This RTP payload supports ANC data packets regardless of whether they originate from an SD or HD interface, or if the ANC data packet is from the vertical ancillary space (VANC) or the horizontal ancillary space (HANC), or if the ANC packet is located in the luma (Y) or color-difference (C) channel. Sufficient information is provided to enable the ANC packets at the output of the decoder to be restored to their "original" locations in the serial digital video signal raster (if that is desired). This payload could be used by itself, or used along with a range of RTP video formats. In particular, it has been specifically designed so that it could be used along with RFC 4175 [RFC4175] "RTP Payload Format for Uncompressed Video" or RFC 5371 [RFC5371] "RTP Payload Format for JPEG 2000 Video Streams."

The data model in this document for the ANC data RTP payload is based on the data model of SMPTE ST 2038 [ST2038], which standardizes the carriage of ANC data packets in an MPEG-2 Transport Stream.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. RTP Payload Format for SMPTE ST 291 Ancillary Data

The format of an RTP packet containing SMPTE ST 291 Ancillary Data is shown below:

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC    |M|    PT       |        sequence number        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Extended Sequence Number    |            Length             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ANC_Count     |C|   Line_Number       |   Horizontal_Offset   | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        DID        |        SDID       |   Data_Count      | R |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 		    	     User_Data_Words...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Checksum_Word   |octet_align|    (next ANC data packet)...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  
        
        

Figure 1: SMPTE Ancillary Data RTP Packet Format

RTP packet header fields SHALL be interpreted as per RFC 3550 [RFC3550], with the following specifics:

Timestamp: 32 bits

The timestamp field is interpreted in a similar fashion to RFC 4175 [RFC4175]:
For progressive scan video, the timestamp SHALL denote the sampling instant of the frame to which the ancillary data in the RTP packet belongs. Packets MUST NOT include ANC data from multiple frames, and all packets with ANC data belonging to the same frame MUST have the same timestamp.
For interlaced video, the timestamp SHALL denote the sampling instant of the field to which the ancillary data in the RTP packet belongs. Packets MUST NOT include ANC data from multiple fields, and all packets belonging to the same field MUST have the same timestamp.
A 90-kHz timestamp SHOULD be used in both cases. If the sampling instant does not correspond to an integer value of the clock, the value SHALL be truncated to the next lowest integer, with no ambiguity.
Marker bit (M): 1 bit

The marker bit set to "1" SHALL indicate the last RTP packet containing ANC data for a frame (for progressive scan video) or the last RTP packet containing ANC data for a field (for interlaced video).

2.1. Payload Header Definitions

The ANC RTP payload header fields are defined as:

Extended Sequence Number: 16 bits

The high order bits of the extended 32-bit sequence number, in network byte order. This is the same as the Extended Sequence Number field in RFC 4175 [RFC4175].
Length: 16 bits

Number of octets of the ANC RTP payload, beginning with the "C" bit of the first ANC packet data.
ANC_Count: 8 bits

This field is the count of the total number of ANC data packets carried in the RTP payload. A single ANC RTP packet payload SHALL NOT carry more than 255 ANC data packets.

And for each ANC data packet in the payload, the following header fields MUST be present:

C: 1 bit

For HD signals, this flag, when set to "1", indicates that the ANC data corresponds to the color-difference channel (C). When set to "0", this flag indicates that the ANC data corresponds to the luma (Y) channel. For SD signals, this flag SHALL be set to "0".
Line_Number: 11 bits

This field contains the line number (as defined in ITU-R BT.1700 [BT1700] for SD video or ITU-R BT.1120 [BT1120] for HD video) that corresponds to the location of the ANC data packet. The lines that are available to convey ANC data are as defined in the applicable sample structure specification (e.g., SMPTE 274M [ST274], SMPTE ST 296 [ST296], ITU-R BT.656 [BT656]) and may be further restricted per SMPTE RP 168 [RP168]. A value of 0x7FF (all bits in the field are '1') SHALL indicate that the ANC data is carried without any specific location within the frame.
Horizontal_Offset: 12 bits

This field defines the location of the ANC packet relative to the start of active video (SAV). 0 means that the Ancillary Data Flag (ADF) of the ANC packet begins immediately following SAV. For HD, this shall be in units of luma sample numbers as specified by the defining document of the particular image (e.g., SMPTE 274M [ST274] for 1920 x 1080 active images, or SMPTE ST 296 [ST296] for 1280 x 720 progressive active images). For SD, this is in units of (27MHz) multiplexed word numbers, as specified in SMPTE ST 125 [ST125]. It should be noted that HANC space in the digital blanking area will generally have higher luma sample numbers than any samples in the active digital line.

The fields DID, SDID, Data_Count, User_Data_Words, and Checksum_Word represent the 10-bit words carried in the ANC data packet, as per SMPTE ST 291 [ST291]:

DID: 10 bits

Data Identification Word
SDID: 10 bits

Secondary Data Identification Word. Used only for a "Type 2" ANC data packet. Note that in a "Type 1" ANC data packet, this word will actually carry the Data Block Number (DBN).
Data_Count: 10 bits

The lower 8 bits of Data_Count, corresponding to bits b7 (MSB) through b0 (LSB) of the 10-bit Data_Count word, contain the actual count of 10-bit words in User_Data_Words. Bit b8 is the even parity for bits b7 through b0, and bit b9 is the inverse (logical NOT) of bit b8.
R: 2 reserved bits

R is a field of two reserved bits that MUST be set to zero.
User_Data_Words: integer number of 10 bit words

User_Data_Words (UDW) are used to convey information of a type as identified by the DID word or the DID and SDID words. The number of 10-bit words in the UDW is defined by the Data_Count field.
Checksum_Word: 10 bits

The Checksum_Word can be used to determine the validity of the ANC data packet from the DID word through the UDW. The lower 8 bits of Checksum_Word, corresponding to bits b8 (MSB) through b0 (LSB) of the 10-bit data count word, contain the actual checksum value. Bit b9 is the inverse (logical NOT) of bit b8. The checksum value is equal to the nine least significant bits of the sum of the nine least significant bits of the DID word, the SDID word, the Data_Count word, and all User_Data_Words in the ANC data packet. The checksum is initialized to zero before calculation, and any end carry resulting from the checksum calculation is ignored.
octet_align: 0-7 bits as needed to complete octet

Octet align contains enough "0" bits as needed to complete the last octet of an ANC packet's data in the RTP payload. This ensures that the next ANC packet's data in the RTP payload begins octet-aligned despite ANC packets being made up of 10-bit words. If an ANC data packet in the RTP payload ends aligned with an octet, there is no need to add any octet alignment bits.

3. Payload Format Parameters

This RTP payload format is identified using the video/smpte291 media type, which is registered in accordance with RFC 4855 [RFC4855], and using the template of RFC 4288 [RFC4288].

Note that the Media Type Definition is in the "video" tree due to the expected use of SMPTE ST 291 Ancillary Data with video formats.

3.1. Media Type Definition

Type name: video

Subtype name: smpte291

Required parameters:

Optional parameters:

Encoding considerations: This media type is framed and binary; see Section 4.8 of RFC 4288 [RFC4288].

Security considerations: See Section 5 of [this RFC]

Interoperability considerations: Data items in smpte291 can be very diverse. Receivers might only be capable of interpreting a subset of the possible data items. Some implementations may care about the location of the ANC data packets in the SDI raster, but other implementations may not care.

Published specification: [this RFC]

Applications that use this media type: Devices that stream real-time professional video, especially those that must interoperate with legacy serial digital interfaces (SDI).

Additional Information: none

Person & email address to contact for further information: T. Edwards <thomas.edwards@fox.com>, IETF Payload Working Group <payload@ietf.org>

Intended usage: COMMON

Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP RFC 3550 [RFC3550]. Transport within other framing protocols is not defined at this time.

Author: T. Edwards <thomas.edwards@fox.com>

Change controller: IETF Payload working group delegated from the IESG.

3.2. Mapping to SDP

The mapping of the above defined payload format media type and its parameters SHALL be done according to Section 3 of RFC 4855 [RFC4855].

A sample SDP mapping for ancillary data is as follows:

m=video 30000 RTP/AVP 112
a=rtpmap:112 smpte291/90000
a=fmtp:112 DID=0x61; SDID=0x02;

In this example, a dynamic payload type 112 is used for ancillary data. The 90 kHz RTP timestamp rate is specified in the "a=rtpmap" line after the subtype. The RTP sampling clock is 90 kHz. In the "a=fmtp:" line, DID 0x61 and SDID 0x02 are specified (which are registered to EIA 608 Closed Caption Data by SMPTE).

3.3. Offer/Answer Model and Declarative Considerations

When offering SMPTE ST 291 Ancillary data over RTP using the Session Description Protocol (SDP) in an Offer/Answer model [RFC3264] or in a declarative manner (e.g., SDP in the Real-Time Streaming Protocol (RTSP) [RFC2326] or the Session Announcement Protocol (SAP) [RFC2974]), the offerer could provide a list of streams available with specific DID & SDIDs, and the answerer could specify which streams with specific DID & SDIDs it would like to accept.

4. IANA Considerations

One media type (video/smpte291) has been defined and needs registration in the media types registry. See Section 3.1

5. Security Considerations

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550] and any applicable RTP profile, e.g., AVP [RFC3551].

To avoid potential buffer overflow attacks, receivers should take care to validate that the ANC packets in the RTP payload are of the appropriate length (using the Data_Count field) for the ANC data type specified by DID & SDID. Also the Checksum_Word should be checked against the ANC data packet to ensure that its data has not been damaged in transit.

Some receivers will simply move the ANC data packet bits from the RTP payload into a serial digital interface (SDI). It may still be a good idea for these "re-embedders" to perform the above mentioned validity tests to avoid downstream SDI systems from becoming confused by bad ANC packets, which could be used for a denial of service attack.

"Re-embedders" into SDI should also double check that the Line_Number and Horizontal_Offset leads to the ANC data packet being inserted into a legal area to carry ancillary data in the SDI video bit stream of the output video format.

6. References

6.1. Normative References

[BT1120] ITU-R, "BT.1120-8, Digital Interfaces for HDTV Studio Signals", January 2012.
[BT1700] ITU-R, "BT.1700, Characteristics of Composite Video Signals for Conventional Analogue Television Systems", February 2005.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", RFC 4288, December 2005.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, February 2007.
[ST291] SMPTE, "ST 291-1:2011, Ancillary Data Packet and Space Formatting", 2011.

6.2. Informative References

[BT656] ITU-R, "BT.656-5, Interfaces for Digital Component Video Signals in 525-Line and 625-Line Television Systems Operating at the 4:2:2 Level of Recommendation ITU-R BT.601", December 2007.
[RFC2326] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2974] Handley, M., Perkins, C. and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.
[RFC4175] Gharai, L. and C. Perkins, "RTP Payload Format for Uncompressed Video", RFC 4175, September 2005.
[RFC5371] Futemma, S., Itakura, E. and A. Leung, "RTP Payload Format for JPEG 2000 Video Streams", RFC 5371, October 2008.
[RP168] SMPTE, "RP 168:2009, Definition of Vertical Interval Switching Point for Synchronous Video Switching", 2009.
[ST125] SMPTE, "ST 125:2013, SDTV Component Video Signal Coding 4:4:4 and 4:2:2 for 13.5 MHz and 18 MHz Systems", 2013.
[ST2038] SMPTE, "ST 2038:2008, Carriage of Ancillary Data Packets in an MPEG-2 Transport Stream", 2008.
[ST259] SMPTE, "ST 259:2008, SDTV Digital Signal/Data - Serial Digital Interface", 2008.
[ST274] SMPTE, "ST 274:2008, 1920 x 1080 Image Sample Structure, Digital Representation and Digital Timing Reference Sequences for Multiple Picture Rates", 2008.
[ST292] SMPTE, "ST 292-1:2012, 1.5 Gb/s Signal/Data Serial Interface", 2012.
[ST296] SMPTE, "ST 296:2012, 1280 x 720 Progressive Image 4:2:2 and 4:4:4 Sample Structure - Analog and Digital Representation and Analog Interface", 2012.

Author's Address

Thomas G. Edwards FOX 10201 W. Pico Blvd. Los Angeles, CA 90035 USA Phone: +1 310 369 6696 EMail: thomas.edwards@fox.com