Audio/Video Transport Core Maintenance A.M. Williams
Internet-Draft Audinate
Intended status: Standards Track K. Gross
Expires: September 15, 2014 AVA Networks
R. van Brandenburg
H.M. Stokking
TNO
March 14, 2014

RTP Clock Source Signalling
draft-ietf-avtcore-clksrc-10

Abstract

NTP format timestamps are used by several RTP protocols for synchronisation and statistical measurements. This memo specifies SDP signalling identifying timestamp reference clock sources and SDP signalling identifying the media clock sources in a multimedia session.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 15, 2014.

Copyright Notice

Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

RTP protocols use NTP format timestamps to facilitate multimedia session synchronisation and for providing estimates of round trip time (RTT) and other statistical parameters.

Information about media clock timing exchanged in NTP format timestamps may come from a clock which is synchronised to a global time reference, but this cannot be assumed nor is there a standardised mechanism available to indicate that timestamps are derived from a common reference clock. Therefore, RTP implementations typically assume that NTP timestamps are taken using unsynchronised clocks and must compensate for absolute time differences and rate differences. Without a shared reference clock, RTP can time align flows from the same source at a given receiver using relative timing, however tight synchronisation between two or more different receivers (possibly with different network paths) or between two or more senders is not possible.

High performance AV systems often use a reference media clock distributed to all devices in the system. The reference media clock is often distinct from the reference clock used to provide timestamps. A reference media clock may be provided along with an audio or video signal interface, or via a dedicated clock signal (e.g. genlock [SMPTE-318-1999] or audio word clock [AES11-2009]). If sending and receiving media clocks are known to be synchronised to a common reference clock, performance can improved by minimising buffering and avoiding rate conversion.

This specification defines SDP signalling of timestamp reference clock sources and media reference clock sources.

2. Applications

Timestamp reference clock source and media clock signalling benefit applications requiring synchronised media capture or playout and low latency operation.

Examples include, but are not limited to:

Social TV
: RTCP for inter-destination media synchronization [I-D.ietf-avtcore-idms] defines social TV as the combination of media content consumption by two or more users at different devices and locations and real-time communication between those users. An example of Social TV, is where two or more users are watching the same television broadcast at different devices and/or locations, while communicating with each other using text, audio and/or video. A skew in the media playout of the two or more users can have adverse effects on their experience. A well-known use case here is one friend experiencing a goal in a football match well before or after other friends.
Video Walls
: A video wall consists of multiple computer monitors, video projectors, or television sets tiled together contiguously or overlapped in order to form one large screen. Each of the screens reproduces a portion of the larger picture. In some implementations, each screen or projector may be individually connected to the network and receive its portion of the overall image from a network-connected video server or video scaler. Screens are refreshed at 50 or 60 hertz or potentially faster. If the refresh is not synchronized, the effect of multiple screens acting as one is broken.
Networked Audio
: Networked loudspeakers, amplifiers and analogue I/O devices transmitting or receiving audio signals via RTP can be connected to various parts of a building or campus network. Such situations can for example be found in large conference rooms, legislative chambers, classrooms (especially those supporting distance learning) and other large-scale environments such as stadiums. Since humans are more susceptible to differences in audio delay, this use case needs even more accuracy than the video wall use case. Depending on the exact application, the need for accuracy can then be in the range of microseconds [Olsen].
Sensor Arrays
: Sensor arrays contain many synchronised measurement elements producing signals which are then combined to form an overall measurement. Accurate capture of the phase relationships between the various signals arriving at each element of the array is critically important for proper operation. Examples include towed or fixed sonar arrays, seismic arrays and phased arrays used in radar applications, for instance.

3. Definitions

The following definitions are used in this draft:

media level
: Media level information applies to a single SDP media stream. In an SDP description, media-level information appears after each "m"-line.
multimedia session
: A set of multimedia senders and receivers as well as the data streams flowing from senders to receivers. The Session Description Protocol (SDP) [RFC4566] describes multimedia sessions.
RTP media stream
: A single stream of RTP packets identified by an RTP SSRC.
RTP media sender
: The device generating an associated RTP media stream
SDP media stream
: An RTP session potentially containing more than one RTP source. SDP media descriptions beginning with an "m"-line define the parameters of an SDP media stream.
session level
: Session level information applies to an entire multimedia session. In an SDP description, session-level information appears before the first "m"-line.
source level
: Source level information applies to a specific RTP media stream. Source-Specific Media Attributes in the Session Description Protocol (SDP) [RFC5576] defines how source-level information is included into an SDP session description.
traceable time
: A clock is considered to provide traceable time if it can be proven to be synchronised to International Atomic Time (TAI). Coordinated Universal Time (UTC) is a time standard synchronized to TAI. UTC is therefore also considered traceable time once leap seconds have been taken unto account. GPS [IS-GPS-200F] is commonly used to provide a TAI traceable time reference. Some network time synchronisation protocols (e.g. PTP [IEEE1588-2008], NTP) can explicitly indicate that the master clock is providing a traceable time reference over the network.

4. Timestamp Reference Clock Source Signalling

The NTP format timestamps used by RTP are taken by reading a local real-time clock at the sender or receiver. This local clock may be synchronised to another clock (time source) by some means or it may be unsynchronised. A variety of methods are available to synchronise local clocks to a reference time source, including network time protocols (e.g. NTP [RFC5905], PTP [IEEE1588-2008]) and radio clocks (e.g. GPS [IS-GPS-200F]).

The following sections describe and define SDP signalling, indicating whether and how the local timestamping clock in an RTP sender/receiver is synchronised to a reference clock.

4.1. Clock synchronization

Two or more local clocks that are sufficiently synchronised will produce timestamps for a given RTP event can be used as if they came from the same clock. Providing they are sufficiently synchronised, timestamps produced in one RTP sender or receiver can be directly compared to a local clock in another RTP sender or receiver.

The accuracy of synchronisation required is application dependent. See Applications [applications] section for a discussion of applications and their corresponding requirements. To serve as a reference clock, clocks must minimally be syntonized (exactly frequency matched) to one another.

Sufficient synchronisation can typically be achieving by using a network time protocol (e.g. NTP, 802.1AS, IEEE 1588-2008) to synchronize all devices to a single master clock.

Another approach is to use clocks providing a global time reference (e.g. GPS, Galileo, GLONASS). This concept may be used in conjunction with network time protocols as some protocols (e.g. PTP, NTP) allow master clocks to indicate explicitly that they are providing traceable time.

4.2. Identifying NTP Reference Clocks

A single NTP server is identified by hostname (or IP address) and an optional port number. If the port number is not indicated, it is assumed to be the standard NTP port (123).

Two or more NTP servers MAY be listed at the same level in the session description to indicate that all of the listed servers deliver the same reference time and may be used interchangeably. RTP senders and receivers are assured proper synchronization regardless of which server they choose and, in support of fault tolerance, may switch servers while streaming.

4.3. Identifying PTP Reference Clocks

The IEEE 1588 Precision Time Protocol (PTP) family of clock synchronisation protocols provides a shared reference clock in an network - typically a LAN. IEEE 1588 provides sub-microsecond synchronisation between devices on a LAN and typically locks within seconds at startup. With support from Ethernet switches, IEEE 1588 protocols can achieve nanosecond timing accuracy in LANs. Network interface chips and cards supporting hardware time-stamping of timing critical protocol messages are also available.

Three flavours of IEEE 1588 are in use today:

Each IEEE 1588 clock is identified by an EUI-64 called a "ClockIdentity". A slave clock using one of the IEEE 1588 family of network time protocols acquires the ClockIdentity/EUI-64 of the grandmaster clock that is the ultimate source of timing information for the network. A boundary clock which is itself slaved to another boundary clock or the grandmaster passes the grandmaster ClockIdentity through to its slaves.

Several instances of the IEEE 1588 protocol may operate independently on a single network, forming distinct PTP domains, each of which may have a different grandmaster clock. As the IEEE 1588 standards have developed, the definition of PTP domains has changed. IEEE 1588-2002 identifies protocol subdomains by a textual name, but IEEE 1588-2008 identifies protocol domains using a numeric domain number. 802.1AS is a Layer-2 profile of IEEE 1588-2008 supporting a single numeric clock domain (0).

When PTP domains are signalled via SDP, senders and receivers SHOULD check that both grandmaster ClockIdentity and PTP domain match when determining clock equivalence.

Two or more IEEE 1588 clocks MAY be listed at the same level in the session description to indicate that all of the listed clocks are candidate grandmaster clocks for the domain or deliver the same reference time and may be used interchangeably. RTP senders and receivers are assured proper synchronization regardless of which synchronization source they choose and, in support of fault tolerance, may switch reference clock source while streaming.

The PTP protocols employ a distributed election protocol called the "Best Master Clock Algorithm" (BMCA) to determine the active clock master. The clock master choices available to BMCA can be restricted or biased by configuration parameters to influence the election process. In some systems it may be desirable to limit the number of possible PTP clock masters to avoid the need to re-signal timestamp reference clock sources when the clock master changes.

4.4. Identifying Global Reference Clocks

Global reference clocks provide a source of traceable time, typically via a hardware radio receiver interface. Examples include GPS, Galileo and GLONASS. Apart from the name of the reference clock system, no further identification is required.

4.5. Private Reference Clocks

In other systems, all RTP senders and receivers may use a timestamp reference clock that is not provided by one of the methods listed above. Examples may include the reference time information provided by digital television or cellular services. These sources are identified as "private" reference clocks. All RTP senders and receivers in a session using a private reference clock are assumed to have a mechanism outside this specification for determining whether their timestamp reference clocks are equivalent.

4.6. Local Reference Clocks

RFC 3550 allows senders and receivers to either use a local wall clock reference for their NTP timestamps or, by setting the timestamp field to 0, to supply no timestamps at all. Both are common practice in embedded RTP implementations. These clocks are identified as "local" and can only be assumed to be equivalent to clocks originating from the same device.

4.7. Traceable Reference Clocks

A timestamp reference clock source may be labelled "traceable" if it is known to be to delivering traceable time. Providing adjustments are made for differing epochs, timezones and leap seconds, timestamps taken using clocks synchronised to a traceable time source can be directly compared even if the clocks are synchronised to different sources or via different mechanisms.

Marking a clock as traceable allows additional information (e.g. IP addresses, PTP master identifiers and the like) to be omitted from the SDP since any traceable clock available at the answerer is considered to be an appropriate timestamp reference clock. For example, an offerer could could specify ts-refclk:ntp=/traceable/ and the answerer could use GPS as a reference clock since GPS is a source of traceable time.

4.8. SDP Signalling of Timestamp Reference Clock Source

Specification of the timestamp reference clock source may be at any or all levels (session, media or source) of an SDP description (see level definitions [definitions] earlier in this document for more information).

Timestamp reference clock source signalling included at session-level provides default parameters for all RTP sessions and sources in the session description. More specific signalling included at the media level overrides default session level signalling. More specific signalling included at the source level overrides default media level signalling.

If timestamp reference clock source signalling is included anywhere in an SDP description, it must be properly defined for all levels in the description. This may simply be achieved by providing default signalling at the session level.

Timestamp reference clock parameters may be repeated at a given level (i.e. for a session or source) to provide information about additional servers or clock sources. If the attribute is repeated at a given level, all clocks described at that level are assumed to be equivalent. Traceable time sources MUST NOT be mixed with non-traceable time sources at any given level.

Note that clock source parameters may change from time to time, for example, as a result of a PTP clock master election. The SIP [RFC3261] protocol supports re-signalling of updated SDP information, however other protocols may require additional notification mechanisms.

General forms of usage:

session level:
a=ts-refclk:<clksrc>
media level:
a=ts-refclk:<clksrc>
source level:
a=ssrc:<ssrc-id> ts-refclk:<clksrc>

ABNF [RFC5234] grammar for the timestamp reference clock attribute:

; external references:
POS-DIGIT   = <See RFC 4566>
token       = <See RFC 4566>
byte-string = <See RFC 4566>
DIGIT       = <See RFC 5324>
HEXDIG      = <See RFC 5324>
CRLF        = <See RFC 5324>
hostport    = <See RFC 3261, with revisions from RFC 5954>

timestamp-refclk = "ts-refclk:" clksrc CRLF

clksrc = ntp / ptp / gps / gal / glonass / local / private / clksrc-ext

clksrc-ext         = clksrc-param-name clksrc-param-value
clksrc-param-name  = token
clksrc-param-value = ["=" byte-string ]

ntp             = "ntp=" ntp-server-addr
ntp-server-addr = hostport / "/traceable/"

ptp             = "ptp=" ptp-version ":" ptp-server
ptp-version     = "IEEE1588-2002"
                / "IEEE1588-2008"
                / "IEEE802.1AS-2011"
                / ptp-version-ext
ptp-version-ext = token

ptp-server      = ptp-gmid [":" ptp-domain]
                / "traceable"
ptp-gmid        = EUI64
ptp-domain      = ptp-domain-name / ptp-domain-nmbr

; PTP domain allowed characters: 0x21-0x7E (IEEE 1588-2002)
ptp-domain-name = "domain-name=" 1*16ptp-domain-char
ptp-domain-char = %x21-7E

; PTP domain allowed number range: 0-127 (IEEE 1588-2008)
ptp-domain-nmbr = "domain-nmbr=" ptp-domain-dgts
ptp-domain-dgts = ptp-domain-n1 / ptp-domain-n2 / ptp-domain-n3
ptp-domain-n1   = DIGIT             ; 0-9
ptp-domain-n2   = POS-DIGIT DIGIT   ; 10-99
ptp-domain-n3   = ("10"/"11") DIGIT ; 100-119
                / "12" %x30-37      ; 120-127

gps      =  "gps"
gal      =  "gal"
glonass  =  "glonass"
local    =  "local"
private  =  "private" [ ":traceable" ]

EUI64 = 7(2HEXDIG "-") 2HEXDIG

Figure 1: Timestamp Reference Clock Source Signalling

4.8.1. Examples

Figure 2 shows an example SDP description with a timestamp reference clock source defined at the session level.

v=0
o=jdoe 2890844526 2890842807 IN IP4 192.0.2.1
s=SDP Seminar
i=A Seminar on the session description protocol
u=http://www.example.com/seminars/sdp.pdf
e=j.doe@example.com (Jane Doe)
c=IN IP4 233.252.0.1/64
t=2873397496 2873404696
a=recvonly
a=ts-refclk:ntp=/traceable/
m=audio 49170 RTP/AVP 0
m=video 51372 RTP/AVP 99
a=rtpmap:99 h263-1998/90000

Figure 2: Timestamp reference clock definition at the session level

Figure 3 shows an example SDP description with timestamp reference clock definitions at the media level overriding the session level defaults.

v=0
o=jdoe 2890844526 2890842807 IN IP4 192.0.2.1
s=SDP Seminar
i=A Seminar on the session description protocol
u=http://www.example.com/seminars/sdp.pdf
e=j.doe@example.com (Jane Doe)
c=IN IP4 233.252.0.1/64
t=2873397496 2873404696
a=recvonly
a=ts-refclk:local
m=audio 49170 RTP/AVP 0
a=ts-refclk:ntp=203.0.113.10
a=ts-refclk:ntp=198.51.100.22
m=video 51372 RTP/AVP 99
a=rtpmap:99 h263-1998/90000
a=ts-refclk:ptp=IEEE802.1AS-2011:39-A7-94-FF-FE-07-CB-D0

Figure 3: Timestamp reference clock definition at the media level

Figure 4 shows an example SDP description with a timestamp reference clock definition at the source level overriding the session level default.

v=0
o=jdoe 2890844526 2890842807 IN IP4 192.0.2.1
s=SDP Seminar
i=A Seminar on the session description protocol
u=http://www.example.com/seminars/sdp.pdf
e=j.doe@example.com (Jane Doe)
c=IN IP4 233.252.0.1/64
t=2873397496 2873404696
a=recvonly
a=ts-refclk:local
m=audio 49170 RTP/AVP 0
m=video 51372 RTP/AVP 99
a=rtpmap:99 h263-1998/90000
a=ssrc:12345 ts-refclk:ptp=IEEE802.1AS-2011:39-A7-94-FF-FE-07-CB-D0

Figure 4: Timestamp reference clock signalling at the source level

5. Media Clock Source Signalling

The media clock source for a stream determines the timebase used to advance the RTP timestamps included in RTP packets. The media clock may be asynchronously generated by the sender, it may be generated in fixed relationship to the reference clock or it may be generated with respect to another stream on the network (which is presumably being received by the sender).

5.1. Asynchronously Generated Media Clock

In the simplest sender implementation, the sender generates media by sampling audio or video according to a free-running local clock. The RTP timestamps in media packets are advanced according to this media clock and packet transmission is typically timed to regular intervals on this timeline. The sender may or may not include an NTP timestamp in sender reports to allow mapping of this asynchronous media clock to a reference clock.

The asynchronously generated media clock is the assumed mode of operation when there is no signalling of media clock source. Alternatively, asynchronous media clock may be explicitly signalled.

  • a=mediaclk:sender

5.2. Direct-Referenced Media Clock

A media clock may be directly derived from a reference clock. For this case it is required that a reference clock be specified with an a=ts-refclk attribute (Section 4.8).

The signalling optionally indicates a media clock offset value. The offset indicates the RTP timestamp value at the epoch (time of origin) of the reference clock. To use the offset, implementations need to compute RTP timestamps from reference clocks. To simplify these calculations, streams utilizing offset signalling SHOULD use a TAI timestamp reference clock to avoid complications introduced by leap seconds. See [I-D.ietf-avtcore-leap-second] for further discussion of leap-second issues in timestamp reference clocks.

To compute the RTP timestamp against an IEEE 1588 (TAI-based) reference, the time elapsed between the 00:00:00 1 January 1970 IEEE 1588 epoch and the current time must be computed. Between the epoch and 1 January 2013, there were 15,706 days (including extra days during leap years). Since there are no leap seconds in a TAI reference, there are exactly 86,400 seconds during each of these days or a total of 1,356,998,400 seconds from the epoch to 00:00:00 1 January 2013. A 90 kHz RTP clock for a video stream would have advanced 122,129,856,000,000 units over this period. With a signalled offset of 0, the RTP clock value modulo the 32-bit unsigned representation in the RTP header would have been 2,460,938,240 at 00:00:00 1 January 2013. If an offset of 23,465 had been signalled, the clock value would have been 2,460,961,705.

In order to use an NTP reference, the actual time elapsed between the 00:00:00, 1 January 1900 NTP epoch to the current time must be computed. 2,208,988,800 seconds elapsed between the NTP epoch and 00:00:00 1 January 1970 [RFC0868]. Between the beginning of 1970 and 2013, there were 15,706 days elapsed (including extra days during leap years) and 25 leap seconds inserted. There is therefore a total of 3,565,987,225 seconds from the NTP epoch to 00:00:00 1 January 2013. A 90 kHz RTP clock for a video stream would have advanced 320,938,850,250,000 units over this period. With a signalled offset of 0, the RTP clock value modulo the 32-bit unsigned representation would have been 1,714,023,696 at 00:00:00 1 January 2013.

If no offset is signalled, the offset can be inferred at the receiver by examining RTCP sender reports which contain NTP and RTP timestamps which combined define a mapping. The NTP/RTP timestamp mapping provided by RTCP SRs takes precedence over that singaled through SDP, however the media clock rate implied by the SRs MUST be consistent with the rate signalled.

A rate modifier may be specified. The modifier is expressed as the ratio of two integers and modifies the rate specified or implied by the media description by this ratio. If omitted, the rate is assumed to be the exact rate specified or implied by the media format. For example, without a rate specification, the RTP clock for an 8 kHz G.711 audio stream will advance exactly 8000 units for each second advance in the reference clock from which it is derived.

The rate modifier is primarily useful for accommodating certain "oddball" audio sample rates associated with NTSC video (see Figure 7). Modified rates are not advised for video streams which generally use a 90 kHz RTP clock regardless of frame rate or sample rate used for embedded audio.

  • a=mediaclk:direct[=<offset>] [rate=<rate numerator>/<rate denominator>]

5.3. Stream-Referenced Media Clock

A common synchronisation architecture for audio/visual systems involves distributing a reference media clock from a master device to a number of slave devices, typically by means of a cable. Examples include audio word clock distribution and video black burst distribution. In this case, the media clock is locally generated, often by a crystal oscillator and is not locked to a timestamp reference clock.

To support this architecture across a network, a master clock identifier is associated with an RTP media stream carrying media clock timing information from a master device. The master clock identifier represents a media clock source in the master device. Slave devices in turn associate the master media clock identifier with streams they transmit, signalling the synchronisation relationship between the master and the transmitter's media clock.

Slave devices recover media clock timing from the clock master stream, using it to synchronise the slave media clock with the master. Timestamps in the master clock RTP media stream are taken using the timestamp reference clock shared by the master and slave devices. The timestamps communicate information about media clock timing (rate, phase) from the master to the slave devices. Timestamps are communicated in the usual RTP fashion via RTCP SRs, or via the RFC6051 [RFC6051] header extension. The stream media format may indicate other clock information, such as the nominal rate.

Note that slaving of a device media clock to a master device does not affect the usual RTP lip sync / time alignment algorithms. Time aligned playout of two or more RTP sources still relies upon NTP timestamps supplied via RTCP SRs or by the RFC6051 timestamp header extension.

In a given system, master clock identifiers must uniquely identify a single media clock source. Such identifiers MAY be manually configured, however identifiers SHOULD be generated according to the "short-term persistent RTCP CNAME" algorithm as described in RFC7022 [RFC7022]. Master clock identifiers not already in base64 format MUST be encoded as a base64 strings when used in SDP. Although the RTCP CNAME algorithm is used to generate the master clock identifier, it is used to tag RTP sources in SDP descriptions and does not appear in RTCP as a CNAME.

A reference stream can be an RTP stream or AVB stream based on the IEEE 1722 [IEEE1722] standard.

An RTP clock master stream SHOULD be identified at the source level by an SSRC [RFC5576] and master clock identifier. An RTP stream that provides media clock timing directly from a reference media clock (e.g. internal crystal, audio word clock or video blackburst signal) SHOULD tag the stream as a master clock source using the "src:" prefix. If master clock identifiers are declared at the media or session level, all RTP sources at or below the level of declaration MUST provide equivalent timing to a slave receiver.

  • a=ssrc:<ssrc> mediaclk:id=src:<media-clktag> sender
  • a=mediaclk:id=src:<media-clktag> sender

A transmitted RTP stream slaved to media clock master is signalled by including master clock identifier:

  • a=mediaclk:id=<media-clktag> sender

An RTP media sender indicates that it is slaved to an IEEE 1722 clock master via a stream identifier (an EUI-64):

  • a=mediaclk:IEEE1722=<StreamID>

An RTP media sender may gateway IEEE 1722 media clock timing to RTP:

  • a=mediaclk:id=src:<media-clktag> IEEE1722=<StreamID>

5.4. SDP Signalling of Media Clock Source

Specification of the media clock source may be at any or all levels (session, media or source) of an SDP description (see level definitions (Section 3) earlier in this document for more information).

Media clock source signalling included at session level provides default parameters for all RTP sessions and sources in the session description. More specific signalling included at the media level overrides default session level signalling. Further, source-level signalling overrides media clock source signalling at the enclosing media level and session level.

Media clock source signalling may be present or absent on a per-stream basis. In the absence of media clock source signals, receivers assume an asynchronous media clock generated by the sender.

Media clock source parameters may be repeated at a given level (i.e. for a session or source) to provide information about additional clock sources. If the attribute is repeated at a given level, all clocks described at that level are comparable clock sources and may be used interchangeably.

General forms of usage:

session level:
a=mediaclk:<mediaclock>
media level:
a=mediaclk:<mediaclock>
source level:
a=ssrc:<ssrc-id> mediaclk:<mediaclock>

ABNF [RFC5234] grammar for the media clock reference attribute:

; external references:
integer     = <See RFC 4566>
token       = <See RFC 4566>
byte-string = <See RFC 4566>
base64      = <See RFC 4566>
SP          = <See RFC 5234>
DIGIT       = <See RFC 5234>
HEXDIG      = <See RFC 5234>

media-clksrc = "mediaclk:" [media-clkid SP] mediaclock

media-clkid  = "id=" [ "src:" ] media-clktag
media-clktag = base64

mediaclock   = sender / direct / ieee1722-streamid / mediaclock-ext

mediaclock-ext         = mediaclock-param-name mediaclock-param-value
mediaclock-param-name  = token
mediaclock-param-value = [ "=" byte-string ]

sender = "sender"
direct = "direct" [ "=" 1*DIGIT ] [SP rate]
rate   = "rate=" integer "/" integer

ieee1722-streamid = "IEEE1722=" avb-stream-id
avb-stream-id     = EUI64
EUI64 = 7(2HEXDIG "-") 2HEXDIG

Figure 5: Media Clock Source Signalling

5.5. Examples

Figure 6 shows an example SDP description 8 channels of 24-bit, 48 kHz audio transmitted as a multicast stream. Media clock is derived directly from an IEEE 1588-2008 reference.

v=0
o=- 1311738121 1311738121 IN IP4 192.0.2.1
c=IN IP4 233.252.0.1/64
s= 
t=0 0
m=audio 5004 RTP/AVP 96
a=rtpmap:96 L24/48000/8
a=sendonly
a=ts-refclk:ptp=IEEE1588-2008:39-A7-94-FF-FE-07-CB-D0:0
a=mediaclk:direct=963214424

Figure 6: Media clock directly referenced to IEEE 1588-2008

Figure 7 shows an example SDP description 2 channels of 24-bit, 44056 kHz NTSC "pull-down" media clock derived directly from an IEEE 1588-2008 reference clock

v=0
o=- 1311738121 1311738121 IN IP4 192.0.2.1
c=IN IP4 233.252.0.1/64
s= 
t=0 0
m=audio 5004 RTP/AVP 96
a=rtpmap:96 L24/44100/2
a=sendonly
a=ts-refclk:ptp=IEEE1588-2008:39-A7-94-FF-FE-07-CB-D0:0
a=mediaclk:direct=963214424 rate=1000/1001

Figure 7: "Oddball" sample rate directly referenced to IEEE 1588-2008

Figure 8 shows the same 48 kHz audio transmission from Figure 6 with media clock derived from another RTP stream.

v=0
o=- 1311738121 1311738121 IN IP4 192.0.2.1
c=IN IP4 233.252.0.1/64
s= 
t=0 0
m=audio 5004 RTP/AVP 96
a=rtpmap:96 L24/48000/2
a=sendonly
a=ts-refclk:ptp=IEEE1588-2008:39-A7-94-FF-FE-07-CB-D0:0
a=mediaclk:id=MDA6NjA6MmI6MjA6MTI6MWY= sender

Figure 8: RTP stream with media clock slaved to a master

Figure 9 shows the same 48 kHz audio transmission from Figure 6 with media clock derived from an IEEE 1722 AVB stream.

v=0
o=- 1311738121 1311738121 IN IP4 192.0.2.1
c=IN IP4 233.252.0.1/64
s= 
t=0 0
m=audio 5004 RTP/AVP 96
a=rtpmap:96 L24/48000/2
a=sendonly
a=ts-refclk:ptp=IEEE1588-2008:39-A7-94-FF-FE-07-CB-D0:0
a=mediaclk:IEEE1722=38-D6-6D-8E-D2-78-13-2F

Figure 9: RTP stream with media clock slaved to an IEEE1722 master device

6. Signalling Considerations

Signalling of timestamp reference clock source [rcgrammar] and media clock source [mcgrammar] is defined to be used either by applications that implement the SDP Offer/Answer model [RFC3264] or by applications that use SDP to describe media and transport configurations.

A description SHOULD include both reference clock signalling and media clock signalling. If no reference clock is available, this SHOULD be signalled as a local reference [localref].

When no media clock signalling is present, an asynchronous media clock (Section 5.1) MUST be assumed. When no reference clock signalling is present, a local reference clock (Section 4.6) MUST be assumed.

If a reference clock is not signalled or a local reference is specified, the corresponding media clock may be established as rate synchronised with no assurance of time synchronisation.

When the description signals a direct-referenced media clock (Section 5.2), reference clock signalling is REQUIRED. Asynchronous and stream-referenced media clocks (Section 5.3) MAY be specified with or without a reference clock signalling.

6.1. Usage in Offer/Answer

During offer/answer, clock source signalling via SDP uses a declarative model. Supported media and/or reference clocks are specified in the offered SDP description. The answerer may accept or reject the offer in an application-specific way depending on the clocks that are available and the clocks that are offered. For example, an answerer may choose to accept an offer that lacks a common clock by falling back to a lower performance mode of operation (e.g. by assuming reference or media clocks are local rather than shared). Conversely, the answerer may choose to reject the offer when the offered clock specifications indicate that the available reference and/or media clocks are incompatible.

While negotiation of reference clock and media clock attributes is not defined in this document, negotiation MAY be accomplished using the capabilities negotiation procedures defined in [RFC5939].

6.1.1. Indicating Support for Clock Source Signalling

An offerer or answerer indicates support for media clock signalling by including a reference or media clock specification in the SDP description. An offerer or answerer without specific reference or media clocks to signal SHOULD indicate support for clock source signalling by including a local reference clock [localref] specification in the SDP description.

6.1.2. Timestamp Reference Clock

If one or more of the reference clocks specified in the offer are usable by the answerer, the answerer SHOULD respond with an answer containing the subset of reference clock specifications in the offer that are usable by the answerer. If the answerer rejects the offer because the available reference clocks are incompatible, the rejection MUST contain at least one timestamp reference clock specification usable by the answerer so that appropriate information is available for debugging. If no external reference clock is available to the answerer a local reference clock [localref] specification SHOULD be included in the rejection.

In both offers and answers, multiple reference clock specifications indicate equivalent clocks from different sources which may be used interchangeably. RTP senders and receivers are assured proper synchronization regardless of which of the specified sources is chosen and, in support of fault tolerance, may switch clock sources while streaming.

6.1.3. Media Clock

If the media clock mode specified in the offer is acceptable to the answerer, the answerer SHOULD respond with an answer containing the same media clock specification as the offer. If the answerer rejects the offer because the available reference clocks are incompatible, the rejection MUST contain a media clock specification supported by the answerer so that appropriate information is available for debugging. If no shared media clocks are available to the answerer an asynchronous media clock [asyncmc] specification SHOULD be included in the rejection.

6.2. Usage Outside of Offer/Answer

SDP can be employed outside of the Offer/Answer context, for instance for multimedia sessions that are announced through the Session Announcement Protocol (SAP) [RFC2974], or streamed through the Real Time Streaming Protocol (RTSP) [RFC2326].

Devices using published descriptions to join sessions SHOULD assess their synchronization compatibility with the described session based on the clock source signalling and SHOULD NOT attempt to join a session with incompatible reference or media clocks.

7. Security Considerations

Entities receiving and acting upon an SDP message should note that a session description cannot be trusted unless it has been obtained by an authenticated transport protocol from a known and trusted source. Many different transport protocols may be used to distribute session description, and the nature of the authentication will differ from transport to transport. For some transports, security features are often not deployed. In case a session description has not been obtained in a trusted manner, the endpoint SHOULD exercise care because, among other attacks, the media sessions received may not be the intended ones, the destination where media is sent to may not be the expected one, any of the parameters of the session may be incorrect.

Incorrect reference or media clock parameters may cause devices or streams to synchronize to unintended clock sources. Normally this simply results in failure to establish a session or failure to synchronize once connected. Enough devices fraudulently assigned to a specific clock source (e.g. a particular IEEE 1588 grandmaster) may, however, constitute a successful denial of service attack on that source. Devices MAY wish to validate the integrity of the clock description through some means before connecting to unfamiliar clock sources.

The timestamp reference clocks negotiated by this protocol are used to provide media timing information to RTP. Negotiated timestamp reference clocks SHOULD NOT be relied upon to provide a secure time reference for security critical operations (e.g. the expiration of public key certificates).

8. IANA Considerations

This document defines two new SDP attributes: 'ts-refclk' and 'mediaclk', within the existing Internet Assigned Numbers Authority (IANA) registry of SDP Parameters.

This document also defines a new IANA registry subordinate to the IANA SDP Parameters registry: the Media Clock Source Parameters Registry. Within this new registry, this document defines an initial set of three media clock source parameters. Further, this document defines a second new IANA registry subordinate to the IANA SDP Parameters registry: the Timestamp Reference Clock Source Parameters Registry. Within this new registry, this document defines an initial six parameters.

8.1. Reference Clock SDP Parameter

The SDP attribute "ts-refclk" defined by this document is registered with the IANA registry of SDP Parameters as follows:

SDP Attributes ( "att-field (both session and media level)" &
                 "att-field (source level)" ):

  Attribute name:     ts-refclk

  Long form:          Timestamp reference clock source

  Type of name:       att-field

  Type of attribute:  Session, media and source level

  Subject to charset: No

  Purpose:            See section 4 of this document

  Reference:          This document

  Values:             See section 8.3 of this document

Figure 10

The attribute has an extensible parameter field and therefore a registry for these parameters is required. This new registry is defined in Section 8.3.

8.2. Media Clock SDP Parameter

The SDP attribute "mediaclk" defined by this document is registered with the IANA registry of SDP Parameters as follows:

SDP Attributes ( "att-field (both session and media level)" &
                 "att-field (source level)" ):

  Attribute name:     mediaclk

  Long form:          Media clock source

  Type of name:       att-field

  Type of attribute:  Session, media and source level

  Subject to charset: No

  Purpose:            See section 5 of this document

  Reference:          This document

  Values:             See section 8.4 of this document

Figure 11

The attribute has an extensible parameter field and therefore a registry for these parameters is required. The new registry is defined in Section 8.4.

8.3. Timestamp Reference Clock Source Parameters Registry

This document creates a new IANA sub-registry called the Timestamp Reference Clock Source Parameters Registry, subordinate to the IANA SDP Parameters registry. Each entry in the Timestamp Reference Clock Source Parameters Registry contains:

Name:
Token used in the SDP description (clksrc-param-name)
Long name:
Descriptive name for the timestamp reference clock source
Reference:
Reference to the document describing the SDP token (clksrc-param-name) and syntax for the optional value associated with the token (mediaclock-param-value)

Initial values for the Timestamp Reference Clock Source Parameters registry are given below.

Future assignments are to be made through the Specification Required policy [RFC5226]. The Name field in the table corresponds to a new value corresponding to clksrc-param-name. The Reference must specify a syntax corresponding to clksrc-param-value.

Name Long Name Reference
ntp Network Time Protocol This document, section 4
ptp Precision Time Protocol This document, section 4
gps Global Position System This document, section 4
gal Galileo This document, section 4
glonass Global Navigation Satellite System This document, section 4
local Local Clock This document, section 4
private Private Clock This document, section 4

8.4. Media Clock Source Parameters Registry

This document creates a new IANA sub-registry called the Media Clock Source Parameters registry, subordinate to the IANA SDP Parameters registry. Each entry in the Media Clock Source Parameters Registry contains:

Name:
Token used in the SDP description (mediaclock-param-name)
Long name:
Descriptive name for the media clock source type
Reference:
Reference to the document describing the SDP token (mediaclock-param-name) and syntax for the optional value associated with the token (mediaclock-param-value)

Initial values for the Media Clock Source Parameters registry are given below.

Future assignments are to be made through the Specification Required policy [RFC5226]. The Name field in the table corresponds to a new value corresponding to mediaclock-param-name. The Reference must specify a syntax corresponding to mediaclock-param-value.

Name Long Name Reference
sender Asynchronously Generated Media Clock This document, section 5
direct Direct-Referenced Media Clock This document, section 5
IEEE1722 IEEE1722 Media Stream Identifier This document, section 5

8.5. Source-level Attributes

[RFC5576] requires new source-level attributes to be registered with the IANA registry named "att-field (source level)".

8.5.1. Source-level Timestamp Reference Clock Attribute

The source-level SDP attribute "ts-refclk" defined by this document is registered with the "att-field (source level)" IANA registry of SDP Parameters according to Figure 10.

8.5.2. Source-level Media Clock Attribute

The source-level SDP attribute "mediaclk" defined by this document is registered with the "att-field (source level)" IANA registry of SDP Parameters according to Figure 11.

9. Acknowledgements

The authors would like to thank Magnus Westerlund and Paul Kyzivat for valuable comments which resulted in important improvements to this document.

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.
[RFC4566] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
[RFC5576] Lennox, J., Ott, J. and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, June 2009.
[RFC5905] Mills, D., Martin, J., Burbank, J. and W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithms Specification", RFC 5905, June 2010.
[RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP Flows", RFC 6051, November 2010.
[RFC7022] Begen, A., Perkins, C., Wing, D. and E. Rescorla, "Guidelines for Choosing RTP Control Protocol (RTCP) Canonical Names (CNAMEs)", RFC 7022, September 2013.
[IEEE1588-2002] Institute of Electrical and Electronics Engineers, "1588-2002 - IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems", IEEE Std 1588-2002, 2002.
[IEEE1588-2008] Institute of Electrical and Electronics Engineers, "1588-2008 - IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems", IEEE Std 1588-2008, 2008.
[IEEE802.1AS-2011] Institute of Electrical and Electronics Engineers, "Timing and Synchronization for Time-Sensitive Applications in Bridged Local Area Networks", .
[IEEE1722] Institute of Electrical and Electronics Engineers, "IEEE Standard for Layer 2 Transport Protocol for Time Sensitive Applications in a Bridged Local Area Network", .

10.2. Informative References

[Olsen] Olsen, D., "Time Accuracy Requirements in Audio Networks", April 2007.
[RFC0868] Postel, J. and K. Harrenstien, "Time Protocol", STD 26, RFC 868, May 1983.
[RFC2326] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2974] Handley, M., Perkins, C. and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.
[RFC5939] Andreasen, F., "Session Description Protocol (SDP) Capability Negotiation", RFC 5939, September 2010.
[I-D.ietf-avtcore-idms] Brandenburg, R., Stokking, H., Deventer, O., Boronat, F., Montagud, M. and K. Gross, "Inter-destination Media Synchronization using the RTP Control Protocol (RTCP)", Internet-Draft draft-ietf-avtcore-idms-13, August 2013.
[I-D.ietf-avtcore-leap-second] Gross, K. and R. Brandenburg, "RTP and Leap Seconds", Internet-Draft draft-ietf-avtcore-leap-second-08, January 2014.
[IEEE802.1BA-2011] Institute of Electrical and Electronics Engineers, "Audio Video Bridging (AVB) Systems", .
[IS-GPS-200F] Global Positioning Systems Directorate, "Navstar GPS Space Segment/Navigation User Segment Interfaces", September 2011.
[SMPTE-318-1999] Society of Motion Picture & Television Engineers, "Television and Audio – Synchronization of 59.94- or 50-Hz Related Video and Audio Systems in Analog and Digital Areas – Reference Signals", .
[AES11-2009] Audio Engineering Society, "AES11-2009: AES recommended practice for digital audio engineering - Synchronization of digital audio equipment in studio operations", .

Authors' Addresses

Aidan Williams Audinate Level 1, 458 Wattle St Ultimo, NSW 2007 Australia Phone: +61 2 8090 1000 Fax: +61 2 8090 1001 EMail: aidan.williams@audinate.com URI: http://www.audinate.com/
Kevin Gross AVA Networks Boulder, CO US EMail: kevin.gross@avanw.com URI: http://www.avanw.com/
Ray van Brandenburg TNO Brassersplein 2 Delft, 2612CT the Netherlands Phone: +31-88-866-7000 EMail: ray.vanbrandenburg@tno.nl
Hans Stokking TNO Brassersplein 2 Delft, 2612CT the Netherlands EMail: hans.stokking@tno.nl