Internet DRAFT - draft-lennox-avtext-lrr
draft-lennox-avtext-lrr
Payload Working Group J. Lennox
Internet-Draft D. Hong
Intended status: Standards Track Vidyo
Expires: September 10, 2015 J. Uberti
S. Holmer
M. Flodman
Google
March 9, 2015
The Layer Refresh Request (LRR) RTCP Feedback Message
draft-lennox-avtext-lrr-00
Abstract
This memo describes the RTP Payload-Specific Feedback Message "Layer
Refresh Request" (LRR), which can be used to request a state refresh
of one or more substreams of a layered media stream. It also defines
its use with several scalable media formats.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 10, 2015.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Lennox, et al. Expires September 10, 2015 [Page 1]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions, Definitions and Acronyms . . . . . . . . . . . . 2
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2
3. Layer Refresh Request . . . . . . . . . . . . . . . . . . . . 4
3.1. Message Format . . . . . . . . . . . . . . . . . . . . . 4
4. Usage with specific codecs . . . . . . . . . . . . . . . . . 5
4.1. H264 SVC . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2. VP8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3. H265 . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.4. VP9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5. Usage with different scalability transmission mechanisms . . 8
6. Security Considerations . . . . . . . . . . . . . . . . . . . 8
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
This memo describes an RTP Payload-Specific Feedback Message
[RFC4585] "Layer Refresh Request" (LRR). It is designed to allow a
receiver of a layered media stream to request that one or more of its
substreams be refreshed, such that it can then be decoded by an
endpoint which previously was not receiving those layers, without
requiring that the entire stream be refreshed (as it would be if the
receiver sent a Full Intra Request (FIR) [RFC5104].
The message is designed to be applicable both to temporally and
spatially scaled streams, and to both single-stream and multi-stream
scalability modes.
2. Conventions, Definitions and Acronyms
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2.1. Terminology
A "Layer Refresh Point" is a point in a scalable stream after which a
decoder, which previously had been able to decode only some (possibly
none) of the available layers of stream, is able to decode a greater
number of the layers.
Lennox, et al. Expires September 10, 2015 [Page 2]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
For spatial (or quality) layers, layer refresh typically requires
that a spatial layer be encoded in a way that references only lower-
layer subpictures of the current picture, not any earlier pictures of
that spatial layer. Additionally, the encoder must promise that no
earlier pictures of that spatial layer will be used as reference in
the future.
An illustration of spatial layer refresh is shown below.
... <-- S1 <-- S1 S1 <-- S1 <-- ...
| | | |
\/ \/ \/ \/
... <-- S0 <-- S0 <-- S0 <-- S0 <-- ...
1 2 3 4
In this illustration, frame 3 is a layer refresh point for spatial
layer S1; a decoder which had previously only been decoding spatial
layer S0 would be able to decode layer S1 starting at frame 3.
Figure 1
For temporal layers, layer refresh requires that the layer be
"temporally nested", i.e. use as reference only earlier frames of a
lower temporal layer, not any earlier frames of this temporal layer,
and also promise that no future frames of this temporal layer will
reference frames of this temporal layer before the refresh point. In
many cases, the temporal structure of the stream will mean that all
frames are temporally nested, in which case decoders will have no
need to send LRR messages for the stream.
An illustration of temporal layer refresh is shown below.
... <----- T1 <------ T1 T1 <------ ...
/ / /
|_ |_ |_
... <-- T0 <------ T0 <------ T0 <------ T0 <--- ...
1 2 3 4 5 6 7
In this illustration, frame 6 is a layer refresh point for temporal
layer T1; a decoder which had previously only been decoding temporal
layer T0 would be able to decode layer T1 starting at frame 6.
Figure 2
Lennox, et al. Expires September 10, 2015 [Page 3]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
An illustration of an inherently temporally nested stream is shown
below.
T1 T1 T1
/ / /
|_ |_ |_
... <-- T0 <------ T0 <------ T0 <------ T0 <--- ...
1 2 3 4 5 6 7
In this illustration, the stream is temporally nested in its ordinary
structure; a decoder receiving layer T0 can begin decoding layer T1
at any point.
Figure 3
3. Layer Refresh Request
A layer refresh frame can be requested by sending a Layer Refresh
Request (LRR), which is an RTCP payload-specific feedback message
[RFC4585] asking the encoder to encode a frame which makes it
possible to upgrade to a higher layer. The LRR contains one or two
tuples, indicating the layer the decoder wants to upgrade to, and
(optionally) the currently highest layer the decoder can decode.
The specific format of the tuples, and the mechanism by which a
receiver recognizes a refresh frame, is codec-dependent. Usage for
several codecs is discussed in Section 4.
LRR follows the model of the Full Intra Request (FIR)
[RFC5104](Section 3.5.1) for its retransmission, reliability, and use
in multipoint conferences. TODO: expand these here.
The LRR message is identified by RTCP packet type value PT=PSFB and
FMT=TBD. The FCI field MUST contain one or more FIR entries. Each
entry applies to a different media sender, identified by its SSRC.
3.1. Message Format
The Feedback Control Information (FCI) for the Layer Refresh Request
consists of one or more FCI entries, the content of which is depicted
in Figure 4. The length of the LRR feedback message MUST be set to
2+3*N, where N is the number of FCI entries.
Lennox, et al. Expires September 10, 2015 [Page 4]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seq nr. |C| Payload Type| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Target Layer Index | Current Layer Index (opt) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4
SSRC (32 bits) The SSRC value of the media sender that is requested
to send a layer refresh point.
Seq nr. (8 bits) Command sequence number. The sequence number space
is unique for each pairing of the SSRC of command source and the
SSRC of the command target. The sequence number SHALL be
increased by 1 modulo 256 for each new command. A repetition
SHALL NOT increase the sequence number. The initial value is
arbitrary.
C (1 bit) A flag bit indicating whether the "Current Layer Index"
field is present in the FCI. If this bit is false, the sender of
the LRR message is requesting refresh of all layers up to and
including the target layer.
Payload Type (7 bits) The RTP payload type for which the LRR is
being requested. This gives the context in which the target layer
index is to be interpreted.
Reserved (16 bits) All bits SHALL be set to 0 by the sender and
SHALL be ignored on reception.
Target Layer Index (16 bits) The target layer for which the receiver
wishes a refresh point. Its format is dependent on the payload
type field.
Current Layer Index (16 bits) If C is 1, the current layer being
decoded by the receiver. This message is not requesting refresh
of layers at or below this layer. If C is 0, this field SHALL be
set to 0 by the sender and SHALL be ignored on reception.
4. Usage with specific codecs
Lennox, et al. Expires September 10, 2015 [Page 5]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
4.1. H264 SVC
H.264 SVC [RFC6190] defines temporal, dependency (spatial), and
quality scalability modes.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R| DID | QID | TID |RES |
+---------------+---------------+
Figure 5
Figure 5 shows the format of the layer index field for H.264 SVC
streams. This is designed to follow the same layout as the third and
fourth bytes of the H.264 SVC NAL unit extension, which carry the
stream's layer information. The "R" and "RES" fields MUST be set to
0 on transmission and ignored on reception. See [RFC6190]
Section 1.1.3 for details on the DID, QID, and TID fields.
TODO: identifying layer refresh frames in an H.264 bitstream.
4.2. VP8
The VP8 RTP payload format [I-D.ietf-payload-vp8] defines temporal
scalability modes. It does not support spatial scalability.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|TID| RES |
+---------------+---------------+
Figure 6
Figure 6 shows the format of the layer index field for VP8 streams.
The "RES" fields MUST be set to 0 on transmission and ingnored on
reception. See [I-D.ietf-payload-vp8] Section 4.2 for details on the
TID field.
TODO: identifying layer refresh frames in an VP8 bitstream.
4.3. H265
The initial version of the H.265 payload format
[I-D.ietf-payload-rtp-h265] defines temporal scalability, with
protocol elements reserved for spatial or other scalability modes
Lennox, et al. Expires September 10, 2015 [Page 6]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
(which are expected to be defined in a future version of the
specification.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | LayerId | TID |
+-------------+-----------------+
Figure 7
Figure 7 shows the format of the layer index field for H.265 streams.
This is designed to follow the same layout as the first and second
bytes of the H.265 NAL unit header, which carry the stream's layer
information. The "RES" field MUST be set to 0 on transmission and
ingnored on reception. See [I-D.ietf-payload-rtp-h265] Section 1.1.3
for details on the LayerId and TID fields.
TODO: identifying layer refresh frames in an H.265 bitstream.
4.4. VP9
The RTP payload format for VP9 [I-D.uberti-payload-vp9] defines how
it can be used for spatial and temporal scalability.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-------------+-----------------+
| T |R| S | RES |
+-------------+-----------------+
Figure 8
Figure 8 shows the format of the layer index field for VP9 streams.
This is designed to follow the same layout as the "L" byte of the VP9
payload header, which carries the stream's layer information. The
"R" and "RES" fields MUST be set to 0 on transmission and ingnored on
reception. See [I-D.uberti-payload-vp9] for details on the T and S
fields.
Identification of a layer refresh frame can be derived from the
reference IDs of each frame by backtracking the dependency chain
until reaching a point where only decodable frames are being
referenced. Therefore it's recommended for both the flexible and the
non-flexible mode that, when upgrade frames are being encoded in
response to a LRR, those packets should contain layer indices and the
reference fields so that the decoder or an MCU can make this
derivation.
Lennox, et al. Expires September 10, 2015 [Page 7]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
Example:
LRR {1,0}, {2,1} is sent by an MCU when it is currently relaying
{1,0} to a receiver and which wants to upgrade to {2,1}. In response
the encoder should encode the next frames in layers {1,1} and {2,1}
by only referring to frames in {1,0}, or {0,0}.
In the non-flexible mode, periodic upgrade frames can be defined by
the layer structure of the SS, thus periodic upgrade frames can be
automatically identified by the picture ID.
5. Usage with different scalability transmission mechanisms
Several different mechanisms are defined for how scalable streams can
be transmitted in RTP. The RTP Taxonomy
[I-D.ietf-avtext-rtp-grouping-taxonomy] Section 3.7 defines three
mechanisms: Single RTP Stream on a Single Media Transport (SRST),
Multiple RTP Streams on a Single Media Transport (MRST), and Multiple
RTP Streams on Multiple Media Transports (MRMT).
The LRR message is applicable to all these mechanisms. For MRST and
MRMT mechanisms, the "media source" field of the LRR FCI is set to
the SSRC of the RTP stream containing the layer indicated by the
Current Layer Index (if "C" is 1), or the stream containing the base
encoded stream (if "C" is 0). For MRMT, it is sent on the RTP
session on which this stream is sent. On receipt, the sender MUST
refresh all the layers requested in the stream, simultaneously in
decode order.
Note: arguably, for the MRST and MRMT mechanisms, FIR feedback
messages could instead be used to refresh specific individual layers.
However, the usage of FIR for MRSR/MRMT is not explicitly specified
anywhere, and if FIR is interpreted as refreshing layers, there is no
way to request an actual full, synchronized refresh of all the layers
of an MRST/MRMT layered source. Thus, the authors feel that
interpreting FIR as refreshing the entire source, and using LRR for
the individual layers, would be more useful.
6. Security Considerations
All the security considerations of FIR feedback packets [RFC5104]
apply to LRR feedback packets as well. Additionally, media senders
receiving LRR feedback packets MUST validate that the payload types
and layer indices they are receiving are valid for the stream they
are currently sending, and discard the requests if not.
Lennox, et al. Expires September 10, 2015 [Page 8]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
7. IANA Considerations
The IANA is requested to register the following values:
- TODO: PSFB value for LRR
8. References
[I-D.ietf-avtext-rtp-grouping-taxonomy]
Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro,
"A Taxonomy of Grouping Semantics and Mechanisms for Real-
Time Transport Protocol (RTP) Sources", draft-ietf-avtext-
rtp-grouping-taxonomy-06 (work in progress), March 2015.
[I-D.ietf-payload-rtp-h265]
Wang, Y., Sanchez, Y., Schierl, T., Wenger, S., and M.
Hannuksela, "RTP Payload Format for High Efficiency Video
Coding", draft-ietf-payload-rtp-h265-07 (work in
progress), December 2014.
[I-D.ietf-payload-vp8]
Westin, P., Lundin, H., Glover, M., Uberti, J., and F.
Galligan, "RTP Payload Format for VP8 Video", draft-ietf-
payload-vp8-14 (work in progress), March 2015.
[I-D.uberti-payload-vp9]
Uberti, J., Holmer, S., Flodman, M., and J. Lennox, "RTP
Payload Format for VP9 Video", draft-uberti-payload-vp9-00
(work in progress), October 2014.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006.
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
"Codec Control Messages in the RTP Audio-Visual Profile
with Feedback (AVPF)", RFC 5104, February 2008.
[RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
"RTP Payload Format for Scalable Video Coding", RFC 6190,
May 2011.
Lennox, et al. Expires September 10, 2015 [Page 9]
Internet-Draft Layer Refresh Request RTCP Feedback March 2015
Authors' Addresses
Jonathan Lennox
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: jonathan@vidyo.com
Danny Hong
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: danny@vidyo.com
Justin Uberti
Google, Inc.
747 6th Street South
Kirkland, WA 98033
USA
Email: justin@uberti.name
Stefan Holmer
Google, Inc.
Kungsbron 2
Stockholm 111 22
Sweden
Magnus Flodman
Google, Inc.
Kungsbron 2
Stockholm 111 22
Sweden
Lennox, et al. Expires September 10, 2015 [Page 10]