Internet DRAFT - draft-samuelsson-avtext-rpvi
draft-samuelsson-avtext-rpvi
AVTEXT Working Group J. Samuelsson
Internet-Draft Ericsson
Intended status: Standards Track M. Coban
Expires: June 2015 Qualcomm
S. Wenger
Vidyo
December 15, 2014
Reference Picture Verification Information in the
RTP Audio-Visual Profile with Feedback (AVPF)
draft-samuelsson-avtext-rpvi-00.txt
Abstract
This document specifies an extension to the feedback messages defined
in the Audio-Visual Profile with Feedback (AVPF). The new Reference
Picture Verification Information (RPVI) feedback message conveys
information about available reference pictures in the decoded picture
buffer of a video decoder in the receiver of an RTP video stream.
By including information related to Decoded Picture Hash (DPH)
values, media senders and media receivers can verify that reference
pictures used for prediction by the video encoder and the video
decoder are aligned. It is also possible to use the RPVI feedback
message to indicate that a specific reference picture has incorrect
sample values (i.e. a mismatch in the DPH value between encoder and
decoder) or that a specific reference picture has been lost.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
Samuelsson, et al. Expires June 15, 2015 [Page 1]
Internet-Draft Reference Picture Verification Info December 2014
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on June 15, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents
1. Introduction...................................................2
1.1. Applicability.............................................3
2. Terminology....................................................4
2.1. Standards Language........................................4
2.2. Glossary..................................................4
3. Reference Picture Verification Information.....................4
3.1. Message Format............................................6
4. SDP Signaling..................................................8
5. Security Considerations........................................9
6. IANA Considerations............................................9
7. References.....................................................9
7.1. Normative References......................................9
7.2. Informative References...................................10
8. Acknowledgments...............................................10
1. Introduction
This document defines a new RTCP feedback message to augment those
defined in [RFC4585], [RFC5104] and [RFC6642], for use together with
Samuelsson, et al. Expires June 15, 2015 [Page 2]
Internet-Draft Reference Picture Verification Info December 2014
video codecs that exploits temporal prediction through the use of one
or more reference pictures, e.g. [H.264], VP8 [RFC6386] and [HEVC].
1.1. Applicability
The video codecs [H.264] and [HEVC] both use temporal prediction in
order to achieve efficient compression without compromising the
visual quality of the compressed video. Video data (frames/pictures)
are encoded together with non-video data (such as parameter sets) and
an abstraction layer is used to structure the encoded bits in a
format suitable for network transportation.
A stream encoded according to H.264 or HEVC, and packetized according
to [RFC6184] and [I-D.ietf-payload-rtp-h265], respectively, is
typically transmitted from a media sender to a media receiver. The
media sender encodes the video and the media receiver decodes the
video. During the entire session (or, more specifically, within a
coded video sequence, it is crucial that the process performed at the
decoder is aligned with the process performed at the encoder. Even
the slightest difference in the sample values of a decoded picture
can result in severe visual degradation when the picture is used for
prediction by following pictures.
There are several factors that can affect the alignment of encoding
and decoding processes:
o Loss of data. In many applications it is possible to detect the
loss of RTP packets and perform appropriate actions for repairing
the loss without delivering corrupt data to the video decoder.
However, in some applications such methods may not be available
(for example due to delay constraints) or they may fail.
o Bit errors. If the receiver does not have means for detecting
individual bit errors, such errors may occur in the data that is
delivered to the video decoder.
o Random access. When performing random access into a stream it
might be difficult for the decoder to deduce if it is operating
with the correct parameters and reference pictures.
o Hardware failure. The hardware in the decoder could be
malfunctioning, for example if it is not able to correctly store
decoded pictures used for prediction.
Samuelsson, et al. Expires June 15, 2015 [Page 3]
Internet-Draft Reference Picture Verification Info December 2014
o Incorrect implementations. Ideally all video encoders and video
decoders would be implemented impeccably according to the codec
specification. However, in practice there is unfortunately the
risk of misinterpretation of the specification as well as the risk
of implementation bugs.
The feedback message specified in this memo can be utilized to
detect misalignment between encoder and decoder reference pictures.
Other mechanisms (such as sending IDR pictures) not specified
herein, can be utilized to combat the potential negative effects of
an encoder/decoder misalignment.
2. Terminology
2.1. Standards Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2.2. Glossary
AVPF - Audio-Visual Profile with Feedback
DPH - Decoded Picture Hash
FCI - Feedback Control Information [RFC4585]
IDR - Instantaneous Decoder Refresh
RPVI - Reference Picture Verification Information
SEI - Supplemental Enhancement Information
3. Reference Picture Verification Information
A Reference Picture Verification Information (RPVI) feedback message
can be sent by media receivers to report which reference pictures are
available in the decoded picture buffer. Along with identifiers of
the available reference pictures it is possible to transmit the
result of verifying the Decoded Picture Hash (DPH) values or to
transmit the actual DPH values (see section 3.1). The feedback
message can be sent at any time during an RTP session. This memo does
Samuelsson, et al. Expires June 15, 2015 [Page 4]
Internet-Draft Reference Picture Verification Info December 2014
not describe the process for handling incorrect DPH values. However,
in order to achieve good media quality and recover from errors in the
sample values of decoded pictures it is strongly recommended that a
media sender (encoder) takes appropriate actions upon the detection
of an incorrect DPH value or negative acknowledgements (NACK). Such
actions could for example include:
o Transmission of data that resets the state of the decoder, e.g. an
Instantaneous Decoder Refresh (IDR) picture. By providing a
refresh-point, the media sender can ensure that errors that have
occurred in decoded reference pictures do not propagate to future
pictures.
o Encoding following pictures using "old" reference pictures that
have been received, decoded and preferably verified to have
correct sample values. Excluding all references to pictures with
incorrect sample values will give the same effect as providing a
refresh-point: errors that are present in decoded reference
pictures do not propagate to future pictures.
o Retransmission of parameter sets. If an update of parameter sets
is lost, there is a risk that the decoder uses some parameters
incorrectly (e.g. too strong deblocking filter) without detectable
errors in the decoding process. By retransmitting the parameter
sets the encoder can make sure that the correct parameters are
used but it is not by its own sufficient for recovering from
errors in sample values of decoded reference pictures. This action
is recommended to be combined with one of the first to actions in
this list.
o Changing encoder settings or parameters to avoid configurations
that cause incorrect decoder state. When errors continuously
appear (even after performing one or both of the first two actions
in this list) a media sender can try to change the configuration
of the encoder in order to find a setting that does not result in
errors in the decoded pictures.
Samuelsson, et al. Expires June 15, 2015 [Page 5]
Internet-Draft Reference Picture Verification Info December 2014
3.1. Message Format
The RPVI message is identified by RTCP packet type value PT=PSFB and
FMT=TBD. The Feedback Control Information (FCI) for RPVI consists of
one or more FCI entries, the content of which is depicted in Figure
1. Each entry applies to a different reference picture, identified by
its Reference Picture Identifier.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MT| Reserved6 | RefPicId |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RefPicId | |
+-+-+-+-+-+-+-+-+ +
| |
+ Decoded Picture Hash (conditional) |
+ +
| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+
Figure 1 Syntax of an FCI Entry in the RPVI message
The semantics of the fields are as follows:
MT: 2 bits
Indicates the picture status information as follows:
0: No hash information regarding the correctness of the
reference picture is available.
1: The Decoded Picture Hash of the reference picture is
included in the Reference Picture Description.
2: The indicated picture is entirely or partially lost,
hence not fully decodable.
3: The Decoded Picture Hash has been used to verify the
reference picture to be incorrect.
Samuelsson, et al. Expires June 15, 2015 [Page 6]
Internet-Draft Reference Picture Verification Info December 2014
When MT equals 0 or 1, the reference picture identified by the
current entry is indicated as being available at receiver's
decoded picture buffer which may be available at the sender's
decoded picture buffer for reference when encoding the next
picture to be encoded at the reception of the RPVI feedback
message. For MT equals to 1 with the exception that if the
encoder finds that the provided hash of the reference picture
does not match the encoder's hash value, then it MUST NOT use
the reference picture.
Informative note: When a feedback message contains one or
more RPVI entry with MT equals to 0 or 1, the encoder may
select one or more of the identified pictures and/or
inferred reference pictures from the availability of the
indicated pictures to be used for reference. The selection
of which picture(s) to use for reference is out of scope of
this memo but may for example be based on maximizing
compression efficiency.
When MT equals 2 or 3 the reference picture identified by the
current entry MUST NOT be used for reference for the next
picture or any picture that follows the next picture. Other
reference pictures that use the reference picture identified by
the current entry SHOULD NOT be used for reference, unless
their Decoded Picture Hash has been verified to be correct.
Reserved6: 6 bits
This field is reserved for future definition. In the absence of
such a definition, the bits in this field MUST be set to zero
and ignored by the receiver of the RPVI feedback message.
RefPicId: 32 bits
If the video codec used for the media stream is HEVC, RefPicId
represents the value of the PicOrderCntVal (in network byte
order) of the reference picture, as defined in [HEVC].
If the video codec used for the media stream is H.264, RefPicId
represents the value of the frame_num (in network byte order)
of the reference picture, as defined in [H.264].
If the video codec used for the media stream is neither HEVC
nor H.264, the picture identifier RefPicId SHOULD be defined
outside of this specification.
Samuelsson, et al. Expires June 15, 2015 [Page 7]
Internet-Draft Reference Picture Verification Info December 2014
Decoded Picture Hash: Variable number of bytes
Present only if MT equals 1. Represent the Decoded Picture Hash
Supplemental Enhancement Information (SEI) data (in network
byte order), see D.2.19 of [HEVC], of the decoded picture. The
Decoded Picture Hash data starts with a one byte type field,
which can be used to calculate the amount of hash data. For
video encoded with three color components, such as YCbCr and
RGB, the total length of the Decoded Picture Hash will be 49
bytes when the first byte equals 0, 7 bytes when the first byte
equals 1 and 13 bytes when the first byte equals 2.
Informative note: At the time of writing this memo, the
Decoded Picture Hash SEI message is only specified for HEVC.
However, the DPH calculations defined in D.3.19 of [HEVC]
operate only on decoded sample values and is therefore codec
agnostic. The DPH SEI message defined in D.2.19 of [HEVC]
does not contain any HEVC specific information and can
therefore easily be replicated in the context of any video
codec that decode encoded data into arrays of sample values,
such as H.264.
4. SDP Signaling
A new "ack" and "nack" feedback parameter "rpvi" is defined to
indicate the usage of the RPVI feedback message.
(In the following ABNF [RFC5234], rtcp-fb-ack-param, rtcp-fb-nack-
param is used as defined in [RFC4585].)
rtcp-fb-ack-param =/ SP "rpvi"
rtcp-fb-nack-param =/ SP "rpvi"
The following parameter is defined in this document for use with
'ack':
o 'rpvi' stands for Reference Picture Verification Information and
indicates the use of RPVI messages as defined in Section 3.
The following parameter is defined in this document for use with
'nack':
o 'rpvi' stands for Reference Picture Verification Information and
indicates the use of RPVI messages as defined in Section 3.
Samuelsson, et al. Expires June 15, 2015 [Page 8]
Internet-Draft Reference Picture Verification Info December 2014
The offer/answer rules for these SDP feedback parameters are
specified in the RTP/AVPF profile [RFC4585].
Methods and rules for when to send RPVI messages are out of scope of
this memo. When the RPVI message is used in "ack" mode it may for
example be sent at a regular interval or for all pictures that
fulfills certain requirements (such as being coded as Intra
pictures). However, it is possible in both "ack" mode and "nack" mode
to send the RPVI message in response to a specific event (such as a
picture loss). When the "ack" mode is used for MT equal to 2 or 3 it
can be said to represent an acknowledgement of having received enough
data to derive the PictureID of the indicated picture but that there
appears to be some data missing (MT equal to 2) or the sample values
seems to be incorrect (MT equal to 3).
5. Security Considerations
The security considerations documented in [RFC4585] are also
applicable for the RPVI message defined in this document.
More specifically, a malicious group member can report incorrect DPH
values in RPVI feedback messages to make the sender throttle the data
transmission and increase the amount of redundancy information or
take other action to deal with the pretended incorrect DPH value
(e.g. change encoder configuration). This may result in a
degradation of the quality of the reproduced media stream.
A solution to prevent such attack with maliciously sent RPVI feedback
messages is to apply an authentication and integrity protection
framework for the feedback messages. This can be accomplished using
the RTP profile that combines Secure RTP [RFC3711] and AVPF into
SAVPF [RFC5124].
6. IANA Considerations
A new RPVI Feedback Message Type should be registered with IANA in
"FMT Values for PSFB Payload Types".
7. References
7.1. Normative References
[H.264] ITU-T Recommendation H.264, "Advanced video coding for
generic audiovisual services", February 2014,
<http://www.itu.int/rec/T-REC-H.264-201402-P>.
Samuelsson, et al. Expires June 15, 2015 [Page 9]
Internet-Draft Reference Picture Verification Info December 2014
[HEVC] ITU-T Recommendation H.265, "High Efficiency Video Coding",
April 2013, <http://www.itu.int/rec/T-REC-H.265-201304-I>.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-Time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004.
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
Real-time Transport Control Protocol (RTCP)-Based Feedback
(RTP/SAVPF)", RFC 5124, February 2008.
7.2. Informative References
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
"Codec Control Messages in the RTP Audio-Visual Profile
with Feedback (AVPF)", RFC 5104, February 2008.
[RFC6642] Wu, Q., Xia, F., and R. Even, "RTP Control Protocol (RTCP)
Extension for a Third-Party Loss Report", RFC 6642, June
2012.
[RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP
Payload Format for H.264 Video", RFC 6184, May 2011.
[I-D.ietf-payload-rtp-h265]
Wang, Y., Sanchez, Y., Schierl, T., Wenger, S. and M.
Hannuksela, "RTP Payload Format for High Efficiency Video
Coding",draft-ietf-payload-rtp-h265 (work in progress),
August 2014.
8. Acknowledgments
The authors would like to thank Bo Burman, Rickard Sjoberg and Magnus
Westerlund for valuable feedback during the development of this memo.
This document was prepared using 2-Word-v2.0.template.dot.
Samuelsson, et al. Expires June 15, 2015 [Page 10]
Internet-Draft Reference Picture Verification Info December 2014
Authors' Addresses
Jonatan Samuelsson
Ericsson
Farogatan 6, 164 80, Stockholm, Sweden
Phone: +46 761 26 35 91
Email: jonatan.samuelsson@ericsson.com
Muhammed Coban
Qualcomm
Email: mcoban@qti.qualcomm.com
Stephan Wenger
Vidyo
Email: stewe@stewe.org
Samuelsson, et al. Expires June 15, 2015 [Page 11]