Network Working Group | A. Roach |
Internet-Draft | Mozilla |
Intended status: Standards Track | June 12, 2015 |
Expires: December 14, 2015 |
WebRTC Video Processing and Codec Requirements
draft-ietf-rtcweb-video-06
This specification provides the requirements and considerations for WebRTC applications to send and receive video across a network. It specifies the video processing that is required, as well as video codecs and their parameters.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 14, 2015.
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
One of the major functions of WebRTC endpoints is the ability to send and receive interactive video. The video might come from a camera, a screen recording, a stored file, or some other source. This specification provides the requirements and considerations for WebRTC applications to send and receive video across a network. It specifies the video processing that is required, as well as video codecs and their parameters.
Note that this document only discusses those issues dealing with video codec handling. Issues that are related to transport of media streams across the network are specified in [I-D.ietf-rtcweb-rtp-usage].
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119].
This section provides guidance on pre- and post-processing of video streams.
Unless specified otherwise by the SDP or codec, the color space SHOULD be sRGB [SRGB]. For clarity, this is the color space indicated by codepoint 1 from “ColourPrimaries” as defined in [IEC23001-8].
Unless specified otherwise by the SDP or codec, the video scan pattern for video codecs is Y’CbCr 4:2:0.
This document imposes no normative requirements on camera capture; however, implementors are encouraged to take advantage of the following features, if feasible for their platform:
If the video source is some portion of a computer screen (e.g., desktop or application sharing), then the considerations in this section also apply.
Because screen-sourced video can change resolution (due to, e.g., window resizing and similar operations), WebRTC video recipients MUST be prepared to handle mid-stream resolution changes in a way that preserves their utility. Precise handling (e.g., resizing the element a video is rendered in versus scaling down the received stream; decisions around letter/pillarboxing) is left to the discretion of the application.
Note that the default video scan format (Y’CbCr 4:2:0) is known to be less than optimal for the representation of screen content produced by most systems in use at the time of this document’s publication, which generally use RGB with at least 24 bits per sample. In the future, it may be advisable to use video codecs optimized for screen content for the representation of this type of content.
Additionally, attention is drawn to the requirements in [I-D.ietf-rtcweb-security-arch] section 5.2 and the considerations in [I-D.ietf-rtcweb-security] section 4.1.1.
In some circumstances – and notably those involving mobile devices – the orientation of the camera may not match the orientation used by the encoder. Of more importance, the orientation may change over the course of a call, requiring the receiver to change the orientation in which it renders the stream.
While the sender may elect to simply change the pre-encoding orientation of frames, this may not be practical or efficient (in particular, in cases where the interface to the camera returns pre-compressed video frames). Note that the potential for this behavior adds another set of circumstances under which the resolution of a screen might change in the middle of a video stream, in addition to those mentioned under “Screen Sourced Video,” above.
To accommodate these circumstances, RTCWEB implementations that can generate media in orientations other than the default MUST support generating the R0 and R1 bits of the Coordination of Video Orientation (CVO) mechanism described in section 7.4.5 of [TS26.114], and MUST send them for all orientations when the peer indicates support for the mechanism. They MAY support sending the other bits in the CVO extension, including the higher-resolution rotation bits. All implementations SHOULD support interpretation of the R0 and R1 bits, and MAY support the other CVO bits.
Further, some codecs support in-band signaling of orientation (for example, the SEI “Display Orientation” messages in H.264 and H.265). If CVO has been negotiated, then the sender MUST NOT make use of such codec-specific mechanisms. However, when support for CVO is not signaled in the SDP, then such implementations MAY make use of the codec-specific mechanisms instead.
For the definitions of “WebRTC Browser,” “WebRTC Non-Browser”, and “WebRTC-Compatible Endpoint” as they are used in this section, please refer to [I-D.ietf-rtcweb-overview].
WebRTC Browsers MUST implement the VP8 video codec as described in [RFC6386] and H.264 Constrained Baseline as described in [H264].
WebRTC Non-Browsers that support transmitting and/or receiving video MUST implement the VP8 video codec as described in [RFC6386] and H.264 Constrained Baseline as described in [H264].
“WebRTC-compatible endpoints” are free to implement any video codecs they see fit. This follows logically from the definition of “WebRTC-compatible endpoint.” It is, of course, advisable to implement at least one of the video codecs that is mandated for WebRTC Browsers, and implementors are encouraged to do so.
SDP allows for codec-independent indication of preferred video resolutions using the mechanism described in [RFC6236]. WebRTC endpoints MAY send an “a=imageattr” attribute to indicate the maximum resolution they wish to receive. Senders SHOULD interpret and honor this attribute by limiting the encoded resolution to the indicated maximum size, as the receiver may not be capable of handling higher resolutions.
Additionally, codecs may include codec-specific means of signaling maximum receiver abilities with regards to resolution, frame rate, and bitrate.
Unless otherwise signaled in SDP, recipients of video streams MUST be able to decode video at a rate of at least 20 fps at a resolution of at least 320 pixels by 240 pixels. These values are selected based on the recommendations in [HSUP1].
Encoders are encouraged to support encoding media with at least the same resolution and frame rates cited above.
For the VP8 codec, defined in [RFC6386], endpoints MUST support the payload formats defined in [I-D.ietf-payload-vp8].
In addition to the [RFC6236] mechanism, VP8 encoders MUST limit the streams they send to conform to the values indicated by receivers in the corresponding max-fr and max-fs SDP attributes.
Unless otherwise signaled, implementations that use VP8 MUST encode and decode pixels with a implied 1:1 (square) aspect ratio.
For the [H264] codec, endpoints MUST support the payload formats defined in [RFC6184]. In addition, they MUST support Constrained Baseline Profile Level 1.2, and they SHOULD support H.264 Constrained High Profile Level 1.3.
Implementations of the H.264 codec have utilized a wide variety of optional parameters. To improve interoperability the following parameter settings are specified:
H.264 codecs MAY send and MUST support proper interpretation of SEI “filler payload” and “full frame freeze” messages. “Full frame freeze” messages are used in video switching MCUs, to ensure a stable decoded displayed picture while switching among various input streams.
When the use of the video orientation (CVO) RTP header extension is not signaled as part of the SDP, H.264 implementations MAY send and SHOULD support proper interpretation of Display Orientation SEI messages.
Implementations MAY send and act upon “User data registered by Rec. ITU-T T.35” and “User data unregistered” messages. Even if they do not act on them, implementations MUST be prepared to receive such messages without any ill effects.
Unless otherwise signaled, implementations that use H.264 MUST encode and decode pixels with a implied 1:1 (square) aspect ratio.
This specification does not introduce any new mechanisms or security concerns beyond what is in the other documents it references. In WebRTC, video is protected using DTLS/SRTP. A complete discussion of the security considerations can be found in [I-D.ietf-rtcweb-security] and [I-D.ietf-rtcweb-security-arch]. Implementors should consider whether the use of variable bit rate video codecs are appropriate for their application, keeping in mind that the degree of inter-frame change (and, by inference, the amount of motion in the frame) may be deduced by an eavesdropper based on the video stream’s bit rate.
Implementors making use of H.264 are also advised to take careful note of the “Security Considerations” section of [RFC6184], paying special regard to the normative requirement pertaining to SEI messages.
This document requires no actions from IANA.
The author would like to thank Gaelle Martin-Cocher, Stephan Wenger, and Bernard Aboba for their detailed feedback and assistance with this document. Thanks to Cullen Jennings for providing text and review, and to Russ Housley for a careful final review. This draft includes text from draft-cbran-rtcweb-codec.
[I-D.ietf-rtcweb-rtp-usage] | Perkins, C., Westerlund, M. and J. Ott, "Web Real-Time Communication (WebRTC): Media Transport and Use of RTP", Internet-Draft draft-ietf-rtcweb-rtp-usage-24, May 2015. |
[I-D.ietf-rtcweb-security] | Rescorla, E., "Security Considerations for WebRTC", Internet-Draft draft-ietf-rtcweb-security-08, February 2015. |
[I-D.ietf-rtcweb-security-arch] | Rescorla, E., "WebRTC Security Architecture", Internet-Draft draft-ietf-rtcweb-security-arch-11, March 2015. |