Audio/Video Transport Working Group Q. Wu
Internet-Draft R. Even
Intended status: Standards Track Huawei
Expires: April 23, 2013 October 22, 2012

Bandwidth and RTCP timing issues for multi-source endpoint
draft-wu-avtcore-multiplex-multisource-endpoint-01.txt

Abstract

This document discusses bandwidth issues and RTCP timing rule issues that arise when the multi-source endpoint multiplexing all the media type in one RTP session and follows RFC3550 timing rules. It provides recommendations for multi-source host sending multiple media types in the same session.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http:/⁠/⁠datatracker.ietf.org/⁠drafts/⁠current/⁠.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 23, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http:/⁠/⁠trustee.ietf.org/⁠license-⁠info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Multiplexing is a method by which multiple streams are combined into one stream over a shared medium. In RTP, multiplexing is provided by the destination transport address (network address and port number) which is different for each RTP session. Given each media type having different requirement for bandwidth, multiple media types (e.g., Separate audio and video streams) are usually not carried in a single RTP session.

However for some applications that use unicast transport, e.g., in RTCWeb application, multiple-source hosts may use a different SSRC for each medium but sending them in the same RTP session, which reduces communication failure due to NAT and firewall when using multiple RTP sessions or multiple transport flow. If these multi-source hosts still follow RFC3550 timing rules, audio and video are multiplexed onto a single RTP session and share a common session bandwidth, the audio flows sending at much lower rates will waste a large amount of bandwidth.

This document provides recommendations for multi-source host sending multiple media types in the same session.

2. Terminology

2.1. Standards Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Multi-Source Endpoint


End system with multiple sources generated from one host and running on that host. One example of multi-source endpoint is an RTP endpoint which has multiple capture devices of the same media type and characteristics.
Single-Source Endpoint

End system with single source generated from one host described in RFC3550.

3. Use Cases for multi-source host communications

3.1. One to one Communication between multi-source endpoints

       Multiple-Source              Multiple-Source
           Endpoint1                    Endpoint2
          +--------+                   +--------+
          |        |  Common Transport |        |
          |        |                   |        |
          |        |Transport flow1(v1,a1)      |
SSRC-V1---|        |------------------>|        |--- SSRC-V2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
SSRC-A1---|        |Transport flow1(v2,a2)      |
          |        |<------------------|        +----SSRC-A2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          +--------+                   +--------+   

In the figure 1, one to one communication between multiple-source endpoint1 and multiple-source endpoint2 takes place. In order to reduce unicast transport flow to facilitate NAT/FW traversal, Multiple-Source endpoint 1 and multiple-source endpoint2 in one to one communication share the same transport. Both video stream v1 and audio stream a1 from multiple-source endpoint 1 are multiplexed in the RTP session1 and transmitted over the transport flow1 to multiple-source endpoint 2. Similarly, both video stream v2 and audio stream a2 from multiple-source endpoint2 are multiplexed in the RTP session 2 and transmitted over the same transmitted over the same transport flow 1 as multiple source endpoint1.

Multiple-source endpoint1 sends RTCP SR and RR packets for every active media stream it’s receiving (i.e.,v2,a2) from every local source (i.e.,v1,a1), which wastes a lot of bandwidth for redundant statistics reports. So does multiple-source endpoint2.

Also the bandwidth usage for each stream is limited by its media data rate and audio stream usually consumes lower bandwidth than video stream (e.g.,high quality audio with 80kbps and high quality video with 350 kbps). When audio stream and video stream share the same transport, i.e., sharing the common session bandwidth between audio and video, audio stream will still be sent at its own media data rate far below the common session bandwidth and therefore waste lots of bandwidth provided by the transport. Further when reception reports are used for audio stream and follows RFC3550 timing rule, RTCP SR and RR packets for active audio stream will be reported more frequent than when audio and video are carried in separate transports since more RTCP bandwidth are allocated to audio stream reporting when more session bandwidth are allocated to audio stream.

3.2. Multi-party Communication between multi-source endpoints

                             +---+
                             |   |
                    Transport flow 1(v1,a1)-----------+
                    Transport flow 1(v2,a2)           |
                          +--+ c +-------->           |
                          |  | e |        | Multiple  |-----SSRC-V2
                          |  | n |        |  Source   |
          +-----------+   |  | t |        | Endpoint2 |
          |           <---|  | e |        |           |
SSRC-V1---|           |      | r |        |           +-----SSRC-A2
          | Multiple  |      |   |        +-----------+
          |  Source   |      | s |
          | Endpoint1 |      | e |
SSRC-A1---|           <---|  | r |        +-----------+
          |           |   |  | v |        |           |
          +-----------+   |  | e |        |           |
                          |  | r |        | Multiple  |-----SSRC-V3
                          |  |               Source   |
                    Transport flow 2(v1,a1) Endpoint3 |
                    Transport flow 2(v3,a3)           |
                          ---|   +-------->           +-----SSRC-A3
                             |   |        +-----------+
                             |   |
                             |   |
                             +---+                               

In figure 2, multiple-source endpoint1 communicates with multiple-source endpoint2 and multiple-source endpoint3 in the same session. In order to reduce unicast transport flow to facilitate NAT/FW traversal, multiple-source endpoint 1 shares the same transport flow 1 with multiple-source endpoint2 to send video stream v1 and audio stream a1 and receive video stream v2 and audio stream v2 simultaneously. Similarly, multiple-source endpoint 1 shares the same transport flow 2 with multiple-source endpoint3 to send video stream v1 and audio stream a1 and receive video stream v3 and audio stream a3 simultaneously.

Multiple-source endpoint1 sends RTCP SR and RR packets for every active media stream it’s receiving (i.e.,v2,a2,v3,a3) from every local source (i.e.,v1,a1), which wastes a lot of bandwidth for redundant statistics reports. So does multiple-source endpoint2.

Also the bandwidth usage for each stream is limited by its media data rate and audio stream usually consumes lower bandwidth than video stream (e.g.,high quality audio with 80kbps and high quality video with 350 kbps). When audio stream and video stream share the same transport, i.e., sharing the common session bandwidth between audio and video, audio stream will still be sent at its own media data rate far below the common session bandwidth and therefore waste lots of bandwidth provided by the transport. Further when reception reports are used for audio stream and follows RFC3550 timing rule, RTCP SR and RR packets for active audio stream will be reported more frequent than when audio and video are carried in separate transports since more RTCP bandwidth are allocated to audio stream reporting when more session bandwidth are allocated to audio stream.

3.3. Communication between multi-source endpoint multiplexing each medium in separate RTP session and multiple-source endpoint multiplexing all mediums in one RTP session

       Multiple-Source              Multiple-Source
           Endpoint1                    Endpoint2
          +--------+                   +--------+
          |        | Various Transports|        |
          |        |                   |        |
          |        |Transport flow1(v1,a1)      |
SSRC-V1---|        |------------------>|        |--- SSRC-V2
          |        |                   |        |
          |        |Transport flow2(v2)|        |
          |        |<------------------|        |
          |        |                   |        |
SSRC-A1---|        |Transport flow3(a2)|        |
          |        |<------------------|        +----SSRC-A2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          +--------+                   +--------+                    

In the figure 3, multiple-source endpoint1 residing outside of the firewall communicates with multiple-source endpoint2 residing inside of the firewall. Unlike one to one communication described in section 3.1, multiple-source endpoint2 multiplexes each medium (i.e.,v2,a2) in separate transports (i.e.,transport flow2,transport flow3)to multiple-source endpoint1 while multiple-source endpoint1 multiplexes all mediums(ie.,v1,a1) in one unicast transport(i.e.,transport flow1)to multiple-source endpoint2.

4. Discuss

The RTCP bandwidth fraction is derived from the media data rate. If audio and video are sent as two separate RTP sessions, they would naturally have different RTCP bandwidth fractions, since the two media types have different rates.

However, if audio and video are multiplexed onto a single RTP session, a common session bandwidth would have to be chosen. This common bandwidth will likely be inappropriate for one of the media types, leading to the situation where some media flows use their allocation, while some flows send at a rate that is quite different to the session bandwidth. In the common case, the video sends at the session bandwidth, while the audio flows send at much lower rates. The RTCP bandwidth is derived from the session bandwidth, which means it's appropriate for the video, but is too high for the audio. In the worst case, the audio flows can have more RTCP than video flows, which is very wasteful. Fixing this is potentially difficult, since the session bandwidth concept is baked into all the RTCP timing rules.

5. Recommendations

Senders and receivers which are not multi-source endpoints are not affected by bandwidth issues associated with multi-source endpoint and should follow RFC3550 timing rules[RFC3550], no special accommodation is required.

Senders and receivers which are multi-source endpoints and sending or receiving multiple media types over different transport should be treated in the same way as single source endpoints dealing with multiple streams in the same media type.

Senders and receivers which are multi-source endpoints and sending or receiving multiple media types MAY multiplex/demultiplex each media type in separate RTP session or multiplex/demultiplex all media types in one RTP session. The multiplexing/demultiplexing mode to be employed in two directions between senders and receivers should be configurable. It is RECOMMENDED that

  1. As the default behavior, Senders and receivers use the media bundling mechanism [BUNDLE] in two directions between senders and receivers, i.e., multiplexing/demulplexing separate media types in one RTP session over the same lower layer transport.
  2. Configuration or local policy on the senders or receivers can override the default Mechanism specified in Option 1 above in one or two direction. Therefore senders and receivers MAY be configured with the different multiplexing mode.
  3. Dynamic multiplexing negotiation mechanism [BUNDLE] can be used to signal which multiplexing mechanism is used between senders and receivers and override the default mechanism specified in Option 1 and 2 above. The employed multiplexing negotiation mechanism is outside the scope of this document.

Multi-source endpoints multiplexing each media type in separate RTP session SHOULD be treated as multiple endpoints for each media type.

Editor's Note: It will be useful to discuss why we recommend to have a specific algorithm and see how it works with mixers and translators in the future version.

5.1. Choosing RTCP Bandwidth

Multi-source endpoint multiplexing multiple media types in the same RTP session SHOULD specify RTCP bandwidth using SDP, in a “b=RS” line or a “b=RR” line rather than choosing 5% of session bandwidth for RTCP bandwidth, especially when audio stream and video stream are sent over the same transport and the media data rate of audio stream is far less than video stream.

RTCP bandwidth for video stream MAY still follow RFC3550 recommendation, i.e., the fraction of the session bandwidth added for RTCP be fixed at 5%.

RTCP bandwidth for video = session bandwidth *5%

However given lower bandwidth usage of audio stream, RTCP bandwidth for audio stream is RECOMMENDED to be specified using SDP in a “b=RS” line or a “b=RR” line.

RTCP bandwidth for audio stream SHOULD be calculated based on the following formula:

RTCP bandwidth for audio = ((audio codec maximum bitrate*20%)+ audio codec maximum bitrate)*5%

Here a 20% RTP packet overhead is added to the data rate to calculate the required RTCP bandwidth.

5.2. Maintaining the number of session members and senders

As described in section 6.2.1 of RFC3550, Calculation of the RTCP packet interval mainly depends upon the number of members. For an endpoint with multiple sources multiplexing all media types in the same RTP session, it is more desirable to set the number of sender for such endpoint as one and the number of receiver as one. However according to RFC3550, each source within such endpoint is considered valid until multiple packets carrying the new SSRC have been received, or until an SDES RTCP packet containing a CNAME for that SSRC has been received.

To achieve this, the sender which is multi-source endpoint SHOULD not send Reception reports (e.g., SR/RR RTCP packets) from all local sources to each remote source in the receiving side, Similarly the Receiver which is multi-source endpoint SHOULD not receive reception reports(e.g., SR/RR RTCP packets from each remote source in the sending side to all local sources. Instead, one designated local source from the senders within multi-source endpoint should be chosen as reporting source for other local sources within the sender.

When multi-Source endpoints multiplexing each media type in separated RTP session join the session, they SHOULD treated as multiple single-source endpoint multiplexing for each media type and SHOULD be counted as multiple active senders. The number of active senders within multi-source endpoint is decided by the number of local source which is sending reception reports.

For the number of session members, it depends on the number of multiple-source endpoints E and the number of local source in each multiple-source endpoint S. It can be estimated or calculated according to the rules in RFC 3550 section 6.2.1, based on received valid SSRC in the SDES packet or RTP packets.

5.3. RTCP Reporting Interval calculation

The RTCP packet interval calculation SHOULD consider RTCP bandwidth estimation and signaling in section 5.1 and active sender estimation in section 5.2. The RTCP reporting interval for audio stream and video stream should be calculated according to the rules in RFC3550 section 6.2 respectively. When RTCP reception reports for audio stream and video stream are sent in different intervals, in order to decrease the number of RTCP packet to be sent, it is more desirable to transmit reception report for audio stream and reception report for video stream in the same compound RTCP packet and set RTCP reporting interval for audio stream as a multiple of RTCP reporting intervals for video stream.

For example, if the RTCP Reporting Interval for audio stream is more than one RTCP Reporting Interval for video but less than two RTCP Reporting Intervals for video, it is RECOMMENDED the RTCP Reporting Interval for audio stream be chosen as one RTCP Reporting Interval for video.

If the RTCP Reporting Interval for audio stream is more than two RTCP Reporting Intervals for video but less than three RTCP Reporting Intervals for video, it is RECOMMENDED the RTCP Reporting Interval for audio stream be chosen as two RTCP Reporting Intervals for video.

If the RTCP Reporting Interval for audio stream is more than three RTCP Reporting Intervals for video but less than four RTCP Reporting Intervals for video, it is RECOMMENDED the RTCP Reporting Interval for audio stream be chosen as three RTCP Reporting Intervals for video.

Editor's Note: The problem is not as simple as just varying the RTCP bandwidth fraction for the audio sources in an RTP session. Varying the reporting interval in this way for some participants can lead to timeouts. You'd also need to revise the timeout rules, reconsideration rules, etc.

5.4. RTCP reception report

Multiple-Source Endpoint SHOULD NOT send reception reports from one of its source about all the other local sources of its own. RTP application SHOULD provide a means to identify multiple-source endpoint as in fact being sources from the same RTP node.

Multiple-Source Endpoint SHOULD combine RTCP reception reports into a single compound RTCP packet without exceeding the maximum transmission unit (MTU) of the network path.

6. Security Considerations

TBC.

7. IANA Considerations

This document has no actions for IANA.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", March 1997.
[RFC3550] Schulzrinne, H., "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003.
[RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556, July 2003.
[BUNDLE] Holmberg, C. and H. Alvestrand, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", ID draft-ietf-mmusic-sdp-bundle-negotiation-01, August 2012.

8.2. Informative References

[I-D.ietf-avtcore-multi-media-rtp-session-00] Westerlund, M., "Multiple Media Types in an RTP Session", ID draft-ietf-avtcore-multi-media-rtp-session-00, October 2012.

Authors' Addresses

Qin Wu Huawei 101 Software Avenue, Yuhua District Nanjing, Jiangsu 210012 China EMail: sunseawq@huawei.com
Roni Even Huawei 14 David Hamelech Tel, Aviv 64953 Israel EMail: ron.even.tlv@gmail.com