STRAW Working Group | L. Miniero |
Internet-Draft | Meetecho |
Intended status: Standards Track | S. Garcia Murillo |
Expires: December 22, 2014 | Medooze |
V. Pascual | |
Quobis | |
June 20, 2014 |
Guidelines to support RTCP end-to-end in Back-to-Back User Agents (B2BUAs)
draft-ietf-straw-b2bua-rtcp-01
SIP Back-to-Back User Agents (B2BUAs) are often envisaged to also be on the media path, rather than just intercepting signalling. This means that B2BUAs often implement an RTP/RTCP stack as well, whether to act as media transcoders or to just passthrough the media themselves, thus leading to separate media legs that the B2BUA correlates and bridges together. If not disciplined, though, this behaviour can severely impact the communication experience, especially when statistics and feedback information contained in RTCP packets get lost because of mismatches in the reported data.
This document defines the proper behaviour B2BUAs should follow when also acting on the signalling/media plane in order to preserve the end-to-end functionality of RTCP.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 22, 2014.
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Session Initiation Protocol [RFC3261] Back-to-Back User Agents (B2BUAs) are SIP entities that can act as a logical combination of both a User Agent Server (UAS) and a User Agent Client (UAC). As such, their behaviour is not always completelely adherent to the standards, and can lead to unexpected situations the IETF is trying to address. [RFC7092] presents a taxonomy of the most deployed B2BUA implementations, describing how they differ in terms of the functionality and features they provide.
Such components often do not only act on the signalling plane, that is intercepting and possibly modifying SIP messages, but also on the media plane. This means that, when on the signalling path between two or more parties willing to communicate, such components also manipulate the session description [RFC4566] in order to have all RTP and RTCP [RFC3550] pass through it as well within the context of an SDP offer/answer [RFC3264]. The reasons for such a behaviour can be different: the B2BUA may want, for instance, to provide transcoding functionality for peers with incompatible codecs, or it may need the traffic to be directly handled for different reasons like billing, lawful interception, session recording and so on. This can lead to several different topologies for RTP-based communication, as documented in [RFC5117]. These topologies are currently being updated to address new commonly encountered scenarios as well [I-D.ietf-avtcore-rtp-topologies-update].
Whatever the reason, such a behaviour does not come without a cost. In fact, whenever a media-aware component is placed on the path between two peers that want to communicate by means of RTP/RTCP, the end-to-end nature of such protocols is broken, and their effectiveness may be affected as a consequence. While this may not be a problem for RTP packets, which from a protocol point of view just contain opaque media packets and as such can be quite easily relayed, it definitely can cause serious issue for RTCP packets, which carry important information and feedback on the communication quality the peers are experiencing. In fact, RTCP packets make use of specific ways to address the media they are referring to. Consider, for instance, the simple scenario only involving two parties and a single media flow depicted in Figure 1:
+--------+ +---------+ +---------+ | |=== SSRC1 ===>| |=== SSRC3 ===>| | | Alice | | B2BUA | | Bob | | |<=== SSRC2 ===| |<=== SSRC4 ===| | +--------+ +---------+ +---------+
Figure 1: B2BUA modifying RTP headers
In this common scenario, a party (Alice) is communicating with a peer (Bob) as a result of a signalling session managed by a B2BUA: this B2BUA is also on the media path between the two, and is acting as a media relay. It is also, though, rewriting some of the RTP header information on the way, for instance because that's how its RTP relaying stack works: in this example, just the audio SSRC is changed, but more information may be changed as well (e.g., sequence numbers, timestamps, etc.). In particular, whenever Alice sends an audio RTP packet, she adds her SSRC (SSRC1) to the RTP header; the B2BUA rewrites the SSRC (SSRC3) before relaying the packet to Bob. At the same time, RTP packets sent by Bob (SSRC4) get their SSRC rewritten as well (SSRC2) before being relayed to Alice.
Assuming now that Alice needs to inform Bob she has lost several audio packets in the last few seconds, maybe because of a network congestion, she would of course place the related peer audio SSRC she is aware of (SSRC2), together with her own (SSRC1), in RTCP Reports and/or NACKS to do so, hoping for a retransmission or for Bob to slow down. Since the B2BUA is making use of different SSRCs for the RTP communication with the party and the peer, a blind relaying of the RTCP packets to Bob would in this case result, from his perspective, in unknown SSRCs being addressed, thus resulting in the precious information being dropped. In fact, Bob is only aware of SSRCs SSRC4 (the one he's originating) and SSRC3 (the one he's receiving from the B2BUA), and knows nothing about SSRCs SSRC1 and SSRC2 in the RTCP packets he receives. As a consequence of the feedback being dropped, unaware of the issue Bob may continue to flood the party with even more media packets and/or not send Alice the packets she misses, which may easily lead to a very bad communication experience, if not eventually to an unwanted termination of the communication itself.
This is just a trivial example that, together with additional scenarios, will be addressed in the following sections. Nevertheless, it is a valid example of how such a trivial mishandling of precious information may lead to serious consequences, especially considering that more complex scenarios may involve several parties at the same time and multiple media flows rather than a single one. Considering how common B2BUA deployments are, it is very important for them to properly address such feedback, in order to be sure that their activities on the media plane do not break anything they're not supposed to.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
As anticipated in the introductory section, it's very common for B2BUA deployments to also act on the media plane, rather than just signalling alone. In particular, [RFC7092] describes three different categories of such B2BUAs, according to the level of activities performed on the media plane: a B2BUA, in fact, may act as a simple media relay (1), effectively unaware of anything that is transported; it may be a media-aware relay (2), also inspecting and/or modifying RTP and RTCP packets as they flow by; or it may be a full-fledged media termination entity, terminating and generating RTP and RTCP packets as needed.
While [RFC3550] and [RFC5117] already mandate some specific behaviours when specific topologies are deployed, not all deployments strictly adhere to the specifications and as such it's not rare to encounter issues that may be avoided with a more disciplined behaviour in that regard. For this reason, the following subsections will describe the proper behaviour B2BUAs, whatever above category they fall in, should follow in order to avoid, or at least minimize, any impact on end-to-end RTCP effectiveness.
A media relay as identified in [RFC7092] basically just forwards, from an application level point of view, all RTP and RTP packets it receives, without either inspecting or modifying them. Using the RTP Topologies terminology, this can be seen as a RTP Transport Translator. As such, B2BUA acting as media relays are not aware of what traffic they're handling, meaning that not only the packet payloads are opaque to them, but headers as well. Many Session Border Controllers (SBC) implement this kind of behaviour, e.g., when acting as a bridge between an inner and outer network.
Considering all headers and identifiers in both RTP and RTCP are left untouched, issues like the SSRC mismatch described in the previous section would not occur. Similar problems could occur, though, should the session description end up providing incorrect information about the media flowing (e.g., if the SDP on either side contain 'ssrc' [RFC5576] attributes that don't match the actual SSRC being advertized on the media plane) or about the supported RTCP mechanisms (e.g., in case the B2BUA advertized support for NACK because it implements it, but the original INVITE didn't). Such an issue might occur, for instance, in case the B2BUA acting as a media relay is generating a new session description when bridging an incoming call, rather than taking into account the original session description in the first place. This may cause the peers to find a mismatch between the SSRCs advertized in SDP and the ones actually observed in RTP and RTCP packets (which may indeed change during a session anyway, but having them synced during setup would help nonetheless), or having them either ignore or generate RTCP feedback packets that were not explicitly advertized as supported.
In order to prevent such an issue, a media-relay B2BUA SHOULD forward all the SSRC- and RTCP-related SDP attributes when handling a session setup between interested parties: this includes attributes like 'ssrc' [RFC3261], 'rtcp-fb' [RFC4585], 'rtcp-xr-attrib' [RFC3611] and others. It SHOULD NOT, though, blindly forward all SDP attributes, as some of them (e.g., candidates, fingerprints, crypto, etc.) may lead to call failures for different reasons out of scope to this document. One notable example is the 'rtcp' [RFC3605] attribute that UAC may make use of to explicitly state the port they're willing to use for RTCP: considering the B2BUA would relay RTCP packets, the port as seen by the other UAC involved in the communication would differ from the one negotiated originally, and as such it MUST be rewritten accordingly.
Besides, it is worth mentioning that, leaving RTCP packets untouched, a media relay may also let through information that, according to policies, may be best left hidden or masqueraded, e.g., domain names in CNAME items. Nevertheless, that information cannot break the end-to-end RTCP behaviour.
A Media-aware relay, unlike the the Media Relay addressed in the previous section, is actually aware of the media traffic it is handling. As such, it is able to inspect RTP and RTCP packets flowing by, and may even be able to modify the headers in any of them before forwarding them. Using the RFC3550 terminology, this can be seen as a RTP Translator. A B2BUA implementing this role would typically not, though, inspect the RTP payloads as well, which would be opaque to them: this means that the actual media would not be manipulated (e.g, transcoded).
This makes them quite different from the Media Relay previously discussed, especially in terms of the potential issues that may occur at the RTCP level. In fact, being able to modify the RTP and RTCP headers, such B2BUAs may end up modifying RTP related information like SSRC (and hence CSRC lists, that must of course be updated accordingly), sequence numbers, timestamps and the like before forwarding packets from one peer to another. This means that, if not properly disciplined, such a behaviour may easily lead to issues like the one described in the introductory section. As such, it is very important for a B2BUA modifying RTP-related information to also modify the same information in RTCP packets as well, and in a coherent way, so that not to confuse any of the peers involved in a communication.
It is worthwile to point out that such a B2BUA would not necessarily forward all the packets it is receiving, though: Selective Forwarding Units (SFU) [I-D.ietf-avtcore-rtp-topologies-update], for instance, could aggregate or drop incoming RTCP messages, while at the same time originating new ones on their own. For the messages that are forwarded and/or aggregated, though, it's important to make sure the information is coherent.
Besides the behaviour already mandated for RTCP translators in Section 7.2 of [RFC3550], a media-aware B2BUA MUST also handle incoming RTCP messages to forward following this guideline:
Apart from the generic guidelines related to Feedback messages, no additional modifications are needed for PLI, SLI and RPSI feedback messages instead.
Of course, the same considerations about the need for SDP and RTP/RTCP information to be coherent also applies to media-aware B2BUAs. This means that, if a B2BUA is going to change any SSRC, it SHOULD update the related 'ssrc' attributes if they were present in the original description before sending it to the recipient, just as it MUST rewrite the 'rtcp' attribute if provided. At the same time, the ability for a media-aware B2BUA to inspect/modify RTCP packets may also mean such a B2BUA may choose to drop RTCP packets it can't parse: in that case, a media-aware B2BUA SHOULD also advertize its RTCP level of support in the SDP in a coherent way, in order to prevent, for instance, a UAC to make use of NACK messages that would never reach the intended recipients.
A Media Terminator B2BUA, unlike simple relays and media-aware ones, is also able to terminate media itself, that is taking care of RTP payloads as well and not only headers. This means that such components, for instance, can act as media transcoders and/or originate specific RTP media. Using the RTP Topologies terminology, this can be seen as a RTP Media Translator. Such a capability makes them quite different from the previously introduced B2BUA typologies, as this means they are going to terminate RTCP as well: in fact, since the media is terminated by themselves, the related statistics and feedback functionality can be taken care directly by the B2BUA, and does not need to be relayed to the logical peer in the multimedia communication.
For this reason, no specific guideline is needed to ensure a proper end-to-end RTCP behaviour in such scenarios, mostly because most of the times there would be no end-to-end RTCP interaction among the involved peers at all, as the B2BUA would terminate them all and take care of them accordingly. Nevertheless, should any RTCP packet actually need to be delivered to the actual peer, the same guidelines provided for the media-aware B2BUA case apply.
The discussion made in the previous sections on the management of RTCP messages by a B2BUA has so far mostly worked under the assumption that the B2BUA has actually access to the RTP/RTCP information itself. This is indeed true if we assume that plain RTP and RTCP is being handled, but this may not be true once any security is enforced on RTP packets and RTCP messages by means of SRTP [RFC3711], whether the keying is done using Secure Descriptions [RFC4568] or DTLS-SRTP [RFC5764].
While typically not an issue in the Media Relay case, where RTP and RTCP packets are forwarded without any modification no matter whether security is involved or not, this could definitely have an impact on Media-aware Relays and Media Terminator B2BUAs. To make a simple example, if we think of a SRTP/SRTCP session across a B2BUA where the B2BUA itself has no access to the keys used to secure the session, there would be no way to manipulate SRTP headers without violating the hashing on the packet; at the same time, there would be no way to rewrite the RTCP information accordingly either, as most of the packet (especially when RTCP compound packets are involved) would be encrypted.
For this reason, it is important to point out that the operations described in the previous sections are only possible if the B2BUA has a way to effectively manipulate the packets and messages flowing by. This means that, in case media security is involved, the B2BUA willing to act as either a Media-aware Relay or a Media Terminator must act as an intermediary with respect to the secure sessions. As such, different secure sessions need to be negotiated (either via SDES or DTLS-SRTP) with the involved parties, in order to be able to have access to the unencrypted packets and, if needed, modify them before encrypting them again and forwarding them. It is important to point out that this breaks any end-to-end security mechanism that may be in place, though, as all the involved parties would have a secure communication up to the B2BUA and would have to rely on the B2BUA actually encrypting the communication on the other end as well.
This document makes no request of IANA.
TBD. Not any additional consideration to what the standards already give? Probably this section will need a few words about how NOT following the guidelines can lead to security issues: e.g., not properly translating REMB messages can cause an increasing flow of media packets, that may be seen as attacks to devices that can't handle the amount of data.
Note to RFC Editor: Please remove this whole section.
The following are the major changes between the 00 and the 01 versions of the draft:
TBD.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC3261] | Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. |
[RFC4566] | Handley, M., Jacobson, V. and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. |
[RFC3264] | Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. |
[RFC3550] | Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. |
[RFC7092] | Kaplan, H. and V. Pascual, "A Taxonomy of Session Initiation Protocol (SIP) Back-to-Back User Agents", RFC 7092, December 2013. |