Network Working Group E. Ivov
Internet-Draft Jitsi
Intended status: Informational P. Saint-Andre
Expires: November 04, 2013 Cisco Systems, Inc.
E. Marocco
Telecom Italia
May 03, 2013

CUSAX: Combined Use of the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP)
draft-ivov-xmpp-cusax-05

Abstract

This document describes suggested practices for combined use of the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP). Such practices aim to provide a single fully featured real-time communication service by using complementary subsets of features from each of the protocols. Typically such subsets would include telephony capabilities from SIP and instant messaging and presence capabilities from XMPP. This specification does not define any new protocols or syntax for either SIP or XMPP. However, implementing it may require modifying or at least reconfiguring existing client and server-side software. Also, it is not the purpose of this document to make recommendations as to whether or not such combined use should be preferred to the mechanisms provided natively by each protocol (for example, SIP's SIMPLE or XMPP's Jingle). It merely aims to provide guidance to those who are interested in such a combined use.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on November 04, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Historically SIP [RFC3261] and XMPP [RFC6120] have often been implemented and deployed with different purposes: from its very start SIP's primary goal has been to provide a means of conducting "Internet telephone calls". XMPP on the other hand, has, from its Jabber days, been mostly used for instant messaging and presence [RFC6121], as well as related services such as groupchat rooms [XEP-0045].

For various reasons, these trends have continued through the years even after each of the protocols had been equipped to provide the features it was initially lacking:

Despite these advances, SIP remains the protocol of choice for telephony-like services, especially in enterprises where users are accustomed to features such as voice mail, call park, call queues, conference bridges and many others that are rarely (if at all) available in Jingle-based software. XMPP implementations, on the other hand, greatly outnumber and outperform those available for instant messaging and presence extensions developed in the SIMPLE WG, such as MSRP [RFC4975] and XCAP [RFC4825].

For these reasons, in a number of cases adopters have found themselves needing a set of features that are not offered by any single-protocol solution but that separately exist in SIP and XMPP products. The idea of seamlessly using both protocols together would hence often appeal to service providers. Most often, such a service would employ SIP exclusively for audio, video, and telephony services and rely on XMPP for anything else varying from chat, contact list management, and presence to whiteboarding and exchanging files. Because these services and clients involve the combined use of SIP and XMPP, we label them "CUSAX" for short.

+------------+      +-------------+
| SIP Server |      | XMPP Server |
+------------+      +-------------+
         \             /
media     \           /  instant messaging,
signaling  \         /   presence, etc.
            \       /
         +--------------+
         | CUSAX Client |
         +--------------+
        

Figure 1: Division of Responsibilities

This document explains how such hybrid offerings can be achieved with a minimum of modifications to existing software while providing an optimal user experience. It covers server discovery, determining a SIP AOR while using XMPP, and determining an XMPP Jabber Identifier ("JID") from incoming SIP requests. Most of the text here pertains to client behavior but it also recommends certain server-side configurations.

Note that this document is focused on coexistence of SIP and XMPP functionality in end-user-oriented clients. By intent it does not define methods for protocol-level mapping between SIP and XMPP, as might be used within a server-side gateway between a SIP network and an XMPP network (a separate series of documents has been produced that defines such mappings). More generally, this document does not describe service policies for inter-domain communication (often called "federation") between service providers (e.g., how a service provider that offers a combined SIP-XMPP service might communicate with a SIP-only or XMPP-only service), nor does it describe the reasons why a service provider might choose SIP or XMPP for various features.

This document concentrates on use cases where the SIP services and XMPP services are controlled by one and the same provider, since that assumption greatly simplifies both client implementation and server-side deployment (e.g., a single service provider can enforce common or coordinated policies across both the SIP and XMPP aspects of a CUSAX service, which is not possible if a SIP service is offered by one provider and an XMPP service is offered by another). Since this document is of an informational nature, it is not unreasonable for clients to apply some of the guidelines here even in cases where there is no established relationship between the SIP and the XMPP services (for example, it is reasonable for a client to provide a way for its users to easily start a call to a phone number recorded in a vCard). However, the exact set of rules to follow in such cases is left to application developers.

Finally, this document makes a further simplifying assumption by discussing only the use of a single client, not use of and coordination among multiple endpoints controlled by the same user (e.g., user agents running simultaneously on a laptop computer, tablet, and mobile phone).

2. Client Bootstrap

One of the main problems of using two distinct protocols when providing one service is the impact on usability. Email services, for example, have long been affected by the mixed use of SMTP for outgoing mail and POP3 or IMAP for incoming mail. Although standard service discovery methods (such as the proper DNS records) make it possible for a user agent to locate the right host(s) at which to connect, they do not provide the kind of detailed information that is needed to actually configure the user agent for use with the service. As a result, it is rather complicated for inexperienced users to configure a mail client and start using it with a new service, and Internet service providers often need to provide configuration instructions for various mail clients. Client developers and communication device manufacturers on the other hand often ship with a number of wizards that enable users to easily set up a new account for a number of popular email services. While this may improve the situation to some extent, the user experience is still clearly sub-optimal.

While it should be possible for CUSAX users to manually configure their separate SIP and XMPP accounts, dual-stack SIP/XMPP clients ought to provide means of online provisioning, typically by means of a web-based service at an HTTP URI. While the specifics of such mechanisms are outside the scope of this specification, they should make it possible for a service provider to remotely configure the clients based on minimal user input (e.g., only a user ID and password).

Because many of the features that a CUSAX client would privilege in one protocol would also be available in the other, clients should make it possible for such features to be disabled for a specific account. In particular, it is suggested that clients allow for audio and video calling features to be disabled for XMPP accounts, and that instant messaging and presence features should also be made optional for SIP accounts.

The main advantage of this approach is that clients would be able to continue to function properly and use the complete feature set of standalone SIP and XMPP accounts.

Once clients have been provisioned, they need to independently log into the SIP and XMPP accounts that make up the CUSAX "service" and then maintain both these connections as displayed in Figure 2.

+--------------+
| Provisioning |-----------+
|    Server    |           |
+--------------+           v
   |              +----------------+
   |              | vCard Storage/ |
   |              | User Directory |
   |              +----------------+
   |                /            \
   |      +------------+      +-------------+
   |      | SIP Server |      | XMPP Server |
   |      +------------+      +-------------+
   |               \             /
   |      media     \           /  instant messaging,
   |      signaling  \         /   presence, etc.
   |                  \       /
   |               +--------------+
   +---------------| CUSAX Client |
                   +--------------+
        

Figure 2: Example Deployment

In order to improve the user experience, when reporting connection status clients may also wish to present the XMPP connection as an "instant messaging" or a "chat" account. Similarly they could also depict the SIP connection as a "Voice and Video" or a "Telephony" connection. The exact naming is of course entirely up to implementers. The point is that, in cases where SIP and XMPP are components of a service offered by a single provider, such presentation could help users better understand why they are being shown two different connections for what they perceive as a single service. It could alleviate especially situations where one of these connections is disrupted while the other one is still active. Naturally, the developers of a CUSAX client or the providers of a CUSAX service might decide not to accept such situations and force a client to completely disconnect unless both aspects are successfully connected.

Clients may also choose to delay their XMPP connection until they have been successfully registered on SIP. This would help avoid the situation where a user appears online to its contacts but calling it would fail because their clients is still connecting to the SIP aspect of their CUSAX service.

3. Operation

Once a CUSAX client has been provisioned and authorized to connect to the corresponding SIP and XMPP services it would proceed by retrieving its XMPP roster.

The client should use XMPP for all forms of communication with the contacts from this roster, which will occur naturally because they were retrieved through XMPP. Audio/video features however, are disabled in the XMPP stack, so any form of communication based on these features (e.g. direct calls, conferences, desktop streaming, etc.) will happen over SIP. The rest of this section describes deployment, discovery, usability and linking semantics that allow CUSAX clients to fall back and seamlessly use SIP for these features.

3.1. Server-Side Setup

In order for CUSAX to function properly, XMPP service administrators should make sure that at least one of the vCard [RFC6350] "tel" fields for each contact is properly populated with a SIP URI or a phone number when an XMPP protocol for vCard storage is used (e.g., [XEP-0054] or [XEP-0292]). There are no limitations as to the form of that number. For example while it is desirable to maintain a certain consistency between SIP AORs and XMPP JIDs, that is by no means required. It is quite important however that the phone number or SIP AOR stored in the vCard be reachable through the SIP aspect of this CUSAX service.

Administrators may also choose to include the "video" tel type defined in [RFC6350] for accounts that would be capable of handling video communication.

To ensure that the foregoing approach is always respected, service providers might consider (1) preventing clients (and hence users) from modifying the vCard "tel" fields or (2) applying some form of validation before storing changes. Of course such validation would be feasible mostly in cases where a single provider controls both the XMPP and the SIP service since such providers would "know" (e.g., based on use of a common user database for both services) what SIP AOR corresponds to a given XMPP user (as indicated in Figure 2).

3.2. Client-Side Discovery and Usability

When rendering the roster for a particular XMPP account CUSAX clients should make sure that users are presented with a "Call" option for each roster entry that has a properly set "tel" field. This is the case even if calling features have been disabled for that particular XMPP account, as advised by this document. The usefulness of such a feature is not limited to CUSAX. After all, numbers are entered in vCards in order to be dialed and called. Hence, as long as an XMPP client has any means of conducting a call it may wish to make it possible for the user to easily dial any numbers that it learned through whatever means.

Clients that have separate triggers (buttons) for audio and video calls may choose to use the presence or absence of the "video" tel type defined in [RFC6350] and enable or disable the possibility for starting video calls accordingly.

In addition to discovering phone numbers from vCards, clients may also check for alternative communication methods as advertised in XMPP presence broadcasts and Personal Eventing Protocol nodes as described in XEP-0152: Reachability Addresses [XEP-0152]. However, these indications are merely hints, and a receiving client ought not associate a SIP address and an XMPP address unless it has some way to verify the association (e.g., the vCard of the XMPP account lists the SIP address and the vCard of the SIP account lists the XMPP address, or the association is made explicit in a record provided by a trusted directory). Alternatively or in cases where vCard or directory data is not available, a CUSAX client could take the user's own address book as the canonical source for contact addresses.

3.3. Indicating a Relation Between SIP and XMPP Accounts

In order to improve usability, in cases where clients are provisioned with only a single telephony-capable account they ought to initiate calls immediately upon user request without asking users to indicate an account that the call should go through. This way CUSAX users (whose only account with calling capabilities is usually the SIP part of their service) would have a better experience, since from the user's perspective calls "just work at the click of a button".

In some cases however, clients will be configured with more than the two XMPP and SIP accounts provisioned by the CUSAX provider. Users are likely to add additional stand-alone XMPP or SIP accounts (or accounts for other communications protocols), any of which might have both telephony and instant messaging capabilities. Such situations can introduce additional ambiguity since all of the telephony-capable accounts could be used for calling the numbers the client has learned from the vCards.

To avoid such confusion, client implementers and CUSAX service providers may choose to indicate the existence of a special relationship between the SIP and XMPP accounts of a CUSAX service. For example, let's say that Alice's service provider has opened both an XMPP account and a SIP account for her. During or after provisioning, her client could indicate that alice@xmpp.example.com has a CUSAX relation to alice@sip.example.com (i.e., that they are two aspects of the same service). This way whenever Alice triggers a call to a contact in her XMPP roster, the client would preferentially initiate this call through her example.com SIP account even if other possibilities exist (such as the XMPP account where the vCard was obtained or a SIP account with another provider).

If, on the other hand, no relationship has been configured or discovered between a SIP account and an XMPP account, and the client is aware of multiple telephony-capable accounts, it ought to present the user with the choice of reaching the contact through any of those accounts. This includes the source XMPP account where the vCard was obtained (in case its telephony capabilities are not disabled through configuration or provisioning), in order to guarantee proper operation for XMPP accounts that are not part of a CUSAX deployment.

3.4. Matching Incoming SIP Calls to XMPP JIDs

When receiving SIP calls, clients may wish to determine the identity of the caller and a corresponding XMPP roster entry so that users could revert to chatting or other forms of communication that require XMPP. To do so clients could search their roster for an entry whose vCard has a "tel" field matching the originator of the call.

Call-Info: <xmpp:alice@xmpp.example.com> ;purpose=impp
            

In addition, in order to avoid the effort of iterating over an entire roster and retrieving all vCards, CUSAX clients may use a SIP Call-Info header whose 'purpose' header field parameter has a value of "impp" as described in [I-D.saintandre-impp-call-info]. An example follows.

4. Multi-Party Interactions

CUSAX clients that support the SIP conferencing framework [RFC4353] can detect when a call they are participating in is actually a conference and can then subscribe for conference state updates as per [RFC4575]. A regular SIP user agent would also use the same conference URI for text communication with the Message Session Relay Protocol (MSRP). However, given that SIP's instant messaging capabilities would normally be disabled (or simply not supported) in CUSAX deployments, an XMPP Multi-User Chat (MUC) [XEP-0045] associated with the conference can be announced/discovered through <service-uris> bearing the "grouptextchat" purpose [I-D.ivov-grouptextchat-purpose]. Similarly, an XMPP MUC can advertise the SIP URI of an associated service for audio/video interactions using the 'audio-video-uri' field of the "muc#roominfo" data form [XEP-0004] to include extended information [XEP-0128] about the MUC room within XMPP service discovery [XEP-0030]; see [XEP-0045] for an example.

Once a CUSAX client joins the MUC associated with a particular call it should not rely on any synchronization between the two. Both the SIP conference and the XMPP MUC would function independently, each issuing and delivering its own state updates. It is hence possible that that certain peers would temporarily or permanently be reachable in only one of the two conferences. This would typically be the case with single-stack clients that have only joined the SIP call or the XMPP MUC. It is therefore important for CUSAX clients to provide a clear indication to users as to the level of participation of the various participants. In other words, a user needs to be able to easily understand whether a certain participant can receive text messages, audio/video, or both.

Of course, tighter integration between the XMPP MUC and the SIP conference is also possible. Permissions, roles, kicks and bans that are granted and performed in the MUC can easily be imitated by the conference focus/mixer into the SIP call. If for example, a certain MUC member is muted, the conference mixer can choose to also apply the mute on the media stream corresponding to that participant. The details and exact level of such integration is of course entirely up to implementers and service providers.

The approach above describes one relatively lightweight possibility of combining SIP and XMPP multi-party interaction semantics without requiring tight integration between the two. As with the rest of this specification, this approach is by no means normative. Implementation and future specifications may define other methods or provide other suggestions for improving the Unified Communications user experience in cases of multi-user chats in conference calling.

5. Federation

In theory there are no technical reasons why federation would require special behaviour from CUSAX clients. However, it is worth noting that differences in administration policies may sometimes lead to potentially confusing user experiences.

For example, let's say atlanta.example.com observes the CUSAX policies described in this specification. All XMPP users at atlanta.example.com are hence configured to have vCards that match their SIP identities. Alice is therefore used to making free, high-quality SIP calls to all the people in her roster. Alice can also make calls to the PSTN by simply dialing numbers. She may even be used to these calls being billed to her online account so she would careful about how long they last. This is not a problem for her since she can easily distinguish between a free SIP call (one that she made by calling one her roster entries) from a paid PSTN call that she dialed as a number.

Then Alice adds xmpp:bob@biloxi.example.com. The Biloxi domain only has an XMPP service. There is no SIP server and Bob uses a regular, XMPP-only client. Bob has however added his mobile number to his vCard in order to make it easily accessible to his contacts. Alice's client would pick up this number and make it possible for Alice to start a call to Bob's mobile phone number.

This could be a problem because, other than the fact that Bob's address is from a different domain, Alice would have no obvious and straightforward cues telling her that this is in fact a call to the PSTN. In addition to the potentially lower audio quality, Alice may also end up incurring unexpected charges for such calls.

In order to avoid such issues, providers maintaining a CUSAX service for the users in their domain may choose to provide additional cues (e.g., a user interface warning or an an audio tone or message) indicating that a call would incur charges.

A slightly less disturbing scenario, where a SIP service might only allow communication with intra-domain numbers, would simply prevent Alice from establishing a call with Bob's mobile. Providers should hence make sure that calls to extra-domain numbers are flagged with an appropriate audio or textual warning.

6. Security Considerations

Use of the same user agent with two different accounts providing complementary features introduces the possibility of mismatches between the security profiles of those accounts or features. For example, the SIP aspect and XMPP aspect of the CUSAX service might offer different authentication options (e.g., digest authentication for SIP as specified in [RFC3261] and SCRAM authentication [RFC5802] for XMPP as specified in [RFC6120]). Similarly, a CUSAX client might successfully negotiate Transport Layer Security (TLS) [RFC5246] when connecting to the XMPP aspect of the service but not when connecting to the SIP aspect. Such mismatches could introduce the possibility of downgrade attacks. User agent developers and service providers ought to ensure that such mismatches are avoided as much as possible.

Refer to the specifications for the relevant SIP and XMPP features for detailed security considerations applying to each "stack" in a CUSAX client.

7. IANA Considerations

This document has no actions for the IANA.

8. Informative References

[I-D.ivov-grouptextchat-purpose] Ivov, E., "A Group Text Chat Purpose for Conference and Service URIs in the Session Initiation Protocol (SIP) Event Package for Conference State", Internet-Draft draft-ivov-grouptextchat-purpose-00, April 2013.
[I-D.saintandre-impp-call-info] Saint-Andre, P., "Instant Messaging and Presence Purpose for the Call-Info Header in the Session Initiation Protocol (SIP)", Internet-Draft draft-saintandre-impp-call-info-02, April 2013.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.
[RFC4353] Rosenberg, J., "A Framework for Conferencing with the Session Initiation Protocol (SIP)", RFC 4353, February 2006.
[RFC4575] Rosenberg, J., Schulzrinne, H. and O. Levin, "A Session Initiation Protocol (SIP) Event Package for Conference State", RFC 4575, August 2006.
[RFC4825] Rosenberg, J., "The Extensible Markup Language (XML) Configuration Access Protocol (XCAP)", RFC 4825, May 2007.
[RFC4975] Campbell, B., Mahy, R. and C. Jennings, "The Message Session Relay Protocol (MSRP)", RFC 4975, September 2007.
[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008.
[RFC5802] Newman, C., Menon-Sen, A., Melnikov, A. and N. Williams, "Salted Challenge Response Authentication Mechanism (SCRAM) SASL and GSS-API Mechanisms", RFC 5802, July 2010.
[RFC6120] Saint-Andre, P., "Extensible Messaging and Presence Protocol (XMPP): Core", RFC 6120, March 2011.
[RFC6121] Saint-Andre, P., "Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence", RFC 6121, March 2011.
[RFC6350] Perreault, S., "vCard Format Specification", RFC 6350, August 2011.
[XEP-0004] Eatmon, R., Hildebrand, J., Miller, J., Muldowney, T. and P. Saint-Andre, "Data Forms", XSF XEP 0004, August 2007.
[XEP-0030] Hildebrand, J., Millard, P., Eatmon, R. and P. Saint-Andre, "Service Discovery", XSF XEP 0030, June 2008.
[XEP-0045] Saint-Andre, P., "Multi-User Chat", XSF XEP 0045, February 2012.
[XEP-0054] Saint-Andre, P., "vcard-temp", XSF XEP 0054, July 2008.
[XEP-0128] Saint-Andre, P., "Service Discovery Extensions", XSF XEP 0128, October 2004.
[XEP-0152] Hildebrand, J. and P. Saint-Andre, "XEP-0152: Reachability Addresses", XEP XEP-0152, February 2013.
[XEP-0292] Saint-Andre, P. and S. Mizzi, "vCard4 Over XMPP", XSF XEP 0292, October 2011.

Appendix A. Acknowledgements

This draft is inspired by the "SIXPAC" work of Markus Isomaki and Simo Veikkolainen. Markus also provided various suggestions for improving the document.

The authors would also like to thank the following persons for their reviews and suggestions: Sébastien Couture, Richard Brady, Olivier Crête, Aaron Evans, Kevin Gallagher, Adrian Georgescu, Saúl Ibarra Corretgé, David Laban, Murray Mar, Daniel Pocock, Travis Reitter, and Gonzalo Salgueiro.

Authors' Addresses

Emil Ivov Jitsi Strasbourg, 67000 France Phone: +33-672-811-555 EMail: emcho@jitsi.org
Peter Saint-Andre Cisco Systems, Inc. 1899 Wynkoop Street, Suite 600 Denver, CO 80202 USA Phone: +1-303-308-3282 EMail: psaintan@cisco.com
Enrico Marocco Telecom Italia Via G. Reiss Romoli, 274 Turin, 10148 Italy EMail: enrico.marocco@telecomitalia.it