Internet Engineering Task Force Yoshihiro Suzuki
Internet-Draft D3 Communications
Intended Status: Informational Nobuo Ogashiwa
Expires: July 31, 2012 Maebashi Kyoai Gakuen College
February 28, 2012
Real-Time Web Communication by using XMPP Jingle
draft-suzuki-rtcweb-jingle-web-00
Abstract
XMPP Jingle specification defines an XMPP protocol extension for
managing peer-to-peer media sessions between two users. And XMPP
Jingle can manage multi contents simultaneously in one Jingle
stream, but in the XMPP world there is no common way to layout
variable multi contents on each display. In this document, a
solution to this issue is provided by using Web browser's layout
function.
This document describes a new concept to realize one of solutions
of RTCWeb. Of course, "Web Browser" is used for Web application's
frontend for real-time communication, and XMPP Jingle specification
(XEP-166) is used as signaling protocol. And a simple mapping
manner between jingle contents and HTML graphical elements enables
to unite Web browser's layout function and Jingle's media content
management function to realize RTCWeb functions.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-Drafts
as reference material or to cite them other than as "work in
progress."
Suzuki, et al. Expires July 31, 2012 [Page 1]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on July 31, 2012.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Architecture and Functional Model. . . . . . . . . . . . . . . 4
3. Procedures of Real-Time Communication. . . . . . . . . . . . . 5
3.1 Procedures of Initialization and Negotiation . . . . . . . . 5
3.2 Mapping between Contents in Jingle and HTML/DOM Elements . . 5
3.3 Procedures of Jingle Stream Connection . . . . . . . . . . . 6
3.4 Procedures of Adding or Deleting contents. . . . . . . . . . 6
3.5 Procedures of Termination. . . . . . . . . . . . . . . . . . 6
4. Security Considerations . . . . . . . . . . . . . . . . . . . 6
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6.1 Normative References . . . . . . . . . . . . . . . . . . . 6
6.2 Informative References . . . . . . . . . . . . . . . . . . 7
1. Introduction
XMPP can signal various information between users' clients, and
signaling infromation can be easily written by using XML
syntax. So, XMPP has now over 300 extended specifications as XEP
series specifications in XSF (XMPP Standards Foundation) without
core protocols written as RFC 6120: XMPP CORE, RFC 6121: XMPP IM
and so on.
Suzuki, et al. Expires July 31, 2012 [Page 2]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
In the XMPP world, many extensions are about how to treat various
signaling informations only. In IM (Instant Messaging) service
which is very typical service of XMPP, text messages are sent and
received as in-band data, or piggy-backed in signaling message like
as SMS of cell phone services.
In Jingle Specification (XEP-0166) and Jingle related
specifications, it's possible for each client to negotiate stream's
specifications as out of band media paths by using a set of Jingle
signaling procedures. When each client can succeed negotiation, it
sets up media path as a Jingle stream directly between 2 clients,
and then it can directly send and receive any numbers of media
contents in one Jingle stream directly without a help of XMPP
server.
As above description, Jingle specifications define to manage multi
contents stream between 2 clients, but in these specifications it's
not written about a common way to layout and render various
contents on each display . Until now it is a implement matter in
the XMPP world. In order to layout and render changeable number of
contents, it will be defined how to layout and render the contents
at the time when number of contents are changed. So, in this
proposal, Web browser is introduced to realize a flexible frontend
real-time communication services. And this proposal would be one of
solutions to realize RTCWeb.
2. Architecture and functional model
In this document, Browser model is basically same as RTCWeb
overview document [I-D.ietf-rtcweb-overview] as follows. But there
are small changes to simplify to use XMPP, signaling path is
changed to XMPP signaling, and RTC media path is changed to XMPP
Jingle media path. And "Browser RTC function" is changed to
"Browser Jingle function" in order to clarify to use XMPP features.
Suzuki, et al. Expires July 31, 2012 [Page 3]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
+-----------+------------+ On-the-wire
| | | Protocols
| Web | XMPP |--------->
| Server | Server | XMPP Federation
| | | Protocol
+-----------+------------+ (Jingle Signaling
^ Path)
|
|
| HTTP/XMPP
|
|
|
+----------------------------+
| JavaScript/HTML/CSS |
+----------------------------+
Other ^ ^RTC
APIs | |APIs
+---|-----------------|------+
| | | |
| +---------+|
| | Browser || On-the-wire
| Browser | Jingle || Protocols
| | Function|----------->
| | || Jingle
| | || Media Path
| +---------+|
+---------------------|------+
|
V
Native OS Services
Figure 1: Browser Model by Using XMPP
Suzuki, et al. Expires July 31, 2012 [Page 4]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
And Overview of the system is basically same as RTCWeb Overview
document [I-D.ietf-rtcweb-overview] as follows. Like as Figure 1,
there are small changes to clarify to use XMPP features. In the
XMPP specifications, "federation protocol" is defined to signal
between the peering servers.
+ -----+------+ +------+------+
| | | | | |
| Web | XMPP | Signaling | Web | XMPP |
| | |-------------| | |
|Server|Server| path |Server|Server|
| | | | | |
+ -----+------+ +------+------+
/ \
/ \
/ \ HTTP/XMPP
/ \
/ \
/ HTTP/XMPP \
/ \
+-----------+ +-----------+
|JS/HTML/CSS| |JS/HTML/CSS|
+-----------+ +-----------+
+-----------+ +-----------+
| | | |
| | | |
| Browser | ------------------------- | Browser |
| | Jingle media Path | |
| | | |
+-----------+ +-----------+
Figure 2: Browser RTC Trapezoid by Using XMPP
Suzuki, et al. Expires July 31, 2012 [Page 5]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
3. Procedures of Real-Time Communication
Basic procedure is almost same as Jingle specification, but there
are some different points in order to introduce Web browser as
real-time communication service frontend. XMPP's signaling is done
by presence, message and IQ stanza. "Stanza is a basic set of XML
statements in XMPP signaling. In this proposal, in order to map
contents in a Jingle stream to HTML/DOM elements, we add some xml
elements or attributes in Jingle signaling IQ stanzas. In the
following section, it is showed how to use our proposed protocol
extensions. In Jingle specifications, caller is called as "Jingle
Initiator" and callee is called as "Jingle Responder". In figure 3,
it shows simplified session flow of XMPP Jingle, and horizontal
arrow is one IQ stanza with Jingle action type name. Doubled
horizontal arrow is used to show out-band direct media path between
caller and callee. Of course, usual IQ stanzas (single horizontal
arrow) are transferred by the help of XMPP servers belong the
signaling path. (In XMPP world, it may be available for each clients
to connect its own server, server-server signaling protocol is
called federation protocol.)
Caller (Initiator) Callee (Responder)
| |
| session-initiate |
|---------------------------->|
| ack |
|<----------------------------|
| session-accept |
|<----------------------------|
| ack |
|---------------------------->|
| [optional further |
| negotiation] |
|<--------------------------->|
| Real-Time Call (RTP) |
|<===========================>|
| session-terminate |
|<----------------------------|
| ack |
|---------------------------->|
| |
Figure 3: simplified session flow
Suzuki, et al. Expires July 31, 2012 [Page 6]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
3.1. Procedures of Initialization and Negotiation
In the RTCWeb model, user gets initial HTML contents from Web
server as a Web application, and this HTML contents must have
JavaScript statements to set up signaling session between his
client and a XMPP server. So HTML statements is loaded, Web browser
kicks this JavaScript event handler (maybe "onLoaded" event
handler), and JavaScript statements set up a XMPP signaling session
at first.
As next step, user would choose peer user to make a real-time
communication call. Maybe it will be done by the event that user
select a gui part on a some kind of address book. When this event
is occurred, JavaScript statements starts to manage negotiation of
stream specification. Some default contents in a Jingle stream will
be proposed from caller (Jingle initiator) to callee (Jingle
Responder) by using Jingle "session-initiate" IQ-stanza with some
candidates of content specifications. One content specification
have basically 2 XML elements, one is a media information and
another one is a transport information. A "description" XML element
have media information with media type and some candidates of codec
specifications, and a "tranport" XML element have transport type
information and some candidates of IP address and port set. Caller
and Callee make negotiation by using some more IQ-stanzas, and when
negotiation is finally succeeded, callee (Jingle responder) send
session-accept IQ-stanza, at this time, caller's initial candidate
specifications are filtered to one acceptable content specification
for both caller and callee.
In this proposal, session-initiate and session-accept IQ-stanza
should have one more information, it's layout information of
contents in a Jingle stream on the each clients' Web browser
window. This layout information is used to map between a content in
a Jingle stream and a HTML/DOM element. This is a very important
point of this proposal. So, this proposal can manage and layout any
number of contents in a Jingle stream and a Web browser. Following
Example IQ-stanza has layout information with usual Jingle
"session-initiate" XML elements as "