Internet DRAFT - draft-suzuki-web-jingle
draft-suzuki-web-jingle
Internet Engineering Task Force Yoshihiro Suzuki
Internet-Draft D3 Communications
Intended Status: Informational Nobuo Ogashiwa
Expires: July 31, 2012 Maebashi Kyoai Gakuen College
February 28, 2012
Real-Time Web Communication by using XMPP Jingle
draft-suzuki-web-jingle-00
Abstract
XMPP Jingle specification defines an XMPP protocol extension for
managing peer-to-peer media sessions between two users. And XMPP
Jingle can manage multi contents simultaneously in one Jingle
stream, but in the XMPP world there is no common way to layout
variable multi contents on each display. In this document, a
solution to this issue is provided by using Web browser's layout
function.
This document describes a new concept to realize one of solutions
of RTCWeb. Of course, "Web Browser" is used for Web application's
frontend for real-time communication, and XMPP Jingle specification
(XEP-166) is used as signaling protocol. And a simple mapping
manner between jingle contents and HTML graphical elements enables
to unite Web browser's layout function and Jingle's media content
management function to realize RTCWeb functions.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-Drafts
as reference material or to cite them other than as "work in
progress."
Suzuki, et al. Expires July 31, 2012 [Page 1]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on July 31, 2012.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Architecture and Functional Model. . . . . . . . . . . . . . . 4
3. Procedures of Real-Time Communication. . . . . . . . . . . . . 5
3.1 Procedures of Initialization and Negotiation . . . . . . . . 5
3.2 Mapping between Contents in Jingle and HTML/DOM Elements . . 5
3.3 Procedures of Jingle Stream Connection . . . . . . . . . . . 6
3.4 Procedures of Adding or Deleting contents. . . . . . . . . . 6
3.5 Procedures of Termination. . . . . . . . . . . . . . . . . . 6
4. Security Considerations . . . . . . . . . . . . . . . . . . . 6
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6.1 Normative References . . . . . . . . . . . . . . . . . . . 6
6.2 Informative References . . . . . . . . . . . . . . . . . . 7
1. Introduction
XMPP can signal various information between users' clients, and
signaling infromation can be easily written by using XML
syntax. So, XMPP has now over 300 extended specifications as XEP
series specifications in XSF (XMPP Standards Foundation) without
core protocols written as RFC 6120: XMPP CORE, RFC 6121: XMPP IM
and so on.
Suzuki, et al. Expires July 31, 2012 [Page 2]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
In the XMPP world, many extensions are about how to treat various
signaling informations only. In IM (Instant Messaging) service
which is very typical service of XMPP, text messages are sent and
received as in-band data, or piggy-backed in signaling message like
as SMS of cell phone services.
In Jingle Specification (XEP-0166) and Jingle related
specifications, it's possible for each client to negotiate stream's
specifications as out of band media paths by using a set of Jingle
signaling procedures. When each client can succeed negotiation, it
sets up media path as a Jingle stream directly between 2 clients,
and then it can directly send and receive any numbers of media
contents in one Jingle stream directly without a help of XMPP
server.
As above description, Jingle specifications define to manage multi
contents stream between 2 clients, but in these specifications it's
not written about a common way to layout and render various
contents on each display . Until now it is a implement matter in
the XMPP world. In order to layout and render changeable number of
contents, it will be defined how to layout and render the contents
at the time when number of contents are changed. So, in this
proposal, Web browser is introduced to realize a flexible frontend
real-time communication services. And this proposal would be one of
solutions to realize RTCWeb.
2. Architecture and functional model
In this document, Browser model is basically same as RTCWeb
overview document [I-D.ietf-rtcweb-overview] as follows. But there
are small changes to simplify to use XMPP, signaling path is
changed to XMPP signaling, and RTC media path is changed to XMPP
Jingle media path. And "Browser RTC function" is changed to
"Browser Jingle function" in order to clarify to use XMPP features.
Suzuki, et al. Expires July 31, 2012 [Page 3]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
+-----------+------------+ On-the-wire
| | | Protocols
| Web | XMPP |--------->
| Server | Server | XMPP Federation
| | | Protocol
+-----------+------------+ (Jingle Signaling
^ Path)
|
|
| HTTP/XMPP
|
|
|
+----------------------------+
| JavaScript/HTML/CSS |
+----------------------------+
Other ^ ^RTC
APIs | |APIs
+---|-----------------|------+
| | | |
| +---------+|
| | Browser || On-the-wire
| Browser | Jingle || Protocols
| | Function|----------->
| | || Jingle
| | || Media Path
| +---------+|
+---------------------|------+
|
V
Native OS Services
Figure 1: Browser Model by Using XMPP
Suzuki, et al. Expires July 31, 2012 [Page 4]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
And Overview of the system is basically same as RTCWeb Overview
document [I-D.ietf-rtcweb-overview] as follows. Like as Figure 1,
there are small changes to clarify to use XMPP features. In the
XMPP specifications, "federation protocol" is defined to signal
between the peering servers.
+ -----+------+ +------+------+
| | | | | |
| Web | XMPP | Signaling | Web | XMPP |
| | |-------------| | |
|Server|Server| path |Server|Server|
| | | | | |
+ -----+------+ +------+------+
/ \
/ \
/ \ HTTP/XMPP
/ \
/ \
/ HTTP/XMPP \
/ \
+-----------+ +-----------+
|JS/HTML/CSS| |JS/HTML/CSS|
+-----------+ +-----------+
+-----------+ +-----------+
| | | |
| | | |
| Browser | ------------------------- | Browser |
| | Jingle media Path | |
| | | |
+-----------+ +-----------+
Figure 2: Browser RTC Trapezoid by Using XMPP
Suzuki, et al. Expires July 31, 2012 [Page 5]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
3. Procedures of Real-Time Communication
Basic procedure is almost same as Jingle specification, but there
are some different points in order to introduce Web browser as
real-time communication service frontend. XMPP's signaling is done
by presence, message and IQ stanza. "Stanza is a basic set of XML
statements in XMPP signaling. In this proposal, in order to map
contents in a Jingle stream to HTML/DOM elements, we add some xml
elements or attributes in Jingle signaling IQ stanzas. In the
following section, it is showed how to use our proposed protocol
extensions. In Jingle specifications, caller is called as "Jingle
Initiator" and callee is called as "Jingle Responder". In figure 3,
it shows simplified session flow of XMPP Jingle, and horizontal
arrow is one IQ stanza with Jingle action type name. Doubled
horizontal arrow is used to show out-band direct media path between
caller and callee. Of course, usual IQ stanzas (single horizontal
arrow) are transferred by the help of XMPP servers belong the
signaling path. (In XMPP world, it may be available for each clients
to connect its own server, server-server signaling protocol is
called federation protocol.)
Caller (Initiator) Callee (Responder)
| |
| session-initiate |
|---------------------------->|
| ack |
|<----------------------------|
| session-accept |
|<----------------------------|
| ack |
|---------------------------->|
| [optional further |
| negotiation] |
|<--------------------------->|
| Real-Time Call (RTP) |
|<===========================>|
| session-terminate |
|<----------------------------|
| ack |
|---------------------------->|
| |
Figure 3: simplified session flow
Suzuki, et al. Expires July 31, 2012 [Page 6]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
3.1. Procedures of Initialization and Negotiation
In the RTCWeb model, user gets initial HTML contents from Web
server as a Web application, and this HTML contents must have
JavaScript statements to set up signaling session between his
client and a XMPP server. So HTML statements is loaded, Web browser
kicks this JavaScript event handler (maybe "onLoaded" event
handler), and JavaScript statements set up a XMPP signaling session
at first.
As next step, user would choose peer user to make a real-time
communication call. Maybe it will be done by the event that user
select a gui part on a some kind of address book. When this event
is occurred, JavaScript statements starts to manage negotiation of
stream specification. Some default contents in a Jingle stream will
be proposed from caller (Jingle initiator) to callee (Jingle
Responder) by using Jingle "session-initiate" IQ-stanza with some
candidates of content specifications. One content specification
have basically 2 XML elements, one is a media information and
another one is a transport information. A "description" XML element
have media information with media type and some candidates of codec
specifications, and a "tranport" XML element have transport type
information and some candidates of IP address and port set. Caller
and Callee make negotiation by using some more IQ-stanzas, and when
negotiation is finally succeeded, callee (Jingle responder) send
session-accept IQ-stanza, at this time, caller's initial candidate
specifications are filtered to one acceptable content specification
for both caller and callee.
In this proposal, session-initiate and session-accept IQ-stanza
should have one more information, it's layout information of
contents in a Jingle stream on the each clients' Web browser
window. This layout information is used to map between a content in
a Jingle stream and a HTML/DOM element. This is a very important
point of this proposal. So, this proposal can manage and layout any
number of contents in a Jingle stream and a Web browser. Following
Example IQ-stanza has layout information with usual Jingle
"session-initiate" XML elements as "<layout/>" tag.
Suzuki, et al. Expires July 31, 2012 [Page 7]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
Example 1. Example of IQ-stanza from caller (session-initiate)
<iq from="alice@woderland.lit/rabbithole"
id="v73hwcx9"
to="sister@realworld.lit/home"
type="set">
<jingle xmlns="urn:xmpp:jingle:1"
action="session-initiate"
initiator="alice@woderland.lit/rabbithole">
<content creator="initiator" name="Webcam">
<description xmlns ='urn:xmpp:jingle:apps:rtp:1' media='video'>
<payload-type id='98' name='theora' clockrate='90000'>
<parameter name='height' value='600'/>
<parameter name='width' value='800'/ >
<parameter name='delivery-method' value='inline'/>
<parameter name='configuration' value='somebase16string'/>
<parameter name='sampling' value='YCbCr-4:2:2'/>
</payload-type >
</description>
<transport>
<candidate candidate="1"
generation="0"
id="a9j3mnbtu1"
ip="10.1.1.104"
port="13450"/>
</transport>
</content>
<layout URL="http://www.woderland.lit/alice-videocall-00.html"/>
</jingle>
</iq>
3.2. Mapping between contents in Jingle and HTML/DOM elements
As mentioned above, in our proposal layout tag in session-initiate
IQ stanza is very important. In Example. 1, layout information has
only one attribute "URL". For simple html can be written in the
manner of XML syntax, but usual Web applications' HTML statements
and related CSS and JavaScript statements are not conformed XML
syntax, so these are described as URL simply. Web browser's Jingle
function part hands over this URL information to Core of Web
brwoser and Web browser loads new page of URL which describes
layout information of initial default contents set in HTML syntax
manner.
Suzuki, et al. Expires July 31, 2012 [Page 8]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
In this proposal, We proposed 2 way to map a content in a Jingle
stream to a element of HTML/DOM. One is the way to use common id in
both a new "id" attribute of "<content/>" tag and "id" information
in HTML/DOM element. In this way "id" must be unique in one
real-time session, and make the same value as both a new "id"
attribute of "<content/>" tag and "id" information in HTML/DOM
element. This idea introduce a new "id" attribute in <content/> tag
in Jingle signaling information. From the view of Web application
or Ajax programmer, identical objects in HTML,
CSS, JavaScript statements must have the same value of "id"
attribute, this approach is extension of this rule to apply the
content of a Jingle stream. So, This is natural extension for Ajax
programmers, but it is needed to add new attribute in Jingle
specification.
Example 2. Mapping by "id" value
--- Jingle stream specification ---
<jingle action="session-initiate">
<content creator="initiator" name="Webcam" id="webcam-video">
^^^^^^^^^^^^^^^^^
<description media = 'video' >
<payload-type id='98' name='theora' clockrate='90000/'>
</description>
<transport>
<candidate ip="10.1.1.104" port="13450"/>
</transport>
</content>
<layout URL="http://www.woderland.lit/alice-videocall-00.html"/>
</jingle>
-----------------------------------
--- HTML layout specification ---
<html id="98765">
<body>
<div id ="webcam-video"
^^^^^^^^^^^^^^^^^^
style="position: absolute,
top: 10px, left: 800px, height: 768px, width: 1024px,
z-index: 3" />
<video src="rtp://10.1.1.104:13450"/>
</div>
</body>
</html>
---------------------------------
Suzuki, et al. Expires July 31, 2012 [Page 9]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
Another way to map a content in a Jingle stream to a element of
HTML/DOM is to use a set of ip address and port number written in
"<transport/>" tag information of a "<content/>" description, and
in a HTML/DOM element this set of ip address and port number is
described as src attribute of media tag. For the video or image
content, This approach is very natural for protocol designer,
because there is no change in existing specifications to map a
Jingle content to a HTML/CSS/JavaScript elements. But, Web browser
with Jingle Functions must keep mapping information between
<transport/> information of Jingle stream specification and
identical object in HTML/CSS/JavaScript statements.
Example 3. Mapping by a set of ip address and port number
--- Jingle stream specification ---
<jingle action="session-initiate">
<content creator="initiator" name="Webcam">
<description media = 'video' >
<payload-type id='98' name='theora' clockrate='90000/'>
</description>
<transport>
<candidate ip="10.1.1.104" port="13450"/>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
</transport>
</content>
<layout URL="http://www.woderland.lit/alice-videocall-00.html"/>
</jingle>
-----------------------------------
--- HTML layout specification ---
<html id="98765">
<body>
<div id ="webcam-video"
style="position: absolute,
top: 10px, left: 800px, height: 768px, width: 1024px,
z-index: 3" />
<video src="rtp://10.1.1.104:13450"/>
^^^^^^^^^^^^^^^^^^^^^^^^^^^
</div>
</body>
</html>
---------------------------------
Suzuki, et al. Expires July 31, 2012 [Page 10]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
In this proposal, contents in a Jingle stream is related to a
identical object of HTML/CSS/JavaScript , and a Jingle stream is
related to layout description by using whole HTML/CSS
statements. In Jingle specification, it is needed to add
"<layout/>" information, when Jingle specification is described or
changed in Jingle signaling messages. So when a set of contents or
a Jingle stream specification is changed, HTML/CSS statements must
change to update mapping information. (Maybe user's Web Browser
should upload new HTML/CSS/JavaScript set to Web Server, until new
Jingle signaling message will be sent to peer's Web browser. And
when HTML/CSS statements is updated, Web browser must load new page
which include updated layout information.
Of Course, in this proposal, without Jingle signaling message, each
Web browser can change position of contents or scale of contents by
using Ajax manner or apply new CSS statements only to adapt user's
own request on one side.
3.3. Procedures of Jingle stream connection
In section 4.1, both caller (Jingle initiator) and callee
(Jingle responder) agreed contents' specifications in a Jingle
stream and contents' layout information. Jingle specification
defines how to connect between the Jingle clients, but does not
define how to layout on the screen. So, the JavaScript program
which is done the negotiation and connection of Jingle stream, then
connects each content in a Jingle stream to HTML/DOM element by
using mapping information which is described above. HTML rendering
engine starts HTML/DOM rendering, and at same time CSS mechanism
can relocate and rescale HTML/DOM elements. The previous section's
and this section's procedures are automatically done by user's
event which user made by pressing a gui parts on some address book
interfaces. Of course if callee is not online, or lacks abilities
to treat contents in a Jingle stream, or some error occurred in the
procedures, caller's call will be failed. Details of error
condition in the Jingle protocol are well defined in Jingle
specifications (XEP-166 and some more). Maybe, JavaScript API will
be well defined to make a realtime communication call in order to
use various signaling protocol by W3C WebRTC WG.
3.4. Procedures of adding or deleting contents
In Jingle specifications, adding or deleting contents in a Jingle
stream are well defined, so in this proposal the important thing is
to keep mapping information between the contents in a Jingle
stream and HTML/DOM elements in HTML statements or a DOM tree. The
following figure shows overall session status in a Jingle session.
Suzuki, et al. Expires July 31, 2012 [Page 11]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
o
|
| session-initiate
|
| +---------->--------------+
|/ |
PENDING o-----------------------+ |
| | content-accept, | |
| | content-add, | |
| | content-modify, | |
| | content-reject, | |
| | content-remove, | |
| | description-info, | |
\|/ | session-info, | |
| | transport-accept, | |
| | transport-info, | |
| | transport-reject, | |
| | transport-replace | |
| +--------------------+ |
| |
| session-accept \|/
| |
ACTIVE o-----------------------+ |
| | content-accept, | |
| | content-add, | |
| | content-modify, | |
| | content-reject, | |
| | content-remove, | |
| | description-info, | |
\|/ | session-info, | |
| | transport-accept, | |
| | transport-info, | |
| | transport-reject, | |
| | transport-replace | |
| +--------------------+ |
| |
+------------->--------------+
|
| session-terminate
|
o ENDED
Figure 4: overall session status
Suzuki, et al. Expires July 31, 2012 [Page 12]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
Following example is showed to adding one content to a jingle
stream. If user requests to add more contents in a Jingle stream,
more contents specification must be added to adapt a user's
request. When a set of contents or a Jingle stream specification is
changed, mapping information or <layout/> information must be
updated.
Example 4. Example of IQ-stanza (content-add)
<iq from="alice@woderland.lit/rabbithole"
id='j6s4198'
id="v73hwcx9"
to="sister@realworld.lit/home"
type="set">
<jingle xmlns='urn:xmpp:jingle:1'
action='content-add'
initiator="alice@woderland.lit/rabbithole">
sid='a73sjjvkla37jfea'>
<content creator='initiator' name='webcam2'>
<description exmlns='urn:xmpp:jingle:apps:rtp:1' media='video'>
<payload-type id='98' name='theora' clockrate='90000'>
<parameter name='height' value='600'/>
<parameter name='width' value='800'/>
<parameter name='delivery-method' value='inline'/>
<parameter name='configuration' value='somebase16string'/>
<parameter name='sampling' value='YCbCr-4:2:2'/>
</payload-type>
<payload-type id='28' name='nv' clockrate='90000'/>
<payload-type id='25' name='CelB' clockrate='90000'/>
<payload-type id='32' name='MPV' clockrate='90000'/>
<bandwidth type='AS'> 128 </bandwidth>
</description>
<transport xmlns='urn:xmpp:jingle:transports:udp:0'/>
</content>
<layout URL="http://www.woderland.lit/alice-videocall-01.html"/>
</jingle>
</iq>
3.5. Procedures of Termination
Web browser on both side can terminate realtime communication call
by using usual Jingle signaling manner, but there is one different
thing. It is needed to set HTML/CSS/JavaScript statements to update
Web browser window to show that realtime communication call is
terminated. Example 5 is one of session-terminate signaling message.
Suzuki, et al. Expires July 31, 2012 [Page 13]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
Example 5. Example of IQ-stanza (session-terminate)
<iq from="alice@woderland.lit/rabbithole"
id='fl2v387j'
to="sister@realworld.lit/home"
type="set">
<jingle xmlns='urn:xmpp:jingle:1'
action='session-terminate'
initiator='romeo@montague.lit/orchard'
sid='a73sjjvkla37jfea'>
<reason>
<success/>
<text>I'm outta here!</text>
</reason>
<layout URL="http://www.woderland.lit/alice-videocall-2.html"/>
</jingle>
</iq>
4. Security Considerations
This document has no requirement for a change to the security
models within associated protocols.
5. IANA Considerations
T.B.D.
6. References
6.1. Normative References
[RFC6120] P. Saint-Andre, "Extensible Messaging and Presence
Protocol (XMPP): Core", March 2011.
[RFC6121] P. Saint-Andre, "Extensible Messaging and Presence
Protocol (XMPP): Instant Messaging and Presence,
March 2011.
[XEP-0166] Scott Ludwig et al, "Jingle", December 2011.
Available at
http://xmpp.org/extensions/xep-0166.html
[XEP-0167] Scott Ludwig et al, "Jingle RTP Sessions" December
2009.
Available at
http://xmpp.org/extensions/xep-0167.html
Suzuki, et al. Expires July 31, 2012 [Page 14]
Internet-Draft RealTimeWeb Using XMPP Jingle February 2012
[I-D.ietf-rtcweb-overview]
H. Alvestrand, "Overview: Real Time Protocols for
Brower-based Applications",
draft-ietf-rtcweb-overview-02 (work in progress),
September 2011.
[I-D.ietf-rtcweb-use-cases-and-requirements]
C. Holmberg et al,"Web Real-Time Communication
Use-cases and Requirements",
draft-ietf-rtcweb-use-cases-and-requirements-05 (work in
progress), September 2011.
[webrtc-api]
Adam Bergkvist et al, "WebRTC 1.0: Real-time
Communication Between Browsers", February 2012
Available at
http://www.w3.org/TR/webrtc/
6.2. Informative references
Authors' Addresses
Yoshihiro Suzuki
D3 Communications
1-18-31 Tamagawa-Gakuen
Machida City, Tokyo
Japan 194-0041
Email: hiro.suzuki@d3communications.jp
Nobuo Ogashiwa
Maebashi Kyoai Gakuen College
1154-4 Koyahara-machi
Maebashi City, Gunma Pref
JAPAN 379-2192
EMail: ogashiwa@c.kyoai.ac.jp
URI: http://www.kyoai.ac.jp/