CLUE Working Group | R. Presta |
Internet-Draft | S. P. Romano |
Intended status: Informational | University of Napoli |
Expires: September 02, 2013 | March 2013 |
An XML Schema for the CLUE data model
This document provides an XML schema file for the definition of CLUE data model types.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 02, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document provides an XML schema file for the definition of CLUE data model types.
The schema is based on information contained in [I-D.ietf-clue-framework] and also relates to the data model sketched in [I-D.romanow-clue-data-model]. It encodes information and constraints defined in the aforementioned documents in order to provide a formal representation of the concepts therein presented. The schema definition is intended to be modified according to changes applied to the above mentioned CLUE documents.
The document actually represents a strawman proposal aiming at the definition of a coherent structure for all the information associated with the description of a telepresence scenario.
[TBD] Copy text from the framework document.
This section contains the proposed CLUE data model schema definition.
The element and attribute definitions are formal representation of the concepts needed to describe the capabilities of a media provider and the current streams it is transmitting within a telepresence session.
The main groups of information are:
All of the above refers to concepts that have been introduced in [I-D.ietf-clue-framework] and [I-D.romanow-clue-data-model] and further detailed in threads on the mailing list as well as in the following of this document.
<?xml version="1.0" encoding="UTF-8" ?> <xs:schema targetNamespace="urn:ietf:params:xml:ns:clue-info" xmlns:tns="urn:ietf:params:xml:ns:clue-info" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:ietf:params:xml:ns:clue-info" elementFormDefault="qualified" attributeFormDefault="unqualified"> <!-- ELEMENT DEFINITIONS --> <xs:element name="mediaCaptures" type="mediaCapturesType"/> <xs:element name="encodings" type="encodingsType"/> <xs:element name="encodingGroups" type="encodingGroupsType"/> <xs:element name="captureScenes" type="captureScenesType"/> <xs:element name="simultaneousSets" type="simultaneousSetsType"/> <xs:element name="captureEncodings" type="captureEncodingsType"/> <!-- MEDIA CAPTURES TYPE --> <!-- envelope of media captures --> <xs:complexType name="mediaCapturesType"> <xs:sequence> <xs:element name="mediaCapture" type="mediaCaptureType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- DESCRIPTION element --> <xs:element name="description"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="lang" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <!-- MEDIA CAPTURE TYPE --> <xs:complexType name="mediaCaptureType" abstract="true"> <xs:sequence> <!-- mandatory fields --> <xs:element name="capturedMedia" type="xs:string"/> <xs:element name="captureSceneIDREF" type="xs:IDREF"/> <xs:element name="encGroupIDREF" type="xs:IDREF"/> <xs:choice> <xs:sequence> <xs:element name="spatialInformation" type="tns:spatialInformationType" maxOccurs="unbounded"/> </xs:sequence> <xs:element name="nonSpatiallyDefinible" type="xs:boolean" fixed="true"/> </xs:choice> <!-- optional fields --> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="priority" type="xs:integer" minOccurs="0"/> <xs:element name="lang" type="xs:language" minOccurs="0"/> <xs:element name="content" type="xs:string" minOccurs="0"/> <xs:element name="switched" type="xs:boolean" minOccurs="0"/> <xs:element name="dynamic" type="xs:boolean" minOccurs="0"/> <xs:element name="composed" type="xs:boolean" minOccurs="0"/> <xs:element name="maxCaptureEncodings" type="xs:unsignedInt" minOccurs="0"/> <!-- this is in place of "supplementary info": --> <xs:element name="relatedTo" type="xs:IDREF" minOccurs="0"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="captureID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> <!-- SPATIAL INFORMATION TYPE --> <xs:complexType name="spatialInformationType"> <xs:sequence> <xs:element name="capturePoint" type="capturePointType"/> <xs:element name="captureArea" type="captureAreaType" minOccurs="0"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> <!-- TEXT CAPTURE TYPE --> <xs:complexType name="textCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> </xs:extension> </xs:complexContent> </xs:complexType> <!-- AUDIO CAPTURE TYPE --> <xs:complexType name="audioCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> <xs:sequence> <xs:element name="audioChannelFormat" type="audioChannelFormatType" minOccurs="0"/> <xs:element name="micPattern" type="tns:micPatternType" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- MIC PATTERN TYPE --> <xs:simpleType name="micPatternType"> <xs:restriction base="xs:string"> <xs:enumeration value="uni"/> <xs:enumeration value="shotgun"/> <xs:enumeration value="omni"/> <xs:enumeration value="figure8"/> <xs:enumeration value="cardioid"/> <xs:enumeration value="hyper-cardioid"/> </xs:restriction> </xs:simpleType> <!-- AUDIO CHANNEL FORMAT TYPE --> <xs:simpleType name="audioChannelFormatType"> <xs:restriction base="xs:string"> <xs:enumeration value="mono"/> <xs:enumeration value="stereo"/> </xs:restriction> </xs:simpleType> <!-- VIDEO CAPTURE TYPE --> <xs:complexType name="videoCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> <xs:sequence> <xs:element name="nativeAspectRatio" type="xs:string" minOccurs="0"/> <xs:element ref="embeddedText" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- EMBEDDED TEXT ELEMENT --> <xs:element name="embeddedText"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:boolean"> <xs:attribute name="lang" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <!-- CAPTURE SCENES TYPE --> <!-- envelope of capture scenes --> <xs:complexType name="captureScenesType"> <xs:sequence> <xs:element name="captureScene" type="captureSceneType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- CAPTURE SCENE TYPE --> <xs:complexType name="captureSceneType"> <xs:sequence> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="sceneSpace" type="captureSpaceType" minOccurs="0"/> <xs:element name="sceneEntries" type="sceneEntriesType"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="sceneID" type="xs:ID" use="required"/> <xs:attribute name="scale" type="scaleType" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> <!-- SCALE TYPE --> <xs:simpleType name="scaleType"> <xs:restriction base="xs:string"> <xs:enumeration value="millimeters"/> <xs:enumeration value="unknown"/> <xs:enumeration value="noscale"/> </xs:restriction> </xs:simpleType> <!-- CAPTURE AREA TYPE --> <xs:complexType name="captureAreaType"> <xs:sequence> <xs:element name="bottomLeft" type="pointType"/> <xs:element name="bottomRight" type="pointType"/> <xs:element name="topLeft" type="pointType"/> <xs:element name="topRight" type="pointType"/> </xs:sequence> </xs:complexType> <!-- CAPTURE SPACE TYPE --> <xs:complexType name="captureSpaceType"> <xs:sequence> <xs:element name="bottomLeftFront" type="pointType"/> <xs:element name="bottomRightFront" type="pointType"/> <xs:element name="topLeftFront" type="pointType"/> <xs:element name="topRightFront" type="pointType"/> <xs:element name="bottomLeftBack" type="pointType"/> <xs:element name="bottomRightBack" type="pointType"/> <xs:element name="topLeftBack" type="pointType"/> <xs:element name="topRightBack" type="pointType"/> </xs:sequence> </xs:complexType> <!-- POINT TYPE --> <xs:complexType name="pointType"> <xs:sequence> <xs:element name="x" type="xs:decimal"/> <xs:element name="y" type="xs:decimal"/> <xs:element name="z" type="xs:decimal"/> </xs:sequence> </xs:complexType> <!-- CAPTURE POINT TYPE --> <xs:complexType name="capturePointType"> <xs:complexContent> <xs:extension base="pointType"> <xs:sequence> <xs:element name="lineOfCapturePoint" type="tns:pointType" minOccurs="0"/> </xs:sequence> <xs:attribute name="pointID" type="xs:ID"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- SCENE ENTRIES TYPE --> <!-- envelope of scene entries of a capture scene --> <xs:complexType name="sceneEntriesType"> <xs:sequence> <xs:element name="sceneEntry" type="sceneEntryType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- SCENE ENTRY TYPE --> <xs:complexType name="sceneEntryType"> <xs:sequence> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="switchingPolicies" type="switchingPoliciesType" minOccurs="0"/> <xs:element name="mediaCaptureIDs" type="captureIDListType"/> </xs:sequence> <xs:attribute name="sceneEntryID" type="xs:ID" use="required"/> <xs:attribute name="mediaType" type="xs:string" use="required"/> </xs:complexType> <!-- SWITCHING POLICIES TYPE --> <xs:complexType name="switchingPoliciesType"> <xs:sequence> <xs:element name="siteSwitching" type="xs:boolean" minOccurs="0"/> <xs:element name="segmentSwitching" type="xs:boolean" minOccurs="0"/> </xs:sequence> </xs:complexType> <!-- CAPTURE ID LIST TYPE --> <xs:complexType name="captureIDListType"> <xs:sequence> <xs:element name="captureIDREF" type="xs:IDREF" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- ENCODINGS TYPE --> <xs:complexType name="encodingsType"> <xs:sequence> <xs:element name="encoding" type="encodingType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- ENCODING TYPE --> <xs:complexType name="encodingType" abstract="true"> <xs:sequence> <xs:element name="encodingName" type="xs:string"/> <xs:element name="maxBandwidth" type="xs:integer"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="encodingID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##any" processContents="lax"/> </xs:complexType> <!-- AUDIO ENCODING TYPE --> <xs:complexType name="audioEncodingType"> <xs:complexContent> <xs:extension base="tns:encodingType"> <xs:sequence> <xs:element name="encodedMedia" type="xs:string" fixed="audio" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- VIDEO ENCODING TYPE --> <xs:complexType name="videoEncodingType"> <xs:complexContent> <xs:extension base="tns:encodingType"> <xs:sequence> <xs:element name="encodedMedia" type="xs:string" fixed="video" minOccurs="0"/> <xs:element name="maxWidth" type="xs:integer" minOccurs="0"/> <xs:element name="maxHeight" type="xs:integer" minOccurs="0"/> <xs:element name="maxFrameRate" type="xs:integer" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- H26X ENCODING TYPE --> <xs:complexType name="h26XEncodingType"> <xs:complexContent> <xs:extension base="tns:videoEncodingType"> <xs:sequence> <!-- max number of pixels to be processed per second --> <xs:element name="maxH26Xpps" type="xs:integer" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- ENCODING GROUPS TYPE --> <xs:complexType name="encodingGroupsType"> <xs:sequence> <xs:element name="encodingGroup" type="tns:encodingGroupType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- ENCODING GROUP TYPE --> <xs:complexType name="encodingGroupType"> <xs:sequence> <xs:element name="maxGroupBandwidth" type="xs:integer"/> <xs:element name="maxGroupPps" type="xs:integer" minOccurs="0"/> <xs:element name="encodingIDList" type="encodingIDListType"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="encodingGroupID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##any" processContents="lax"/> </xs:complexType> <!-- ENCODING ID LIST TYPE --> <xs:complexType name="encodingIDListType"> <xs:sequence> <xs:element name="encIDREF" type="xs:IDREF" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- SIMULTANEOUS SETS TYPE --> <xs:complexType name="simultaneousSetsType"> <xs:sequence> <xs:element name="simultaneousSet" type="simultaneousSetType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- SIMULTANEOUS SET TYPE --> <xs:complexType name="simultaneousSetType"> <xs:sequence> <xs:element name="captureIDREF" type="xs:IDREF" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="sceneEntryIDREF" type="xs:IDREF" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- CAPTURE ENCODING TYPE --> <xs:complexType name="captureEncodingType"> <xs:sequence> <xs:element name="mediaCaptureID" type="xs:string"/> <xs:element name="encodingID" type="xs:string"/> </xs:sequence> </xs:complexType> <!-- CAPTURE ENCODINGS TYPE --> <xs:complexType name="captureEncodingsType"> <xs:sequence> <xs:element name="captureEncoding" type="captureEncodingType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <!-- CLUE INFO ELEMENT --> <!-- the <clueInfo> envelope can be seen as the ancestor of an <advertisement> envelope --> <xs:element name="clueInfo" type="clueInfoType"/> <!-- CLUE INFO TYPE --> <xs:complexType name="clueInfoType"> <xs:sequence> <xs:element ref="mediaCaptures"/> <xs:element ref="encodings"/> <xs:element ref="encodingGroups"/> <xs:element ref="captureScenes"/> <xs:element ref="simultaneousSets"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="clueInfoID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> </xs:schema>
Following sections describe the XML schema in more detail.
<mediaCaptures> represents the list of one ore more media captures available on the media provider's side. Each media capture is represented by a <mediaCapture> element (Section 10).
<encodings> represents the list of individual encodings available on the media provider's side. Each individual encoding is represented by an <encoding> element (Section 16).
<encodingGroups> represents the list of the encoding groups organized on the media provider's side. Each encoding group is represented by a <encodingGroup> element (Section 20).
<captureScenes> represents the list of the capture scenes organized on the media provider's side. Each capture scene is represented by a <captureScene> element. (Section 14).
<simultaneousSets> contains the simultaneous sets indicated by the media provider. Each simultaneous set is represented by a <simultaneousSet> element. (Section 21).
<captureEncodings> is a list of capture encodings. It can represents the list of the desired capture encodings indicated by the media consumer or the list of instantiated captures on the provider's side. Each capture encoding is represented by a <captureEncoding> element. (Section 22).
According to the CLUE framework, a media capture is the fundamental representation of a media flow that is available on the provider's side. Media captures are characterized with a set of features that are independent from the specific type of medium, and with a set of feature that are media-specific. We design the media capture type as an abstract type, providing all the features that can be common to all media types. Media-specific captures, such as video captures, audio captures and others, are specialization of that media capture type, as in a typical generalization-specialization hierarchy.
The following is the XML Schema definition of the media capture type:
<!-- MEDIA CAPTURE TYPE --> <xs:complexType name="mediaCaptureType" abstract="true"> <xs:sequence> <!-- mandatory fields --> <xs:element name="capturedMedia" type="xs:string"/> <xs:element name="captureSceneIDREF" type="xs:IDREF"/> <xs:element name="encGroupIDREF" type="xs:IDREF"/> <xs:choice> <xs:sequence> <xs:element name="spatialInformation" type="tns:spatialInformationType" maxOccurs="unbounded"/> </xs:sequence> <xs:element name="nonSpatiallyDefinible" type="xs:boolean" fixed="true"/> </xs:choice> <!-- optional fields --> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="priority" type="xs:integer" minOccurs="0"/> <xs:element name="lang" type="xs:language" minOccurs="0"/> <xs:element name="content" type="xs:string" minOccurs="0"/> <xs:element name="switched" type="xs:boolean" minOccurs="0"/> <xs:element name="dynamic" type="xs:boolean" minOccurs="0"/> <xs:element name="composed" type="xs:boolean" minOccurs="0"/> <xs:element name="maxCaptureEncodings" type="xs:unsignedInt" minOccurs="0"/> <!-- this is in place of "supplementary info": --> <xs:element name="relatedTo" type="xs:IDREF" minOccurs="0"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="captureID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType>
<capturedMedia> is a mandatory field specifying the media type of the capture ("audio", "video", "text",...).
<captureSceneIDREF> is a mandatory field containing the identifier of the capture scene the media capture belongs to. Indeed, each media capture must be associated with one and only capture scene. When a media capture is spatially definible, some spatial information is provided along with it in the form of point coordinates (see Section 10.4). Such coordinates refers to the space of coordinates defined for the capture scene containing the capture.
<encGroupIDREF> is a mandatory field containing the identifier of the encoding group the media capture is associated with.
Media captures are divided into two categories: non spatially definible captures and spatially definible captures.
Non spatially definible captures are those that do not capture parts of the telepresence room. Capture of this case are for example those related to registrations, text captures, DVDs, registered presentation, or external streams, that are played in the telepresence room and transmitted to remote sites.
Spatially definible captures are those that capture part of the telepresence room. The captured part of the telepresence room is described by means of the <spatialInformation> element.
This is the definition of the spatial information type:
<!-- SPATIAL INFORMATION TYPE --> <xs:complexType name="spatialInformationType"> <xs:sequence> <xs:element name="capturePoint" type="capturePointType"/> <xs:element name="captureArea" type="captureAreaType" minOccurs="0"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType>
The <capturePoint> contains the coordinates of the capture device that is taking the capture, as well as, optionally, the pointing direction (see Section 10.4.1). It is a mandatory field when the media capture is spatially definible, independently from the media type.
The <captureArea> is an optional field containing four points defining the captured area represented by the capture (see Section 10.4.2).
The <capturePoint> element is used to represent the position and the line of capture of a capture device. The XML Schema definition of the <capturePoint> element type is the following:
<!-- CAPTURE POINT TYPE --> <xs:complexType name="capturePointType"> <xs:complexContent> <xs:extension base="pointType"> <xs:sequence> <xs:element name="lineOfCapturePoint" type="tns:pointType" minOccurs="0"/> </xs:sequence> <xs:attribute name="pointID" type="xs:ID"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- POINT TYPE --> <xs:complexType name="pointType"> <xs:sequence> <xs:element name="x" type="xs:decimal"/> <xs:element name="y" type="xs:decimal"/> <xs:element name="z" type="xs:decimal"/> </xs:sequence> </xs:complexType>
The point type contains three spatial coordinates ("x","y","z") representing a point in the space associated with a certain capture scene.
The capture point type extends the point type, i.e., it is represented by three coordinates identifying the position of the capture device, but can add further information. Such further information is conveyed by the <lineOfCapturePoint>, which is another point-type element representing the "point on line of capture", that gives the pointing direction of the capture device.
If the point of capture is not specified, it means the consumer should not assume anything about the spatial location of the capturing device.
The coordinates of the point on line of capture MUST NOT be identical to the capture point coordinates. If the point on line of capture is not specified, no assumptions are made about the axis of the capturing device.
<captureArea> is an optional element that can be contained within the spatial information associated with a media capture. It represents the spatial area captured by the media capture.
The XML representation of that area is provided through a set of four point-type element, <bottomLeft>, <bottomRight>, <topLeft>, and <topRight>, as it can be seen from the following definition:
<!-- CAPTURE AREA TYPE --> <xs:complexType name="captureAreaType"> <xs:sequence> <xs:element name="bottomLeft" type="pointType"/> <xs:element name="bottomRight" type="pointType"/> <xs:element name="topLeft" type="pointType"/> <xs:element name="topRight" type="pointType"/> </xs:sequence> </xs:complexType>
<bottomLeft>, <bottomRight>, <topLeft>, and <topRight> should be co-planar.
For a switched capture that switches between different sections within a larger area, the area of capture should use coordinates for the larger potential area.
By comparing the capture area of different media captures within the same capture scene, a consumer can determine the spatial relationships between them and render them correctly. If the area of capture is not specified, it means the Media Capture is not spatially related to any other media capture.
When media captures are non spatially definible, they are marked with the boolean <nonSpatiallyDefinible> element set to "true".
<description> is used to provide optionally human-readable textual information. It is used to describe media captures, capture scenes and capture scene entries. A media capture can be described by using multiple <description> elements, each one providing information in a different language. Indeed, the <description> element definition is the following:
<!-- DESCRIPTION element --> <xs:element name="description"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="lang" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>
As it can be seen, <description> is a string element with an attribute ("lang") indicating the language used in the textual description.
<priority> ([I-D.groves-clue-capture-attr]) is an optional integer field indicating the importance of a media capture according to the media provider's perspective. It can be used on the receiver's side to automatically identify the most "important" contribution available from the media provider.
[edt note: no final consensus has been reached on the adoption of such media capture attribute.]
<lang> is an optional element containing the language used in the capture, if any. The purpose of the element could match the one of the "language" attribute proposed in [I-D.groves-clue-capture-attr].
<content> is an optional string element. It contains enumerated values describing the "role" of the media capture according to what is envisionend in [RFC4796] ("slides", "speaker", "sl", "main", "alt"). The values for this attribute are the same as the mediacnt values for the content attribute in [RFC4796]. This attribute can list multiple values, for example "main, speaker".
[edt note: a better XML Schema definition for that element will soon be defined.]
<switched> is a boolean element which indicates whether or not the media capture represents the most appropriate subset of a "whole". What is "most appropriate" is up to the provider and could be the active speaker, a lecturer or a VIP.
[edt note: :(]
<dynamic> is an optional boolean element indicating wheter or not the capture device originating the capture moves during the telepresence session. That optional boolean element has the same purpose of the dynamic attribute proposed in [I-D.groves-clue-capture-attr].
[edt note: There isn't yet final consensus about that element.]
<composed> is an optional boolean element indicating wheter or not the media capture is a mix (audio) or composition (video) of streams. This attribute is useful for a media consumer for example to avoid nesting a composed video capture into another composed capture or rendering.
The optional <maxCaptureEncodings> contains an unsigned integer indicating the maximum number of capture encodings that can be simultaneously active for the media capture. If absent, this parameter defaults to 1. The minimum value for this attribute is 1. The number of simultaneous capture encodings is also limited by the restrictions of the encoding group the media capture refers to my means of the <encGroupIDREF> element.
The optional <relatedTo> element contains the value of the ID attribute of the media capture it refers to. The media capture marked with a <relatedTo> element can be for example the translation of a main media capture in a different language. The <relatedTo> element could be interpreted the same manner of the supplementary information attribute proposed in [I-D.groves-clue-capture-attr] and further discussed in http://www.ietf.org/mail-archive/web/clue/current/msg02238.html.
[edt note: There isn't yet final consensus about that element.]
The "captureID" attribute is a mandatory field containing the identifier of the media capture.
Audio captures inherit all the features of a generic media capture and present further audio-specific characteristics. The XML Schema definition of the audio capture type is reported below:
<!-- AUDIO CAPTURE TYPE --> <xs:complexType name="audioCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> <xs:sequence> <xs:element name="audioChannelFormat" type="audioChannelFormatType" minOccurs="0"/> <xs:element name="micPattern" type="tns:micPatternType" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
Audio-specific information about the audio capture is contained in <audioChannelFormat> (Section 11.1) and in <micPattern> (Section 11.2).
The optional <audioChannelFormat> element is a field with enumerated values ("mono" and "stereo") which describes the method of encoding used for audio. A value of "mono" means the audio capture has one channel. A value of "stereo" means the audio capture has two audio channels, left and right. A single stereo capture is different from two mono captures that have a left-right spatial relationship. A stereo capture maps to a single RTP stream, while each mono audio capture maps to a separate RTP stream.
The XML Schema definition of the <audioChannelFormat> element type is provided below:
<!-- AUDIO CHANNEL FORMAT TYPE --> <xs:simpleType name="audioChannelFormatType"> <xs:restriction base="xs:string"> <xs:enumeration value="mono"/> <xs:enumeration value="stereo"/> </xs:restriction> </xs:simpleType>
The <micPattern> element is an optional field describing the characteristic of the mic capturing the audio signal. It can contains the enumerated values listed below:
<!-- MIC PATTERN TYPE --> <xs:simpleType name="micPatternType"> <xs:restriction base="xs:string"> <xs:enumeration value="uni"/> <xs:enumeration value="shotgun"/> <xs:enumeration value="omni"/> <xs:enumeration value="figure8"/> <xs:enumeration value="cardioid"/> <xs:enumeration value="hyper-cardioid"/> </xs:restriction> </xs:simpleType>
Video captures, similarly to audio captures, extend the information of a generic media capture with video-specific features, such as <nativeAspectRatio> (Section 12.1) and <embeddedText> (Section 12.2).
The XML Schema representation of the video capture type is provided in the following:
<!-- VIDEO CAPTURE TYPE --> <xs:complexType name="videoCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> <xs:sequence> <xs:element name="nativeAspectRatio" type="xs:string" minOccurs="0"/> <xs:element ref="embeddedText" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
If a video capture has a native aspect ratio (for instance, it corresponds to a camera that generates 4:3 video), then it can be supplied as a value of the <nativeAspectRatio> element, in order to help rendering.
The <embeddedText> element is a boolean element indicating that there is text embedded in the video capture. The language used in such embedded textual description is reported in <embeddedText> "lang" attribute.
The XML Schema definition of the <embeddedText> element is:
<!-- EMBEDDED TEXT ELEMENT --> <xs:element name="embeddedText"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:boolean"> <xs:attribute name="lang" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>
The <embeddedText> element could correspond to the embedded-text attribute introduced in [I-D.groves-clue-capture-attr]
[edt note: no final consensus has been reached yet about the adoption of such element]
Also text captures can be described by extending the generic media capture information, similarly to audio captures and video captures.
The XML Schema representation of the text capture type is currently lacking text-specific information, as it can be seen by looking at the definition below:
<!-- TEXT CAPTURE TYPE --> <xs:complexType name="textCaptureType"> <xs:complexContent> <xs:extension base="tns:mediaCaptureType"> </xs:extension> </xs:complexContent> </xs:complexType>
A media provider organizes the available capture in capture scenes in order to help the receiver both in the rendering and in the selection of the group of captures. Capture scenes are made of capture scene entries, that are set of media captures of the same media type. Each capture scene entry represents an alternative to represent completely a capture scene for a fixed media type.
The XML Schema representation of a <captureScene> element is the following:
<!-- CAPTURE SCENE TYPE --> <xs:complexType name="captureSceneType"> <xs:sequence> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="sceneSpace" type="captureSpaceType" minOccurs="0"/> <xs:element name="sceneEntries" type="sceneEntriesType"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="sceneID" type="xs:ID" use="required"/> <xs:attribute name="scale" type="scaleType" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType>
The <captureScene> element can contain zero or more textual <description> elements, defined as in Section 10.6. Besides <description>, there are two other fields: <sceneSpace> (Section 14.1), describing the coordinate space which the media captures of the capture scene refer to, and <sceneEntries> (Section 14.2), the list of the capture scene entries.
The <sceneSpace> describes a bounding volume for the spatial information provided alongside spatially-definible media capture associated with the considered capture scene. Such volume is described as an arbitrary hexahedrons with eight points (<bottomLeftFront>, <bottomRightFront>, <topLeftFront>, <topRightFront>, <bottomLeftBack>, <bottomRightBack>, <topLeftBack>, and <topRightBack>). The coordinate system is Cartesian X, Y, Z with the origin at a spatial location of the media provider's choosing. The media provider must use the same coordinate system with same scale and origin for all media capture coordinates within the same capture scene.
<!-- CAPTURE SPACE TYPE --> <xs:complexType name="captureSpaceType"> <xs:sequence> <xs:element name="bottomLeftFront" type="pointType"/> <xs:element name="bottomRightFront" type="pointType"/> <xs:element name="topLeftFront" type="pointType"/> <xs:element name="topRightFront" type="pointType"/> <xs:element name="bottomLeftBack" type="pointType"/> <xs:element name="bottomRightBack" type="pointType"/> <xs:element name="topLeftBack" type="pointType"/> <xs:element name="topRightBack" type="pointType"/> </xs:sequence> </xs:complexType>
[edt note: this is just a place holder, the definition of the bounding volume has to be discussed]
The <sceneEntries> element is a mandatory field of a capture scene containing the list of scene entries. Each scene entry is represented by a <sceneEntry> element (Section 15).
<!-- SCENE ENTRIES TYPE --> <!-- envelope of scene entries of a capture scene --> <xs:complexType name="sceneEntriesType"> <xs:sequence> <xs:element name="sceneEntry" type="sceneEntryType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType>
The sceneID attribute is a mandatory attribute containing the identifier of the capture scene.
The scale attribute is a mandatory attribute that specifies the scale of the coordinates provided in the capture space and in the spatial information of the media capture belonging to the considered capture scene. The scale attribute can assume three different values:
<!-- SCALE TYPE --> <xs:simpleType name="scaleType"> <xs:restriction base="xs:string"> <xs:enumeration value="millimeters"/> <xs:enumeration value="unknown"/> <xs:enumeration value="noscale"/> </xs:restriction> </xs:simpleType>
A <sceneEntry> element represents a capture scene entry, which contains a set of media capture of the same media type describing a capture scene.
A <sceneEntry> element is characterized as follows.
<!-- SCENE ENTRY TYPE --> <xs:complexType name="sceneEntryType"> <xs:sequence> <xs:element ref="description" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="switchingPolicies" type="switchingPoliciesType" minOccurs="0"/> <xs:element name="mediaCaptureIDs" type="captureIDListType"/> </xs:sequence> <xs:attribute name="sceneEntryID" type="xs:ID" use="required"/> <xs:attribute name="mediaType" type="xs:string" use="required"/> </xs:complexType>
One or more optional <description> elements provide human-readable information about what the scene entry contains. <description> is defined as already seen in Section 10.6.
The remaining child elements are described in the following subsections.
<switchingPolicies> represents the switching policies the media provider support for the media captures contained inside a scene entry. The <switchingPolicies> element contains two boolean elements:
The "site-switch" policy means all captures are switched at the same time to keep captures from the same endpoint site together.
The "segment-switch" policy means different captures can switch at different times, and can be coming from different endpoints.
<!-- SWITCHING POLICIES TYPE --> <xs:complexType name="switchingPoliciesType"> <xs:sequence> <xs:element name="siteSwitching" type="xs:boolean" minOccurs="0"/> <xs:element name="segmentSwitching" type="xs:boolean" minOccurs="0"/> </xs:sequence> </xs:complexType>
The <mediaCaptureIDs> is the list of the identifiers of the media captures included in the scene entry. It is an element of the captureIDListType type, which is defined as a sequence of <captureIDREF> each one containing the identifier of a media capture listed within the <mediaCaptures> element:
<!-- CAPTURE ID LIST TYPE --> <xs:complexType name="captureIDListType"> <xs:sequence> <xs:element name="captureIDREF" type="xs:IDREF" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType>
The sceneEntryID attribute is a mandatory attribute containing the identifier of the capture scene entry represented by the <sceneEntry> element.
The mediaType attribute contains the media type of the media captures included in the scene entry.
The <encoding> element represents an individual encoding, i.e., a way to encode a media capture. Individual encodings can be characterized with features that are independent from the specific type of medium, and with features that are media-specific. We design the individual encoding type as an abstract type, providing all the features that can be common to all media types. Media-specific individual encodings, such as video encodings, audio encodings and others, are specialization of that type, as in a typical generalization-specialization hierarchy.
<!-- ENCODING TYPE --> <xs:complexType name="encodingType" abstract="true"> <xs:sequence> <xs:element name="encodingName" type="xs:string"/> <xs:element name="maxBandwidth" type="xs:integer"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="encodingID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##any" processContents="lax"/> </xs:complexType>
<encodingName> is a mandatory field containing the name of the encoding (e.g., G711, H264, ...).
<maxBandwidth> represent the maximum bitrate the media provider can instantiate for that encoding.
The encodingID attribute is a mandatory attribute containing the identifier of the individual encoding.
Audio encodings inherit all the features of a generic individual encoding and can present further audio-specific encoding characteristics. The XML Schema definition of the audio encoding type is reported below:
<!-- AUDIO ENCODING TYPE --> <xs:complexType name="audioEncodingType"> <xs:complexContent> <xs:extension base="tns:encodingType"> <xs:sequence> <xs:element name="encodedMedia" type="xs:string" fixed="audio" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
Up to now the only audio-specific information is the <encodedMedia> element containing the media type of the media captures that can be encoded with the considered individual encoding. In the case of audio encoding, that element is forced to "audio".
Similarly to audio encodings, video encodings can extend the information of a generic individual encoding with video-specific encoding features, such as <maxWidth>, <maxHeight> and <maxFrameRate>.
The <encodedMedia> element contains the media type of the media captures that can be encoded with the considered individual encoding. In the case of video encoding, that element is forced to "video".
<!-- VIDEO ENCODING TYPE --> <xs:complexType name="videoEncodingType"> <xs:complexContent> <xs:extension base="tns:encodingType"> <xs:sequence> <xs:element name="encodedMedia" type="xs:string" fixed="video" minOccurs="0"/> <xs:element name="maxWidth" type="xs:integer" minOccurs="0"/> <xs:element name="maxHeight" type="xs:integer" minOccurs="0"/> <xs:element name="maxFrameRate" type="xs:integer" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
<maxWidth> represents the video resolution's maximum width supported by the video encoding, expressed in pixels.
[edt note: not present in -09 version of the framework doc]
<maxHeight> representd the video resolution's maximum heith supported by the video encoding, expressed in pixels.
[edt note: not present in -09 version of the framework doc]
<maxFrameRate> provides the maximum frame rate supported by the video encoding for the video capture to be encoded.
[edt note: not present in -09 version of the framework doc]
This is an example of how it is possible to further specialize the definition of a video individual encoding in order to cover encoding specific information. A H26X video encoding can be represented through an element inheriting the video encoding characteristics described above (Section 18) and by adding other information such as <maxH26Xpps>, which represent the maximum number of pixels to be processed per second;.
<!-- H26X ENCODING TYPE --> <xs:complexType name="h26XEncodingType"> <xs:complexContent> <xs:extension base="tns:videoEncodingType"> <xs:sequence> <!-- max number of pixels to be processed per second --> <xs:element name="maxH26Xpps" type="xs:integer" minOccurs="0"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
[edt note: Need to be checked]
The <encodingGroup> element represents an encoding group, which is a set of one or more individual encodings, and parameters that apply to the group as a whole. The definition of the <encodingGroup> element is the following:
<!-- ENCODING GROUP TYPE --> <xs:complexType name="encodingGroupType"> <xs:sequence> <xs:element name="maxGroupBandwidth" type="xs:integer"/> <xs:element name="maxGroupPps" type="xs:integer" minOccurs="0"/> <xs:element name="encodingIDList" type="encodingIDListType"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="encodingGroupID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##any" processContents="lax"/> </xs:complexType>
In the following, the contained elements are further described.
<maxGroupBandwidth> is an optional field containing the maximum bitrate supported for all the individual encodings included in the encoding group.
<maxGroupPps> is an optional field containing the maximum number of pixel per second for all the individual encodings included in the encoding group.
[edt note: Need to be checked]
<maxGroupBandwidth> is the list of the individual encoding grouped together. Each individual encoding is represented through its identifier contained within an <encIDREF> element.
<!-- ENCODING ID LIST TYPE --> <xs:complexType name="encodingIDListType"> <xs:sequence> <xs:element name="encIDREF" type="xs:IDREF" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType>
The encodingGroupID attribute contains the identifier of the encoding group.
<simultaneousSet> represents a simultaneous set, i.e. a list of capture of the same type that cab be transmitted at the same time by a media provider. There are different simultaneous transmission sets for each media type.
<!-- SIMULTANEOUS SET TYPE --> <xs:complexType name="simultaneousSetType"> <xs:sequence> <xs:element name="captureIDREF" type="xs:IDREF" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="sceneEntryIDREF" type="xs:IDREF" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType>
[edt note: need to be checked]
<captureIDREF> contains the identifier of the media capture that belongs to the simultanous set.
<captureIDREF> contains the identifier of the scene entry containing a group of capture that are able to be sent simultaneously with the other capture of the simultaneous set.
A <captureEncoding> is given from the association of a media capture and an individual encoding, to form a capture stream. It is defined as en element of the following type:
<!-- CAPTURE ENCODING TYPE --> <xs:complexType name="captureEncodingType"> <xs:sequence> <xs:element name="mediaCaptureID" type="xs:string"/> <xs:element name="encodingID" type="xs:string"/> </xs:sequence> </xs:complexType>
<mediaCaptureID> contains the identifier of the media capture that has been encoded to form the capture encoding.
<encodingID> contains the identifier of the applied individual encoding.
The <clueInfo> element has been left within the XML Schema for the sake of convenience when representing a prototype of ADVERTISEMENT message (see the example section).
<!-- CLUE INFO ELEMENT --> <!-- the <clueInfo> envelope can be seen as the ancestor of an <advertisement> envelope --> <xs:element name="clueInfo" type="clueInfoType"/> <!-- CLUE INFO TYPE --> <xs:complexType name="clueInfoType"> <xs:sequence> <xs:element ref="mediaCaptures"/> <xs:element ref="encodings"/> <xs:element ref="encodingGroups"/> <xs:element ref="captureScenes"/> <xs:element ref="simultaneousSets"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="clueInfoID" type="xs:ID" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType>
The following XML document represents a schema compliant example of a CLUE telepresence scenario.
There are 5 video captures:
There are 2 audio captures:
The captures are organized into two capture scenes:
Within the capture scene CS1, there are three scene entries available:
On the other hand, capture scene CS2 presents two scene entries:
There are two encoding groups:
As to the simultaneous sets, only VC1 and VC3 cannot be transmitted simultaneously since they are captured by the same device. i.e. the central camera (VC3 is a zoomed-out view while VC1 is a focused view of the front participants). The simultaneous sets would then be the following:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <clueInfo xmlns="urn:ietf:params:xml:ns:clue-info" clueInfoID="prova"> <mediaCaptures> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="audioCaptureType" captureID="AC1"> <capturedMedia>audio</capturedMedia> <captureSceneIDREF>CS2</captureSceneIDREF> <encGroupIDREF>EG1</encGroupIDREF> <nonSpatiallyDefinible>true</nonSpatiallyDefinible> <description lang="en">presentation audio</description> <content>slide</content> <audioChannelFormat>mono</audioChannelFormat> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoCaptureType" captureID="VC4"> <capturedMedia>video</capturedMedia> <captureSceneIDREF>CS2</captureSceneIDREF> <encGroupIDREF>EG0</encGroupIDREF> <nonSpatiallyDefinible>true</nonSpatiallyDefinible> <description lang="en">presentation video</description> <content>slides</content> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="audioCaptureType" captureID="AC0"> <capturedMedia>audio</capturedMedia> <captureSceneIDREF>CS1</captureSceneIDREF> <encGroupIDREF>EG1</encGroupIDREF> <spatialInformation> <capturePoint> <x>0.5</x> <y>1.0</y> <z>0.5</z> <lineOfCapturePoint> <x>0.5</x> <y>0.0</y> <z>0.5</z> </lineOfCapturePoint> </capturePoint> </spatialInformation> <description lang="en"> audio from the central camera mic</description> <audioChannelFormat>mono</audioChannelFormat> <micPattern>figure8</micPattern> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoCaptureType" captureID="VC3"> <capturedMedia>video</capturedMedia> <captureSceneIDREF>CS1</captureSceneIDREF> <encGroupIDREF>EG0</encGroupIDREF> <spatialInformation> <capturePoint> <x>1.5</x> <y>1.0</y> <z>0.5</z> <lineOfCapturePoint> <x>1.5</x> <y>0.0</y> <z>0.5</z> </lineOfCapturePoint> </capturePoint> <captureArea> <bottomLeft> <x>0.0</x> <y>3.0</y> <z>0.0</z> </bottomLeft> <bottomRight> <x>3.0</x> <y>3.0</y> <z>0.0</z> </bottomRight> <topLeft> <x>0.0</x> <y>3.0</y> <z>3.0</z> </topLeft> <topRight> <x>3.0</x> <y>3.0</y> <z>3.0</z> </topRight> </captureArea> </spatialInformation> <description lang="en"> zoomed out view of the room</description> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoCaptureType" captureID="VC2"> <capturedMedia>video</capturedMedia> <captureSceneIDREF>CS1</captureSceneIDREF> <encGroupIDREF>EG0</encGroupIDREF> <spatialInformation> <capturePoint> <x>2.5</x> <y>1.0</y> <z>0.5</z> <lineOfCapturePoint> <x>2.5</x> <y>0.0</y> <z>0.5</z> </lineOfCapturePoint> </capturePoint> <captureArea> <bottomLeft> <x>2.0</x> <y>3.0</y> <z>0.0</z> </bottomLeft> <bottomRight> <x>3.0</x> <y>3.0</y> <z>0.0</z> </bottomRight> <topLeft> <x>2.0</x> <y>3.0</y> <z>3.0</z> </topLeft> <topRight> <x>3.0</x> <y>3.0</y> <z>3.0</z> </topRight> </captureArea> </spatialInformation> <description lang="en">right camera video</description> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoCaptureType" captureID="VC1"> <capturedMedia>video</capturedMedia> <captureSceneIDREF>CS1</captureSceneIDREF> <encGroupIDREF>EG0</encGroupIDREF> <spatialInformation> <capturePoint> <x>1.5</x> <y>1.0</y> <z>0.5</z> <lineOfCapturePoint> <x>1.5</x> <y>0.0</y> <z>0.5</z> </lineOfCapturePoint> </capturePoint> <captureArea> <bottomLeft> <x>1.0</x> <y>3.0</y> <z>0.0</z> </bottomLeft> <bottomRight> <x>2.0</x> <y>3.0</y> <z>0.0</z> </bottomRight> <topLeft> <x>1.0</x> <y>3.0</y> <z>3.0</z> </topLeft> <topRight> <x>2.0</x> <y>3.0</y> <z>3.0</z> </topRight> </captureArea> </spatialInformation> <description lang="en">central camera video</description> </mediaCapture> <mediaCapture xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoCaptureType" captureID="VC0"> <capturedMedia>video</capturedMedia> <captureSceneIDREF>CS1</captureSceneIDREF> <encGroupIDREF>EG0</encGroupIDREF> <spatialInformation> <capturePoint> <x>0.5</x> <y>1.0</y> <z>0.5</z> <lineOfCapturePoint> <x>0.5</x> <y>0.0</y> <z>0.5</z> </lineOfCapturePoint> </capturePoint> <captureArea> <bottomLeft> <x>0.0</x> <y>3.0</y> <z>0.0</z> </bottomLeft> <bottomRight> <x>1.0</x> <y>3.0</y> <z>0.0</z> </bottomRight> <topLeft> <x>0.0</x> <y>3.0</y> <z>3.0</z> </topLeft> <topRight> <x>1.0</x> <y>3.0</y> <z>3.0</z> </topRight> </captureArea> </spatialInformation> <description lang="en">left camera video</description> </mediaCapture> </mediaCaptures> <encodings> <encoding xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoEncodingType" encodingID="ENC0"> <encodingName>h263</encodingName> <maxBandwidth>4000000</maxBandwidth> <encodedMedia>video</encodedMedia> <maxWidth>1920</maxWidth> <maxHeight>1088</maxHeight> </encoding> <encoding xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoEncodingType" encodingID="ENC1"> <encodingName>h263</encodingName> <maxBandwidth>4000000</maxBandwidth> <encodedMedia>video</encodedMedia> <maxWidth>1920</maxWidth> <maxHeight>1088</maxHeight> </encoding> <encoding xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="videoEncodingType" encodingID="ENC2"> <encodingName>h263</encodingName> <maxBandwidth>4000000</maxBandwidth> <encodedMedia>video</encodedMedia> <maxWidth>1920</maxWidth> <maxHeight>1088</maxHeight> </encoding> <encoding xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="audioEncodingType" encodingID="ENC3"> <encodingName>g711</encodingName> <maxBandwidth>64000</maxBandwidth> <encodedMedia>audio</encodedMedia> </encoding> <encoding xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="audioEncodingType" encodingID="ENC4"> <encodingName>g711</encodingName> <maxBandwidth>64000</maxBandwidth> <encodedMedia>audio</encodedMedia> </encoding> </encodings> <encodingGroups> <encodingGroup encodingGroupID="EG0"> <maxGroupBandwidth>12000000</maxGroupBandwidth> <encodingIDList> <encIDREF>ENC0</encIDREF> <encIDREF>ENC1</encIDREF> <encIDREF>ENC2</encIDREF> </encodingIDList> </encodingGroup> <encodingGroup encodingGroupID="EG1"> <maxGroupBandwidth>12000000</maxGroupBandwidth> <encodingIDList> <encIDREF>ENC3</encIDREF> <encIDREF>ENC4</encIDREF> </encodingIDList> </encodingGroup> </encodingGroups> <captureScenes> <captureScene scale="unknown" sceneID="CS1"> <description lang="en">main scene</description> <sceneSpace> <bottomLeftFront> <x>0.0</x> <y>3.0</y> <z>0.0</z> </bottomLeftFront> <bottomRightFront> <x>3.0</x> <y>3.0</y> <z>0.0</z> </bottomRightFront> <topLeftFront> <x>0.0</x> <y>3.0</y> <z>2.0</z> </topLeftFront> <topRightFront> <x>3.0</x> <y>3.0</y> <z>2.0</z> </topRightFront> <bottomLeftBack> <x>0.0</x> <y>3.0</y> <z>0.0</z> </bottomLeftBack> <bottomRightBack> <x>3.0</x> <y>3.0</y> <z>0.0</z> </bottomRightBack> <topLeftBack> <x>0.0</x> <y>3.0</y> <z>2.0</z> </topLeftBack> <topRightBack> <x>3.0</x> <y>3.0</y> <z>2.0</z> </topRightBack> </sceneSpace> <sceneEntries> <sceneEntry mediaType="video" sceneEntryID="SE1"> <description lang="en"> participants streams</description> <mediaCaptureIDs> <captureIDREF>VC0</captureIDREF> <captureIDREF>VC1</captureIDREF> <captureIDREF>VC2</captureIDREF> </mediaCaptureIDs> </sceneEntry> <sceneEntry mediaType="video" sceneEntryID="SE2"> <description lang="en">room stream</description> <mediaCaptureIDs> <captureIDREF>VC3</captureIDREF> </mediaCaptureIDs> </sceneEntry> <sceneEntry mediaType="audio" sceneEntryID="SE3"> <description lang="en">room audio</description> <mediaCaptureIDs> <captureIDREF>AC0</captureIDREF> </mediaCaptureIDs> </sceneEntry> </sceneEntries> </captureScene> <captureScene scale="noscale" sceneID="CS2"> <description lang="en">presentation</description> <sceneEntries> <sceneEntry mediaType="video" sceneEntryID="CS2_SE1"> <description lang="en"> presentation video</description> <mediaCaptureIDs> <captureIDREF>VC4</captureIDREF> </mediaCaptureIDs> </sceneEntry> <sceneEntry mediaType="audio" sceneEntryID="CS2_SE2"> <description lang="en"> presentation audio</description> <mediaCaptureIDs> <captureIDREF>AC1</captureIDREF> </mediaCaptureIDs> </sceneEntry> </sceneEntries> </captureScene> </captureScenes> <simultaneousSets> <simultaneousSet setID="SS1"> <captureIDREF>VC0</captureIDREF> <captureIDREF>VC1</captureIDREF> <captureIDREF>VC2</captureIDREF> <captureIDREF>VC4</captureIDREF> <captureIDREF>AC0</captureIDREF> <captureIDREF>AC1</captureIDREF> </simultaneousSet> <simultaneousSet setID="SS2"> <captureIDREF>VC0</captureIDREF> <captureIDREF>VC3</captureIDREF> <captureIDREF>VC2</captureIDREF> <captureIDREF>VC4</captureIDREF> <captureIDREF>AC0</captureIDREF> <captureIDREF>AC1</captureIDREF> </simultaneousSet> </simultaneousSets> </clueInfo>
Here the link to the unofficial -02 version: http://www.grid.unina.it/Didattica/RetiDiCalcolatori /inf/draft-presta-clue-data-model-schema-02.html
[I-D.ietf-clue-framework] | Duckworth, M., Pepperell, A. and S. Wenger, "Framework for Telepresence Multi-Streams", Internet-Draft draft-ietf-clue-framework-09, February 2013. |
[I-D.romanow-clue-data-model] | Romanow, A. and A. Pepperell, "Data model for the CLUE Framework", Internet-Draft draft-romanow-clue-data-model-01, June 2012. |
[I-D.groves-clue-capture-attr] | Groves, C., Yang, W. and R. Even, "CLUE media capture description", Internet-Draft draft-groves-clue-capture-attr-01, February 2013. |
[RFC4796] | Hautakorpi, J. and G. Camarillo, "The Session Description Protocol (SDP) Content Attribute", RFC 4796, February 2007. |