Internet DRAFT - draft-pepperell-clue-switched-attribute
draft-pepperell-clue-switched-attribute
CLUE A. Pepperell
Internet-Draft Silverflare
Intended status: Standards Track A. Romanow
Expires: December 2, 2012 R. Hansen
B. Baldino
Cisco Systems
May 31, 2012
Use of switched capture attribute & spatial co-ordinates in advanced
cases
draft-pepperell-clue-switched-attribute-00
Abstract
This draft examines the issues with advertising "switched" captures
in CLUE, and makes some proposals for how to solve the issues
involved.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 2, 2012.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
Pepperell, et al. Expires December 2, 2012 [Page 1]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. The need for switched captures in CLUE . . . . . . . . . . . . 3
4. Issues with switched captures . . . . . . . . . . . . . . . . . 4
5. Proposed approach . . . . . . . . . . . . . . . . . . . . . . . 5
6. A less minimalist solution . . . . . . . . . . . . . . . . . . 6
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 6
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6
8.1. Normative References . . . . . . . . . . . . . . . . . . . 6
8.2. Informative References . . . . . . . . . . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6
Pepperell, et al. Expires December 2, 2012 [Page 2]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
1. Introduction
This draft attempts to state some of the issues involved in using
switched captures in CLUE, explores the need for a "switched"
attribute and what this attribute means in different contexts.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119] and
indicate requirement levels for compliant implementations.
3. The need for switched captures in CLUE
The media capture "switched" attribute refers to captures whose
content can change between different provider-chosen possibilities.
A typical case might be a 3 camera system choosing to offer a capture
scene entry comprising a single switched video capture which at any
given time would show one of the 3 camera feeds available (perhaps
based on audio activity within its local scene, or room). The
presence of the "switched" attribute would distinguish such a media
capture from another that, say, was providing a fixed, zoomed out,
view of all the available seats, even if both captures involved used
identical point of origin and capture co-ordinates.
In common with other capture types, a consumer would only need to set
up decoder state for a switched capture once (when first selecting to
be sent an instantiation of that capture) and not need to modify such
state in response to the provider choosing to change the source of
the switched capture. Note that "switched" here carries no
implications in terms of whether the audio or video in question has
been transcoded or forwarded unmodified.
For an MCU or endpoint providing 1, 2, 3, 4 ... n, video captures
with adjacency characteristics (for instance, camera feeds intended
to be shown "in a line" or a transcoded conference view created for
display across multiple monitors) the capture co-ordinates supplied
by the provider give the consumer sufficient information to be able
to render those captures correctly. Specifically, the consumer knows
not only that a group of video captures forms a complete
representation of the capture scene (because together those captures
from a capture scene entry) but also how the captures in that group
should be displayed relative to each other in order to preserve the
integrity of the rendered scene.
Pepperell, et al. Expires December 2, 2012 [Page 3]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
4. Issues with switched captures
CLUE is, however, intended to cover more advanced switching cases,
cases typically (though not necessarily exclusively) involving an MCU
device. For instance, an MCU may choose to forward a selection of
significant participants' audio and video captures to devices
participating in that conference in order for those devices to form
their own appropriate multi-pane layouts. This might be a required
feature of the system (if the MCU in question had no transcoding or
composition capabilities) or simply a desired one (perhaps in order
to reduce latency and media degradation caused by potentially
multiple stages of transcoding). These cases result in MCU middle
box devices wishing, as part of their CLUE provider roles, to
advertise to participating consumer devices the availability of
potentially many switched captures. For example, an MCU might
advertise the availability of up to 20 such switched streams;
possible consumer behaviors in such a case would include:
o a 4 screen endpoint choosing to receive the "top 16" switched
video streams to display a 2 x 2 grid on each of its screens
o a 3 screen endpoint choosing to receive the "top 12" switched
video streams to display the most significant 3 at full screen
size and the next 9 as 3 small PiPs on each screen
o a 1 screen endpoint choosing to receive the "top 10" switched
video streams and forming a "1 big + 9 small" display
To support such cases, several additional factors need to be
considered in addition to what has been previously discussed:
o knowledge that the 20 switched streams advertised do not all need
to be sent to the consumer for it to be able to represent the
complete scene to the user (this is not the case for a normal
multi-camera endpoint scenario, for instance, where typically a
consumer would need all captures in a capture scene entry in order
to be able to render that scene)
o ensuring that the spatial characteristics of contributing systems
to the ordered set are adhered to when sending out the requested
instantiated captures to consumers (for instance in the 3 screen
"top 12" example above), the provider should be able to take into
account the undesirability of splitting the 4 constituent captures
of a 4 camera system that was the active speaker across 3 full
screen panes and a single small PiP)
o ensuring that sufficient stream synchronization information is
available at the consumer in order for it to be able to perform
correct lip sync on the dynamically changing set of received audio
and video streams
Pepperell, et al. Expires December 2, 2012 [Page 4]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
5. Proposed approach
A minimalist solution to the above issues is proposed here and
addresses the above points as follows:
o Use of the existing "switched" media capture attribute to cover
two subtly-different cases:
* endpoints providing a subset of their available camera feeds /
microphones as one or more switched captures
* an MCU providing a subset of all current participants as a set
to be laid out by a consumer device in a layout of the
consumer's choosing
o Indicating to the consumer that a valid representation of the
scene can be constructed with a subset of the captures that form a
capture set entry would be accomplished by ensuring that the
captures in that capture set entry do not have any associated co-
ordinate or point of origin attributes. For instance, if an MCU
were able to send on 100 such streams but a receiving consumer
device could only form a 2 x 2 layout of the 4 most significant,
it would need to be able to determine that the 100 capture capture
scene entry was still of use to it, rather that it being, say, a
strip of 100 video thumbnails that was only a valid representation
of the scene when displayed in a certain order. In many senses it
would not be possible for the provider to supply any capture co-
ordinates in this case because no fixed, pre-determined, set of
co-ordinates would be valid.
o In order for the provider device to be able to make correct
choices about which of the ordered set of participants' captures
to send to the consumer device, there is a requirement for some
information about the consumer-side render groupings to be
conveyed from the consumer to the provider. The proposal here is
to be consistent with the provider-side X / Y / Z capture co-
ordinates and for the consumer to be able to signal, when making
its stream choice from the provider, the render co-ordinates of
each instantiated video capture. For example, if the active
speaker was a 3 camera system, all 3 corresponding video captures
might be sent to a consumer that had signalled that the first 3 or
more video captures would be rendered adjacently. A consumer
device with a different rendered layout might only be sent the
"second loudest" participant's video (if, for instance, the
corresponding source system was supplying just a single camera-
sourced video capture).
o In order for the dynamic mapping between audio and video captures
to be ascertained by the consumer, the proposal is for use of the
RTCP CNAME attribute to be the preferred mechanism, and for
consumer devices to monitor which streams have the same clock
source, and so can be usefully synchronized.
Pepperell, et al. Expires December 2, 2012 [Page 5]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
6. A less minimalist solution
An alternative to the above minimalist solution would be to remove
some of the implicitly signalled elements; specifically:
o a new attribute could be defined at the capture scene entry level
explicitly signalling that a subset of the constituent captures
can be used to produce a valid representation of that scene (this
removes the significance of, and thus the need to observe, the
absence of provider-side capture co-ordinates)
o rather than reusing the "switched" capture attribute for both a
single system switching between its available captures that cover
a single scene and an MCU-style device providing a set of "active
speaker" captures, introduce a new attribute for captures that
represent a provider choice of captures potentially cut down from
a larger list (e.g. the superset of all captures from all
conference participants) ordered by some provider-specific method
(e.g. loudest participants first)
7. Security Considerations
This draft involves only the internal nomenclature of the CLUE
framework and data model, and hence has no security considerations.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
8.2. Informative References
[I-D.ietf-clue-framework]
Romanow, A., Duckworth, M., Pepperell, A., and B. Baldino,
"Framework for Telepresence Multi-Streams",
draft-ietf-clue-framework-05 (work in progress), May 2012.
Authors' Addresses
Andy Pepperell
Silverflare
Email: andy.pepperell@silverflare.com
Pepperell, et al. Expires December 2, 2012 [Page 6]
Internet-Draft Switched attribute & spatial co-ordinates May 2012
Allyn Romanow
Cisco Systems
San Jose, CA 95134
USA
Email: allyn@cisco.com
Robert Hansen
Cisco Systems
San Jose, CA 95134
USA
Email: rohanse2@cisco.com
Brian Baldino
Cisco Systems
San Jose, CA 95134
USA
Email: bbaldino@cisco.com
Pepperell, et al. Expires December 2, 2012 [Page 7]