Internet DRAFT - draft-roach-mmusic-mlines
draft-roach-mmusic-mlines
MMUSIC A. B. Roach
Internet-Draft Mozilla
Intended status: Informational January 31, 2013
Expires: August 4, 2013
Thoughts on syntax for representing multiple media streams
draft-roach-mmusic-mlines-00
Abstract
This document briefly explores the ramifications of combining
multiple media streams into one SDP m= section versus expressing each
in its own m= section.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 4, 2013.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Roach Expires August 4, 2013 [Page 1]
Internet-Draft Media Stream Syntax January 2013
1. Introduction
As part of the ongoing RTCWEB and CLUE work, it has become clear that
the current mechanisms in SDP are insufficient for describing complex
sessions with multiple streams. Two competing schools of thought
have emerged. One holds that the m= lines should apply to RTP
sessions, regardless of how many media streams they contain. Another
holds that m= lines should apply to media streams exclusively, and
that an additional mechanism should be applied to combine multiple
streams into a single RTP session, if necessary.
2. Alternatives
2.1. Alternative 1: Multiple streams per m= section
One approach to specifying multiple streams in a single RTP session
is to put information for several streams into a single m= section;
and, by doing do, implicitly combine them into a single session.
To maintain some level of backwards compataibility with SDP, this
approach might choose to have one m= section for audio and a second
for video (with additional m= sections for other media types if they
are used in the future), combining those sections with a=group:BUNDLE
[I-D.ietf-mmusic-sdp-bundle-negotiation]; we will call this
"Alternative 1a". An alternate approach would be the definition of a
new media type which effectively allows transmission of any kind of
media, thereby avoiding the need to bundle multiple sections together
at all. A syntax for such an approach is proposed by
[I-D.holmberg-mmusic-sdp-mmt-negotiation]. We will call this
"Alternative 1b".
In both of the cases described above, certain SDP attributes might be
targeted at only one of the streams in an RTP session. These
attributes can be matched up with individual streams using the
"a=ssrc" extension defined in [RFC5576].
For "Alternative 1a", we have the additional challenge of specifying
attributes that apply to the entire RTP session, such as a=rtcp-fb
and ICE candidate parameters. One approach would be inclusion of
such parameters only in the first m= section within a bundle, with
the implication that they apply to the entire session.
Roach Expires August 4, 2013 [Page 2]
Internet-Draft Media Stream Syntax January 2013
2.1.1. Alternative 1a: One section per RTP session per type
v=0
o=- 2890844526 2890844526 IN IP4 host.example.com
s=
c=IN IP4 host.example.com
t=0 0
a=group:BUNDLE c1 c2
m=audio 10000 RTP/AVP 0 8 97
a=mid:c1
a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
192.0.2.240 rport 51091
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
a=ssrc:11111 label:speaker-audio
a=ssrc:22222 label:floor-mic
m=video 10000 RTP/AVP 31 32
a=mid:c2
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
a=ssrc:33333 label:speaker-video
a=ssrc:44444 label:slides
Roach Expires August 4, 2013 [Page 3]
Internet-Draft Media Stream Syntax January 2013
2.1.2. Alternative 1b: One section per RTP session
v=0
o=- 2890844526 2890844526 IN IP4 host.example.com
s=
c=IN IP4 host.example.com
t=0 0
a=group:MMT foo bar zoe
m=anymedia 10000 RTP/AVP 0 8 97 31 32
a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
192.0.2.240 rport 51091
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
a=mmtype:0 audio
a=mmtype:8 audio
a=mmtype:97 audio
a=mmtype:31 video
a=mmtype:32 video
a=ssrc:11111 label:speaker-audio
a=ssrc:22222 label:floor-mic
a=ssrc:33333 label:speaker-video
a=ssrc:44444 label:slides
2.2. Alternative 2: Single stream per m= section
An alternate proposal is constraining one m= section to talk about a
single media stream. Like alternative 1a, above, the BUNDLE
extension is used to combine several m= sections into a single RTP
session. Any attributes that are applicable to a single media stream
can be correlated by putting them in the corresponding m= section.
Any attributes that apply to the transport paramters (e.g., rtcp-fb,
ICE parameters) are conveyed in the first m= section within the
bundle (alternate schemes are possible, but this seems the simplest
and most straightforward).
Roach Expires August 4, 2013 [Page 4]
Internet-Draft Media Stream Syntax January 2013
v=0
o=- 2890844526 2890844526 IN IP4 host.example.com
s=
c=IN IP4 host.example.com
t=0 0
a=group:BUNDLE c1 c2 c3 c4
m=audio 10000 RTP/AVP 0 8 97
a=mid:c1
a=label:speaker-audio
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
192.0.2.240 rport 51091
m=audio 10000 RTP/AVP 0 8 97
a=mid:c2
a=label:floor-mic
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 10000 RTP/AVP 31 32
a=mid:c3
a=label:speaker-video
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
m=video 10000 RTP/AVP 31 32
a=mid:c4
a=label:slides
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
2.3. Pros and Cons
2.3.1. Codec Selection
Currently, in SDP and the various documents that rely on it (such as
[RFC3264]), there are certain assumptions made about the ordinality
of streams to m= sections. Consider, for example, wanting to convey
two audio streams with a low-bandwidth voice codec preferred for one,
but a high-quailty codec preferred for the other. RFC 3264 has rules
indicating that codecs are conveyed in the order of their preference.
With alternative 2, it is trivial to provide different ordering (or
even a different set) of codecs to acheive such a goal. Alternatives
1a and 1b lack the ability to do so without additional extensions.
This set of facts supports alternative 2 in preference to
alternatives 1a and 1b.
Roach Expires August 4, 2013 [Page 5]
Internet-Draft Media Stream Syntax January 2013
2.3.2. Port Number Handling
When multiple sections are used to represent a single session, we
need to make a decision regarding the port number conveyed in the m=
line itself. One option is to use the same port number in all
related m= sections. According to Cullen Jennings, this interacts
very poorly with existing implementations that use SDP. The other
alternative is to indicate bogus port numbers in all (or all but one)
of the m= lines. According to Hadriel Kaplan, this usage will lead
to certain media intermediaries destroying the session when it
determines that a signaled port is going unused.
Alternative 1b avoids this problem altogether by having only one m=
per IP/port combination, thereby completely sidestepping the question
of what to put in subsequent m= lines.
This set of facts supports alternative 1b in preference to
alternatives 1a and 2.
2.3.3. Attribute handling
Attributes that appear inside m= sections can be generally broken
down into three categories: those intended to apply to a single media
stream (e.g., framerate); those intended to apply to an RTP session
(e.g., rtcp-fb), and those that are explicitly bound to the m= line
itself (e.g., rtpmap). By and large, these attributes have been
defined with an assumption that each RTP session had one stream and
vice-versa.
By specifying a model that breaks this one-to-one correspondence, we
have created the need to be able designate a specific media stream
within an RTP session (for alternatives 1a and 1b), or the need to be
able to talk about session-level attributes (for alternatives 1a and
2).
Alternatives 1a and 1b can perform stream-level designation through
the use of the ssid attribute specified in [RFC5576]. Alternatives
1a and 2 can apply a convention that any RTP-session-level attributes
are placed in the first m= section in a bundle (although other, more
complicated approaches may also be possible).
Note, in particular, that alternative 1a inherits both problems of
being able to designate attributes as applying to a single stream, as
well as being able to talk about session-level attributes when
multiple m=lines are bundled together.
This set of facts supports alternatives 1b and 2 in preference to
alternative 1a.
Roach Expires August 4, 2013 [Page 6]
Internet-Draft Media Stream Syntax January 2013
2.3.4. What We're Unaware of Not Knowing
It is worth noting that the problem described in Section 2.3.1 was
not discovered for quite a long time after the discussion of multiple
media streams had begun. In the characterization of "known knowns,"
"known unknowns," and "unknown unknowns," this issue remained an
unknown unknown for more than a little time.
Generally, addressing these unknown unknowns is likely to be easiest
if we have the highest granularity of control. Alternative 2, by
breaking each stream apart into its own instance of the control
structure that has historically been used to work with media (the m=
section), provides this high granularity where alternatives 1a and 1b
do not.
It is the author's opinion that the probable existance of such
unknown unknowns favors alternative 2 over 1a or 1b.
2.4. Red Herrings
During the course of discussing this topic, several points have been
raised that, while relevant, do not bias the selection of one
solution over another.
One issue that has been brought up is that SDP offer/answer requires
signaling of the number of m= sections in the offer, to allow clear
semantics for negotiation. Some proponents of solutions 1a and 1b
have indicated a belief that allowing multiple streams per m= section
avoides this restriction. This assertion has a number of problems.
First, it assumes that implementations can perform reasonable
operations on dynamically created media streams that begin and end
without any signaling. It further assumes that the problems that the
offer/answer model imposed the m-line restrictions for are no longer
applicable (at least, not on a stream level). Finally, this
assertion assumes that no control surfaces are necessary to talk
about and/or manipulate the individual streams (alternately, if such
control surfaces are introduced, then additional SDP round-trips to
exchange information about those controls is necessary, making them
semantically equivalent to a new offer/answer exchange -- which
eliminates any purported advantage).
It has also been observed that, in addition to being sometimes
applicable to streams and sometimes applicable to sessions, attribute
are also sometimes unidirectional, and sometimes bidirectional.
While an astute observation, this does not appear to have any bearing
on the ultimate solution selected, as all three alternatives face
exactly the same challenges in dealing with issues of directionality.
Roach Expires August 4, 2013 [Page 7]
Internet-Draft Media Stream Syntax January 2013
Finally, it should be noted that any decision to include multiple
sections within a single m= section does little to simplify
implementation. Even if native RTCWEB implementations generate the
fewest m= sections necessary to convey their desired session state,
the selection of alternatives 1a and 1b does not obviate the
requirement that implementations must be able to receive SDP with
several m=audio sections (for example). Interoperation with legacy
implementations, even through a gateway, will require that proper
handling of such session descriptions is present in every RTCWEB
implementation.
2.5. Summary
The following table summarizes the pros and cons conveyed in the
preceding sections on a per-solution basis.
+---------------+----+----+---+
| Issue | 1a | 1b | 2 |
+---------------+----+----+---+
| Section 2.3.1 | - | - | + |
| Section 2.3.2 | - | + | - |
| Section 2.3.3 | - | + | + |
| Section 2.3.4 | - | - | + |
+---------------+----+----+---+
Based on these criteria, it is the author's belief that Alternative 2
provides the most benefit, with Alternative 1b providing a close
second place.
Alternative 1a has the remarkable property of combining all of the
drawbacks of solutions 1b and 2, forming a kind of "sweet-spot" of
ill-advisement, and thereby maximizing the amount of work required of
the MMUSIC, RTCWEB,and CLUE working groups.
3. IANA Considerations
This document makes no requests of IANA.
4. Security Considerations
The author does not beleive that the syntax under discussion has an
impact on the security properties of those protocols that make use of
SDP.
Roach Expires August 4, 2013 [Page 8]
Internet-Draft Media Stream Syntax January 2013
5. Normative References
[I-D.holmberg-mmusic-sdp-mmt-negotiation]
Holmberg, C., Alvestrand, H., and J. Lennox, "Multiplexed
Media Types (MMT) Using Session Description Protocol (SDP)
Port Numbers",
draft-holmberg-mmusic-sdp-mmt-negotiation-00 (work in
progress), October 2012.
[I-D.ietf-mmusic-sdp-bundle-negotiation]
Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation
Using Session Description Protocol (SDP) Port Numbers",
draft-ietf-mmusic-sdp-bundle-negotiation-01 (work in
progress), August 2012.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
Media Attributes in the Session Description Protocol
(SDP)", RFC 5576, June 2009.
Author's Address
Adam Roach
Mozilla
Dallas, TX
US
Email: adam@nostrum.com
Roach Expires August 4, 2013 [Page 9]