Internet DRAFT - draft-ietf-avt-reconsider
draft-ietf-avt-reconsider
Internet Engineering Task Force AVT WG
Internet Draft J.Rosenberg,H.Schulzrinne
draft-ietf-avt-reconsider-00.txt Bell Laboratories/Columbia U.
July 1997
Expires: January 1998
Timer Reconsideration for Enhanced RTP Scalability
STATUS OF THIS MEMO
This document is an Internet-Draft. Internet-Drafts are working docu-
ments of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute work-
ing documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference mate-
rial or to cite them other than as ``work in progress''.
To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Distribution of this document is unlimited.
1 Abstract
RTP, the Real Time Transport Protocol, has gained widespread accep-
tance as the transport protocol for voice and video on the Internet.
It provides services such as timestamping, sequence numbering, and
payload identification. It also contains a control component, the
Real Time Control Protocol (RTCP), which is used for loose session
control, QoS reporting, and media synchronization, among other func-
tions. The RTP specification describes an algorithm for determining
the RTCP packet transmission rate at a host participating in a multi-
cast RTP session. This algorithm was designed to allow RTP to be used
in sessions with anywhere from one to a million members. However, we
have discovered several problems with this algorithm when used with
very large groups with rapidly changing group membership. One problem
is the flood of RTCP packets which occurs when many users join a mul-
ticast RTP session at nearly the same time. To solve this problem, we
present a novel adaptive timer algorithm called reconsideration. We
present a mathematical analysis of this algorithm, and demonstrate
that it performs extremely well, reducing the congestion problem by
several orders of magnitude. We also back up these results with simu-
lation.
J.Rosenberg,H.Schulzrinne [Page 1]
2 Introduction
Internet Draft Reconsideration July 1997
There has recently been a flood of interest in the delivery of multi-
media services on the Internet. The growing popularity of Internet
telephony, streaming audio and video services (such as those provided
by Real Audio) and the Mbone are all indicators of this trend. To
support these applications, standards are being developed to insure
interoperability. The ITU-T H.323 [1] specification for Internet
telephony is gaining widespread acceptance among software vendors.
The IETF is developing protocols such as SIP [2] for multimedia ses-
sion initiation, and RTSP [3] for controlling multimedia servers on
the Internet.
Interwoven with all of the above protocols is the Real Time Transport
Protocol (RTP) [4]. It is used by H.323 terminals as the transport
pprotocol for multimedia; both SIP and RTSP were designed to control
multimedia sessions delivered over RTP. Its main function is to carry
real time services, such as voice and video, over an IP network. It
provides payload type identification so that the receiver can deter-
mine the media type contained in the packet. Sequence numbers and
timestamps are also provided, so that packets can be reordered,
losses can be detected, and data can be played out at the right
speeds. RTP was designed to be easily used in multicast conferences.
To this end, it guarantees that each participant in a session has a
unique identifier called the synchronization source (SSRC). This
identifier is carried in each packet, providing applications a way to
de-multiplex packets from different users.
RTP also contains a control component, called the Real Time Control
Protocol (RTCP). It is multicast to the same multicast group as RTP,
but on a different port number. Both data senders and receivers peri-
odically multicast RTCP messages. RTCP packets provide many services.
First, they are used to identify the users in a session. One RTCP
packet type, the Source Descriptor (SDES) contains the name, email
address, telephone number, fax, and location of the participant.
Another, the receiver report, contains reception quality reporting.
This information can be used by senders to adapt their transmission
rates or encodings dynamically during a session [5]. It can also be
used by network administrators to monitor network quality [6]. It
could potentially be used by receivers to decide which multicast
groups to join in a layered multimedia session; such an application
is similar to RLM [7]. Yet another RTCP packet type, the sender
report (SR), is used to aid receivers in inter-media synchronization
(lip sync), and to indicate transmitted bit rates, among other func-
tions.
Since RTP was designed for multicast, it was engineered to work well
with both large and small sessions. A typical small session might be
a teleconference among five business executives, while a typical
large session might be an Mbone broadcast of a shuttle launch, where
J.Rosenberg,H.Schulzrinne [Page 2]
Internet Draft Reconsideration July 1997
group sizes of two hundred listeners have been reported [8]. As the
demand for multimedia continues to grow, larger and larger group
sizes will become commonplace. It is not difficult to envision Mbone
concert broadcasts with thousands of members. It has even been sug-
gested that RTP might be the transport protocol of choice for multi-
cast distribution of multimedia in future cable networks, where tens
of thousands of users might be the norm.
The principle difficulty in achieving scalability to large group
sizes is the rate of RTCP packet transmissions from a host. If each
host sends packets at some fixed interval, the total packet rate sent
to the multicast group increases linearly with the group size, N.
This traffic would quickly congest the network, and be particularly
problematic for hosts connected through low speed dialup modems. To
counter this, the RTP specification requires that end systems utiliz-
ing RTP listen to the multicast group, and count the number of dis-
tinct RTP end systems which have sent an RTCP packet. This results in
a group size estimate, L computed locally at each host. The interval
between packet transmissions is then set to scale linearly with L.
This has the effect of keeping the total traffic in the multicast
group constant, independent of the number of group members.
The above scaling mechanism works well for small to medium sized
groups (up to perhaps a few hundred members). However, it suffers
from problems when applied to larger groups, particularly ones whose
group membership is dynamic. We have identified three inter-related
problems which arise with large, dynamic multicast groups:
oCongestion: In many cases, the access bandwidths for users will be
small compared to network bandwidths (28.8 kb/s modems, for exam-
ple, can now handle multimedia RTP sessions when RTP header com-
pression [9] is used). We also anticipate that many multicast RTP
sessions will exhibit rapid increases in group membership at cer-
tain points in time. This can happen for a number of reasons. Many
sessions have precise start times. Multimedia tools such as vat
and vic can be programmed to join a session at the instant of its
inception. Even without automation, users are likely to fire up
their applications around the time the session is scheduled to
begin. Such phenomena are common in current cable networks, where
people change channels when shows begin and end. Studies have been
performed to look at the group membership over time of some of the
popular sessions on the Mbone [10][8]. These studies show exactly
this kind of step-join behavior. The result of these step joins
are inaccuracies in the group size estimates obtained by listening
to the group. Each newly joined member believes that they are the
only member, at least initially, and begins to send packets at a
relatively fast rate. Combined with slow access links, the result
is a flood of RTCP reports, causing access link congestion and
J.Rosenberg,H.Schulzrinne [Page 3]
Internet Draft Reconsideration July 1997
loss.
For example, consider an RTP session where the total RTCP rate is
to be limited to 1 kb/s. If all RTCP packets are 1 kbit, packets
should be sent at a total rate of one per second. Under steady
state conditions, if there are 100 group members, each member will
send a packet once every 100 seconds, and everything works. How-
ever, if 100 group members all join the session at about the same
time, each thinks they are initially the only group member. Each
therefore sends packets at a rate of 1 per second, yielding an
aggregate rate of 100 packets per second, or 100 kb/s, into the
group.
oState Storage: In order to estimate the group size, hosts must
listen to the multicast group and count the number of distinct end
systems which send an RTCP packet. To make sure an end system is
counted only once, its unique identifier (SSRC) must be stored.
Clearly, this does not scale well to extremely large groups, which
would require megabytes of memory just to track users. Alternate
solutions must be found, particularly for set top boxes, where
memory is limited.
oDelay: As the group sizes grow, the time between RTCP reports from
any one particular user becomes very large (in the example above,
with 3000 group members, each would get to send an RTCP packet
about once an hour). This interval may easily exceed the duration
of group membership. This means that timely reporting of QoS prob-
lems from a specific user will not occur, and the value of the
actual reports is lost.
In this paper, we consider only the first problem, that of congestion.
It is our aim to solve this problem with a single mechanism, applicable
to large groups and small alike. It is possible to develop solutions
which work well for specific applications. For example, RTCP reporting
can be disabled completely [11]. This, in fact, solves all three of the
above problems. However, many applications require RTCP, and therefore
this approach is not a general one. Another alternative is to use hier-
archical summarizers. This helps improve convergence time, and may
relieve congestion, but it requires special servers to be deployed. It
is also not appropriate for small groups, and therefore does not work
well as a universal solution to the congestion problem.
There has been a small amount of prior work on resolving difficulties
with timers in Internet protocols. Most prominent among this work is
[12] and [13]. Sharma et. al. consider how to scale soft state timers to
varying link capacities and state quantities. Their work considers only
the point to point case. In that scenario, any sharp increases in the
amount of state to send (which is equivalent to the sharp increases of
J.Rosenberg,H.Schulzrinne [Page 4]
Internet Draft Reconsideration July 1997
group membership we consider here) are known instantly by the sender,
since all of the state resides there. The congestion problem which we
treat here arises due to the distributed nature of the system and the
multicast interconnect. In that scenario, a rapid change in group mem-
bership is represented by a change in group state distributed across
many nodes. As such, our work can be viewed as a generalization of
their's to distributed multicast groups.
In fact, our algorithm for controlling the congestion problem in RTP is
applicable to other protocols and systems. An extension to the Service
Location Protocol [14] has been proposed [15] which uses the reconsider-
ation algorithm to control congestion in the multicast group used to
disseminate information on network services. The algorithm is generally
applicable to distributed systems where (1) control of bandwidth is
desirable, (2) the bandwidth is used to transmit state, (3) the state is
used to determine end system transmission rates, and (4) the state is
dynamic. These constraints apply to BGP [16], for example, when a route
server is used and update rates are to be controlled.
The remainder of the paper is organized as follows. In Section 3, we
detail the current RTCP packet transmission algorithm. Section 4
describes the desired ideal behavior. Section 5 describes our solution,
an algorithm called timer reconsideration, and shows its performance.
Section 6 then analyzes the algorithm to provide more insight into its
performance. Section 7 discusses the algorithms performance under steady
state conditions, and Section 8 summarizes our work.
3 Current RTCP Algorithm
Each user i in a multicast group using RTP maintains a single state
variable, the learning curve, which we denote by L(t). This variable
represents the number of other users that have been heard from at
time t. The state is initialized to L(0)=1 when the user joins the
group.
Each user multicasts RTCP reports periodically to the group. In order
to avoid network congestion, the total rate of RTCP reports multicast
to the group, summed across all users, is set at 5% of the total mul-
ticast session bandwidth (it is assumed in RTP that this quantity is
known apriori). We define C as the average interval between arrivals
of RTCP packets, across all users, into the group, so that C is the
average RTCP packet size divided by 5% of the session bandwidth. To
meet this criteria, each user computes a deterministic interval,
which represents the nominal interval between their own packet trans-
missions required to meet the 5% constraint. This interval is given
by :
__________________________
In actuality, the RTP specification allocates 75% of
J.Rosenberg,H.Schulzrinne [Page 5]
Internet Draft Reconsideration July 1997
T_d= (T_ min, C L(t)),
whereTmmin
is2.5sfortheinitialpacketfromtheuser,and5s
for all other packets. To avoid synchronization, the actual interval
is then computed as a random number uniformly distributed between 0.5
and 1.5 times Td.
The algorithm for sending RTCP packets follows directly. Assume a
user joins at time t=0. The first packet from that user is scheduled
at a time uniformly distributed between 1/2 and 3/2 of Tmin (which is
2.5 s for the first packet), putting the first packet transmission
time between 1.25 and 3.75 seconds. We denote this time as t0. All
subsequent packets are sent at a time tn equal to:
t_n = t_n-1 + R(alpha) (5,CL(t_n-1)),
where we have defined R(alpha) as a random variable uniformly dis-
tributed between (1-alpha) and (1+alpha). (A equals 1/2 in the cur-
rent algorithm; we generalize because A has a strong impact on tran-
sient behavior). A pseudo-code algorithm describing the behavior upon
expiration of the interval timer is given in Figure 1.
3.5in
new_interval = C * current_group_size_estimate;
new_interval = max(new_interval, Tmin);
new_interval = new_interval * random_factor;
send_packet();
schedule_timer(current_time + new_interval);
Figure 1: Current RTCP Algorithm
The difficuly arises when a large number (say, N) of users all join
the group at the same time. We call this a step-join. Since all users
start out with L(t)=1, all schedule their first packet to be sent
between t=1.25 and t=3.75, a fixed, 2.5 second interval. The result
is a flood of N packets for 2.5 s, many of which are lost if the
access bandwidth is low. Since group size estimates are based on the
reception of these packets, losing them will continue to cause each
user to have a low estimate of the actual group size. This will cause
__________________________
the RTCP bandwidth to data senders, and the remaining
25% to listeners. In the remainder of the paper, we
assume that everyone is a listener. This simplifies the
analysis and simulations, all of which can be trivially
extended to include the case where there are senders.
J.Rosenberg,H.Schulzrinne [Page 6]
Internet Draft Reconsideration July 1997
continued congestion until enough packets get through to make the
group size estimates correct.
4 Ideal Behavior
The flood of packets caused by the current RTCP algorithm with a step
join has both good and bad consequences. Clearly, the congestion
which results is not desirable. However, the flood allows the end
systems to very rapidly learn about the group sizes and group member-
ship, which is desireable. There is a fundamental and unavoidable
tradeoff between the convergence time (i.e., the time until the
observed group size L(t) equals the actual group size) and the band-
width used to achieve convergence. What, then, represents the behav-
ior which is desirable?
Our approach is to define the ideal behavior as the one where feed-
back into the group never exceeds its specified threshold (5% for
RTCP). This implies that convergence times will grow as the group
sizes grow. However, it is the most social solution, in the sense
that it will never congest the network, no matter how large the group
sizes become. If we define the ideal behavior as convergence within
any amount of time that grows less than linearly with the group size,
the result is a protocol that does not scale and can eventually
result in congestion.
We also consider congestion avoidance to be more important because we
expect many users to be connected via low speed dialup lines. In that
case, bandwidth is at a premium, and it is in the self-interest of
users to make the best use of it. Most users probably consider RTCP
feedback much less important than the video or audio data itself, and
therefore it is important to keep the feedback below the required 5%.
We now state the desired ideal behavior:
1. The learning curve L(t) grows linearly at a rate of C users
per second, until it reaches the group size N, at which point
it becomes flat, and remains at N.
2. The bandwidth used by all feedback is always equal to C pack-
ets per second during the convergence period.
5 Reconsideration
Our contribution is a new solution which we call reconsideration. The
effect of the algorithm is to reduce the initial flood of packets
which occur when a number of users simultaneously join the group.
This algorithm operates in two modes, conditional and unconditional.
We first discuss conditional reconsideration.
J.Rosenberg,H.Schulzrinne [Page 7]
Internet Draft Reconsideration July 1997
At time tn, as defined above, instead of sending the packet, the user
checks if the group size estimate L(t) has changed since tn-1. If it
has, the user reconsiders. This means that the user recomputes the
RTCP interval (including the randomization factor) based on the cur-
rent state (call this new interval Trime),
and adds it to tn-1. If the result is a time before the current time
tn, the packet is sent, else it is rescheduled for tn-1+Trime.
In other words, the state at time tn gives us potentially new infor-
mation about the group size, compared to the state at time tn-1.
Therefore, we redo the interval computation that was done previously
at time tn-1, but using the new state. If the resulting interval
would have caused the packet to be scheduled before the current time,
we know that our interval estimate was not too low. If, however, the
recomputation pushes the timer off into the future, we know that our
initial timer estimate was computed incorrectly, and we delay trans-
mission based on our new timer. A pseudo-code specification of the
algorithm is given in Figure 2.
4.5in
new_interval = C * current_group_size_estimate;
new_interval = max(new_interval, Tmin);
new_interval = new_interval * random_factor;
if ((last_transmission + new_interval < current_time)
(current_group_size_estimate vious_group_size_estimate))
send_packet();
schedule_timer(current_time + new_interval);
last_transmission = current_time;
previous_group_size_estimate = current_group_size_estimate;
else
schedule_timer(last_transmission + new_interval);
previous_group_size_estimate = current_group_size_estimate;
Figure 2: Conditional Reconsideration
Intuitively, this mechanism should help alleviate congestion by
restricting the transmission of packets during the convergence peri-
ods, where the perceived group sizes L(t) are rapidly increasing.
In unconditional reconsideration, the user reconsiders independently
of whether the number of perceived users has changed since the last
report time. Thus, the RTCP interval is always recomputed, added to
the last transmission time tn-1, and the packet is only sent if the
resulting time is before the current time. Clearly, when the group
sizes are increasing, this algorithm behaves identically to condi-
tional reconsideration. However, its behavior differs in two
respects. First, consider the case where we have converged, and group
J.Rosenberg,H.Schulzrinne [Page 8]
Internet Draft Reconsideration July 1997
sizes are no longer changing. In conditional reconsideration, no
timer recomputation is done. But for unconditional, it is redone.
Since group sizes have not changed, the deterministic part of the
interval remains the same. However, the random factor is redrawn each
time. This means that packets will be transmitted when the recomputed
random factor is smaller than the previous factor, and packets will
be delayed when the recomputed random factor is greater than the pre-
vious one. Note that since the random factor is of finite extent
(between 1/2 and 3/2), packets are guaranteed to eventually be sent.
However, the result is an average increase in the interval between
RTCP packets.
4in
new_interval = C * current_group_size_estimate;
new_interval = max(new_interval, Tmin);
new_interval = new_interval * random_factor;
if (last_transmission + new_interval < current_time)
send_packet;
schedule_timer(current_time + new_interval);
last_transmission = current_time;
else
schedule_timer(last_transmission + new_interval);
Figure 3: Unconditional Reconsideration
The behavior of unconditional reconsideration differs during the ini-
tial transient as well. Consider N users who simultaneously join the
group at time 0. They all schedule their first RTCP packets to be
sent between t=1.25 and t=3.75. The users whose packets were sched-
uled earliest (at a time a little bit after t=1.25) will not recon-
sider with conditional reconsideration, and will always send their
packets. This is because no one else has sent any packets yet, and
thus they have not perceived the group size to have changed. In fact,
because of network delays, many users may send packets without recon-
sidering. Once the first transmitted packet has reached the end sys-
tems, conditional reconsideration kicks in, since users will perceive
a change in group size only then. With unconditional reconsideration,
those first few users do not wait for the first packet to arrive
before using the reconsideration algorithm. They will all recompute
the timer. Obviously, the group size estimate hasn't changed, but the
random variable will be redrawn. For the first few users, the random
factor was initially extremely small (that's why they are the first
few users to send). In all likelihood, when the factor is redrawn, it
will be larger than the initial factor, and thus the resulting inter-
val will be larger. This will delay transmission of RTCP packets for
those users. As time goes on, it becomes less likely than the new
J.Rosenberg,H.Schulzrinne [Page 9]
Internet Draft Reconsideration July 1997
random factor will be greater than the initial one. However, by then,
any RTCP packets which may have been sent will begin to arrive,
increasing the group size estimates for each user. In this fashion,
unconditional reconsideration alleviates the initial spike of packets
which are present in conditional reconsideration. These arguments are
all quantified in later sections.
Both modes of the algorithm are advantageous in that they do not
require any modifications to the current RTCP protocol structure. In
fact, they operate properly even when only a subset of the multicast
group utilizes them. As more and more members of a group use the
algorithm, the amount of congestion is lessened in proportion. This
leaves open a smooth migration path which is absent for most of the
other proposed solutions.
5.1 Simulations
We ran a number of simulations to examine the performance of the
reconsideration algorithms.
In our simulation model each user is connected to the network via an
access link of 28.8 kb/s downstream (i.e., from the network to the
user). We assume upstream links are infinitely fast, since congestion
occurs only downstream. This congestion is due to the RTCP reports
from all of the other users being sent to any particular user. Multi-
cast join latencies are ignored; this is reasonable in protocols such
as DVMRP [17] since initial packets are flooded. We assume that the
network introduces a delay of D seconds, where D is uniformly dis-
tributed between 0 and 600 ms. Each user has a 100 kB buffer on the
downstream access link. We assume all RTCP packets are 128 bytes in
size.
Figures only available in postscript version
Figures only available in postscript version
Figure 5.1 and Figure 5.1 depict state evolution for a single user
when 10,000 users simultaneously join a multicast group at t=0. The
figures depict the system with no reconsideration (the current speci-
fication), conditional reconsideration, unconditional reconsidera-
tion, and the ideal behavior. The graphs are plotted on a log-log
scale to emphasize the beginning and complete evolution of the sys-
tem. Figure 5.1 depicts the learning curve, and Figure 5.1 shows the
integral of r(t), i.e., the total number of packets sent, given that
r(t) is the packet transmission rate into the multicast group. Note
the burst of packets sent in the beginning by the current algorithm.
Exactly 10,000 packets are sent out in a 2.5 s interval. This is
almost 3000 times the desired RTCP packet rate. However, this burst
J.Rosenberg,H.Schulzrinne [Page 10]
Internet Draft Reconsideration July 1997
is reduced substantially by the reconsideration mechanisms. Condi-
tional reconsideration causes only 197 packets to be sent over a 210
ms interval, and unconditional reconsideration causes merely 75 pack-
ets to be sent over a 327 ms interval. We also observed similar
improvements with a variety of different link speeds, delays, and
group memberships.
We noted that the startup burst with reconsideration was particularly
disturbing when network delays were deterministic instead of uni-
formly distributed. This is demonstrated in Figure 5.1, which looks
at the cumulative number of packets sent when 10,000 users simultane-
ously join at t=0, using conditional reconsideration. In all cases,
the mean network delay was 300ms, but the distribution varies. Expo-
nentially distributed network delays exhibited nearly identical per-
formance to a uniform distribution. Later sections will demonstrate
that the spike is dependent on the amount of time until the first
packet arrives. As the number of users in the step join becomes
large, the number of users who send their packets within the first S
seconds after t=1.25 grows large for any S. Consider an S much
smaller than typical network delays, say 10 mus. As far as computing
arrival times at end stations, these packets can be treated as though
they were all sent at the same time. The amount of time until the
first of these packets arrives at any end system is thus the minimum
network delay experienced by all of those packets. If the network
delays are exponential, the expected minimum of N exponential random
variables goes to zero as N grows. The same is also true for a uni-
form random variable. For a deterministic variable, this is not the
case; the minimum is always the same. Therefore, the performance is
worse for network delays which are fixed.
Figures only available in postscript version
We have also observed that the reconsideration mechanisms cause a
complete pause in packet transmissions after the initial spike. This
pause (which we call the plateau effect) lasts for a time propor-
tional to the number of packets in the spike. This has both positive
and negative implications. On the plus side, it gives network buffers
time to clear. However, it also causes the send rate to deviate from
our desired fixed 1/C packets per second. The phenomenon occurs
because the spike of packets in the beginning causes the system to
reconsider, and not send, all packets after the spike. A more
detailed explanation of the phenomenon is given in Section 6. How-
ever, after the spike and plateau, the packet rate behaves fairly
well, sending packets at a nearly constant rate.
We also ran simulations to observe performance in linear joins, where
groups of users join the system at time seconds apart, for some
small . The results are shown in Figure 5.1 and Figure 5.1. Both
J.Rosenberg,H.Schulzrinne [Page 11]
Internet Draft Reconsideration July 1997
plots depict the cumulative number of packets sent by all users. The
simulation parameters are identical to the above cases, except net-
work delays are deterministic 300 ms. The first plot depicts condi-
tional reconsideration, and the second, unconditional. In all cases,
2500 users join the system, but the rate that they do so is varied.
Both plots depict the step join, and joins at a rate of 5000, 2500,
and 500 users per second. The plots indicate that linear joins
quickly eliminate the initial transient of packets and the plateau
period, with the reduction being better for unconditional reconsider-
ation.
We have done some analysis to examine how the behavior of reconsider-
ation changes under linear joins. Our analysis has shown that as soon
as the number of users who join, times , exceeds the network delays,
the initial bursts in the reconsideration algorithms begin to disap-
pear, whereas they remain for the current specification. All other
aspects of the system performance (including long term growth of
L(t)) are identical to the step-join case.
Figures only available in postscript version
Figures only available in postscript version
6 Analysis
In this section, we present a mathematical analysis of the reconsid-
eration mechanism. We first consider the case where there are no net-
work delays. This results in a differential equation which describes
the learning curve. The analysis also applies to networks with delay,
but only models the post-transient behavior of the system. However,
this is sufficient to compute the post-transient packet rate and sys-
tem convergence times. We then extend this analysis to the case of
network delays, and derive expressions which describe the transient
spikes and plateaus in the learning curve. We also analytically
demonstrate the reasons for improved performance from unconditional
reconsideration, which only exists in the presence of network delays.
6.1 No Delay
We consider a system where all of the users join the network at the
same time, t=0. It is assumed that the network introduces neither
delay nor loss, and that access links have infinite bandwidth. The
result is that when a user sends an RTCP packet, it is received by
all of the users simultaneously at the time it was transmitted.
In the model considered here, all users will have exactly the same
state (in terms of L(t)) at all times. Thus, we trace state evolution
as seen by a particular user. The user estimate has converged when
J.Rosenberg,H.Schulzrinne [Page 12]
Internet Draft Reconsideration July 1997
L(t)=N, the number of users actually in the group.
Whenever a packet is reconsidered, it is either sent, or it is not,
depending on whether the newly computed send time is before or after
the current time. We can therefore view the reconsideration mechanism
as causing any packet to be sent with some probability P. In the most
general case, P is a function of the current time t, the time of the
last RTCP report, and the number of users observed at t, L(t). For-
tunately, the learning curve is only affected by packets which are
initial, that is, sent by users which have not yet sent a packet. For
all such users, the last report time is initialized to t=0, so that
the send probability is a function of t and L(t) only.
If we consider some small interval of time, the change in L(t) is
equal to the number of initial packets scheduled to be sent during
this interval, times the probability of sending a packet in that
interval. Based on this, we can immediately write the differential
equation:
(dL)/(dt) = d(t) P(t,L(t)),
where d(t) is the rate of packets scheduled for transmission during
some time interval. What remains is the evaluation of the scheduled
rate d(t) and probability P(t,L(t)). We first consider the send prob-
ability.
Consider an initial packet scheduled to be transmitted by a user at
time t. Since the number of perceived users, L(t) has surely changed
over any time interval, this packet is reconsidered
user perceives L(t) other users in the system. It thus calculates a
new packet interval, which is equal to:
T = R(alpha) (T_ min,C L(t))
SinceCL(t)islargerthanTmmin
mostofthetime,weignorethemax
operator. Keeping in mind that the previous report time is always
t=0, we can immediately write the probability of sending a packet
using Equation (1):
ll P_ send = (t-(1- alpha) C L(t))/(2 alpha C L(t))& (1 - alpha) C
L(t) < t < (1 + alpha) C L(t)
__________________________
It is for this reason that we make no distinction
between conditional and unconditional reconsideration
here
J.Rosenberg,H.Schulzrinne [Page 13]
Internet Draft Reconsideration July 1997
Figures only available in postscript version
The numerator represents the range of times in the interval widow
which fall below the current time t, while the denominator represents
the total range over which the times for the interval are selected.
This is illustrated in Figure 6.1. Note that this probability only
makes sense when t is between (1-alpha) and (1+alpha) of CL(t). When
t is to the left of the reconsideration window, the probability is
zero, and when t is to the right of the window, it is one.
An important implication of this equation is that the send probabil-
ity is zero when t < (1 - alpha) C L(t) . This places an upper bound
on the learning curve; if the learning curve should reach this bound,
no initial packets would be sent, and the curve would remain flat
until it fell back below this upper bound. We therefore define the
maximumlearningcurveLmmax
(t)tobe:
L_ max(t) = (1)/((1 - alpha) C) t
TheactuallearningcurveL(t)isalwaysbelowLmmax
(t).
The next step is to compute the scheduled rate, which is signifi-
cantly harder. In the ideal case, the rate that packets have been
scheduled at should equal the number of users in the system, N,
divided by the average RTCP interval size perceived by those users at
time t, namely CL(t). At any point in time the fraction of packets to
be sent which are initial is (N-L)/N. Thus, the scheduled rate of
initial packets is roughly given by:
d(t)=(N - L(t))/(C L(t))
The curves of Figure 5.1 show that the reconsideration algorithms
exhibit linear behavior between roughly t=100 and t=9000 (thus ignor-
ing the transient behavior in the beginning few seconds). We there-
fore attempt to determine the slope a of this line based on the dif-
ferential equation. Substituting L(t)=at into (2):
a = (N - L(t))/(C L(t)) (1-(1- alpha) C a)/(2 alpha C a)
For small t, L(t)<N, so we can ignore the L in the first term's
numerator. Thus:
(2 alpha C^2 L(t))/(N) a^2 + a (1- alpha) C -1 = 0
Thus, for large N and small t, L(t)ect the a2 term, and obtain the
desired result:
a = (1)/((1 - alpha) C)
J.Rosenberg,H.Schulzrinne [Page 14]
Internet Draft Reconsideration July 1997
Not coincidentally, this is also the slope of the maximum learning
curve. The equation indicates, therefore, that L(t) grows at its max-
imum rate until the approximation is no longer valid, at which point
its growth tapers off.
As mentioned previously, the equation for the scheduled rate d(t) is
very approximate. We have done some more extensive analysis, and
found that a slightly better approximation is given by:
d(t) = (N - L(t))/(C (1 - alpha)/(2 - alpha) L(t))
This is of the same form as the previous equation, but tends to model
the nonlinearities of the system better.
Now, with the density and send probabilities computed, we can write
the final differential equation, which is:
(dL)/(dt) = (N - L(t))/(C (1 - alpha)/(2 - alpha) L(t)) (t - (1 -
alpha) C L(t))/(2 alpha C L(t))
This ODE allows us to compute a numerical solution, which can be com-
pared against the simulations. Figure 6.1 shows the learning curve,
with 10,000 users joining at t=0, for both analysis and simulation.
In the simulation, however, we take into account non-zero delays and
finite link speeds; network delays are a deterministic 300 ms, and
link speeds are 28.8 kbps. Note that despite this change in assump-
tions, the analytical expression still comes extremely close to the
experimental for a large portion of the convergence period. In par-
ticular, it is very close during the period of linearity of L(t) and
less accurate afterwards. In addition, the differential equation does
not capture the behavior of L(t) for 0 the experimental curve
exhibits the spike and plateau (this is difficult to see in Figure
6.1 because of the x axis scale).
We believe that network delays only impact the behavior of L(t) when
they are on the order of CL(t). This is somewhat intuitive; the
timescale of transmission events for any particular user is CL(t). If
network delays are much smaller than this, they are almost instanta-
neous as far as sending packets goes, and therefore do not affect the
system behavior. It is for this reason that network delays only
impact the learning curve during the first minute or so.
Figures only available in postscript version
With an understanding of the behavior of L(t), we are now in a posi-
tion to discuss the real quantity of interest; the aggregate bit rate
generated by these sources as they move towards convergence. We call
this quantity r(t). Since the integral of this quantity is the total
J.Rosenberg,H.Schulzrinne [Page 15]
Internet Draft Reconsideration July 1997
number of packets sent, we have, as an immediate consequence:
r(t) >= (d)/(dt)L(t)
Experimentally, we have observed that r(t) is actually equal to the
derivative of L(t) for a large fraction of the time until conver-
gence. The reason for this is that the reconsideration mechanism
favors packets from users who have not yet sent a packet (initial
packets). Consider two packets, both scheduled to be sent at some
time t. One is an initial packet, and the other is from a user who
has sent a packet previously. For the initial packet, the last report
time is at t=0, whereas for the other packet, the last report time is
at some time t*, not equal to zero. In the latter case, the bottom
edge of the interval window is at t*+C(1-alpha)L(t). Thus, the proba-
bility of sending a non-initial packet at time t is:
P_ sendold = (t - t^* - C (1 - alpha) L(t))/(2 alpha C L(t))
This quantity is always less than the send probability for initial
packets as given in (3). In fact, for small t, L(t) is equal to
t/C(1-alpha). Plugging this in to (7), we get that the numerator of
the fraction is negative, so the send probability is exactly zero.
is exactly equal to the derivative of L(t) while L(t) is linear. We
expect it to continue to track the derivative closely even as L(t)
tapers off.
Once L(t) has converged to N, packets are sent at a rate of 1/C with
conditional reconsideration. With unconditional reconsideration, this
rate is somewhat less. Therefore, r(t) exhibits a dual-constant
behavior; it starts at 1/(1-alpha)C, stays there for some time, then
reduces to 1/C, where it remains from then on.
The final step is to approximate the convergence time. Unfortunately,
the precise time depends on the non-linear regime of L(t), which we
cannot capture adequately. However, we can bound the convergence time
by assuming linear behavior until L(t) equals N. Since the actual
L(t) is less than this curve, the convergence time Tc is easily
bounded on the left by:
__________________________
Note that plugging in L(t)=t/C(1-alpha) to equation (3)
yields a numerator of zero, and thus a probability of
zero also. In fact, the send probability is zero only
in the limit for N=infty; it is slightly positive for
all real cases. This is in contrast to the send
probability for non-initial packets, which is exactly
zero for finite N.
J.Rosenberg,H.Schulzrinne [Page 16]
Internet Draft Reconsideration July 1997
T_c >= N C (1 - alpha)
This bound grows linearly with the group size, as expected.
It is possible to compute an upper bound as well. Consider the last
initial packet to be sent. Before it is sent, L(t)=N-1. As long as
the send probability is less than one, it is possible that this last
initial packet will not be sent. But, according to (3), the send
probability is one when t>(1+alpha)CL(t). This means that the last
initial packet must be sent as soon as t=(1+alpha)C(N-1). This gives
us an upper bound of:
T_c <= N C (1 + alpha)
6.2 Modeling Delay and Loss
In this section, we consider the reconsideration algorithm in the
presence of network delay and link bottlenecks. We compute the size
of the spike during the initial transient, and the duration of the
plateau. We also demonstrate the superiority of unconditional recon-
sideration in reducing these startup effects.
The spike and plateau are easily explained. At t=0, all N users join
the system. They schedule their packets to be sent between
(1-alpha)Tmin and (1+alpha)Tmin. At time (1-alpha)Tmin, packets begin
to be sent. Lets say the network introduces a delay of D seconds.
This means that no packets will arrive at any end system until time
(1-alpha)Tmin+D. During these D seconds, many packets will be sent by
end-systems, causing the initial spike of packets. After D seconds,
this burst of packets will arrive. This causes a sharp increase in
the perceived group size L(t). This, in turn, increases the packet
transmission interval, and moves the left hand side of the interval
window well beyond the current time, so that Psend=0. The result is a
complete halt in transmissions until real time catches up with the
left hand side of the reconsideration window.
This qualitative description of the system is easily quantified. For
a large enough N, the flood of packets starting at time (1-alpha)Tmin
will saturate the access links D seconds later, independent of
whether conditional or unconditional reconsideration is used. While
the links remain saturated, packets arrive at a continuous rate at
the link speed, which we denote as m packets per second. We can
therefore express the arrival time of the nth packet as:
t_n = (1 - alpha) T_min + D + (n)/(m)
Since each packet arrival increases L(t) by one, we can set n equal
to L(t) in the above equation and solve for L(t):
J.Rosenberg,H.Schulzrinne [Page 17]
Internet Draft Reconsideration July 1997
L(t) = m(t - (1 - alpha) T_min - D)
This flood of packets will cause the learning curve L(t) to advance
very quickly, beyond its maximum as given in (4). When the learning
curve exceeds this maximum, all sending will stop. Call this stopping
time tstop. It can be obtained as the solution to:
(1 - alpha) C L(t_stop) = t_stop
t_stop = (1 - alpha) T_min + D + ((1 - alpha) T_min + D)/((1 - alpha)
C m - 1)
We can then plug this into (9) and solve for the number of packets
which have arrived at this point, nstop:
n_stop = ((1 - alpha) T_min + D)/((1 - alpha) C - 1/m)
The next step is to determine the number of packets sent up to this
point. This figure differs based on whether the reconsideration mech-
anism is conditional or unconditional. We first look at conditional.
The number of packets sent consists of two terms. Before the arrival
of the first packet (at time (1-alpha)Tmin+D+1/m), all packets sched-
uled to be sent are actually sent, since no users have observed a
change in the group size (which would activate the reconsideration
mechanism). The number of packets sent is then the density of packets
scheduled to be sent (which is N/2ATmin) times the amount of time
until the first packet arrives. We call this quantity nsenta, and it
is:
n_senta = (N)/(2 alpha T_min) (D + (1)/(m) )
Once the first packet arrives, reconsideration kicks in, and not all
packets will be sent. Each will be sent with some probability, P.
Unfortunately, this is not the same probability Psend as defined in
Equation 3. That equation ignored the max operator, assuming L(t) was
large most of the time. This is not true in the very beginning, where
it takes a few packets to increase CL(t) beyond Tmin. We assume that
once enough packets have arrived to do this, the result will be to
move the left hand side of the reconsideration window ahead of the
current time (this is true when D<C). In other words, we assume the
left hand side of the reconsideration window is always at
(1-alpha)CTmin until tstop.
With this in mind, the send probability between the arrival of the
first packet, and the stopping of transmission, is given by:
P_send = (t - (1 - alpha) T_min)/(2 alpha T_min)
J.Rosenberg,H.Schulzrinne [Page 18]
Internet Draft Reconsideration July 1997
The number of packets sent is given by the integral of the scheduled
packet rate times the send probability:
n_sentb = INTEGRAL_(1 - alpha) T_min+ D + 1/m^t_stop d(t) P_send dt
Since the density is N/2ATmin during this time period of interest,
the number of packets sent is obtained by:
n_sentb = INTEGRAL_(1 - alpha) T_min+ D + 1/m^t_stop (N)/(2 alpha
T_min) (t - (1 - alpha) T_min)/(2 alpha T_min) dt
This integral results in a growth in the number of sent packets as t2
until complete cutoff at tstop. The solution to the integral is:
n_sentb = (N)/(8 alpha^2 T_min^2) ( ( ((1 - alpha) T_min + D)/((1 -
alpha) C m - 1) + D)^2 - (D + (1)/(m) )^2 )
And the total number of packets sent, using conditional reconsidera-
tion, during this transient spike is:
n_sent = n_senta + n_sentb
These analytical results are compared with simulation in Figure 6.2.
The figure displays the cumulative number of packets sent for a step
join. For the simulation, 100,000 users join the system at t=0. Net-
work delays are deterministic and equal to 300 ms, and link speeds
are 28.8 kbps. The plot shows only the initial transient. The linear
and then t2 behavior is clear from the simulation. Our approximation
for both nsenta and nsentb is quite good. The analysis also predicts
that sending will stop at tstop=1.72s, which agrees with the simula-
tion. Also note that the number of packets sent is dominated by the
nsenta term.
Figures only available in postscript version
For unconditional reconsideration, the number of packets sent during
the transient is different. In the conditional case, the total con-
sisted of two parts; one before the arrival of the first packet (as
the reconsideration mechanism had not kicked in yet), and one after.
In the case of unconditional, we do not need to wait for the arrival
of a packet for the mechanism to activate. Therefore, the number of
packets sent is given by an equation similar to that for nsentb
above. It is the integral of the scheduled rate, times the send prob-
ability. In this case, the integral is between (1-alpha)CTmin and
tstop, instead of just between the arrival of the first packet and
tstop. The number of packets sent for unconditional is therefore:
n_sent = INTEGRAL_(1 - alpha) T_min^t_1 (N)/(2 alpha T_min) (t - (1 -
J.Rosenberg,H.Schulzrinne [Page 19]
Internet Draft Reconsideration July 1997
alpha) T_min)/(2 alpha T_min) dt
Solving, we obtain:
n_sent = (N)/(8 alpha^2 T_min^2) ( ((1 - alpha) T_min + D)/((1 -
alpha) C m - 1) + D)^2
This quantity is small compared to nsenta for conditional reconsider-
ation, thus the improved performance. These results are compared with
simulation in Figure 6.2. The simulation model is identical to that
in Figure 6.2, except unconditional reconsideration is used. As the
plot indicates, only the t2 behavior is present here. The total num-
ber of packets sent during the transient is much reduced, and reason-
ably well predicted by our analysis.
Figures only available in postscript version
The next step is to determine the duration of the plateau period.
Packet sending will start again when the current time catches up with
the left hand side of the interval window, which will have quickly
advanced to (1-alpha)Cnsent. The time at which this happens, tstart
is:
t_start = (1 - alpha) C n_sent
For conditional reconsideration, if we assume nsent....pproxnsenta,
we obtain:
t_start = (C (1 - alpha) N)/(2 alpha T_min) (D + (1)/(m) )
The duration of the plateau period itself is given by:
T_plat = t_start - t_stop
Table 1 lists the values of the parameters derived above for various
group sizes. In all cases, A=1/2, Tmin=2.5, C=.711s, and D=300ms. The
unconditional mechanism provides clear gains in terms of reducing the
number of packets sent during the transient, and the duration of the
plateau effect.
7 Steady State Behavior
It is important to consider the behavior of the reconsideration algo-
rithms when the learning curve has reached steady state (i.e.,
L(t)=N). The ideal behavior is for the total send rate of the group
to be 1/C RTCP packets per second, equally divided among all users.
J.Rosenberg,H.Schulzrinne [Page 20]
Internet Draft Reconsideration July 1997
_________________________________________________________
Conditional
Unconditional
2-5
Group Size N nsent Tplat
nsent Tplat
_________________________________________________________
1000 143 49 s 18 5 s
10000 1430 506 s 178 61 s
100000 14305 5083 s 1784 632 s
_________________________________________________________
Table 1: Transient Behavior for Various Group Sizes
There are actually two situations which can be reasonably deemed as
steady state. The first of these is a group size which remains
exactly fixed. However, in real systems, users come and go, so a sec-
ond definition of steady state is a group whose membership oscillates
slightly about some large value.
We ran simulations to examine performance of the algorithms under
both of these conditions. In first, fixed-group size scenario, both
conditional reconsideration and the current RTCP algorithm both gen-
erate packets at the desired rate of 1/C per second. We found, as
expected, that the packet rate was reduced in the unconditional case,
and packets were sent at .82/C packets per second, a reduction by
18%.
We also performed a stochastic analysis of the unconditional algo-
rithm in steady state. The analysis demonstrated that the packet
intervals, instead of being uniformly distributed between 1/2 and 3/2
of the deterministic interval, were distributed with a density of
(y-1/2)e
y-1/2dybetween1/2and3/2.Theresultisthatthepacket
rates are reduced by 1-ac1e-3/2, or 18%, matching our simulations
exactly.
In the second, slightly oscillating scenario, unconditional reconsid-
eration and the current algorithm performed identically to their
behavior in the first scenario. Conditional reconsideration, however,
exhibited an average packet rate of .91/C, a reduction by 9%. This
makes sense. When a user is about to send an RTCP packet, half the
time the group size is larger than when the last packet is sent,
activating the reconsideration. The other half of the time, the group
size is slightly less, and the packet is sent, as if there were no
reconsideration. Thus, the packet rate should be halfway between
unconditional and the current algorithm.
We also ran some simulations to investigate the fairness properties
J.Rosenberg,H.Schulzrinne [Page 21]
Internet Draft Reconsideration July 1997
of the algorithm. By fairness, we mean the variation of the number of
packets transmitted per user, across all users. In a perfectly fair
system, all users should have transmitted the same number of packets.
We found all algorithms, including the current RTCP algorithm, to be
extremely fair, with coefficients of variation below 0.005 after
about an hour of running time.
Finally, we investigated the impact of reconsideration on synchro-
nization. The problem of synchronization in the Internet was studied
by Van Jacobson and Sally Floyd in [18]. Their study focused on the
synchronization of periodic routing messages, such as those generated
by RIP or IGRP. However, they generalize their results to any system
which is characterized by their periodic messages model. Fortunately,
the RTCP feedback mechanism fits perfectly into this model, making
their results directly applicable here. Although reconsideration
reduces the randomness of the interval, the reduction is negligible
compared to the amount required to induce synchronization.
8 Summary and Future Work
RTP was meant to support real-time communications ranging from two-
party telephone calls to broadcast applications with very large user
populations. It incorporates an adaptive feedback mechanism that
allows scaling to moderately sized groups, but shows a number of
deficiencies once the group size exceeds on the order of a thousand.
The problems can be summarized as congestion, convergence delays and
state storage problems. We have solved the congestion problem via a
simple algorithm called reconsideration. Both analysis and simulation
have shown that the algorithm reduces the initial congestion by
orders of magnitude under a variety of conditions. Furthermore, the
algorithm is backwards compatible with the existing RTCP algorithm,
allowing for a simple migration path.
The reconsideration algorithm has been implemented as part of a gen-
eric RTP Library, and is now operational in several common Mbone
tools, such as rat and Nevot. It has also been proposed to the IETF
as an improvement to the RTP specification, and is likely to be
incorporated into the next release.
Future work involves considering the problem of simultaneous leaves,
to which reconsideration cannot be directly applied. More work is
also needed to solve the other RTP scalability problems.
9 Acknowledgments
The authors wish to thank Steve Casner, T.V. Lakshman, Sanjay Mithal,
Daniel Rubenstein, Bernhard Suter, and Don Towsley for their insight-
ful comments and discussion.
J.Rosenberg,H.Schulzrinne [Page 22]
Internet Draft Reconsideration July 1997
10 Bibliography
[1] ITU-T, Recommendation H.323 - Visual Telephone Systems and
Equipment for Local Area Networks which Provide Non-Guaranteed Qual-
ity of Service , February 1996.
[2] M. Handley, H. Schulzrinne, and E. Schooler, SIP: Session initia-
tion protocol, Internet Draft, Internet Engineering Task Force, Dec.
1996. Work in progress.
[3] H. Schulzrinne, A real-time stream control protocol (RTSP'),
Internet Draft, Internet Engineering Task Force, Dec. 1996. Work in
progress.
[4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, RTP: a
transport protocol for real-time applications, Request for Comments
(Proposed Standard) RFC 1889, Internet Engineering Task Force, Jan.
1996.
[5] I. Busse, B. Deffner, and H. Schulzrinne, Dynamic QoS control of
multimedia applications based on RTP, Computer Communications ,
vol. 19, pp. 4958, Jan. 1996.
[6] J. Robinson and J. Stewart, Multimon - an ipmulticast monitor.
The MultiMON web page can be found at
http://www.merci.crc.doc.ca/mbone/MultiMON.
[7] S. McCanne, V. Jacobson, and M. Vetterli, Receiver-driven layered
multicast, in SIGCOMM Symposium on Communications Architectures and
Protocols , (Palo Alto, California), Aug. 1996.
[8] K. Almeroth and M. Ammar, Multicast group behavior in the
internet's multicast backbone (mbone), IEEE Communications Magazine
, vol. 35, June 1997.
[9] S. Casner and V. Jacobson, Compressing IP/UDP/RTP headers for
low-speed serial links, Internet Draft, Internet Engineering Task
Force, Nov. 1996. Work in progress.
[10] K. Almeroth and M. Ammar, Collecting and modeling the join/leave
behavior of multicast group members in the mbone, in Proceedings of
the Symposium on High Performance Distributed Computing , (Syracuse,
NY), pp. 20916, IEEE, Aug. 1996.
[11] B. Aboba, Alternatives for enhancing RTCP scalability, Internet
Draft, Internet Engineering Task Force, Jan. 1997. Work in progress.
[12] P. Sharma, D. Estrin, S. Floyd, and V. Jacobson, Scalable timers
J.Rosenberg,H.Schulzrinne [Page 23]
Internet Draft Reconsideration July 1997
for soft state protocols, in Proc. of IEEE Infocom , 1997.
[13] P. Sharma, D. Estrin, S. Floyd, and V. Jacobson, Scalable timers
for soft state protocols, technical report, University of Southern
California, June 1996.
[14] J. Veizades, E. Guttman, and C. Perkins, Service location proto-
col, Request for Comments (Proposed Standard) RFC 2165, Internet
Engineering Task Force, June 1997.
[15] J. Rosenberg, H. Schulzrinne, and B. Suter, Wide area service
location, (internet draft), Internet Engineering Task Force, July
1997. Work In Progress.
[16] Y. Rekhter and T. Li, A border gateway protocol 4 (BGP-4),
Request for Comments (Draft Standard) RFC 1771, Internet Engineering
Task Force, Mar. 1995. (Obsoletes RFC1654).
[17] T. Pusateri, Distance vector multicast routing protocol, Inter-
net Draft, Internet Engineering Task Force, Sept. 1996. Work in pro-
gress.
[18] S. Floyd and V. Jacobson, The synchronization of periodic rout-
ing messages, IEEE/ACM Transactions on Networking , vol. 2, pp.
122136, Apr. 1994.
J.Rosenberg,H.Schulzrinne [Page 24]