RTP Media Congestion Avoidance Techniques (rmcat) | M.W. Welzl |
Internet-Draft | University of Oslo |
Intended status: Experimental | January 19, 2013 |
Expires: July 23, 2013 |
Coupled congestion control for RTP media
draft-welzl-rmcat-coupled-cc-00
When multiple congestion controlled RTP sessions traverse the same network bottleneck, it can be beneficial to combine their controls such that the total on-the-wire behavior is improved. This document describes such a method for flows that have the same sender, in a way that is as flexible and simple as possible while minimizing the amount of changes needed to existing RTP applications.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http:/⁠/⁠datatracker.ietf.org/⁠drafts/⁠current/⁠.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 23, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http:/⁠/⁠trustee.ietf.org/⁠license-⁠info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
When there is enough data to send, a congestion controller must increase its sending rate until the path's available capacity has been reached; depending on the controller, sometimes the rate is increased further, until packets are ECN-marked or dropped. In the public Internet, this is currently the only way to get any feedback from the network that can be used as an indication of congestion. This process inevitably creates undesirable queuing delay -- an effect that is amplified when multiple congestion controlled connections traverse the same network bottleneck. When such connections originate from the same host, it would therefore be ideal to use only one single sender-side congestion controller which determines the overall allowed sending rate, and then use a local scheduler to assign a proportion of this rate to each RTP session. This way, priorities could also be implemented quite easily, as a function of the scheduler; honoring user-specified priorities is, for example, required by rtcweb [rtcweb-usecases].
The Congestion Manager (CM) [RFC3124] provides a single congestion controller with a scheduling function just as described above. It is, however, hard to implement because it requires an additional congestion controller and removes all per-connection congestion control functionality, which is quite a significant change to existing RTP based applications. This document presents a method that is easier to implement than the CM and also requires less significant changes to existing RTP based applications. It attempts to roughly approximate the CM behavior by sharing information between existing congestion controllers, akin to "Ensemble Sharing" in [RFC2140].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
Figure 1 shows the elements of the architecture for coupled congestion control: the Flow State Exchange (FSE), Shared Bottleneck Detection (SBD) and Flows. The FSE is a storage element. It is passive in that it does not actively initiate communication with flows and the SBD; its only active role is internal state maintenance (e.g., an implementation could use soft state to remove a flow's data after long periods of inactivity). Every time a flow's congestion control mechanism would normally update its sending rate, the flow instead updates information in the FSE and performs a query on the FSE, leading to a sending rate that is often different from what the congestion controller originally determined. Using information about/from the currently active flows, SBD updates the FSE with the correct Flow Group Identifiers (FGIs).
------- <--- Flow 1 | FSE | <--- Flow 2 .. ------- <--- .. Flow N ^ | | ------- | | SBD | <-------| -------
Figure 1: Coupled congestion control architecture
Since everything shown in Figure 1 is assumed to operate on a single host (the sender) only, this document only describes aspects that have an influence on the resulting on-the-wire behavior. It does, for instance, not define how many bits must be used to represent FGIs, or in which way the entities communicate. Implementations can take various forms: for instance, all the elements in the figure could be implemented within a single application, thereby operating on flows generated by that application only. Another alternative could be to implement both the FSE and SBD together in a separate process which different applications communicate with via some form of Inter-Process Communication (IPC). Such an implementation would extend the scope to flows generated by multiple applications. The FSE and SBD could also be included in the Operating System kernel.
This section gives an overview of the roles of the elements of coupled congestion control, and provides an example of how coupled congestion control can operate.
SBD uses knowledge about the flows to determine which flows belong in the same Flow Group (FG), and assigns FGIs accordingly. This knowledge can be derived from measurements, by considering correlations among measured delay and loss as an indication of a shared bottleneck, or it can be based on the simple assumption that packets sharing the same five-tuple (IP source and destination address, protocol, and transport layer port number pair) are typically routed in the same way. The latter method is the only one specified in this document: SBD MUST consider all flows that use the same five-tuple to belong to the same FG. This classification applies to certain tunnels, or RTP flows that are multiplexed over one transport (cf. [transport-multiplex]). In one way or another, such multiplexing will probably be recommended for use with rtcweb [rtcweb-rtp-usage]. Port numbers are needed as part of the classification due to mechanisms like Equal-Cost Multi-Path (ECMP) routing which use different paths for packets towards the same destination, but are typically configured to keep packets from the same transport connection on the same path.
The FSE contains a list of all flows that have registered with it. For each flow, it stores:
The information listed here is enough to implement the sample flow algorithm given below. FSE implementations could easily be extended to store, e.g., a flow's current sending rate for statistics gathering or future potential optimizations.
Flows register themselves with SBD and FSE when they start, deregister from the FSE when they stop, and carry out an UPDATE function call every time their congestion controller calculates a new sending rate. Via UPDATE, they provide the newly calculated rate and the desired rate (less than the calculated rate in case of non-greedy flows, the same otherwise). UPDATE returns a rate that should be used instead of the rate that the congestion controller has determined.
Below, an example algorithm is described. While other algorithms could be used instead, the same algorithm must be applied to all flows. The way the algorithm is described here, the operations are carried out by the flows, but they are the same for all flows. This means that the algorithm could, for example, be implemented in a library that provides registration, deregistration functions and the UPDATE function. To minimize the number of changes to existing applications, one could, however, also embed this functionality in the FSE element.
The goals of the flow algorithm are to achieve prioritization, improve network utilization in the face of non-greedy flows, and impose limits on the increase behavior such that the negative impact of multiple flows trying to increase their rate together is minimized. It does that by assigning a flow a sending rate that may not be what the flow's congestion controller expected. It therefore builds on the assumption that no significant inefficiencies arise from temporary non-greedy behavior or from quickly jumping to a rate that is higher than the congestion controller intended. How problematic these issues really are depends on the controllers in use and requires careful per-controller experimentation. The coupled congestion control mechanism described here also does not require all controllers to be equal; effects of heterogeneous controllers, or homogeneous controllers being in different states, are also subject to experimentation.
There are more potential issues with the algorithm described here. Rule 3 b) leads to a conservative behavior: it ensures that only one flow at a time can increase the overall sending rate. This rule is probably appropriate for situations where minimizing delay is the major goal, but it may not fit for all purposes; it also does not incorporate the magnitude by which a flow can increase its rate. Notably, despite this limitation on the overall rate of all flows per FGI, immediate rate jumps of single flows could become problematic when the FSE is used in a highly asynchronous manner, e.g. when flows have very different RTTs. Rule 3 e) gives all the leftover rate of non-greedy flows to the first flow that updates its sending rate, provided that this flow needs it all (otherwise, its own leftover rate can be taken by the next flow that updates its rate). Other policies could be applied, e.g. to divide the leftover rate of a flow equally among all other flows in the FGI.
In order to illustrate the operation of the coupled congestion control algorithm, this section presents a toy example of two flows that use it. Let us assume that both flows traverse a common 10 Mbit/s bottleneck and use a simplistic congestion controller that starts out with 1 Mbit/s, increases its rate by 1 Mbit/s in the absence of congestion and decreases it by 2 Mbit/s in the presence of congestion. For simplicity, flows are assumed to always operate in a round-robin fashion. Rate numbers below without units are assumed to be in Mbit/s. For illustration purposes, the actual sending rate is also shown for every flow in FSE diagrams even though it is not really stored in the FSE.
Flow #1 begins. It is greedy and considers itself to have top priority. This is the FSE after the flow algorithm's step 1:
--------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ---------------------------------------------
Its congestion controller gradually increases its rate. Eventually, at some point, the FSE should look like this:
--------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 10 | 10 | 10 | 10 | ---------------------------------------------
Now another flow joins. It is also greedy, and has a lower priority (0.5):
----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 10 | 10 | 10 | 10 | | 2 | 1 | 0.5 | 1 | 1 | 11 | 1 | -----------------------------------------------
Now assume that the first flow updates its rate to 8, because the total sending rate of 11 exceeds the total capacity. Let us take a closer look at what happens in step 3 of the flow algorithm.
new_CR = 8. new_DR = infinity. 3 a) S_P = 1.5; S_DR = 11; new_S_CR = 11. 3 b) new_CR < CR, hence CR = 8. 3 c) new_S_CR = 9; S_CR = 9. 3 d) DR = CR = 8; S_DR = 9. 3 e) TLO = 0; there are no other flows with DR < CR. 3 f) new sending rate: min(infinity, 1/1.5 * 9 + 0) = 6. 3 g) does not apply. The resulting FSE looks as follows: ----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 8 | 8 | 9 | 6 | | 2 | 1 | 0.5 | 1 | 1 | 11 | 1 | -----------------------------------------------
The effect is that flow #1 is sending with 6 Mbit/s instead of the 8 Mbit/s that the congestion controller derived. Let us now assume that flow #2 updates its rate. Its congestion controller detects that the network is not fully saturated (the actual total sending rate is 6+1=7) and increases its rate.
new_CR=2. new_DR = infinity. 3 a) S_P = 1.5; S_DR = 9; new_S_CR = 9. 3 b) new_CR > CR but new_S_CR < S_CR, hence CR = 2. 3 c) new_S_CR = 10; S_CR = 10. 3 d) DR = CR = 2; S_DR = 10. 3 e) TLO = 0; there are no other flows with DR < CR. 3 f) new sending rate: min(infinity, 0.5/1.5 * 10 + 0) = 3.33. 3 g) new sending rate > DR, hence DR = 3.33. The resulting FSE looks as follows: ----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 8 | 8 | 9 | 6 | | 2 | 1 | 0.5 | 2 | 3.33 | 10 | 3.33 | -----------------------------------------------
The effect is that flow #2 is now sending with 3.33 Mbit/s, which is close to half of the rate of flow #1 and leads to a total utilization of 6(#1) + 3.33(#2) = 9.33 Mbit/s. Flow #2's congestion controller has increased its rate faster than the controller actually expected. Now, flow #1 updates its rate. Its congestion controller detects that the network is not fully saturated and increases its rate. Additionally, the application feeding into flow #1 limits the flow's sending rate to at most 2 Mbit/s.
new_CR=9. new_DR=2. 3 a) S_P = 1.5; S_DR = 11.33; new_S_CR = 10. 3 b) new_CR > CR and new_S_CR > S_CR, hence CR is not updated (since flow #2 has just increased S_CR, flow #1 cannot also increase it in this iteration). 3 c) new_S_CR = 10; S_CR = 10. 3 d) DR = 2; S_DR = 5.33. 3 e) TLO = 0; there are no other flows with DR < CR. 3 f) new sending rate: min(2, 1/1.5 * 10 + 0) = 2. Note that, without the 2 Mbit/s limitation from the application, the new sending rate for flow #1 would now be 6.66 Mbit/s, leading to perfect network saturation (6.66 + 3.33 = approx. 10). 3 g) does not apply. The resulting FSE looks as follows: ----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 8 | 2 | 10 | 2 | | 2 | 1 | 0.5 | 2 | 3.33 | 10 | 3.33 | -----------------------------------------------
Now, the total rate of the two flows is 2 + 3.33 = 5.33 Mbit/s, i.e. the network is significantly underutilized due to the limitation of flow #1. Flow #2 updates its rate. Its congestion controller detects that the network is not fully saturated and increases its rate.
new_CR=3. new_DR = infinity. 3 a) S_P = 1.5; S_DR = 5.33; new_S_CR = 10. 3 b) new_CR > CR but new_S_CR = S_CR, hence CR = 3. 3 c) new_S_CR = 11; S_CR = 11. 3 d) DR = 3; S_DR = 5. 3 e) TLO = 0; flow #1 has DR < CR, hence TLO += 1/1.5 * 11 - 2 = 5.33. DR of flow #1 is set to 8. Flow #1 does not have a negative P(i) value, so its entry is not deleted. 3 f) new sending rate: min(infinity, 0.5/1.5*11 + 5.33) = 9. 3 g) new sending rate > DR, hence DR = 9. The resulting FSE looks as follows: ----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | 1 | 8 | 8 | 10 | 2 | | 2 | 1 | 0.5 | 3 | 9 | 11 | 9 | -----------------------------------------------
Now, the total rate of the two flows is 2 + 9 = 11 Mbit/s, exceeding the total capacity by the 1 Mbit/s by which the congestion controller of flow #2 has increased its rate. Note that, had flow #1 been greedy, the same total rate would have resulted after this iteration. Finally, flow #1 terminates. It sets P to -1 and DR to 0. Let us assume that it terminated late enough for flow #2 to still experience the network in a congested state, i.e. flow #2 decreases its rate in the next iteration.
new_CR = 1. new_DR = infinity. 3 a) S_P = 1.5; S_DR = 9; new_S_CR = 11. 3 b) new_CR < CR hence CR = 1. 3 c) new_S_CR = 9; S_CR = 9. 3 d) DR = 1; S_DR = 1. 3 e) TLO = 0; flow #1 has DR < CR, hence TLO += 1/1.5 * 9 - 0 = 6. DR of flow #1 is set to 8. Flow #1 has a negative P(i) value, so its entry is deleted. 3 f) new sending rate: min(infinity, 0.5/1.5 * 9 + 6) = 9. 3 g) new sending rate > DR, hence DR = 9. The resulting FSE looks as follows: ----------------------------------------------- | # | FGI | P | CR | DR | S_CR | Rate | | | | | | | | | | 1 | 1 | -1 | 8 | 0 | 10 | 2 | (before deletion) | 2 | 1 | 0.5 | 1 | 9 | 9 | 9 | -----------------------------------------------
Now, the total rate, used only by flow #2, is 9 Mbit/s, which is the rate that it would have had alone upon reacting to congestion after a sending rate of 11 Mbit/s.
This document has benefitted from discussions with and feedback from Stein Gjessing, David Hayes, Safiqul Islam, Naeem Khademi, Andreas Petlund, and David Ros (who also gave the FSE its name).
This memo includes no request to IANA.
In scenarios where the architecture described in this document is applied across applications, various cheating possibilities arise: e.g., supporting wrong values for the calculated rate, the desired rate, or the priority of a flow. In the worst case, such cheating could either prevent other flows from sending or make them send at a rate that is unreasonably large. The end result would be unfair behavior at the network bottleneck, akin to what could be achieved with any UDP based application. Hence, since this is no worse than UDP in general, there seems to be no significant harm in using this in the absence of UDP rate limiters.
In the case of a single-user system, it should also be in the interest of any application programmer to give the user the best possible experience by using reasonable flow priorities or even letting the user choose them. In a multi-user system, this interest may not be given, and one could imagine the worst case of an "arms race" situation, where applications end up setting their priorities to the maximum value. If all applications do this, the end result is a fair allocation in which the priority mechanism is implicitly eliminated, and no major harm is done.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC2140] | Touch, J., "TCP Control Block Interdependence", RFC 2140, April 1997. |
[RFC3124] | Balakrishnan, H. and S. Seshan, "The Congestion Manager", RFC 3124, June 2001. |
[rtcweb-usecases] | Holmberg, C., Hakansson, S. and G. Eriksson, "Web Real-Time Communication Use-cases and Requirements", Internet-draft draft-ietf-rtcweb-use-cases-and-requirements-10.txt, December 2012. |
[transport-multiplex] | Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a Single Lower-Layer Transport", Internet-draft draft-westerlund-avtcore-transport-multiplexing-04.txt, October 2012. |
[rtcweb-rtp-usage] | Perkins, C., Westerlund, M. and J. Ott, "Web Real-Time Communication (WebRTC): Media Transport and Use of RTP", Internet-draft draft-ietf-rtcweb-rtp-usage-05.txt, October 2012. |