Internet Engineering Task Force N. Kuhn, Ed.
Internet-Draft Telecom Bretagne
Intended status: Informational P. Natarajan, Ed.
Expires: August 11, 2014 Cisco Systems
D. Ros
Simula Research Laboratory
N. Khademi
University of Oslo
February 07, 2014

AQM Evaluation Guidelines
draft-kuhn-aqm-eval-guidelines-00

Abstract

Unmanaged large buffers in today's networks have given rise to a slew of performance issues. These performance issues can be addressed by some form of Active Queue Management (AQM), optionally in combination with a packet scheduling scheme such as fair queuing. The IETF AQM working group was recently formed to standardize AQM schemes that are robust, easily implemented, and successfully deployed in today's networks. This document describes various criteria for performing precautionary evaluations of AQM proposals. This document also helps in ascertaining whether any given AQM proposal should be taken up for standardization by the AQM WG.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 11, 2014.

Copyright Notice

Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Active Queue Management (AQM) addresses the concerns arising from using unnecessarily large and unmanaged buffers in order to improve network and application performance. Several AQM algorithms have been proposed in the past years, most notable being Random Early Detection (RED), BLUE, and Proportional Integral controller (PI). In general, these algorithms actively interact with Transmission Control Protocol (TCP) and other transport protocol that deploys a congestion control scheme to manage the amount of data they keep in the network. While the available buffer space in the routers and switches is sufficiently enough to accommodate the short-term buffering requirements, this has the effect of reducing mean buffer occupancy, and therefore both end-to-end delay and jitter. Some of these algorithms, notably RED, have also been widely implemented on network devices. However, we haven't realized the benefits of RED AQM scheme since it is reported to be usually turned off. The main reason of this reluctance to use RED is that its parameters' sensitiveness to the operating conditions in the network and the difficulty of tuning them to realize some benefits in today's deployment.

In order to meet mostly throughput-based SLA requirements and to avoid packet drops, many network providers resort to increasing the available buffer space. This increase is also referred to as Bufferbloat [BB2012]. Deploying large unmanaged buffers on the Internet lead to the increase in end-to-end delay, resulting in poor performance for latency sensitive applications such as real-time multimedia (e.g., voice, video, gaming, etc.). The degree to which this affects modern networking equipment, especially consumer-grade equipment, produces problems even with commonly used web services. Active queue management is thus essential to control queuing delay and decrease network latency.

The AQM working group was recently formed within the TSV area to address the problems with large unmanaged buffers in the Internet. Specifically, the AQM WG is tasked with standardizing AQM schemes that not only address concerns with such buffers, but also are robust under wide variety of operating conditions. In order to ascertain whether the WG should undertake standardizing an AQM proposal, the WG requires guidelines for evaluating AQM proposals. This document provides the necessary guidelines.

1.1. Guidelines for AQM designers

One of the key objectives behind formulating the guidelines is to help ascertain whether a specific AQM is not only better than drop-tail but also safe to deploy. Thus, the evaluation of AQM performance can be divided into two categories: (1) the guidelines to quantify AQM schemes' performance in terms of latency reduction, goodput maximization and the trade-off between the two and (2) the guidelines for safe deployment, including self adaptation, stability analysis, fairness, design/implementation complexity and robustness to different operating conditions.

This memo recognizes that an AQM scheme MAY NOT be suitable for all possible network environments relevant to the IETF such as home networks, data centers, enterprise edge etc. Therefore, this document considers two different categories of evaluation scenarios: (1) generic scenarios that any AQM proposal SHOULD be evaluated against, and (2) evaluation scenarios specific to a network environment. Irrespective of whether or not an AQM is standardized by the WG, we recommend the relevant scenarios and metrics discussed in this document to be considered. Since a specific AQM scheme MAY NOT be applicable to all network environments, the specific evaluation scenarios enable to establish the environments where the AQM is applicable. These guidelines do not present every possible scenario and cannot cover every possible aspect of a particular algorithm. In addition, it is worth noting that the proposed criteria are not bound to a particular evaluation toolset.

This document details how an AQM designer can rate the feasibility of their proposal in different types of network devices, given the various architecture possibilities (switches, routers, firewalls, hosts, drivers, etc.) where an AQM may be implemented. To this end, these guidelines state that an AQM's resource requirements SHOULD be measured, considering which parts of the AQM run in real-time on the data versus the components than run at higher levels or larger time-scales.

1.2. Reducing the latency and maximizing the goodput

The trade-off between reducing the latency and maximizing the goodput is intrinsically linked to each AQM scheme and is a central key to evaluating its performance. This trade-off MUST be considered in various scenarios to ensure the safety of an AQM deployment. Whenever possible, solutions should aim at both maximizing goodput and minimizing latency. This document proposes guidelines that enable the reader to quantify (1) reduction of latency and (2) maximization of goodput and (3) the trade-off between the two.

The tester SHOULD discuss the performance of its proposal in terms of performance and deployment in comparison with those of drop-tail: basically, these guidelines provide the tools to understand the cost (in terms of deployment) versus the potential gain in performance of the introduction of the proposed scheme.

1.3. Organization of the document

This memo is organized as follows:

1.4. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Metrics of interest

End-to-end delay is the result of propagation delay, serialization delay, service delay in a switch, and queuing delay, summed over the network elements in the path. Among those, only the queuing delay is variable for a certain network path. AQM or scheduling algorithms may reduce this delay by providing signals to the sender on the emergence of congestion, but any impact on the goodput must be carefully considered. This section presents the metrics that MUST be used to better quantify (1) the reduction of latency, (2) maximization of goodput and (3) the trade-off between the two. These metrics MUST be considered to better assess the performance of an AQM scheme.

2.1. Queue-related metrics

The queue-related metrics enable a better understanding of the AQM behavior under tests and the impact of its internal parameters. This section provides details on (1) the metrics that SHOULD be evaluated and (2) how to represent them.

The metrics presented are the link utilization, the queuing delay and the queue size in order to quantify the trade-off between goodput and delay. Considering the fact that AQM schemes may drop packets, the AQM tester SHOULD look carefully at the drops that the scheme provokes.

2.1.1. Link utilization

The definition of the link utilization is given in the section 2.3.5 of RFC5136 [RFC5136]: "the utilization now represents the fraction of the capacity that is being used and is a value between zero (meaning nothing is used) and one (meaning the link is fully saturated)."

The link utilization is a metric that MUST be measured at the output of the sending device and illustrates the link between queuing delay and packet-dropping rates, which are key elements to understand the internal behavior of the algorithm. The goodput metric for end-to-end performance evaluation will be discussed in Section 2.2.3.

The guidelines advise that the tester SHOULD determine the minimum, average and maximum measurements of the link utilization and the coefficient of variation for the average value as well.

2.1.2. Queuing delay and queue size

The queuing delay is the time a packet waits in a queue until it can be transmitted to the lower layers. The queue size is the number of bytes which are occupying the queue.

Both queue size and queuing delay are needed because of fluctuating link speeds. Moreover, AQM algorithm may be based on the length of the queue (such as RED) or the queuing delay (such as CoDel or PIE).

The guidelines advice that the tester SHOULD determine the minimum, average and maximum measurements of these metrics and the coefficient of variation for the average values as well.

2.1.3. Packet loss

Losses can occur for various reasons. The losses under considerations in this section are the losses that crop up in the queue where AQM schemes take place.

Two classes of loss can be distinguished:

For each of these cases, these guidelines advise one measure of:

2.2. End-to-end Metrics

The impact of the introduction of AQM schemes on the end-to-end performance MUST be evaluated: this section presents the metrics that enable to evaluate the benefits provided by the tested AQM.

The metrics presented are the latency, the goodput, the packet loss synchronization, the Quality of Experience (QoE) related metrics and the fairness. The objective of these metrics is to quantify how much the introduction of an AQM:

2.2.1. Flow Completion time

The flow completion time is an important performance metric for the end user. Considering the fact that an AQM scheme may drop packets, the flow completion time is directly linked to its algorithm and this is all the more true when the flows are short.

An AQM evaluation SHOULD measure the distribution of the flow completion time.

2.2.2. Packet loss

The packet losses, that crop up in the queue where AQM schemes take place, impact on the end-to-end performance at the receiver's side. This metric may be already included in Section 2.1.3, however its end-to-end aspect may ease the understanding of each proposal.

The tester MUST evaluate, at the receiver:

The guidelines advice that the tester SHOULD determine the minimum, average and maximum measurements of these metrics and the coefficient of variation for the average value as well.

2.2.3. Goodput

The goodput may be already included with latency measurements, but measuring the goodput enable an end-to-end appreciation of how well the AQM improves transport and application performance. The measured end-to-end goodput is inversely proportional to the AQM scheme's packet drops -- the smaller the packet drops, fewer packets need retransmission, minimizing AQM's impact on transport and application performance. Additionally, AQM scheme may resort to Explicit Congestion Notification (ECN) marking as an initial means to control delay. Again, marking packets instead of dropping them reduces number of packet retransmissions and increases goodput. Overall, end-to-end goodput values help evaluate the AQM scheme's effectiveness in minimizing packet drops that impact application performance and estimate how well the AQM scheme works with ECN.

If scheduling comes into play, a measure of how individual queues are serviced may be necessary: the scheduling introduced on top of the AQM may starve some flows and boost others. The utilization of the link does not cover this, as the utilization would be the same, whereas the goodput let the tester see if some flows are starved or not.

The guidelines advice that the tester SHOULD determine the minimum, average and maximum measurements of the goodput and the coefficient of variation for the average value as well.

2.2.4. Latency and jitter

The end-to-end latency differs from the queuing delay: it is linked to the network topology and the paths characteristics. Moreover, the jitter strongly depends on the traffic and the topology as well. The introduction of an AQM scheme would impact on these metrics and the end-to-end evaluation of performance MUST consider them for a better understanding.

The guidelines advice that the tester SHOULD determine the minimum, average and maximum measurements for these metrics and the coefficient of variation for their average values as well.

2.2.5. QoE metrics

An AQM evaluation study COULD measure the quality of experience of end users for selected applications which are sensitive to latency, such as video-streaming or VoIP. For these specific applications, one SHOULD estimate the average Mean Opinion Score (MOS). Many AQM proposals attempt to reduce the latency and these QoE metrics can provide strong arguments for the developments of various AQM solutions.

The evaluation of QoE SHOULD consider the end-to-end latency and jitter detailed in Section 2.2.4.

2.3. Discussion on the trade-off between latency and goodput

The metrics presented in this section MUST be considered, in order to discuss and quantify the trade-off between latency and medium utilization wherever present.

This trade-off can also be illustrated with figures following the recommendations of the section IV-B of [TCPEVAL2008]. For each scenarios, the output SHOULD be four graphs:

Concerning the drop rate, the AQM tester can distinguish two classes of drops, AQM-induced losses and buffer overflow, resulting in two graphs for the 'drop rate vs. queuing delay' graph. Each of this pair of graphs provide part of a better understanding (1) of the delay/goodput/drop-rate trade-off for a given congestion control mechanism, and (2) of how the goodput and average queue size vary as a function of the traffic load.

3. Evaluation scenarios

This section presents the set of scenarios that COULD be considered to evaluate the performance of AQM scheme: some scenarios MUST be considered, whereas others MAY NOT. One AQM algorithm may not work in all networking environments: this section helps in determining environments where an AQM proposal is applicable. The performance for the whole set of scenarios MAY not be evaluated, but for each selected scenario, the metrics presented in Section 2 MUST be considered. Each following subsections can be seen as a potential working area for the tested AQM algorithm. The output of these guidelines would be a list of competencies for each AQM, which will let the AQM WG have clear criteria to compare the AQMs.

While presenting the performance of an AQM algorithm for the selected scenarios, the tester MUST provide any parameter that had to be set beforehand. Moreover, the values for these parameters MUST be explained and justified as detailed in Section 4.2.

The tester SHOULD compare its proposal's performance and deployment with those of drop-tail: basically, these guidelines provide the tools to understand the cost (in terms of deployment) versus the potential gain in performance of the introduction of the proposed scheme.

This section is organized as follows:

3.1. Topology and notations

This section presents the topology that can be used for each of the following scenario and corresponding notations.

    +--------------+                                +--------------+
    |senders A|    |                                |  |receivers B|
    |---------+    |                                |  +-----------|
    |              |                                |              |
    |--------------|                                |--------------|
    |traffic class1|RTTA1.1,                RTTR1.1,|traffic class1|
    |--------------|CA1.1                      CR1.1|--------------|
    | SEN.Flow1.1 +---------+            +-----------+ REC.Flow1.1 |
    |        +     |        |            |          |        +     |
    |        +     |RTTA1.X,|            |  RTTR1.X,|        +     |
    |        +     |CA1.X   |            |     CR1.X|        +     |
    | SEN.Flow1.X +-----+   |            |  +--------+ REC.Flow1.X |
    |--------------|    |   |            |  |       |--------------|
    |    +         |  +-+---+---+     +--+--+---+   |        +     |
    |    |         |  |Central L|     |Central R|   |        |     |
    |    |         |  |---------|RTTLR|---------|   |        |     |
    |    |         |  | AQM     |CLR  |         |   |        |     |
    |    |         |  | BuffSize+-----+         |   |        |     |
    |    +         |  | (Bsize) |     |         |   |        +     |
    |--------------|  +-----+--++     ++-+------+   |--------------|
    |traffic classN|RTTAN.1,|  |       | |  RTTRN.1,|traffic classN|
    |--------------|CAN.1   |  |       | |     CRN.1|--------------|
    | SEN.FlowN.1 +---------+  |       | +-----------+ REC.FlowN.1 |
    |        +     |           |       |            |        +     |
    |        +     |RTTAN.Y,   |       |    RTTRN.Y,|        +     |
    |        +     |CAN.Y      |       |       CRN.Y|        +     |
    | SEN.FlowN.Y +------------+       +-------------+ REC.FlowN.Y |
    +--------------+                                +--------------+
		

Figure 1: Topology and notations

Figure 1 is a generic topology where:

The size of the buffers MUST be carefully, set considering the bandwith-delay product.

3.2. Generic scenarios

The following scenarios are generic and MUST be considered whatever the context is.

3.2.1. Traffic Profiles

Network and end devices need to be configured with reasonable amount of buffers in order to absorb transient bursts. In some situations, network providers configure devices with large buffers to avoid packet drops and increase goodput. Transmission Control Protocol (TCP) fills up these unmanaged buffers until the TCP sender receives a signal (packet drop) to cut down the sending rate. The larger the buffer, the higher the buffer occupancy, and therefore the queuing delay. On the other hand, an efficient AQM scheme sends out early congestion signals to TCP senders so that the queuing delay is brought under control.

Not all applications run over the same flavor of TCP. Variety of senders generate different traffic profiles at the networking device. For example, there could be senders that either do not respond to congestion signals (aka unresponsive flows) or do not cut down their sending rate as expected (aka aggressive flows). An AQM scheme should ensure queuing delay is under control irrespective of these traffic profiles.

This document will evaluate an AQM proposal based on the metrics presented in Section 2 irrespective of traffic profiles involved -- different senders (TCP variants, unresponsive, aggressive), traffic mix with different applications, etc. Additionally, the AQM scheme MUST NOT require operator tuning to work with varying traffic profiles.

3.2.1.1. Topology Description

The topology is presented in Figure 1. For this scenario, the capacities of the links MUST be set to 10Mbps and the RTTs to 100ms.

3.2.1.2. TCP-friendly Sender

This scenario helps evaluate how AQM scheme adapts to a TCP-friendly transport sender. Single TCP New Reno flow between sender A and receiver B, that transfers a large file for a period of 50s. Other TCP friendly congestion control schemes such as TCP-friendly rate control [RFC5348] etc MAY also be considered.

For each TCP-friendly transport considered, the graphs described in Section 2.3 MUST be generated. We expect that an AQM proposal exhibit similar behavior for all the TCP-friendly transports considered.

3.2.1.3. Aggressive Transport Sender

This scenario helps evaluate how AQM scheme adapts to a transport sender whose sending rate is more aggressive than a single TCP-friendly sender. Single TCP Cubic flow between sender A and receiver B, that transfers a large file for a period of 50s. Other congestion control schemes such as (ref) MAY also be considered in-order to help understand how the AQM scheme adapts to that particular aggressive transport.

For each flavor of aggressive transport, the graphs described in Section 2.3 MUST be generated.

3.2.1.4. Unresponsive Transport Sender

This scenario helps evaluate how AQM scheme adapts to a transport sender who is not responsive to congestion signals (ECN marks and/or packet drops) from the AQM scheme. In order to create a test environment that results in queue build up, we consider unresponsive flow(s) whose sending rate is greater than the bottleneck link capacity between nodes L and R. Note that faulty transport implementations on end hosts and/or faulty network elements en-route that "hide" congestion signals in packet headers [I-D.ietf-aqm-recommendation] may also lead to a similar situation, such that the AQM scheme needs to adapt to unresponsive traffic.

This scenario consists of UDP flow(s) with an aggregate rate of 12Mbps between sender A and receiver B, that transfers a large file for a period of 50s. Graphs described in Section 2.3 MUST be generated.

3.2.1.5. Traffic Mix

This scenario helps evaluate how AQM scheme adapts to a traffic mix consisting of different applications such as FTP, web, voice, video traffic. These testing cases presented in this subsection have been inspired by the table 2 of [DOCSIS2013]:

Figure 2 present the various cases for the traffic that MUST be generated between sender A and receiver B.

	 			+----+-------------------+--------+
	 			|Case| Number of traffic |Comments|
	 			+    +----+----+----+----+        +
	 			|    |VoIP|Webs|CBR |FTP | on FTP |
	 			+----+----+----+----+----+--------+
	 			| A  |  1 |  1 |  0 |  0 |        |
			 	|    |    |    |    |    |        |
	 			| B  |  1 |  1 |  0 |  1 | cont.  |
	 			|    |    |    |    |    |        |
	 			| C  |  1 |  1 |  0 |  5 | repeat |
	 			|    |    |    |    |    |  (5MB) |
	 			| D  |  1 |  1 |  1 |  5 | repeat |
	 			|    |    |    |    |    |  (5Mb) |
	 			+----+----+----+----+-------------+
		

Figure 2: Traffic Mix scenarios

For each of these scenarios, the graphs described in Section 2.3 MUST be generated. In addition, other metrics such as end-to-end latency, jitter, flow completion time, QoE MUST be generated.

3.2.2. Burst absorption

Packet arrivals can be bursty due to various reasons. Dropping one or more packets from a burst may result in performance penalties for the corresponding flows since the dropped packets have to be retransmitted. Performance penalties may turn into unmet SLAs and be disincentives to AQM adoption. Therefore, an AQM scheme SHOULD be designed to accommodate transient bursts. AQM mechanisms do not present the same tolerance to bursts of packets arriving in the buffer: this tolerance MUST be quantified.

Note that accommodating bursts translates to higher queue length and queuing delay. Naturally, it is important that the AQM scheme brings bursty traffic under control quickly. On the other hand, spiking packet drops inorder to bring packet bursts quickly under control could result in multiple drops per flow and severely impact transport and application performance. Therefore, an AQM scheme SHOULD bring bursts under control by balancing both aspects -- (1) queuing delay spikes are minimized and (2) performance penalties for ongoing flows in terms of packet drops are minimized.

AQM maintain short queues to allow the remaining space in the queue for bursts of packets. The tolerance to burst of packets depends on the number of packets in the queue, which is directly linked to the AQM policy. Moreover, one AQM scheme may implement a feature controlling the maximum size of accepted bursts, which may be set by the number of packets in the buffer, or the currently estimated queuing delay. Also, the impact of the buffer size on the burst allowance MAY be evaluated, detailed in Section 3.3.4.

3.2.2.1. Topology Description

The topology is presented in Figure 1. For this scenario, the capacities of the links MUST be set to 10Mbps and the RTTs to 100ms.

3.2.2.2. Generic bursty traffic

The following traffic should be considered from sender A to receiver B:

  • One Constant bit rate UDP traffic: 1Mbps UDP flow;
  • One TCP transfer: repeating 5MB file transmission;
  • Burst of packets: size of the burst from 5 to MAX_BUFFER_SIZE packets.

For each of these scenarios, the graphs described in Section 2.3 MUST be generated. In addition, other metrics such as end-to-end latency, jitter, flow completion time, QoE MUST be generated. Moreover, the tester MUST provide the flow completion time, detailed in Section 2.2.1, for each burst size considered.

3.2.2.3. Realistic bursty traffic

The following bursty traffic SHOULD be considered:

  • IW10: TCP transfer with initial congestion window set to 10 (repeating 5MB file transmission);
  • Bursty video frames (H.264/AVC) (60fps);
  • HTTP web traffic (repeated download of 700kB);
  • Constant bit rate UDP traffic (1Mbps UDP flow).

Figure 3 present the various cases for the traffic that MUST be generated between sender A and receiver B.

			+--------------------------------+
			|Case| Number of traffic         |
			|    +-----+----+----+-----------+
			|    |Video|Webs| CBR| FTP (IW10)|
			+----|-----|----|----|-----------|
			| A  |  0  |  1 |  1 |     0     |
			|----|-----|----|----|-----------|
			| B  |  0  |  1 |  1 |     1     |
			|----|-----|----|----|-----------|
			| C  |  1  |  1 |  1 |     0     |
			+----|-----|----|----|-----------|
			| D  |  1  |  1 |  1 |     0     |
			+----|-----|----|----|-----------|
			| E  |  1  |  1 |  1 |     1     |
			+----+-----+----+----+-----------+
		

Figure 3: Bursty traffic scenarios

For each of these scenarios, the graphs described in Section 2.3 MUST be generated. In addition, other metrics such as end-to-end latency, jitter, flow completion time, QoE MUST be generated.

3.2.3. Inter-RTT and intra-protocol fairness

TCP dynamics are a driving force for AQM design. It is therefore important to evaluate against a set of RTT (e.g., from 5 ms to 200 ms). Also, asymmetry in terms of RTT between various paths SHOULD be considered so that the fairness between the flows can be discussed as one may react faster to congestion than another. The impact of the scheduling and the AQM introduced on this lack of fairness SHOULD be evaluated.

Moreover, introducing an AQM and/or scheduling schemes may result in the absence of fairness between the flows, even when the RTTs are identical. This potential lack of fairness SHOULD be evaluated.

The topology that SHOUD be exploited is the one of Figure 1:

  • to evaluate the inter-RTT fairness, for each run, two flows (Flow1.1 and Flow1.2) SHOULD be introduced and the set of RTT SHOULD be: RTTA1.1 in [5ms;100ms] and RTTA1.2 in [100ms;200ms].
  • to evaluate the impact of the RTT value on the AQM performance and the intra-protocol fairness, for each run, two flows (Flow1.1 and Flow1.2) SHOULD be introduced and the set of RTT SHOULD be: RTTA1.1 in [5ms;200ms] and RTTA1.2 in [5ms;200ms], with (RTTA1.1)=(RTTA1.2).

These flows MUST have the same congestion control algorithm.

The output that MUST be measured is the ratio between the average goodput values of the two flows (Section 2.2.3) and the packet drop rate for each flow (Section 2.2.2).

3.2.4. Fluctuating network conditions

Network devices experience varying operating conditions depending on factors such as time of day, deployment scenario etc. For example:

  • Traffic and congestion levels are higher during peak hours than off-peak hours.
  • A queue's draining rate could vary depending on other queues. A low load on high priority queue implies higher draining rate for lower priority queues.

If the target context is a stable environment, the tester MUST illustrate their stability over time.

In context where the network conditions can vary over time, an AQM scheme MUST be robust enough to control network latencies under fluctuating network conditions, without the need for operator tuning of AQM parameters. This document will evaluate AQM proposals under varying congestion levels and varying draining rates.

3.2.4.1. Topology Description

The topology is presented in Figure 1. For this scenario, the capacities of the links MUST be set to 10Mbps and the RTTs to 100ms.

3.2.4.2. Mild Congestion

This scenario helps evaluate how an AQM scheme adapts to a light load of incoming traffic resulting in mild congestion -- packet drop rates less than 1%. The scenario consists of 4-5 TCP New Reno flows between sender A and receiver B. All TCP flows start at random times during the initial second. Each TCP flow transfers a large file for a period of 50s.

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.2.4.3. Medium Congestion

This scenario helps evaluate how an AQM scheme adapts to incoming traffic resulting in medium congestion -- packet drop rates between 1%-3%. The scenario consists of 10-20 TCP New Reno flows between sender A and receiver B. All TCP flows start at random times during the initial second. Each TCP flow transfers a large file for a period of 50s.

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.2.4.4. Heavy Congestion

This scenario helps evaluate how an AQM scheme adapts to incoming traffic resulting in heavy congestion -- packet drop rates between 5%-10%. The scenario consists of 30-40 TCP New Reno flows between sender A and receiver B. All TCP flows start at random times during the initial second. Each TCP flow transfers a large file for a period of 50s.

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.2.4.5. Varying Available Bandwidth

This scenario helps evaluate how an AQM scheme adapts to varying available bandwidth on the outgoing link. To simulate varying draining rates, the bottleneck bandwidth between nodes 'Central L' and Central R' vary over the course of the experiment as follows -- 100Mbps during 0-50s, 10Mbps during 50-100s, 100Mbps during 100-150s. The scenario consists of 50 TCP New Reno flows between sender A and receiver B. All TCP flows start at random times during the initial second. Each TCP flow transfers a large file for a period of 150s. In order to better assess the impact of draining rates on the AQM behavior, the tester MUST compare its performance with those of tail-drop.

For this scenario, the graphs described in Section 2.3 MUST be generated. Moreover, one graph MUST be generated for each of the three phases previously detailed.

3.3. Diverse Network Environments

This section presents scenarios with are related to specific network environments. These classical network environments which COULD be considered to evaluate the performance of the AQM under tests.

Each subsection presents a generic scenario and a use case. The scenarios are classified according to a relation between the delay (high, medium or low) and the capacity (high, medium, low). One scenario details as well that the impact of the sizes of the buffers should be evaluated. The guidelines selected those which are of interest for the evaluation of the performance of AQM proposals. On top of these abstracted scenario, these guidelines present use cases for each selected scenario, by proposing a carefully dimensioned topology.

3.3.1. Medium bandwidth, medium delay: Wi-Fi

This scenario is introduced to carefully evaluate AQM proposals in a generic context, where the link between the delay and the bandwidth is not specific and assess how AQM proposals can control latency in this context.

We refer to Figure 1 to detail the topology:

  • Sender A to Central L: capacity=100Mbps, RTT=10ms;
  • Central L to Central R: capacity=20Mbps, RTT=10ms;
  • Central R to Receiver B: capacity=100Mbps, RTT=10ms;
  • The tester MAY include a packet loss rate of 1 to 3% on the Wi-Fi link (between Central L and Central R).

The traffic that MUST generated between the sender A and the receiver B is:

  • Five repeating TCP transfers: repeating 5MB file transmission;
  • One continuous TCP transfer: continuous file transmission;
  • Four HTTP web traffic (repeated download of 700kB);

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.3.2. Low bandwidth, high delay: Rural broadband networks and satellite links

In the context of low bandwith and high delay, the burst absorption capacity of an AQM is seriously challenged. Indeed, due to the important bandwith-delay product, the sending buffer should be large, resulting in potentially large congestion windows and large bursts arrivals to the gateways. The tolerance to large incoming bursts is a key feature of an AQM introduced in this context: this is the reason why this challenging context is detailed in these guidelines.

We refer to Figure 1 to detail the topology:

  • Sender A to Central L: capacity=10Mbps, RTT=10ms;
  • Central L to Central R: capacity=1Mbps, RTT=200ms;
  • Central R to Receiver B: capacity=10Mbps, RTT=10ms;

The traffic that MUST generated between the sender A and the receiver B is:

  • Five repeating TCP transfers: repeating 5MB file transmission;
  • One continuous TCP transfer: continuous file transmission;
  • Four HTTP web traffic (repeated download of 700kB);

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.3.3. High bandwidth, low delay: data centers

In the context of high bandwith and low delay, the specific characteristics require updated thresholds, which determine the behavior of an AQM. As a result, the auto-tuning of an AQM is seriously challenged. This is the reason why this challenging context is detailed in these guidelines.

We refer to Figure 1 to detail the topology:

  • Sender A to Central L: capacity=1Gbps, RTT=0.1ms;
  • Central L to Central R: capacity=1Gbps, RTT=0.1ms;
  • Central R to Receiver B: capacity=1Gbps, RTT=0.1ms;

The traffic that MUST generated between the sender A and the receiver B is:

  • Four repeating TCP transfers: repeating 5MB file transmission;

For this scenario, the graphs described in Section 2.3 MUST be generated.

3.3.4. Low and high buffers

The size of the buffers impacts on AQMs performance, whether its algorithm is based on the queue length or the queueing delay. The tester MAY consider cases where the buffer is low (i.e. 1/10 BDP) and when the buffer is large (i.e. 10 BDP).

We refer to Figure 1 to detail the topology:

  • Sender A to Central L: capacity=100Mbps, RTT=10ms;
  • Central L to Central R: capacity=20Mbps, RTT=10ms;
  • Central R to Receiver B: capacity=100Mbps, RTT=10ms;

The traffic that MUST generated between the sender A and the receiver B is:

  • Five repeating TCP transfers: repeating 5MB file transmission;
  • One continuous TCP transfer: continuous file transmission;
  • Four HTTP web traffic (repeated download of 700kB);

For this scenario, the graphs described in Section 2.3 MUST be generated. Moreover, these guidelines advise to plot the characteristics of the queue (such as queue length or queuing delay) as explained in Section 2.1.2.

4. Deployment

This section details deployment issues that MUST be discussed, such as stability, implementation cost, implementation feasibility, control knobs, etc.

4.1. Operator control knobs and auto-tuning

One of the biggest hurdles for RED deployment was/is its parameter sensitivity to operating conditions -- how difficult it is to tune important RED parameters for a deployment in order to get maximum benefit from the RED implementation. Fluctuating congestion levels and network conditions add to the complexity. Incorrect parameter values lead to poor performance. Naturally, various network operators felt it unwise to turn on the AQM (RED) implementation.

Any AQM scheme is likely to have parameters whose values affect the AQM's control law and behavior. Exposing all these parameters as control knobs to a network operator (or user) can easily result in an unsafe AQM deployment. Unexpected AQM behavior ensues when parameter values are not set properly. A minimal number of control knobs minimizes the number of ways a, possible naive, user can break the AQM system. Fewer control knobs make the AQM scheme more user-friendly and easier to deploy and debug.

We recommend that an AQM scheme SHOULD minimize the control knobs exposed for operator tuning. An AQM scheme SHOULD expose only those knobs that control the "larger" AQM behavior such as queue delay threshold, queue length threshold, etc.

Additionally, an AQM scheme's safety is directly related to its stability under varying operating conditions such as varying traffic profiles and fluctuating network conditions, as described in Section 3.2.4 and in Section 3.2.1. Operating conditions vary often and hence it is necessary that the AQM MUST remain stable under these conditions without the need for additional external tuning. If AQM parameters require tuning under these conditions, then the AQM MUST self-adapt necessary parameter values by employing auto-tuning techniques.

4.2. Parameter sensitivity and stability analysis

An AQM scheme's control law is the primary means by which the AQM controls queuing delay. Hence understanding the AQM control law is critical to understanding AQM behavior. The AQM's control law may include several input parameters whose values affect the AQM output behavior and stability. Additionally, AQM schemes may auto-tune parameter values in-order to maintain stability under different network conditions (such as different congestion levels, draining rates or network environments). The stability of these auto-tuning techniques is also important to understand.

AQM proposals SHOULD provide background material showing control theoretic analysis of the AQM control law and the input parameter space within which the control law operates as expected. For parameters that are auto-tuned, the material SHOULD include stability analysis of the auto-tuning mechanism(s) as well. Such analysis helps the WG understand AQM control law better and the network conditions/deployments under which the AQM is stable.

The impact of every externally tuned parameter MUST be discussed. As an example, if an AQM proposal needs various external tuning to work on different network environments presented in Section 3, these external modifications MUST be clear for deployment issues. Also, the frequency at which some parameters are re-configured MUST be evaluated, as it may impact the capacity of the AQM to absorb incoming bursts of packets.

4.3. Implementation cost

An AQM's successful deployment is directly related to its ease of implementation. Network platforms may need hardware or software implementations of the AQM. Depending on a platform's capabilities and limitations, the platform may or may not be able to implement some or all parts of the AQM logic.

AQM proposals SHOULD provide pseudo-code for the complete AQM scheme, highlighting generic implementation-specific aspects of the scheme such as "drop-tail" vs. "drop-head", inputs from platform (current queueing delay, queue length), computations involved, need for timers etc. This helps identify costs associated with implementing the AQM on a particular hardware or software platform. Also, it helps WG understand what kind of platforms can easily support the AQM and which cannot.

AQM proposals SHOULD highlight parts of AQM logic that are platform dependent and discuss if and how AQM behavior could be impacted by the platform. For example, a queue-delay based AQM scheme requires current queuing delay as input from the platform. If the platform already maintains this value, then it is trivial to implement the AQM logic on the platform. On the other hand, if the platform provides indirect means to estimate queuing delay (ex: timestamps, deque rate etc.), then the AQM behavior is sensitive to how good the queuing delay estimate turns out on that platform. Highlighting the AQM's sensitivity to queuing delay estimate helps implementers identify optimal means of implementing the AQM on a platform.

4.4. Interaction with packet scheduling

On top of the introduction of an AQM scheme, a router may schedule the transmission of packets in a specific manner by introducing a scheduling scheme. This algorithm may create sub-queues and integrate an AQM scheme on each of these sub-queues. Another scheduling policy may modify the way packets are sequenced, modifying the timestamp of each packet.

Both schedulers and AQMs can be introduced when packet are either enqued or dequed. If both schedulers and AQM are implemented when packet are enqued, their interaction should not be a major issue. However, if one is introduced when packets are enqued and the others when they are dequed, there may be destructive interactions.

The scheduling and the AQM schemes conjointly impact on the end-to-end performance. During the evaluation process of an AQM proposal, the tester MUST discuss the feasibility to add scheduling on top of its algorithm. This discussion MAY detail if AQM is placed while packets are enqued and dequed.

4.5. ECN behavior

Apart from packet drops, Explicit Congestion Notification (ECN) is an alternative means to signal data senders about network congestion. A network device explicitly marks specific bit(s) in packet headers to convey congestion information to data senders. A data sender implementing ECN treats the marked packet as if it were a packet drop and reacts the same way as it would to a packet drop. Note that ECN minimizes performance penalties, since packets do not have to be retransmitted.

An AQM scheme SHOULD support ECN and SHOULD leverage ECN as an initial means to control queuing delay before resorting to packet drops. An AQM scheme SHOULD self-adapt and remain stable even with faulty and/or unresponsive ECN implementations en-route.

4.6. Packet sizes and congestion notification

An AQM scheme may be considering packet sizes while generating congestion signals. [I-D.ietf-tsvwg-byte-pkt-congest] discusses the motivations behind the same. For example, control packets such as DNS requests/responses, TCP SYNs/ACKs are small, and their loss can severely impact application performance. An AQM scheme may therefore be biased towards small packets by dropping them with smaller probability compared to larger packets. However, such an AQM scheme is unfair to data senders generating larger packets. Data senders, malicious or otherwise, are motivated to take advantage of the AQM scheme by transmitting smaller packets, and could result in unsafe deployments and unhealthy transport and/or application designs.

An AQM scheme SHOULD adhere to recommendations outlined in [I-D.ietf-tsvwg-byte-pkt-congest], and SHOULD NOT provide undue advantage to flows with smaller packets.

5. Comparing AQMs

This memo recognizes that the guidelines mentioned above may be used for comparing AQMs. This memo recommends that AQM schemes MUST be compared against both performance (Section 3) and deployment (Section 4) categories. In addition, this section details how best to achieve a fair comparison of AQM schemes by avoiding certain pitfalls.

5.1. Performance comparison

AQM schemes MUST be compared against all the generic scenarios discussed in Section 3.2. AQM schemes MAY be compared for specific network environments such as data center, home networks etc. For a particular network environment, AQM schemes MUST be compared against all the scenarios listed for that network environment Section 3.3. For each evaluation scenario, the schemes MUST be compared against the metrics discussed under that scenario. Moreover, if an AQM scheme's parameter(s) were externally tuned for optimization or other purposes, these values MUST be disclosed.

Note that AQM schemes belong to different varieties such as queue-length based scheme (ex: RED) or queue-delay based scheme (ex: CoDel, PIE). Also, AQM schemes expose different control knobs associated with different semantics. For example, while both PIE and CoDel are queue-delay based schemes and each expose a knob to control the queueing delay -- PIE's "queueing delay reference" vs. CoDel's "queueing delay target", the two schemes' knobs have different semantics resulting in different control points. Such differences in AQM schemes can be easily overlooked while making comparisons.

This document recommends the following procedures for a fair performance comparison of two AQM schemes:

  1. Comparable control parameters and comparable input values: Carefully identify the set of parameters that control similar behavior between the two AQM schemes and ensure these parameters have comparable input values. For example, while comparing how well a queue-length based AQM X controls queueing delay vs. queue-delay based AQM Y, identify the two schemes' parameters that control queue delay and ensure that their input values are comparable. Similarly, to compare two AQMs on how well they accommodate bursts, identify burst-related control parameters and ensure they are configured with similar values.
  2. Compare over a range of input configurations: There could be situations when the set of control parameters that affect a specific behavior have different semantics between the two AQM schemes. As mentioned above, PIE's knob to control queue delay has different semantics from CoDel's. In such situations, the schemes MUST be compared over a range of input configurations. For example, compare PIE vs. CoDel over the range of delay input configurations -- 5ms, 10ms, 15ms etc.

5.2. Deployment comparison

AQMs MUST be compared against the deployment criteria discussed in Section 4.

6. Acknowledgements

This work has been partially supported by the European Community under its Seventh Framework Programme through the Reducing Internet Transport Latency (RITE) project (ICT-317700).

7. Contributors

Many thanks to Gorry Fairhurst, Amadou B. Bagayoko, Chamil Kulatunga Michael Welzl, Fred Baker, Rong Pan and David Collier-Brown for detailed and wise feedback on this document.

8. IANA Considerations

This memo includes no request to IANA.

9. Security Considerations

All drafts are required to have a security considerations section.

10. References

10.1. Normative References

[I-D.ietf-aqm-recommendation] Baker, F. and G. Fairhurst, "IETF Recommendations Regarding Active Queue Management", Internet-Draft draft-ietf-aqm-recommendation-01, January 2014.
[I-D.ietf-tsvwg-byte-pkt-congest] Briscoe, B. and J. Manner, "Byte and Packet Congestion Notification", Internet-Draft draft-ietf-tsvwg-byte-pkt-congest-12, November 2013.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, 1997.
[RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", RFC 5136, 2008.

10.2. Informative References

[RFC5348] Floyd, S., Handley, M., Padhye, J. and J. Widmer, "TCP Friendly Rate Control (TFRC): Protocol Specification", RFC 5348, September 2008.
[TCPEVAL2008] Andrew, L., Marcondes, C., Floyd, S., Dunn, L., Guillier, R., Gang, W., Eggert, L., Ha, S. and I. Rhee, "Towards a common TCP evaluation suite", PFLDnet 6th, 2008.
[BB2012] CACM Staff, , "BufferBloat: what's wrong with the internet?", Commun. ACM vol. 55, 2008.
[DOCSIS2013] White, G. and D. Rice, "Active Queue Management Algorithms for DOCSIS 3.0", Technical repport - Cable Television Laboratories , 2013.

Appendix A. Additional Stuff

This becomes an Appendix.

Authors' Addresses

Nicolas Kuhn (editor) Telecom Bretagne 2 rue de la Chataigneraie Cesson-Sevigne, 35510 France Phone: +33 2 99 12 70 46 EMail: nicolas.kuhn@telecom-bretagne.eu
Preethi Natarajan (editor) Cisco Systems 510 McCarthy Blvd Milpitas, California United States EMail: prenatar@cisco.com
David Ros Simula Research Laboratory EMail: dros@simula.no
Naeem Khademi University of Oslo Department of Informatics, PO Box 1080 Blindern N-0316 Oslo, Norway Phone: +47 2285 24 93 EMail: naeemk@ifi.uio.no