Internet DRAFT - draft-bagnulo-congress-cci
draft-bagnulo-congress-cci
Network Working Group M. Bagnulo
Internet-Draft UC3M
Intended status: Informational 6 August 2023
Expires: 7 February 2024
Congestion Control Invariants
draft-bagnulo-congress-cci-01
Abstract
This document initiates the discussion about Congestion Control
Invariants, that is, mechanisms that several CCAs implement and that
would benefit from a common specification for all CCAs to improve
their interoperability
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 7 February 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Bagnulo Expires 7 February 2024 [Page 1]
Internet-Draft CCI August 2023
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Periodic Slow Down Invariant . . . . . . . . . . . . . . 3
1.1.1. Motivation . . . . . . . . . . . . . . . . . . . . . 3
1.1.2. Proposed invariant . . . . . . . . . . . . . . . . . 5
1.2. Other potential invariants . . . . . . . . . . . . . . . 6
2. Security Considerations . . . . . . . . . . . . . . . . . . . 6
3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6
5. Informative References . . . . . . . . . . . . . . . . . . . 6
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 7
1. Introduction
Over the last decade, we have witnessed a refreshing spring in
congestion control research, resulting in a number of novel
congestion control algorithms (CCAs). Indeed, in addition to the
traditional congestion control algorithms such as New Reno and Cubic,
we can now observe in that at least, the following algorithms are
being used in parts of the Internet:
BBR (Bottleneck Bandwidth and Round-trip propagation time)
[I-D.cardwell-iccrg-bbr-congestion-control] is a model-based
congestion control algorithm that attempts to improve the
performance of Internet communications by reducing the delay (when
bottleneck buffers are large) and increase the throughput (when
bottleneck buffers are small).
LEDBAT/LEDBAT++ (Low Extra Delay Background Transport)
)[I-D.irtf-iccrg-ledbat-plus-plus] is a CCA that implements a
less-than-best-effort (LBE) traffic class. When LEDBAT()++)
traffic shares a bottleneck with one or more TCP connections using
Cubic or other loss-based congestion control algorithms, it
reduces its sending rate earlier and more aggressively than
competing flows, allowing Cubic traffic to use more of the
available capacity.
DCTCP (Data-Center TCP) [I-D.ietf-tcpm-dctcp] is a CCA developed
by Microsoft to reduce the latency for data center communications.
DCTCP relies on AccECN to quantify the amount of inflight traffic
that is experiencing congestion and reduced the sending rate
accordingly. This allows DCTCP to operate with small queues,
oscillating around the optimal operation point. while DCTP was
originally designed for its use within data center networks, the
L4S (Low Latency, Low Loss, and Scalable Throughput) architecture
extends the use of DCTCP to the Internet.
Bagnulo Expires 7 February 2024 [Page 2]
Internet-Draft CCI August 2023
MPTCP (MultiPath TCP) [RFC8684] is an extension to TCP to support
multiple concurrent paths in a single TCP connection. MPTCP
includes a novel CCA that allows the coupling of the CCAs used in
the different paths [RFC6356]. Through the coupled CCA, MPTCP
manages to offload traffic from paths that are experiencing
congestion towards path that are less congested.
The adoption of the aforementioned CCA has not been uneventful. The
roll-outs of some CCA have been problematic
[_10.1145_3355369.3355604] than others. Specifically, the wide
deployment of BBR(v1) attracted a fair amount of attention due to the
(un)fairness issues that arise when BBR(v1) competes against legacy
CCAs such as Cubic and New Reno . As it has been repeatedly reported,
BBR(v1) does not react to packet losses, which results in large
packet loss rate for itself and other competing flows using
alternative CCAs. Since other CCAs (such as Cubic) do react to
packet losses, this BBR(v1) behaviour resulted in BBR(v1) seizing
more than its fair share of capacity when competing with CCAs that do
react against packet losses. these fairness issues are now being
corrected with the new version of BBR (BBRv2) and also triggered the
community to re-think the fairness requirements imposed to novel CCAs
in order to be deployed in the public Internet.
In this note, we focus in a different aspect of the interaction
between different CCAs. Specifically, we posit that several of these
CCAs implement similar functionalities in different ways which pose
challenges to the correct interaction between these CCAs. The goal
of this note is to initiate a line of research to identify potential
invariants in CCAs, meaning, mechanisms that several CCAs implement
and that would benefit from a common specification for all CCAs to
improve their interoperability. Such standardised mechanisms could
serve as building blocks for novel CCAs, so that when a new CCA needs
to implement one of such functions, it re-uses the specified building
block, rather than re-inventing it. To bootstrap the proposed work,
we motivate and propose a first Congestion Control algorithm
Invariant (CC), namely, periodic slow downs.
1.1. Periodic Slow Down Invariant
1.1.1. Motivation
Both BBR and LEDBAT++ estimate the base RTT as part of their
operations. The base RTT is the RTT in the absence of queueing
delay, which means it is the minimum RTT observable in a given path.
LEDBAT++ uses the base RTT to determine the current queuing delay,
which is computed as the difference between the current RTT and the
base RTT. BBR uses the base RTT to determine the Bandwidth Delay
Product (BDP) which affects the flight-size a flow is able to inject
Bagnulo Expires 7 February 2024 [Page 3]
Internet-Draft CCI August 2023
in the network.
In order to have visibility of the base RTT, both protocols perform
periodic slow downs as an attempt to empty the queues and expose the
base RTT. Because there may be multiple flows contributing to the
queue, both protocols include some form of synchronisation logic,
that allows multiple competing flows to slow down at the same time,
increasing the chances to empty the queue and expose the base RTT.
While both protocols implement the periodic slow down, the actual
implementation details differ.
In the case of LEDBAT++, it performs a slow-start increase at the
beginning of the connection. Then, LEDBAT++ executes periodic slow-
downs to obtain more accurate measurements of the base RTT.
Specifically LEDBAT++ sets the Congestion Window (CW) to 2 MSS during
2 RTTs and then performs a slow-start increase back to the value that
it was using before the periodic decrease. An initial slow-down is
performed 2 RTTs after exiting the initial slow-start. This process
is performed periodically. If we call Tss the time that it takes for
the slow-start to ramp back up, then LEDBAT++ performs the next
periodic slow down after a period equal to 9Tss.
This mechanism effectively empties the queue when there is a single
LEDBAT++ flow contributing to the queue (i.e. there is no other
traffic, LEDBAT++ or otherwise). If there are other competing
LEDBAT++ flows, this mechanism, albeit counter-intuitively, actually
works. Where there is a single flow int he bottleneck and it is
using LEDBAT++, it will correctly estimate the base RTT. If later
on, another LEDBAT++ joins, the base RTT measured will include the
added queueing delay T generated by the previous flow. This will
trigger than the second flow will attempt to generate an additional
queueing delay T on top of that, outcasting the first flow. This is
called late-comer advantage and has been documented extensively
[_10.1145_3355369.3355604]. At this point, only the second flow
prevails. This is when the initial slow down of the second flow
kicks in. Since the second flow has outcasted the first flow, when
the second flow slows down, it exposes the base RTT.
Bagnulo Expires 7 February 2024 [Page 4]
Internet-Draft CCI August 2023
In the base of BBRv1, if during the last 10s, a BBRv1 flow has not
observed an RTT smaller than its current estimation of the base RTT
(called RTprop), BBRv1 enters in the ProbeRTT state, reducing the
inflight to only 4 packets during at least 200 ms and one RTT.
RTprop is set to the minimum RTT observed during the last 10 s. This
mechanism naturally embeds synchronisation of slow-downs across
multiple flows. Suppose there are N uncoordinated BBRv1 flows
competing in the bottleneck. When the first one of them performs a
slow down, it is likely that the rest of the flows record a minimum
value for the RTT, which would likely cause than the next slow down
will occurs 10 s after this for all flows.
We have described how both LEDBAT++ and BBRv1 periodic slow down
mechanism work when there are multiple LEDBAT++/BBRv1 flows
respectively. We next consider how the slow down mechanism perform
when there is a mix of BBRv1 and LEDBAT++ flows. Based on the logic
of each of the mechanisms, we can easily conclude that will not
synchronise their slow downs. The reason for this is that the period
of the slowdowns does not match. In the case of BBR is a fixed
period of 10 s, while in the LEDBAT++ case, the period depends both
on the RTT and in the targeted CW. This lack of synchronisation has
been verified experimentally in [COMNET].
1.1.2. Proposed invariant
Having two CCAs such as LEDBAT++ and BBR implementing two different
slow down mechanisms is clearly counterproductive, since neither of
them is able to perform concurrently and expose the base RTT when
there is a mix of both types of flows competing in a bottleneck.
Having a single slow down mechanism standardised that should be used
as a building block by every CCA that requires a periodic slow down
mechanism would naturally bring interoperability between the
different CCAs, avoiding interference when they need to expose and
measure the base RTT.
Regarding the specific mechanism, we believe that the one specified
by BBR has merits over the one of LEDBAT++. Specifically, the one
specified by BBR is able to synchronise the slowdowns of multiple
flows, which seems challenging for the LEDBAT++ mechanism, especially
when the different flows have different characteristics. for
instance, if there are different LEDBAT++ flows with different RTTs
competing in the same bottleneck, the periods of the slow downs of
the different flows is likely to be different as the Tss for each
flow will be different (because the RTTs are different).
Bagnulo Expires 7 February 2024 [Page 5]
Internet-Draft CCI August 2023
1.2. Other potential invariants
As next steps, we propose to identify other potential invariants by
identifying basic building blocks used in different CCAs and that if
implemented in different ways would result in interference between
the different flavours.
2. Security Considerations
3. IANA Considerations
4. Acknowledgements
This work was supported by the EU through the StandICT CCI project.
5. Informative References
[COMNET] Bagnulo, M.B. and A.G. Garcia-Martinez, "When less is
more: BBR versus LEDBAT++", Computer Networks Volume 219,
2022.
[I-D.cardwell-iccrg-bbr-congestion-control]
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
Jacobson, "BBR Congestion Control", Work in Progress,
Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
control-02, 7 March 2022,
<https://datatracker.ietf.org/doc/html/draft-cardwell-
iccrg-bbr-congestion-control-02>.
[I-D.ietf-tcpm-dctcp]
Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
Control for Data Centers", Work in Progress, Internet-
Draft, draft-ietf-tcpm-dctcp-10, 28 August 2017,
<https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
dctcp-10>.
[I-D.irtf-iccrg-ledbat-plus-plus]
Balasubramanian, P., Ertugay, O., and D. Havey, "LEDBAT++:
Congestion Control for Background Traffic", Work in
Progress, Internet-Draft, draft-irtf-iccrg-ledbat-plus-
plus-01, 25 August 2020,
<https://datatracker.ietf.org/doc/html/draft-irtf-iccrg-
ledbat-plus-plus-01>.
Bagnulo Expires 7 February 2024 [Page 6]
Internet-Draft CCI August 2023
[RFC6356] Raiciu, C., Handley, M., and D. Wischik, "Coupled
Congestion Control for Multipath Transport Protocols",
RFC 6356, DOI 10.17487/RFC6356, October 2011,
<https://www.rfc-editor.org/info/rfc6356>.
[RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
"Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
DOI 10.17487/RFC6817, December 2012,
<https://www.rfc-editor.org/info/rfc6817>.
[RFC8684] Ford, A., Raiciu, C., Handley, M., Bonaventure, O., and C.
Paasch, "TCP Extensions for Multipath Operation with
Multiple Addresses", RFC 8684, DOI 10.17487/RFC8684, March
2020, <https://www.rfc-editor.org/info/rfc8684>.
[_10.1016_j.comnet.2013.02.020]
Carofiglio, G., Muscariello, L., Rossi, D., Testa, C.,
Valenti, S., and Elsevier BV, "Rethinking the Low Extra
Delay Background Transport (LEDBAT) Protocol", Computer
Networks, vol. 57, no. 8, pp. 1838-1852,
DOI 10.1016/j.comnet.2013.02.020, June 2013,
<http://dx.doi.org/10.1016/j.comnet.2013.02.020>.
[_10.1145_3355369.3355604]
Ware, R., Mukerjee, M. K., Seshan, S., Sherry, J., and
ACM, "Modeling BBR's Interactions with Loss-Based
Congestion Control", Proceedings of the Internet
Measurement Conference, DOI 10.1145/3355369.3355604, 21
October 2019, <http://dx.doi.org/10.1145/3355369.3355604>.
Author's Address
Marcelo Bagnulo
UC3M
Email: marcelo@it.uc3m.es
Bagnulo Expires 7 February 2024 [Page 7]