Internet DRAFT - draft-huang-tsvwg-transport-challenges
draft-huang-tsvwg-transport-challenges
tsvwg R. Huang
Internet-Draft S. Ren
Intended status: Informational H. Luo
Expires: 15 March 2024 Q. Chen
Huawei
12 September 2023
The Challenges that Current Service Transports are Facing
draft-huang-tsvwg-transport-challenges-00
Abstract
This document discusses the challenges for improving the transmission
quality when lack of information between network and application, and
then provide some basic requirements that new synergy mechanisms
should possess.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 15 March 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Huang, et al. Expires 15 March 2024 [Page 1]
Internet-Draft transport challenges September 2023
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 3
3. Challenges of Improving transmission quality in WAN . . . . . 3
3.1. Network Undifferentiated Scheduling . . . . . . . . . . . 3
3.2. Heuristic Network Conditions . . . . . . . . . . . . . . 4
3.2.1. Slow Start . . . . . . . . . . . . . . . . . . . . . 4
3.2.2. Bandwidth and RTT Probing . . . . . . . . . . . . . . 4
3.2.3. Multiple Path . . . . . . . . . . . . . . . . . . . . 5
3.3. Heterogeneous Environment . . . . . . . . . . . . . . . . 5
4. Requirements for Synergy Mechanisms between Network and
Endpoint . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Existing Mechanisms . . . . . . . . . . . . . . . . . . . . . 6
6. Security Considerations . . . . . . . . . . . . . . . . . . . 7
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 7
8.1. Normative References . . . . . . . . . . . . . . . . . . 7
8.2. Informative References . . . . . . . . . . . . . . . . . 7
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction
Currently, the Internet transport protocols are evolving rapidly. On
one hand, this is due to the consideration of user privacy and
security that drives the transport protocol evolution towards built-
in encryption; On the other hand, TCP ossification caused by
excessive intervention of intermediate devices is also frustrating
the industry, and then e2e built-in encryption becomes the most
popular design of new transport protocols. However, network and
transport are not independent nor unrelated; they are closely rely on
each other to work, thus there must have some synergy mechanisms
between them to help the transmission work better. In the past,
transport protocols like TCP enable the collaboration between network
and application through plaintext message headers. But now, this is
no longer possible in increasingly popular secure transport protocols
like QUIC, and the industry urgently needs a new way to achieve this
synergy.
This document discusses the challenges for improving the transmission
quality when lack of information between network and application, and
then provide some basic requirements that new synergy mechanisms
should possess.
Huang, et al. Expires 15 March 2024 [Page 2]
Internet-Draft transport challenges September 2023
2. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Challenges of Improving transmission quality in WAN
3.1. Network Undifferentiated Scheduling
DSCP is designed to ensure Quality of Service (QoS) for transmission
in network by encoding the 6 bits in the header of an IP packet to
classify service categories and achieve differentiated services.
However, as the variety of Internet applications continues to
increase, current differentiating services become coarse granularity,
e.g., internet traffic is all treated as Best Effort, and network
devices are unable to obtain effective and legitimate application
information to forward the internet traffic appropriately with
quality. For instance, service specific bandwidth, latency, or
jitter requirements cannot be adequately met, resulting in relative
poor end user experience. This is also pointed out in
[I-D.kaippallimalil-tsvwg-media-hdr-wireless]. Even though DSCP is
implemented in the real deployments agreed among service provider and
ISPs, the benefit is quite limited due to the lack of information
density. For example, the specific traffic paying for the good
quality service still cannot get a satisfied improvement of quality
during the busy hours.
At another point, network undifferentiated secheduling also affects
some network functions to be fully utilized. An example would be the
usage of CoMP (Coordinated Multipoint transmission/reception) in LTE
scenarios, which is used to manage interference effect through
collaborative processing among different cells or base stations,
thereby improving network efficiency. In our experience of the
intra-eNB deployment, if additional service level information, such
as desired completion time and start/end signals, is provided, the
CoMP success rate can be greatly improved and so does the network's
goodput.
Huang, et al. Expires 15 March 2024 [Page 3]
Internet-Draft transport challenges September 2023
3.2. Heuristic Network Conditions
Application transmissions rely on network, thus network conditions
greatly affect applications performance. Because current application
and network are loosely cooperated and little information is shared,
applications can only passively make speculative adjustments through
end-to-end feedback. Such adjustments not only lack precision but
also have lagged effect. This is discussed in following sections.
3.2.1. Slow Start
Current transport protocols increase the sending packets gradually
through slow start, usually starting with a small initial window of
around 10, to avoid injecting too much packets into the network.
This has been effectively preventing network from collapse for
decades. This also means bandwidth utilization is low during the
slow start phase. It becomes significant with the widespread
adoption of technologies such as 5G and fiber-to-the-home (FTTH).
Particularly, when the network's BDP (Bandwidth Delay Product)
increases, the duration of slow start becomes longer, resulting in
poor transmission efficiency for show flows. In the test reports of
[_5G], it is mentioned that BBR slow start phase lasts around 6s
before it converges to the high network bandwidth in 5G mobile web
browsing scenario. In [flash], it is highlighted that with a flow
duration of 1 s (which transferred over 1 MB of data), the bandwidth
efficiencies for Cubic and BBR were only 53% and 48%. This
significantly impacts the transmission quality of short flow
applications, e.g., mobile app dowload/update, cloud album, or first
page loading of apps.
3.2.2. Bandwidth and RTT Probing
Current congestion control algorithms often rely on E2E feedback to
infer the network state and adjust packet transmissions accordingly.
However, in the case of RTT is relatively large, which is quite
common in WAN scenarios, the increased transmission time in the
network results in longer E2E feedback cycles, and the feedback
signals may not reach the sender in a timely manner. In such
situations, the sender is unable to accurately perceive congestion
and make timely adjustments, leading to lower effective throughput in
wide area and long-distance networks. Therefore, the effectiveness
of performance adjustments may be adversely affected in these
circumstances.
We conducted tests on the throughput performance of BBR and CUBIC
under different network conditions, including 64 concurrent traffic,
2 Gbps link capacity, and varying levels of latency and packet loss.
Under the scenario of a 5ms latency and a 0.01% packet loss rate, the
Huang, et al. Expires 15 March 2024 [Page 4]
Internet-Draft transport challenges September 2023
total throughput of CUBIC has already dropped to less than 10% of the
total bandwidth. BBR showed a significant enhancement in this
scenario, achieving a throughput of over 50% even with a 5ms latency
and a 0.1% packet loss rate. However, as the latency increased to
10ms, the throughput of BBR decreased to only about 30%, and further
decreased to around 20% with a 15ms latency. It is evident that BBR
improves overall throughput performance, it fails to fully utilize
the available network resources as latency increases.
The problem is particularly prominent in heterogeneous environment,
e.g., traffic aggregating across both data centers and WAN. The
internal delay within the data center is short and allows for quick
convergence, resulting in significant dynamic changes in bandwidth.
On the other hand, the WAN side has a longer feedback period and
slower convergence, making it challenging to accurately predict the
bandwidth situation. As a result, the overall network resources and
performance cannot be well balanced. This is also discussed in
[Annulus] and [Cross-Datacenter].
3.2.3. Multiple Path
As network coverage and diversity continue to improve, wide-area
multipath application becomes a trend. 3GPP has already introduced
Access Traffic Streering, Switching, and Splitting (ATSSS), which is
one of the prevalent use case of network-assisted multipath
transport. However, practical multiple path deployments often face
the coexistence of high-quality and low-quality links, with different
lost rate and RTT for different disjoint paths. Relying solely on
e2e path congestion control to guess the network condition on each
path, especially for highly dynamic wireless networks, can easily
lead to traffic scheduling instability and suboptimal regimes.
Quantifying the network behaviour precisely and taking advantage of
it in multiple path mechanisms can be a way to achieve fast
convergence and better experience.
3.3. Heterogeneous Environment
Real networks have feature of segmented heterogeneity, such as the
potential mutual influence between WAN-side traffic and data center
internal traffic, or a traffic could go through comparatively less
stable wireless segment inside enterprise or home broadband scenarios
and stable fixed cable segment in WAN. However, one single set of
parameters, or even one single congestion control algorithm cannot
achieve optimized performance in such a complex enviroment. For
instance, as the tests in [Pantheon], BBR can handle scenarios with
random packet loss like 5G and Wi-Fi, but its throughput may not be
as good as cubic in other situations, while cubic's throughput is
poor in scenarios with random packet loss; In a multipath scenario,
Huang, et al. Expires 15 March 2024 [Page 5]
Internet-Draft transport challenges September 2023
due to the dynamic and diverse nature of different paths, a fixed set
of algorithm parameters may not achieve optimal performance.
Currently, the work in IETF is mainly limited to idealized scenarios
only relying on e2e feedback which has been used for decades, and has
not extensively considered new ways, e.g., adaptive solutions for
transport protocols, when traversing heterogeneous networks. And
these new ways may require a good collaboration and information
exchange between network and endpoints.
4. Requirements for Synergy Mechanisms between Network and Endpoint
In conclusion, the improvement of transmission quality should not
solely rely on passive heuristic network conditions at the endpoints.
Further enhancement should involve the synergy between the network
and the client side. Several requirements for this collaborative
mechanism are listed as following:
1. There should have 2 kinds of collaborations: one for host to
network, the other for network to host. Either one mechanism for
each or one mechanism for both.
2. The mechanisms should enhance the corresponding transmission
quality and end-user experience, rather than deteriorating them.
3. The mechanisms must be secure and trustworthy, preventing
malicious attacks and tampering.
4. The mechanisms must effectively prevent deception and abuse.
5. The mechanisms should not cause transport protocols to become
ossification. Specifically, the information transmitted through
the collaborative mechanism should be incremental and
referential, instead of decisive or heavily dependent. Its
presence should result in a better experience, while the absence
SHOULD NOT degrade the experience compared to current situation.
5. Existing Mechanisms
ECN [rfc3168] is widely deployed in the industry that uses 2 bits in
the IP header to convey congestion information. It combines with AQM
mechanisms in network devices, setting the CE code point in the IP
header to indicate congestion before the queue overflows, thereby
notifying endpoints to reduce their sending rate. Futhermore, L4S
[rfc9330] redefines the semantics of ECT(1) code point and isolates
L4S traffic from traditional traffic through the usage of dual-queue
AQM in the middlebox, to achieve low latency.
Huang, et al. Expires 15 March 2024 [Page 6]
Internet-Draft transport challenges September 2023
ECN and L4S are essentially the collaboration between network and
endpoints to achieve the desired low loss and low latency goals.
However, this approach cannot completely address the challenges
described in Section 3. Additionally, as elaborated in
[L4SinCellular], L4S is quite sensitivity to time varying network,
such as wireless and Wi-Fi networks, which may make it difficult to
simultaneously achieve high throughput and low latency in such
environments. If more information is provided for collaboration,
issues may be overcomed more easily.
6. Security Considerations
This document has no security considerations.
7. IANA Considerations
This document has no IANA actions.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
8.2. Informative References
[Annulus] SAEED, A., GUPTA, V., GOYAL, P., SHARIF, M., PAN, R.,
AMMAR, M., ZEGURA, E., JANG, K., ALIZADEH, M., KABBANI,
A., and A. VAHDAT, "A Dual Congestion Control Loop for
Datacenter and WAN Traffic Aggregates", 2020.
[Cross-Datacenter]
ZENG, G., BAI, W., CHEN, G., CHEN, K., HAN, D., ZHU, Y.,
and L. CUI, "Congestion Control for Cross-Datacenter
Networks", 2019.
[flash] GUO, L. and J. LEE, "TCP-FLASH - A Fast Reacting TCP for
Modern Networks", 2021.
Huang, et al. Expires 15 March 2024 [Page 7]
Internet-Draft transport challenges September 2023
[I-D.kaippallimalil-tsvwg-media-hdr-wireless]
Kaippallimalil, J., Gundavelli, S., and S. Dawkins, "Media
Header Extensions for Wireless Networks", Work in
Progress, Internet-Draft, draft-kaippallimalil-tsvwg-
media-hdr-wireless-02, 5 July 2023,
<https://datatracker.ietf.org/doc/html/draft-
kaippallimalil-tsvwg-media-hdr-wireless-02>.
[L4SinCellular]
MATHIEU, B. and S. TUFFIN, "Evaluating the L4S
Architecture in Cellular Networks with a Programmable
Switch", 2021.
[Pantheon] YAN, F., MA, J., HILL, G., RAGHAVAN, D., WAHBY, R., LEVIS,
P., and K. WINSTEIN, "Pantheon: the training ground for
Internet congestion-control research", 2018.
[rfc3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, DOI 10.17487/RFC3168, September 2001,
<https://www.rfc-editor.org/rfc/rfc3168>.
[rfc9330] Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G.
White, "Low Latency, Low Loss, and Scalable Throughput
(L4S) Internet Service: Architecture", RFC 9330,
DOI 10.17487/RFC9330, January 2023,
<https://www.rfc-editor.org/rfc/rfc9330>.
[_5G] Xu, D., Zhou, A., Zhang, X., Wang, G., Liu, X., An, C.,
Shi, Y., Liu, L., and H. Ma, "Understanding Operational
5G: A First Measurement Study on Its Coverage, Performance
and Energy Consumption", 2020.
Authors' Addresses
Rachel Huang
Huawei
Email: rachel.huang@huawei.com
Shoushou Ren
Huawei
Email: renshoushou@huawei.com
Hanlin Luo
Huawei
Email: luohanlin2@huawei.com
Huang, et al. Expires 15 March 2024 [Page 8]
Internet-Draft transport challenges September 2023
Qichang Chen
Huawei
Email: chenqichang1@huawei.com
Huang, et al. Expires 15 March 2024 [Page 9]