Internet DRAFT - draft-huang-tsvwg-transport-challenges

draft-huang-tsvwg-transport-challenges







tsvwg                                                           R. Huang
Internet-Draft                                                    S. Ren
Intended status: Informational                                    H. Luo
Expires: 15 March 2024                                           Q. Chen
                                                                  Huawei
                                                       12 September 2023


       The Challenges that Current Service Transports are Facing
               draft-huang-tsvwg-transport-challenges-00

Abstract

   This document discusses the challenges for improving the transmission
   quality when lack of information between network and application, and
   then provide some basic requirements that new synergy mechanisms
   should possess.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 15 March 2024.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.



Huang, et al.             Expires 15 March 2024                 [Page 1]

Internet-Draft            transport challenges            September 2023


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   3
   3.  Challenges of Improving transmission quality in WAN . . . . .   3
     3.1.  Network Undifferentiated Scheduling . . . . . . . . . . .   3
     3.2.  Heuristic Network Conditions  . . . . . . . . . . . . . .   4
       3.2.1.  Slow Start  . . . . . . . . . . . . . . . . . . . . .   4
       3.2.2.  Bandwidth and RTT Probing . . . . . . . . . . . . . .   4
       3.2.3.  Multiple Path . . . . . . . . . . . . . . . . . . . .   5
     3.3.  Heterogeneous Environment . . . . . . . . . . . . . . . .   5
   4.  Requirements for Synergy Mechanisms between Network and
           Endpoint  . . . . . . . . . . . . . . . . . . . . . . . .   6
   5.  Existing Mechanisms . . . . . . . . . . . . . . . . . . . . .   6
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   7
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Introduction

   Currently, the Internet transport protocols are evolving rapidly.  On
   one hand, this is due to the consideration of user privacy and
   security that drives the transport protocol evolution towards built-
   in encryption; On the other hand, TCP ossification caused by
   excessive intervention of intermediate devices is also frustrating
   the industry, and then e2e built-in encryption becomes the most
   popular design of new transport protocols.  However, network and
   transport are not independent nor unrelated; they are closely rely on
   each other to work, thus there must have some synergy mechanisms
   between them to help the transmission work better.  In the past,
   transport protocols like TCP enable the collaboration between network
   and application through plaintext message headers.  But now, this is
   no longer possible in increasingly popular secure transport protocols
   like QUIC, and the industry urgently needs a new way to achieve this
   synergy.

   This document discusses the challenges for improving the transmission
   quality when lack of information between network and application, and
   then provide some basic requirements that new synergy mechanisms
   should possess.








Huang, et al.             Expires 15 March 2024                 [Page 2]

Internet-Draft            transport challenges            September 2023


2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Challenges of Improving transmission quality in WAN

3.1.  Network Undifferentiated Scheduling

   DSCP is designed to ensure Quality of Service (QoS) for transmission
   in network by encoding the 6 bits in the header of an IP packet to
   classify service categories and achieve differentiated services.
   However, as the variety of Internet applications continues to
   increase, current differentiating services become coarse granularity,
   e.g., internet traffic is all treated as Best Effort, and network
   devices are unable to obtain effective and legitimate application
   information to forward the internet traffic appropriately with
   quality.  For instance, service specific bandwidth, latency, or
   jitter requirements cannot be adequately met, resulting in relative
   poor end user experience.  This is also pointed out in
   [I-D.kaippallimalil-tsvwg-media-hdr-wireless].  Even though DSCP is
   implemented in the real deployments agreed among service provider and
   ISPs, the benefit is quite limited due to the lack of information
   density.  For example, the specific traffic paying for the good
   quality service still cannot get a satisfied improvement of quality
   during the busy hours.

   At another point, network undifferentiated secheduling also affects
   some network functions to be fully utilized.  An example would be the
   usage of CoMP (Coordinated Multipoint transmission/reception) in LTE
   scenarios, which is used to manage interference effect through
   collaborative processing among different cells or base stations,
   thereby improving network efficiency.  In our experience of the
   intra-eNB deployment, if additional service level information, such
   as desired completion time and start/end signals, is provided, the
   CoMP success rate can be greatly improved and so does the network's
   goodput.











Huang, et al.             Expires 15 March 2024                 [Page 3]

Internet-Draft            transport challenges            September 2023


3.2.  Heuristic Network Conditions

   Application transmissions rely on network, thus network conditions
   greatly affect applications performance.  Because current application
   and network are loosely cooperated and little information is shared,
   applications can only passively make speculative adjustments through
   end-to-end feedback.  Such adjustments not only lack precision but
   also have lagged effect.  This is discussed in following sections.

3.2.1.  Slow Start

   Current transport protocols increase the sending packets gradually
   through slow start, usually starting with a small initial window of
   around 10, to avoid injecting too much packets into the network.
   This has been effectively preventing network from collapse for
   decades.  This also means bandwidth utilization is low during the
   slow start phase.  It becomes significant with the widespread
   adoption of technologies such as 5G and fiber-to-the-home (FTTH).
   Particularly, when the network's BDP (Bandwidth Delay Product)
   increases, the duration of slow start becomes longer, resulting in
   poor transmission efficiency for show flows.  In the test reports of
   [_5G], it is mentioned that BBR slow start phase lasts around 6s
   before it converges to the high network bandwidth in 5G mobile web
   browsing scenario.  In [flash], it is highlighted that with a flow
   duration of 1 s (which transferred over 1 MB of data), the bandwidth
   efficiencies for Cubic and BBR were only 53% and 48%. This
   significantly impacts the transmission quality of short flow
   applications, e.g., mobile app dowload/update, cloud album, or first
   page loading of apps.

3.2.2.  Bandwidth and RTT Probing

   Current congestion control algorithms often rely on E2E feedback to
   infer the network state and adjust packet transmissions accordingly.
   However, in the case of RTT is relatively large, which is quite
   common in WAN scenarios, the increased transmission time in the
   network results in longer E2E feedback cycles, and the feedback
   signals may not reach the sender in a timely manner.  In such
   situations, the sender is unable to accurately perceive congestion
   and make timely adjustments, leading to lower effective throughput in
   wide area and long-distance networks.  Therefore, the effectiveness
   of performance adjustments may be adversely affected in these
   circumstances.

   We conducted tests on the throughput performance of BBR and CUBIC
   under different network conditions, including 64 concurrent traffic,
   2 Gbps link capacity, and varying levels of latency and packet loss.
   Under the scenario of a 5ms latency and a 0.01% packet loss rate, the



Huang, et al.             Expires 15 March 2024                 [Page 4]

Internet-Draft            transport challenges            September 2023


   total throughput of CUBIC has already dropped to less than 10% of the
   total bandwidth.  BBR showed a significant enhancement in this
   scenario, achieving a throughput of over 50% even with a 5ms latency
   and a 0.1% packet loss rate.  However, as the latency increased to
   10ms, the throughput of BBR decreased to only about 30%, and further
   decreased to around 20% with a 15ms latency.  It is evident that BBR
   improves overall throughput performance, it fails to fully utilize
   the available network resources as latency increases.

   The problem is particularly prominent in heterogeneous environment,
   e.g., traffic aggregating across both data centers and WAN.  The
   internal delay within the data center is short and allows for quick
   convergence, resulting in significant dynamic changes in bandwidth.
   On the other hand, the WAN side has a longer feedback period and
   slower convergence, making it challenging to accurately predict the
   bandwidth situation.  As a result, the overall network resources and
   performance cannot be well balanced.  This is also discussed in
   [Annulus] and [Cross-Datacenter].

3.2.3.  Multiple Path

   As network coverage and diversity continue to improve, wide-area
   multipath application becomes a trend. 3GPP has already introduced
   Access Traffic Streering, Switching, and Splitting (ATSSS), which is
   one of the prevalent use case of network-assisted multipath
   transport.  However, practical multiple path deployments often face
   the coexistence of high-quality and low-quality links, with different
   lost rate and RTT for different disjoint paths.  Relying solely on
   e2e path congestion control to guess the network condition on each
   path, especially for highly dynamic wireless networks, can easily
   lead to traffic scheduling instability and suboptimal regimes.
   Quantifying the network behaviour precisely and taking advantage of
   it in multiple path mechanisms can be a way to achieve fast
   convergence and better experience.

3.3.  Heterogeneous Environment

   Real networks have feature of segmented heterogeneity, such as the
   potential mutual influence between WAN-side traffic and data center
   internal traffic, or a traffic could go through comparatively less
   stable wireless segment inside enterprise or home broadband scenarios
   and stable fixed cable segment in WAN.  However, one single set of
   parameters, or even one single congestion control algorithm cannot
   achieve optimized performance in such a complex enviroment.  For
   instance, as the tests in [Pantheon], BBR can handle scenarios with
   random packet loss like 5G and Wi-Fi, but its throughput may not be
   as good as cubic in other situations, while cubic's throughput is
   poor in scenarios with random packet loss; In a multipath scenario,



Huang, et al.             Expires 15 March 2024                 [Page 5]

Internet-Draft            transport challenges            September 2023


   due to the dynamic and diverse nature of different paths, a fixed set
   of algorithm parameters may not achieve optimal performance.
   Currently, the work in IETF is mainly limited to idealized scenarios
   only relying on e2e feedback which has been used for decades, and has
   not extensively considered new ways, e.g., adaptive solutions for
   transport protocols, when traversing heterogeneous networks.  And
   these new ways may require a good collaboration and information
   exchange between network and endpoints.

4.  Requirements for Synergy Mechanisms between Network and Endpoint

   In conclusion, the improvement of transmission quality should not
   solely rely on passive heuristic network conditions at the endpoints.
   Further enhancement should involve the synergy between the network
   and the client side.  Several requirements for this collaborative
   mechanism are listed as following:

   1.  There should have 2 kinds of collaborations: one for host to
       network, the other for network to host.  Either one mechanism for
       each or one mechanism for both.

   2.  The mechanisms should enhance the corresponding transmission
       quality and end-user experience, rather than deteriorating them.

   3.  The mechanisms must be secure and trustworthy, preventing
       malicious attacks and tampering.

   4.  The mechanisms must effectively prevent deception and abuse.

   5.  The mechanisms should not cause transport protocols to become
       ossification.  Specifically, the information transmitted through
       the collaborative mechanism should be incremental and
       referential, instead of decisive or heavily dependent.  Its
       presence should result in a better experience, while the absence
       SHOULD NOT degrade the experience compared to current situation.

5.  Existing Mechanisms

   ECN [rfc3168] is widely deployed in the industry that uses 2 bits in
   the IP header to convey congestion information.  It combines with AQM
   mechanisms in network devices, setting the CE code point in the IP
   header to indicate congestion before the queue overflows, thereby
   notifying endpoints to reduce their sending rate.  Futhermore, L4S
   [rfc9330] redefines the semantics of ECT(1) code point and isolates
   L4S traffic from traditional traffic through the usage of dual-queue
   AQM in the middlebox, to achieve low latency.





Huang, et al.             Expires 15 March 2024                 [Page 6]

Internet-Draft            transport challenges            September 2023


   ECN and L4S are essentially the collaboration between network and
   endpoints to achieve the desired low loss and low latency goals.
   However, this approach cannot completely address the challenges
   described in Section 3.  Additionally, as elaborated in
   [L4SinCellular], L4S is quite sensitivity to time varying network,
   such as wireless and Wi-Fi networks, which may make it difficult to
   simultaneously achieve high throughput and low latency in such
   environments.  If more information is provided for collaboration,
   issues may be overcomed more easily.

6.  Security Considerations

   This document has no security considerations.

7.  IANA Considerations

   This document has no IANA actions.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

8.2.  Informative References

   [Annulus]  SAEED, A., GUPTA, V., GOYAL, P., SHARIF, M., PAN, R.,
              AMMAR, M., ZEGURA, E., JANG, K., ALIZADEH, M., KABBANI,
              A., and A. VAHDAT, "A Dual Congestion Control Loop for
              Datacenter and WAN Traffic Aggregates", 2020.

   [Cross-Datacenter]
              ZENG, G., BAI, W., CHEN, G., CHEN, K., HAN, D., ZHU, Y.,
              and L. CUI, "Congestion Control for Cross-Datacenter
              Networks", 2019.

   [flash]    GUO, L. and J. LEE, "TCP-FLASH - A Fast Reacting TCP for
              Modern Networks", 2021.






Huang, et al.             Expires 15 March 2024                 [Page 7]

Internet-Draft            transport challenges            September 2023


   [I-D.kaippallimalil-tsvwg-media-hdr-wireless]
              Kaippallimalil, J., Gundavelli, S., and S. Dawkins, "Media
              Header Extensions for Wireless Networks", Work in
              Progress, Internet-Draft, draft-kaippallimalil-tsvwg-
              media-hdr-wireless-02, 5 July 2023,
              <https://datatracker.ietf.org/doc/html/draft-
              kaippallimalil-tsvwg-media-hdr-wireless-02>.

   [L4SinCellular]
              MATHIEU, B. and S. TUFFIN, "Evaluating the L4S
              Architecture in Cellular Networks with a Programmable
              Switch", 2021.

   [Pantheon] YAN, F., MA, J., HILL, G., RAGHAVAN, D., WAHBY, R., LEVIS,
              P., and K. WINSTEIN, "Pantheon: the training ground for
              Internet congestion-control research", 2018.

   [rfc3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,
              <https://www.rfc-editor.org/rfc/rfc3168>.

   [rfc9330]  Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G.
              White, "Low Latency, Low Loss, and Scalable Throughput
              (L4S) Internet Service: Architecture", RFC 9330,
              DOI 10.17487/RFC9330, January 2023,
              <https://www.rfc-editor.org/rfc/rfc9330>.

   [_5G]      Xu, D., Zhou, A., Zhang, X., Wang, G., Liu, X., An, C.,
              Shi, Y., Liu, L., and H. Ma, "Understanding Operational
              5G: A First Measurement Study on Its Coverage, Performance
              and Energy Consumption", 2020.

Authors' Addresses

   Rachel Huang
   Huawei
   Email: rachel.huang@huawei.com


   Shoushou Ren
   Huawei
   Email: renshoushou@huawei.com


   Hanlin Luo
   Huawei
   Email: luohanlin2@huawei.com



Huang, et al.             Expires 15 March 2024                 [Page 8]

Internet-Draft            transport challenges            September 2023


   Qichang Chen
   Huawei
   Email: chenqichang1@huawei.com
















































Huang, et al.             Expires 15 March 2024                 [Page 9]