Internet DRAFT - draft-zhang-rtgwg-srv6-computing-connect-usecases


Network Working Group                                          X. Zhang
Internet-Draft                                                  F. Yang
Intended status: Standards Track                               W. Cheng
Expires: April 20, 2023                                    China Mobile
                                                                  Z. Fu
                                                   New H3C Technologies
                                                       October 20, 2023

          Usecases of SRv6 Based Computing Interconnection Network


   The requirements of computing interconnection are increasingly 
   attracting the attention of service providers. They have been thinking
   about how to leverage their network advantages to provide integrated
   networking and computing services. This document describes some 
   scenarios of using SRv6 based network technology which can partially
   meet the service requirement of computing interconnection. 

1.  Introduction

   With the advent of new technology such as cloud computing, big data,
   artificial intelligence, etc., the demand for computing resource is
   continuously increasing. More and more data centers, intelligent
   computing centers, and supercomputing centers have been built to meet
   the growing demand of computing resource. Usually, these computing
   centers are centralized(e.g. central cloud). Especially, in some
   emerging industries, such as self-driving, cloud AR/VR, telemedicine,
   etc., there are not only requirements for computing resource, but also
   for quick delivery and guaranteed quality. These requirements are
   usually related to network factors, such as delay, bandwidth, and
   jitter, etc. These services can be deployed not only on the central
   cloud, but also on distributed edge nodes.

   In order to coordinate computing resource at different levels(center,
   edge and end) uniformly, and to meet user's requirements for computing
   power and network, a new type of network is proposed which can combine
   computing and network information and provide optimal allocation,
   association and scheduling of computing, storage and network resources.
   We call it computing interconnection network in this document. The 
   computing interconnection network is a converged architecture with 
   computing and network.

   The computing interconnection network has attracted the attention of 
   many service providers. Lots of service providers have proposed their
   own concepts of computing interconnection network, and have also 
   released relevant technical white papers. Different computing resources
   are interconnected through the network. The goal of computing 
   interconnection network is to achieve "ubiquitous computing resource, 
   computing network symbiosis, intelligent orchestration, and integrated

   services", and gradually develop into a infrastructure-level service
   that can be "once connected, use anywhere" like water and electricity.
   These goals present some challenges to current network architecture.

   Segment Routing Architecture [RFC8402] proposes a network paradigm
   based on a source routing mechanism. The segment routing network has
   the remarkable characteristics of simplifying the control plane and
   network state. In addition, segment routing can program the network
   functions that need to be performed along the way, so that the packets
   can be transmitted and processed in the desired way. Segment routing
   can be applied to IPv6 network, which is called SRv6. In addition to
   inheriting the advantages of source routing, SRv6 has many other
   advantages. Firstly, IPv6 can provide more addresses to meet the needs
   of the Internet of Things. Secondly, SRv6 has three levels of
   programmability that are extremely scalable. Finally, SRv6 also
   supports in-suit OAM(IOAM), service chain, slicing and other features.
   SRv6 is the trend in the evolution of IP networks to intelligent IP

   The network for computing interconnection has attracted the attention 
   of many service providers. They also have deployed new bearer network 
   with SRv6 to provide better connection service. Based on the flexible 
   scalability, programmability, simplicity and other advantages, SRv6 
   can meet some requirements of computing interconnection network. This
   document introduces some usage scenarios of SRv6 based computing 
   interconnetion network.

2.  Usage Scenarios of SRv6 based computing interconnection network

2.1 SRv6 based computing interconnection network architecture

   The following figure shows a typical architecture of SRv6 based 
   computing interconnection network. There are two layers here, one is
   the infrastructure layer and the other is the control layer.

   *Infrastructure layer:
       Edge: The network edge device of computing interconnection. In this
       document, Edge is both the edge device for computing 
       interconnection and the endpoint of the SRv6 path.

       Computing Resource: The computing resources connected to computing
       interconnection Edge. It can be cloud, edge or terminal and so on.

       Client: Clients requesting computing services.

   *Control layer:
       Computing and Network Controller: Computing resource scheduling,
       orchestration and network policy distribution, in this document,
       CNC is used to represent it. 

                             /                                   /
                            /  Computing and Network Controller /
                           /              (CNC)                /
                                      :        :        :
                                      :        :        :
       ...............................:        :        :.............
  +----:-----+                                 :                     :
+----------+ |                                 :                     :
|Computing | |                                 :                     :
|Resource 1|-+   /-----------------------------:-----------------/   :
+----+-----+    /             +--------------+                  /    :
     |         / +---------+  |    SRv6      |  +---------+    /     :
     +--------+--| Edge1   |--|Infrastructure|--|  Edge3  |---+---+  :
             /   +---------+  |              |  +---------+  /    |  :
            /        |        +--------------+      |       /     |  :
           /         |               |              |      /      |  :
          /          |          +----------+        |     / +-----+--:-+
         /           |          |  Edge 2  |        |    /+----------+ |
        /            |          +----------+        |   / |Computing | |
       /             |               |              |  /  |Resource 2|-+
      /--------------+---------------+--------------+-/   +----------+
                     |               |              |
                 +---+--+        +---+--+       +---+--+
               +------+ |      +------+ |     +------+ |
               |client|-+      |client|-+     |client|-+
               +------+        +------+       +------+

      Figure 1: SRv6 based computing interconnection network architecture

2.2 Path scheduling

   when a computing request comes, it is necessary to decide which remote
   computing node is available to provide the service. Both computing 
   power and network need to be considered for the decision. After the
   computing node is determined, a SRv6 path fulfilling the SLA 
   requirement can be established to steers the packet to the destination,
   i.e. the remote computing node. SRv6 is based on source routing 
   mechanism, which can compose the path information at the ingress of the
   network. The path information is encapsulated in the packet, and 
   identified by a list of SIDs. Then, the packet only need to process the
   outermost SID downstream. The downstream nodes on the forwarding path 
   can be stateless. SRv6 paths could be established according to default
   metrics (e.g. cost) or user's policy. These paths can be in loose or 
   strict manner.

2.3 Resource isolation

   The different services for computing interconnection will
   be implemented on the same physical network. Some are sensitive to 
   network delay, such as AR/VR. Some are sensitive to packet loss, such
   as storage services. Therefore, different services share the same 
   physical network, but they need to be isolated from each other. 
   This requires the network with slicing capabilities. Each slice is a 
   logical network. Different slices can provide the services with 
   different SLA requirements which are isolated from each other. SRv6's
   programmability and protocol simplification could provide slicing 
   capabilities. Using the programmability of SRv6, network devices can
   assign a specific SID and reserve hardware resources for each slice.
   The device identifies the network slice based on the specific SID, and
   steers the packet according to the topology and resources defined by 
   the slice. Then the packets with different SLA requirements can be 
   forwarded in different slices to meet the requirements of business 

2.4 Multi segment path orchestration

   As described in chapter 1, the computing interconnection network is a
   converged architecture with computing and network. The computing 
   resources is interconnected through the wide area network. The network
   may be hierarchical, for example, including access, metro, backbone. 
   For each computing request, CNC needs to learn the state of the network
   and computing resource comprehensively and make decisions according to
   the user's constraint. The final selected computing node may cross
   multiple autonomous domains. Users may want to obtain a link with low
   delay, high bandwidth, or high reliability. Therefore, it is necessary
   to consider how to obtain a path that meets SLA when spanning multiple
   autonomous domains.

   The SRv6's BSID realizes the opening of network capabilities.
   Specifically, the SRv6 network identifies some paths which have
   specific SLA metrics with a SID, such as low latency. The SID is called
   BSID, which could be opened to CNC. CNC can select the appropriate BSID
   according to the user's requirements for network. The BSID hides the
   complex path information, and only one SID is presented externally.
   The path represented by the BSID can be a complete path or a certain
   segment of a complete path. Using SRv6's BSID, underlay and overlay can
   be combined, and multiple domains can also be connected. SRv6 based 
   computing interconnection network can orchestrate network paths more 
   conveniently and concisely.

2.5 Multi service orchestration

   In some scenarios, the computing service may need to pass through
   multiple computing service nodes. For example, in order to meet the
   requirements of user service's security and stability, when data
   packets are transmitted in the network, they often need to pass through
   various service nodes in sequence, such as firewalls, IPS, and
   application accelerators. This can be achieved through the SRv6 service
   function chain(SFC for short). SRv6 SFC is realized through the
   programmability of SRv6. SFC uses specific SIDs to represent the
   value-added services. The CNC can encode the value-added service
   functions from the service request in the network path segment list by
   SIDs, and forward and process the value-added service functions along
   the path. This maximizes the ability to integrate the computing and
   network services.   
2.6 Network reliability for computing interconnection
   Different applications hosted on the computing interconnect network
   have different requirements for network reliability. Network failure
   usually represents packets loss. However, many interactive multimedia
   applications (such as cloud gaming) are very sensitive to packet loss, 
   dozens of milliseconds of packet loss will lead to a rapid decline in
   the quality of service. The traditional fast rerouting mechanism of
   IP network has some problems, such as complex configuration, worse 
   handover performance, etc. However, the computing interconnect network
   based on SRv6 can solve these problems. For example, the Topology 
   Independent Loop-Free Alternate(TI-LFA) technology can provide 
   end-to-end local protection mechanism, which complete path switching 
   in a very short period of time after network failure. In addition, 
   SRv6 also provides a micro ring prevention mechanism, which prevent
   traffic loop in a short time after fault recovery. Therefore, the 
   computing capacity can be continuously and stably transmitted.   

2.7 Application-aware networking

   Traditionally, applications and networks are separated, and the network
   can only identify applications by means of five tuples. This method has
   a coarser granularity and cannot understand the real needs of 
   applications for computing resource and networks. In computing 
   interconnection network, in order to provide efficient and quality 
   guaranteed computing services, the edge of the network is required to
   identify different applications and their needs through incoming 
   packets, so as to provide different SLA services. Application-aware 
   networking for IPv6/SRv6 can meet this demand which can carry the 
   application identification and requirements for network and computing,
   for example by using IPv6 extend head. Then the network edge can 
   perceive these applications and corresponding requirements, so as to
   steer the packet to the appropriate SRv6 path.
2.8 Operations, Administration and Maintenance

   As an infrastructure that can serve various industry customers or
   individual users, the operation, administration and maintenance of
   computing interconnection network is very important. The network 
   usually changes more frequently. When network quality deteriorates, 
   computing interconnection network needs to respond quickly and provide
   a better path. Therefore, real-time monitoring of the current network
   state is required, which can be used as the basis for more accurate 
   and reasonable scheduling decisions and guarantee the SLA requirements.
   SRv6 supports in-situ OAM(IOAM), which can detect the network quality 
   in real-time and accurate way. Based on the real-time network status,
   SRv6 based network can better serve computing scheduling
   and network SLA guarantee.
3. Collaboration between computing and network

   The core concept of computing interconnection network is collaboration
   between computing and network. Computing and network are no longer 
   isolated entities, and they need to cooperate with each other. 
   Computing resources and network resources need to be managed and 
   allocated from a global perspective. The reference architecture given
   in figure 1 is a possible implementation. In this architecture the
   centralized CNC can realize the integrated orchestration, control and
   management of the computing and network. The integrated orchestration
   of computing and network which is aimed at the diversified and 
   customized requirements of computing and network convergence, could
   design product and service models based on the flexible combination of
   the atomic capability of the computing resource and network, and realize 
   the unified orchestration, deployment and guarantee of computing and 
   network services. The collaborative orchestration and scheduling of 
   computing and network provides a new network paradigm to accelerate 
   the digital transformation of society.
4.  Best practices

   Based on the above-mentioned important role of SRv6 in computing 
   interconnection network, as a typical practice, China Mobile has built
   the infrastructure of SRv6 based DCI network and smart SD-WAN. The 
   SRv6-based DCI network can uniformly access various computing resources
   and provide computing services. Smart SD-WAN is a new generation of 
   SD-WAN that integrates overlay and underlay networks. SRv6 based DCI
   network and smart SD-WAN enhance coordination ability between computing
   and network resource. It enable full connectivity of end, edge, cloud 
   and network, and combine user's intent to achieve collaborative 
   scheduling among application, computing and network, and improve
   service quality assurance capabilities, to realize end-to-end, 
   differentiated, deterministic, and value-added computing network 
   The following is a specific application example. Considering a CDN
   scheduling system, CDN applications can be regarded as computing
   services required by users. In the traditional scheduling mode,
   scheduling system usually allocate CDN node according to the user's
   geographic location. This will lead to a large number of users in a
   hotspot area being assigned to the same CDN node, so that some CDN
   nodes are busy, while others are very idle. This results in low
   resource utilization, and service quality cannot be guaranteed. In
   the computing interconnection network, CNC can manage computing resouce
   and network resources at the same time. It can assign users in the same
   hotspot area to different CDN nodes according to the nodes computing 
   load obtained in real-time. Corresponding SRv6 paths are established
   for steering different users's packet to different CDN nodes. Through
   coordinating computing and network, the problem of unbalanced resource
   allocation can be solved and user service experience can be improved. 

                             /                                   /
                            /  Computing and Network Controlle  /
                           /             (CNC)                 /
                                   :       :        :
                                   :       :        :
       ............................:       :        :..................
    +--:----+                              :                          :
  +-------+ |                              :                          :
  |CDN    | |                              :                          :
  |Node 1 |-+    /-------------------------:------------------------/ :
  +--+----+     /              +----------------+                  /  :
     |         / +----------+  |      Cloud     |  +----------+   /   :
     +--------+--|SDWAN CPE1|--|Specific Network|--|SDWAN CPE3|--+--+ :
             /   +----------+  |                |  +----------+ /   | :
            /      *           +----------------+         *    /    | :
           /       *                   |                  *   /     | :
          /        *             +----------+             *  / +----+-:+
         /         **SRv6 path1**|SDWAN CPE2|**SRv6 path2** /+-------+ |
        /                        +----------+              / |CDN    | |
       /                               |                  /  |Node 2 |-+
      /--------------------------------+-----------------/   +-------+
               +---+--+      +---+--+     +--+---+     +--+---+
             +------+ |    +------+ |   +------+ |   +------+ |
             |user 1|-+    |user 2|-+   |user 3|-+   |user 4|-+
             +------+      +------+     +------+     +------+
                   Figure 2: CDN system with SRv6 based network

   Specifically, as shown in figure 2, CDN Node 1 and CDN Node 2 are
   located at SD-WAN CPE1 and SD-WAN CPE3 respectively. There are a large
   number of users in the same area accessing to SD-WAN CPE2. It is
   assumed that CDN Node 1 is closer to this area. In the traditional way,
   user 1 to user 4 all access CDN Node 1. After the computing 
   interconnection network is deployed, CNC will consider the network and
   computing load factors at the same time. User 1 and user 2 will access 
   CDN Node 1. User 3 and user 4 will access CDN Node 2. Meanwhile, two 
   SRv6 paths is established on SD-WAN CPE2. Path 1 is to CDN Node 1 and 
   another is to CDN Node 2. User 1 and user 2 will access through path 1,
   and user 3 and user 4 will access another. In this way, better resource
   utilization and service experience can be achieved.
7.  Informative References

   [RFC8402]  C. Filsfils, S. Previdi, L. Ginsberg, "Segment Routing
              Architecture Services", BCP 126, RFC 8402, DOI 
              10.17487/RFC8402, July 2018, 

Authors' Addresses

   Xiaoqiu Zhang
   China Mobile
   Feng Yang
   China Mobile
   Weiqiang Cheng
   China Mobile

   Zhihua Fu
   New H3C Technologies


