Internet DRAFT - draft-gu-rtgwg-cfn-field-trial

draft-gu-rtgwg-cfn-field-trial



 



RTGWG                                                              S. Gu
INTERNET-DRAFT                                                 G. Zhuang
Intended Status: Informational                       Huawei Technologies
                                                                  H. Yao
                                                                   X. Li
                                                            China Mobile

Expires: June 4, 2020                                   December 2, 2019


         A Report on Compute First Networking (CFN) Field Trial
                      draft-gu-rtgwg-cfn-field-trial-01


Abstract

   Compute First Networking (CFN) enables the routing of the service
   request to an optimal edge site to improve the overall system load
   balancing and efficiency. Especially when an edge site is overloaded,
   other edges with service equivalency can dynamically serve the
   request. This document describes a CFN field trial to show the effect
   that CFN can achieve. Edge to edge interaction to get the available
   computing resources information for services and the network status
   to each other is introduced. Data plane to support late binding based
   dynamic anycast is illustrated too. The field trial shows that CFN
   can greatly improve the overall query per second served for a service
   hosted on multiple edges in a more balanced way. 

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html
 


Gu, et al                                                       [Page 1]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


Copyright and License Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1  Terminology . . . . . . . . . . . . . . . . . . . . . . . .  3
   2 Testbed overview . . . . . . . . . . . . . . . . . . . . . . . .  3
   3. Procedures  . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1 Control Plane  . . . . . . . . . . . . . . . . . . . . . . .  5
     3.2 Data Plane . . . . . . . . . . . . . . . . . . . . . . . . .  6
   4. Preliminary Tests . . . . . . . . . . . . . . . . . . . . . . .  9
     4.1 Requests rush to an edge (no system background load) . . . .  9
     4.2 Requests rush to an edge (system background load exists) . . 10
     4.3 Mixed requests rush to an edge (no system background load) . 11
     4.4 Impact from update frequency . . . . . . . . . . . . . . . . 12
   5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
   6. Security Considerations . . . . . . . . . . . . . . . . . . . . 13
   7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 13
   8. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 13
   9. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     9.1  Normative References  . . . . . . . . . . . . . . . . . . . 13
     9.2  Informative References  . . . . . . . . . . . . . . . . . . 14
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14










 


Gu, et al                                                       [Page 2]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


1. Introduction

   Compute First Networking (CFN) Scenarios and Requirements [CFN-req]
   shows the usage scenarios and requirements to dynamically dispatch
   the service request to multiple edge sites in order to overcome the
   computing resource overloading problem in edge computing. Compute
   First Networking (CFN) framework document [CFN-fmwk] presents the
   basic system framework to dynamically route a service request to a
   selected edge in real time based on the computing load status and
   network conditions. This approach improves the load balancing between
   multiple edges with service equivalency in a distributed manner. This
   document introduces a more concrete CFN field trial and its
   performance.  


1.1  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.


2 Testbed overview

   We deployed CFN node on three edge sites in Hangzhou. The sites are
   approximately 30 kilometers apart. Figure 1 shows the topology and
   configuration we used for this CFN testbed. 



















 


Gu, et al                                                       [Page 3]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


        +-----+            edge site 1
       +-----+|                                        +---+
      +-----+|+           +----------+                 |   |
    +------+|+    ------  |CFN node 1| ----------------|   |
    |client|+             +----------+                 |   |
    +------+           inter-edge itf:10.11.103.1      |   |
                       service ID:SID_S                |   |
                       binding IP BIP1:10.11.102.1     |   |
                                                       |   |
                                                       | n |
       +-----+            edge site 2                  | e |
      +-----+|                                         | t |
     +-----+|+           +----------+                  | w |
   +------+|+    ------  |CFN node 2|------------------| o |
   |client|+             +----------+                  | r |
   +------+           inter-edge itf:10.12.103.1       | k |
                      service ID:SID_S                 |   |
                      binding IP BIP2:10.12.102.1      |   |
                                                       |   |
                                                       |   |
                                                       |   |
       +-----+            edge site 3                  |   |
      +-----+|                                         |   |
     +-----+|+           +----------+                  |   |
   +------+|+    ------  |CFN node 2| -----------------|   |
   |client|+             +----------+                  |   |
   +------+            inter-edge itf:10.13.103.1      +---+
                       service ID:SID_S
                       binding IP BIP3:10.13.102.1



                    Figure 1. CFN testbed overview 

   A matrix multiplication service S is provided by all three edge sites
   (or edges for simplicity in this document). The CFN nodes use a
   unique service ID SID_S to announce the its reachability to service
   S. In our test, we use 200.200.200.201 for SID_S. Consider SID_S here
   as a anycast IP address. Though this service is reachable by a single
   SID_S in network, 3 edges indeed serve SID_S using 3 different
   binding IP (BIP) addresses , BIP1/2/3 with address 10.11/12/13.102.1
   via CFN node 1/2/3 respectively. Service node hosted on or attached
   to a CFN node only knows that it uses its BIP to serve service S and
   has no knowledge about SID_S. 

   Each CFN node has an inter-edge interface IP address for
   communicating the computing load information among CFN nodes. About
   200 simulated clients connect to each CFN node in the test. 
 


Gu, et al                                                       [Page 4]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


3. Procedures

   The procedures are introduced in [CFN-fmwk]. For easy reference,
   control plane and data plane timeline diagrams are shown here too. 

3.1 Control Plane

   When a service node is initiated for service S, the edge platform
   manager will send the registration information about service ID SID_S
   and binding IP (BIP) to access SID_S to the CFN node that the service
   node attaches to.

   Each CFN node regularly gets the computing load information about the
   service node attached to it for SID_S. The computing load information
   can be CPU consumption for SID_S, number of current connections,
   query per second processed, total capacity, or other performance
   metrics. In our test, we give each type of metrics a weight. CFN
   nodes distribute those information to each other by BGP extensions.
   Figure 2 shows the CFN control plane procedures.





























 


Gu, et al                                                       [Page 5]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


      CFN      CFN                    CFN               Edge Platform
      Node 1   Node 2                 Node 3              Manager

       |        |                      |                   |
       |        |                      |                   |
       |        |                      |<------------------|
       |        |                      | 1.Service info    |
       |        |                      | registration/     |
       |        |                      | update/withdraw   |
       |        |                      | (SID_S, BIP 3)    |
       |        |                      |                   |
       |        |                      |                   |
       |        |                      |<------------------|
       |        |                      | 2.Computing load  |
       |        |                      | update triggering |
       |        |                      | (SID_S,computing  |
       |        |                      | load information) |
       |        |                      |                   |
       |        |                      |                   |
       |        |<---------------------|                   |
       |        |                      |                   |
       |<------------------------------|                   |
       |        |  3.BGP update for    |                   |
       |        |  computing load      |                   |
       |        | (SID_S, CFN node 3,  |                   |
       |        |  computing load info)|                   |
       |        |                      |                   |

                   Figure 2. CFN control plane  



3.2 Data Plane

   When a client sends a service request for service S, it uses SID_S as
   destination IP. In the test, SID_S is an anycast address. There are
   various ways that a client can get the SID_S for a service, such as
   by DNS or static configuration. 

   When the CFN ingress which is CFN node 1 in figure 3 receives the
   request, it dynamically selects the most appropriate CFN egress based
   on computing load information received. As figure 4 shows, CFN node 3
   is selected as CFN egress in this case. CFN ingress further tunnels
   the data packet to CFN egress.

   When CFN egress receives the packet, it decapsulates the packet and
   maps the destination address from SID_S to binding IP BIP3. The
   service node for service S gets the packet and processes it. The
 


Gu, et al                                                       [Page 6]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


   service response is returned back to CFN node 3. CFN node 3 is
   conceptually the gateway of attached service nodes for CFN services.
   It maps BIP3 to SID_S as source IP and then tunnels it to CFN node 1.
   CFN node 1 further decapsulates the packet and sends it to the
   client.

   For the subsequent service request packets sent to CFN node 1 from
   the same flow, CFN node always uses CFN node 3 as the egress to
   ensure the flow affinity. 







































 


Gu, et al                                                       [Page 7]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


                  CFN node 1              CFN node 3         Service
     client      (CFN ingress)           (CFN egress)       Node for S

       |              |                       |                   |
       |1.service req |                       |                   |
       |------------->|                       |                   |
       |dst=SID_S     |                       |                   |
       |src=client_IP |                       |                   |
       |              |                       |                   |
       |              |                       |                   |
       |      +----------------+              |                   |
       |      |2.Select CFN    |              |                   |
       |      |egress & save it|              |                   |
       |      +----------------+              |                   |
       |              |                       |                   |
       |              |3. forward service req |                   |
       |              |with encapsulation     |                   |
       |              |---------------------> |                   |
       |              |outer: dst=CFN_Node_3  |                   |
       |              |       src=CFN_Node_1  |                   |
       |              |inner: dst=SID_S       |                   |
       |              |       src=client_IP   |                   |
       |              |                       |                   |
       |              |              +----------------+           |
       |              |              |4.decap & map   |           |
       |              |              |SID_S to binding|           |
       |              |              |IP              |           |
       |              |              +----------------+           |
       |              |                       |                   |
       |              |                       |                   |
       |              |                       |5. forward pkt     |
       |              |                       |------------------>|
       |              |                       |dst=BIP3           |
       |              |                       |                   |
       |              |                       |                   |
       |              |                       |                   |
       |              |                       |  6. service rsp   |
       |              |                       |<----------------- |
       |              |                       |src=BIP3           |
       |              |                       |                   |
       |              |              +----------------+           |
       |              |              |7.map binding IP|           |
       |              |              |back to SID_S & |           |
       |              |              |encap           |           |
       |              |              +----------------+           |
       |              |                       |                   |
       |              |8. forward service rsp |                   |
       |              |with encapsulation     |                   |
 


Gu, et al                                                       [Page 8]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


       |              |<--------------------- |                   |
       |              |outer: dst=CFN_Node_1  |                   |
       |              |       src=CFN_Node_3  |                   |
       |              |inner: dst=client_IP   |                   |
       |              |       src=SID_S       |                   |
       |              |                       |                   |
       |        +----------+                  |                   |
       |        |9 decap   |                  |                   |
       |        +----------+                  |                   |
       |              |                       |                   |
       | 10. forward  |                       |                   |
       |<------------ |                       |                   |
       |dst=client_IP |                       |                   |
       |src=SID_S     |                       |                   |
       |              |                       |                   |


          Figure 3. CFN data plane for the first request of a flow



4. Preliminary Tests

4.1 Requests rush to an edge (no system background load)

   In this test, we assume the service nodes capacities attached to all
   three edges are the same and there is no background computing tasks
   running. The overall computing task handling capacity from service
   nodes can handle about 670 queries per second (qps). 

   The clients attached to edge 1 generating service request to it at
   about 40 qps. The number of clients simultaneously send requests
   varies. When 10 clients send requests, the computing power consumed
   by the system can reach approximately 60% of its overall maximum. The
   requests are all short-processing tasks and based on observation each
   request roughly take 4ms to be completed at the server side.

   CFN leverages the computing load reported by different edges and
   together with network status to spread the service request. On the
   other hand, a pure random selection from the edges to handle the
   request is used for comparison.

   We tested for 5, 10 and 15 clients attached to one edge which result
   in the consumption of medium low, medium high and high computing
   resources of the whole system respectively. Note it exceeds a single
   edge capacity in any case. For 15 clients case, it almost reaches the
   maximum system capacity. Figure 4 shows the average delay between a
   request being sent and the response being received by a client and
 


Gu, et al                                                       [Page 9]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


   system qps. 


    +-------------+--------+----------------+---------+
    |  number of  | system |  average delay |   qps   |
    |  clients    |        |      (ms)      |         |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     3.954      |   208.5 |
    |      5      +--------+----------------+---------+
    | (medium low)| random |     5.316      |   197.7 |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     4.700      |   402.3 |
    |      10     +--------+----------------+---------+
    |(medium high)| random |     5.595      |   302.1 |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     5.506      |   559.3 |
    |      15     +--------+----------------+---------+
    |    (high)   | random |     5.718      |   546.0 |
    +-------------+--------+----------------+---------+


   Figure 4. Test results when service requests rush to 
             a single edge when no system background load


   The CFN achieves better results compared with random selection based
   application layer service dispatch. Average delay decreased by 25.62%
   and 16.00% and total qps increased by 5.5% and 33.17% in medium low
   and medium high computing load respectively. The unbalanced incoming
   traffic is spread to all edges. Unlike random selection, CFN will
   dispatch more requests to the local edge since its network cost is
   the lowest. CFN balances between higher computing resources available
   at the remote sites and lower network cost at the local site to make
   a choice. Hence it outperforms the random selection. In high number
   of clients case, as the maximum system capacity is almost reached,
   the performance are similar for CFN and random case. 


4.2 Requests rush to an edge (system background load exists)

   In this test, different edge has different background computing tasks
   to handle. We randomly select an edge to make it suffer from a
   computing intensive burst which consumes almost 90% of its capacity
   for about 4 seconds. Then computing load returns to zero for 2
   seconds. It creates the busy edge and idle edges scenario. The other
   settings are same as shown in section 4.1.

   Figure 5 shows the average delay between a request being sent and the
 


Gu, et al                                                      [Page 10]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


   response being received by a client and system qps for this case. 


    +-------------+--------+----------------+---------+
    |  number of  | system |  average delay |   qps   |
    |  clients    |        |      (ms)      |         |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     6.291      |  185.6  |
    |      5      +--------+----------------+---------+
    | (medium low)| random |     9.630      |  165.3  |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     6.854      |  360.9  |
    |      10     +--------+----------------+---------+
    |(medium high)| random |     10.592     |  316.3  |
    +-------------+--------+----------------+---------+
    |             |   CFN  |     7.987      |  512.4  |
    |      15     +--------+----------------+---------+
    |    (high)   | random |     12.156     |  441.7  |
    +-------------+--------+----------------+---------+


   Figure 5. Test results when service requests rush to 
             a single edge when system background load exists


   The results show that CFN has average delay decreased by 34.67%,
   35.29% and 34.30% in medium low, medium high and high computing load
   respectively. And total qps is increased by 12.28%, 14.10% and 16.01%
   in medium low, medium high and high computing load respectively. 

   The performance gain of CFN shown in this test case is much higher
   than that in section 4.1 The reason is that the random service
   dispatching has more than 20% chance to send the request to an edge
   with service node with very high background computing load while CFN
   can greatly reduce such possibility.  

   In addition, compare with the results in section 4.1, delay increases
   59.10%, 45.83% and 45.06% in different computing load level in CFN
   and 81.15%, 89.31%, 112.60% in random selection. It shows CFN can
   much better adapt to dynamic computing load change especially when
   system background load is high. 

4.3 Mixed requests rush to an edge (no system background load)

   We changed the characteristics of service requests to reflect the co-
   existence nature of long-processing tasks and short-processing tasks.
   Short-processing task takes roughly 4ms to complete and long-
   processing task takes roughly 400ms to complete.  And the ratio of
 


Gu, et al                                                      [Page 11]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


   long and short tasks is approximately 1:100.

   Figure 6 shows the average delay between a request being sent and the
   response being received by a client and system qps for this case. 

      +-------------+--------+----------------+---------+
      |  number of  | system |  average delay |   qps   |
      |  clients    |        |      (ms)      |         |
      +-------------+--------+----------------+---------+
      |             |   CFN  |     5.205      |  193.5  |
      |      5      +--------+----------------+---------+
      | (medium low)| random |     5.398      |  193.5  |
      +-------------+--------+----------------+---------+
      |             |   CFN  |     5.201      |  393.4  |
      |      10     +--------+----------------+---------+
      |(medium high)| random |     5.985      |  385    |
      +-------------+--------+----------------+---------+
      |             |   CFN  |     6.147      |  559.4  |
      |      15     +--------+----------------+---------+
      |    (high)   | random |     8.499      |  559.4  |
      +-------------+--------+----------------+---------+

   Figure 6. Test results when mixed service requests rush to 
             a single edge when no system background load

   The results show that CFN has average delay decreased by 3.58%,
   13.10% and 27.76% in medium low, medium high and high computing load
   respectively. The qps has no much difference for different levels of
   computing load especially for the medium low and high case. 

4.4 Impact from update frequency

   The computing load information is updated and distributed when its
   metric changes exceed some threshold compared to the last distributed
   information. In the test, we used the 10% of maximum number of
   connections allowed and 5% CPU consumption as threshold. Frequency of
   update affects the system performance. We tested for different update
   interval to see their impact. The clients keep sending requests to
   make the computing resource consumption on each edge maintained at
   medium low which is about 5 connections. Update internal has been set
   to 10s, 5s, 1s, 100ms, 10ms, 1ms. Figure 7 shows the average delay
   between a request being sent and the response being received by a
   client under different update intervals and the improvement of delay
   when comparing to the case of 10 second interval.

   The results shows that the higher frequency of updates distributed
   the better performance. 

 


Gu, et al                                                      [Page 12]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


      +-------------+--------------+----+----+----+-----+----+----+
      |# of clients |    Interval  | 10s| 5s |1s  |100ms|10ms| 1ms|
      |-------------+--------------+----+----+----+-----+----+----+
      |     5       |   Delay(us)  |6445|6255|5741|5312 |4883|4058|
      |(medium low) |--------------+----+----+----+-----+----+----+
      |             |Improvement(%)| 0  |3.5 |12.3|21.3 |32.3|58.8|
      +-------------+--------------+----+----+----+-----+----+----+

       Figure 7. Test results under different update intervals



5. Summary

   This draft presents a field trial for CFN system with three edge
   sites in different locations. CFN enables a network-based fast-react
   system to serve multi-edge based computing service in a more balanced
   way. Computing load information are exchanged regularly between CFN
   nodes. CFN egress bound to serve a particular service is determined
   in real time and maintained to ensure flow affinity.

   The tests show that the overall clients' request delay is greatly
   decreased and the system qps has some improvement too. CFN is a
   feasible and efficient way in edge computing to provide multi-edge
   service balancing.   

6. Security Considerations

   The security risks mentioned in [CFN-fmwk] apply in the tests. As a
   preliminary tests, no extra security risks control is implemented
   currently. Mechanisms such as authentication of edge node and
   fluctuation avoidance should be considered in deployment. 

7. IANA Considerations

   No IANA action is required. 

8. Acknowledgements

   The authors would like to thank Xunwen Li's team members for their
   help in setting up the testbed in Hangzhou. 

9. References

9.1  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.
 


Gu, et al                                                      [Page 13]

INTERNET DRAFT              CFN Field Trial                     Dec 2019


9.2  Informative References
   [CFN-req] Geng, L., et al, "Compute First Networking (CFN) Scenarios
              and Requirements", draft-geng-cfn-req-00, November 2019.

   [CFN-fmwk] Li, Y., et al, "Framework of Compute First Networking
              (CFN)", draft-li-cfn-framework-00, November 2019.


Authors' Addresses


   Shuheng Gu
   Huawei Technologies

   EMail: gushuheng@huawei.com


   Guanhua Zhuang
   Huawei Technologies

   EMail: zhuangguanhua@huawei.com


   Huijuan Yao
   China Mobile

   EMail: yaohuijuan@chinamobile.com


   Xunwen Li
   China Mobile

   EMail: lixunwen@zj.chinamobile.com


















Gu, et al                                                      [Page 14]