ALTO WG | S. Yang |
Internet-Draft | L. Cui |
Intended status: Standards Track | Shenzhen University |
Expires: January 14, 2021 | M. Xu |
Tsinghua University | |
Y. Yang | |
Tongji/Yale | |
R. Huang | |
Research Institute of Tsinghua University in Shenzhen | |
July 13, 2020 |
Delivering Functions over Networks: Traffic and Performance Optimization for Edge Computing using ALTO
draft-yang-alto-deliver-functions-over-networks-01
As the rapid development of the Internet, huge amounts of data are being generated. To satisfy user demands, service providers deploy services near the edge networks. In order to achieve better performances, computing functions and user traffic need to be scheduled properly. However, it is challenging to efficiently schedule resources among the distributed edge servers due to the lack of underlying information, e.g., network topology, traffic distribution, link delay/bandwidth, utilization/capability of computing servers. In this document, we employ the ALTO protocol to help deliver functions and schedule traffic within the edge computing platform. The protocol will provide information of multiple resources for the distributed edge computing platform. The usage of ALTO will improve the efficiency of function delivery in edge computing.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 14, 2021.
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Internet of Things (IoT), artificial intelligence, virtual reality and augmented reality (VR/AR) are developing rapidly, holding promise for the future. The new applications are generating huge amounts of data that need to be processed efficiently. The processing applications involve kinds of functions/services according to user demands. For example, 1) surveillance video could be analysed by AI functions; 2) Hi-Definition video or VR/AR video should be encoded/decoded; 3) Content can be stored in edge networks, which can also be seen as a function/service. Function as a service (FaaS) is becoming more and more popular among cloud computing providers, e.g., Amazon Lambda and IBM Openwhisk. It is expected that functions/services would be deployed anywhere in networks.
Some of the functions/services put strong requirements on quality of services provided by underlying networks, e.g., the delay and jitter should be as small as possible to guarantee user experiences. Different with Mesos and Kubernetes, which can schedule computing resources efficiently in a computing cluster, deploy functions in wide area networks is much more complex.
Firstly, properly deploying functions over distributed networks takes multiple resources into considerations, including network traffic, topology, link delay/bandwidth, computing capacity/utilization of each computing cluster, etc. Besides, the resources are usually scheduled across multiple domains to satisfy user demands. Thus, these information needed to be collected with unified interfaces and protocols, and resources scheduling algorithms SHOULD be optimized to improve user experiences, and network performances, such as load balancing. In this document, we will deliver functions over the edge computing networks to utilize the computing and network resources more efficiently.
We use the ALTO (Application-Layer Traffic Optimization) [RFC7285] to optimize network traffic and performance by delivering functions over the edge computing network. ALTO can provide global network information for the distributed applications, while the information can not be retrieved or computed by the applications themselves [RFC5693]. Generally, the ALTO protocol will collect and compute network information for the distributed edge clusters, including link delay, network traffic, and other cost metrics. Finally, based on pre-defined scheduling algorithms, the system will deliver the functions to the most appropriate edge clusters according to the information provided by the ALTO protocol.
For brevity, in this document, we will use the terminologies introduced in [RFC7285] and [I-D.ietf-alto-unified-props-new].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
Edge computing was proposed to improve network performance in terms of latency, security, bandwidth, etc. In edge computing infrastructure, servers are deployed at the edge to reduce the distance between users and servers. Users can submit their tasks to the edge servers, which will process the tasks and return the computational results back to the users. Compared with traditional centralized computing, the latency, bandwidth and network traffic performance of edge computing is better. Nowadays, edge computing is used in different areas, e.g., latency-sensitive applications such as IoT, artificial intelligence, 5G, VR/AR, etc.
To improve network performance, we will deliver functions over edge computing, such that computing functions can be dynamically scheduled in a distributed edge computing network. However, when deploying functions to edge servers, multiple resources, including bandwidth, computing and link resources, should be allocated to meet the requirements in terms of latency and throughput.
Application-Layer Traffic Optimization (ALTO) [RFC7285] is designed to provide network information for distributed applications. More specifically, the ALTO server will offer necessary network states and information to guide the resource scheduling process for distributed applications, which cannot retrieve the information by themselves. The ALTO protocol will provide the essential network information, including network traffic, cost map, and cost metrics, which are all necessary in the resource selection process. In this case, the distributed applications are allowed to manage the network traffic, and select a better path with low delay to access the network and process the computation tasks.
Since the edge computing clusters are distributed throughout the network, they have different network states, including link delay, topology, network traffic, computing capacity/utilization of each cluster, etc. When delivering functions, the scheduling decisions SHOULD be adaptive to the network states in order to achieve better performance. Therefore, the ALTO protocol can help manage the network information and traffic such that the function can be delivered to a proper edge computing cluster.
Network devices, including routers, servers and clients, are able to communicate with each other. In a realistic network, on the one hand, we have several limited resources, including:
On the other hand, with the development of network technology, we have several network services and functions providing efficient computation service for network users, including:
Suppose a scenario in Internet of Things (IoT), where surveillance cameras are connected via the Internet that apply object detection computing services. When a camera submits a task, the objection detection function will be delivered to an edge server that handles the task, then returns the results to the camera. The system will request and retrieve the network information, including link delay and other cost metrics, by the ALTO protocols from ALTO servers and clients. According to the information provided by ALTO, the function and task will be delivered to the most appropriate edge server that has the best performance from the cameras. The infrastructure is demonstrated in Figure 1.
+---------------+ +-------------------+ | | | | | | | | | ALTO Server |<---------------->| ALTO Client | | | | | | | | | +---------------+ +------^-----+------+ | | | | | | +--+-----v--+ | Cluster | +-------+ Client +------+ | +-----------+ | | | | | | | +------v-------+ +-------v------+ |Edge Computing| |Edge Computing| | | ...... | | | Cluster 1 | | Cluster N | +--------------+ +--------------+ Figure 1. Scenario of delivering function over edge network in IoT
Since lots of edge clusters and servers are distributing in the network, the system MUST handle the huge amount of edge devices and their corresponding network traffic. A cluster client is employed to manage the connectivity and traffic information of the distributed edge clusters. The ALTO client will communicate with the cluster client and provide the necessary network information. The usage of ALTO is to optimize the network traffic and guide the function delivering process in edge computing. It will provide the overall network states with information for the distributed edge clusters, and decide the appropriate edge cluster to deploy the functions.
+---------------+ +-------------------+ | | (1) Network | | | | Information | | | ALTO Server |<---------------->| ALTO Client | | | | | | | | | +---------------+ +------^-----+------+ | | (2)Get clusters | | (3)Select Cluster List | | +--+-----v--+ | Cluster | +-------+ Client +------+ | +-----------+ | | | | (4) Connect to Cluster | | and deliver function | +------v-------+ +-------v------+ |Edge Computing| |Edge Computing| | | ...... | | | Cluster 1 | | Cluster N | +--------------+ +--------------+ Figure 2. Delivering process in edge computing platform with ALTO
More specifically, the ALTO server will collect and compute the network cost metrics; including the link delay, availability, network traffic, bandwidth, and etc. The information will then be sent to the ALTO client. The ALTO client will select the target appropriate edge clusters to deploy the target function. Finally, the system will connect and deploy the function to the target servers, so that users can submit their computation task to the selected edge clusters.
Note that the data transfer process is using the ALTO protocol described in [RFC7285] to guarantee the efficiency and security of the delivering process. In this case, the edge computing clusters are allowed to retrieve the network information, so that the function can be delivered to the proper ones to achieve a better performance in terms of latency, throughput, etc.
We are inspired by the concept of Serverless Computing, which is a new computing paradigm providing function-based computing services, utilizing containerization technology to run functions. The container, including the running code, library, and data dependencies, will be deployed and orchestrated to target edge servers and clusters by container orchestrator Kubernetes (or K8S). The container orchestration scheme will be computed according to the network information provided by ALTO.
We use IBM OpenWhisk as the FaaS platform in edge clusters, where the resources are managed by K8S. Using containerization technology, functions can be flexibly delivered to the target edge server. When a user request for function-based edge computing services, its request will be redirected to the edge server for better performance.
We have implemented a prototype, and are deploying it in real networks of Zhejiang Province, China. The initial results show that, 1) the performance of edge computing will be greatly improved with the provided underlying network information; 2) the information collection and scheduling policies need to be standardized to achieve coordination among different domains.
T.B.D.
To manage the functions more efficiently, we introduce the function standardization in our system. More specifically, functions in our system can be standardized, and also expose the standard APIs, such that users can access and apply for function-based computation services very easily. On top of them, the specific function codes and docker images can be updated and replaced according to standards and user demands, which is beneficial to function management of the platform.
More specifically, function standardization consists of:
Note that function standardization is beneficial to the function delivery. By exposing the standard APIs, users can easily accomplish their tasks by sending requests to the interfaces of the system, bypassing the complicated resource deployment and configuration process. Meanwhile, function standardization is good for system management. Each function in the platform is saved and registered in specific edge servers, such that users can easily locate the target edge servers when applying for functions, and system operators can update or replace the target functions easily.
A function delivery platform can be a multi-domain system. For example, there may be multiple service providers offering the function-based computation service. In this case, we should consider how to collect and manage the network information from different domains, in order to achieve better function delivery performance in networks. Consequently, we SHOULD develop additional designs for our platform.
On the one hand, we introduce the layered design for function delivery. More specifically, we deploy multiple distributed registry servers in the lower layer, each of which processes the function registry in its domain. Then we deploy a centralized registry server in the upper layer to collect and manage the distributed registry servers in the lower layer. A server in the lower layer will report and send network information of its domain to the centralized server in the upper layer periodically. And the centralized server will coordinate the domains by sending instructions to the distributed servers in the lower layer, which will make adjustment according to the instructions of the centralized registry server. In this case, the centralized registry server is able to manage the distributed function and network information easily and efficiently, which is beneficial to multi-domain system management.
On the other hand, we introduce the policy management for multiple domains. Note that different domains MAY have various delivery policies, thus we need to provide a policy management tool for multiple domains. When delivering functions in a multi-domain system, the tool will provide the overall management policy to synchronize and coordinate the distributed local policies in each individual domain. In this case, the distributed multiple domains in different policies are able to communicate and coordinate with each other, with the help of the policy management tool. Therefore, by utilizing the policy management tool, we can manage the multiple domains for efficient function delivery.
Recently, with the development of high-capacity computing devices, the computing power of networks has improved much. However, due to the lack of efficient scheduling strategies, the current computing platforms cannot achieve better computing throughput, i.e., the ability to schedule the distributed computing power over a long period. To improve the scheduling efficiency of the computing power, researchers proposed some high-throughput computing scheduling frameworks, for example, HTCondor, PBS, CPUsage, etc., which are able to schedule the limited distributed computing power to achieve better throughput of the network in a long period. Inspired by the high-throughput computing scheduling frameworks, we develop the scheduling framework for function delivery, in order to achieve better performance of networks.
The objective of our scheduling framework for function delivery is to minimize the computational latency. The basic idea is, our platform will compute the function scheduling schemes, according to the information collected by the ALTO server, including the network congestion, resource utilization, etc. The users will access the most appropriate edge server, which will provide the function-based computation service and return the results to the users.
More specifically, when a user applies for the function delivery service, it will send requests to the interface provided by the ALTO server, along with its location and task information. The ALTO server will also collect the resource utilization and network information of the decentralized edge servers. Then, according to the collected information, the ALTO server will compute the function scheduling scheme, to determine the function delivery destination of a specific edge server. The platform will select the edge server with lowest computation latency for user. However, if the selected edge server is overloaded, the platform will proceed to search other edge server that satisfies the load balance demand, along with achieving considerable latency performance. Finally, the user will establish the communication channel with the target edge server, which will provide the function-based service and return the results to the users.
By developing the scheduling framework and strategy for function delivery, our platform can maintain the stable network condition and guarantee the load balance over a long period, which is beneficial to the reliability of system. And users can enjoy a low-latency and high-throughput function delivery service at the same time.
T.B.D.
This document includes no requests to IANA.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", March 1997. |
[RFC5693] | Seedorf, J. and E. Burger, "Application-Layer Traffic Optimization (ALTO) Problem Statement", RFC 5693, DOI 10.17487/RFC5693, October 2009. |
[RFC7285] | Alimi, R., Penno, R., Yang, Y., Kiesel, S., Previdi, S., Roome, W., Shalunov, S. and R. Woundy, "Application-Layer Traffic Optimization (ALTO) Protocol", RFC 7285, DOI 10.17487/RFC7285, September 2014. |
[I-D.ietf-alto-unified-props-new] | Roome, W., Randriamasy, S., Yang, Y., Zhang, J. and K. Gao, "Unified Properties for the ALTO Protocol", Internet-Draft draft-ietf-alto-unified-props-new-09, September 2019. |