Internet DRAFT - draft-li-rtgwg-cfn-dyncast-architecture
draft-li-rtgwg-cfn-dyncast-architecture
rtgwg Y. Li
Internet-Draft L. Iannone
Intended status: Informational Huawei Technologies
Expires: May 4, 2021 J. He
City University of Hong Kong
L. Geng
P. Liu
China Mobile
Y. Cui
Tsinghua University
October 31, 2020
Architecture of Dynamic-Anycast in Compute First Networking (CFN-
Dyncast)
draft-li-rtgwg-cfn-dyncast-architecture-00
Abstract
Compute First Networking (CFN) Dynamic Anycast refers to in-network
edge computing, where a single service offered by a provider has
multiple instances attached to multiple edge sites. In this
scenario, flows are assigned and consistently forwarded to a specific
instance through an anycast approach based on the network status as
well as the status of the different instance.
This document describes an architecture for the Dynamic Anycast
(Dyncast) in Compute First Networking (CFN). It provides an
overview, a description of the various components, and a workflow
example showing how to provide a balanced multi-edge based service in
terms of both computing and networking resources through dynamic
anycast in real time.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Li, et al. Expires May 4, 2021 [Page 1]
Internet-Draft CFN-dyncast Architecture October 2020
This Internet-Draft will expire on May 4, 2021.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 3
3. CFN-Dyncast Architecture Overview . . . . . . . . . . . . . . 4
4. Architectural Components and Interactions . . . . . . . . . . 5
4.1. Service Identity and Bindings . . . . . . . . . . . . . . 5
4.2. Service Notification between Instances and CFN node . . . 7
4.3. CFN Dyncast Control Plane . . . . . . . . . . . . . . . . 9
4.4. Service Demand Dispatching . . . . . . . . . . . . . . . 9
4.5. CFN Dispatcher . . . . . . . . . . . . . . . . . . . . . 10
5. Summary of the key elements of CFN Dyncast Architecture . . . 12
6. Conclusion (and call for contributions) . . . . . . . . . . . 13
7. Security Considerations . . . . . . . . . . . . . . . . . . . 13
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
9. Informative References . . . . . . . . . . . . . . . . . . . 14
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction
Dynamic anycast in Compute First Networking (CFN-Dyncast) use cases
and problem statements document
[I-D.geng-rtgwg-cfn-dyncast-ps-usecase] shows the usage scenarios
that require an edge to be dynamically selected from multiple edge
sites to serve an edge computing service demand based on computing
resource available at the site and network status in real time.
Multiple edges provide service equivalency and service dynamism in
CFN. The current network architecture in edge computing provides
relatively static service dispatching, for example, to the closest
edge, or to the server with the most computing resources without
Li, et al. Expires May 4, 2021 [Page 2]
Internet-Draft CFN-dyncast Architecture October 2020
considering the network status. Dynamic Anycast takes the dynamic
nature of computing load as well as the network status as metrics for
deciding flow's service dispatch and at the same time maintains the
flow affinity in a service life cycle.
CFN-Dyncast architecture presents an anycast based service and access
models. The aim is to solve the problematic aspects of existing
network layer edge computing service deployment, including the
unawareness of computing resource information of service, static edge
selection, isolated network and computing metrics and/or slow refresh
of status.
CFN-Dyncast assumes there are multiple equivalent edge instances
implementing the same single service (think about the same service
function instantiated on several edge nodes). A single edge node has
limited computing resources attached, and different edge nodes may
have different resources available such as CPU or GPU. Because
multiple edge nodes are interconnected and can collaborate with each
other, it is possible to balance the service load and network load in
CFN. Computing resource available to serve a request is usually
considered the main metric to assign a service demand to an instance
of the service. However, the status of the network, in particular
paths toward the instances, varies over time and may get congested,
hence, becoming another key attribute to be considered. CFN-Dyncast
aims at providing a layer 3 protocol framework able to dispatch the
service demand to the "best" edge node in terms of both computing
resources and network status, in real time and no application and/or
service specific dependencies.
This document describes the a general architecture for the service
notification, status update and service dispatch in CFN edge
computing.
2. Definition of Terms
CFN: Compute First Networking
SID: Service ID, an anycast IP address representing a service and the
clients use it to access that service. SID is independent of which
service instance serves the service demand. Usually multiple service
instances serve a single service.
BID: Binding ID, an address to reach a service instance for a given
SID. It is usually a unicast IP. A service can be provided by
multiple service instances with different BID.
CFN-Dyncast: as defined in [I-D.geng-rtgwg-cfn-dyncast-ps-usecase].
Li, et al. Expires May 4, 2021 [Page 3]
Internet-Draft CFN-dyncast Architecture October 2020
3. CFN-Dyncast Architecture Overview
Service instances can be hosted on servers, virtual machines, access
routers or gateway in edge data center. The CFN node is the glue
allowing CFN-Dyncast network to provide the capability to exchange
the information about the computing resource information of service
instances attached to it, but also to forward flows consistently
toward such instances.
Figure 1 shows the architecture of CFN-Dyncast. CFN nodes are
usually deployed at the edges of the operator infrastructure, where
clients are connected. As such, we can consider that clients are
logically connected to CFN nodes. A CFN node has the purpose to
constantly direct flows coming from clients to an instance of the
service the flow is supposed to go through. Service instances are
initiated at different edge sites, where a CFN node is also running.
A single service can have a huge number of instances running on
different CFN nodes. A "Service ID" (SID) is used to uniquely
identify a service, at the same time identifying the whole set of
instances of that specific service, no matter where those instances
are running. There can be several instances of the service running
on the the same CFN node (e.g., one instance per CPU core), there can
also be on several different CFN nodes (e.g., one instance per PGW-U
in a 5G network). Each instance is associated to a "Binding ID"
indicating where the instance is running. Hence, there is a dynamic
binding between an SID (the service) and a set of BIDs (the instances
of the service) and such bindings are enriched with information
concerning the network state and the available resources so that at
each new service request (a new flow) CFN nodes can decide which
instance is the most appropriate to handle the request. This
highlights the anycast part of CFN-Dyncast, since flow are routed
toward one service end-point among a set of equivalent , i.e., one-
to-one-out-of-many.
When a clients sends a service demand, it will be delivered to the
most appropriate instance of the service attached to a CFN node. A
service demand is normally the first packet of a data flow, not
necessarily an explicit out of band service request. Once the CFN
node has decided which instance has to serve the flow, flow affinity
must be guaranteed, meaning that all packets belonging to the same
flow have to go through the same service instance.
Li, et al. Expires May 4, 2021 [Page 4]
Internet-Draft CFN-dyncast Architecture October 2020
edge site 1 edge site 2 edge site 3
+------------+
+------------+ +------------+ |
+-+----------+ | +------------+ |-+
| service | | | service | |
| instance |-+ | instance |-+
+------------+ +------------+
| |
| +-----------------+ |
+----------+ | | +----------+
|CFN node 1| ----| Infrastructure |---- |CFN node 3|
+----------+ | | +----------+
| +-----------------+
| |
| |
+----------+ +----------+
| CFN | |CFN node 2|
|Dispatcher| +----------+
+----------+ |
| |
| |
+-----+ +------+
+------+| +------+ |
|client|+ |client|-+
+------+ +------+
Figure 1: CFN-Dyncast Architecture
4. Architectural Components and Interactions
Figure 1 also shows that the local components of the architecture are
service instance, CFN node, CFN dispatcher and client. The following
subsections provide an overview of how some of these architectural
components interact. The figures accompanying the examples do not
show the interconnecting infrastructure to avoid making them too
cluttered.
4.1. Service Identity and Bindings
As previously stated, the CFN-Dyncast architecture uses Service ID
(SID) and Binding ID (BID) in order to identify services and their
instances.
Service ID (SID) is an anycast service identifier (which may or may
not be a routable IP address). It is used to access a specific
service no matter which service instance eventually handles the
Li, et al. Expires May 4, 2021 [Page 5]
Internet-Draft CFN-dyncast Architecture October 2020
client's flow. CFN nodes must be able to know SIDs (and their
bindings) in advance and must be able to identify which flow needs
which service. This can be achieved in different ways, for example,
use a special range or coding of anycast IP address as SID, or use
DNS.
Binding ID (BID) is a unicast IP address. It is usually the
interface IP address of a service instance. Mapping and binding from
a SID to a BID is dynamic and depends on the computing resousrces and
network state at the time the service demand is made. The CFN node
must be able to guarantee flow affinity, i.e., steering the flow
always toward the same instance.
Figure 2 shows an abstract example of the use of SIDs and BIDs.
There are three services, namely SID1, SID2, and SID3. In
particular, SID2 has two instances on different CFN nodes (CFN node 2
and CFN node 3). In this case the complete list of bindings (only in
term of SID and BID, no network or resource state) are:
o SID1:BID21
o SID2:BID22,BID32
o SID3:BID33
Li, et al. Expires May 4, 2021 [Page 6]
Internet-Draft CFN-dyncast Architecture October 2020
SID: Service ID
BID: Binding ID
SID1
+--------+ service
+--| BID21 | instance1
| +--------+
+----------+ |
+------|CFN node 2|-------| SID2
| +----------+ | +--------+ service
| +--| BID22 | instance2
| +--------+
|
+------+ +----------+
|client|---|CFN node 1| SID2
+------+ +----------+ +--------+ service
| +--| BID32 | instance3
| | +--------+
| +----------+ |
+------|CFN node 3|-------| SID3
+----------+ | +--------+ service
+--| BID33 | instance4
+--------+
Figure 2: CFN-Dyncast Architectural Concept Example
4.2. Service Notification between Instances and CFN node
CFN-Dyncast service side is responsible to notify its attaching CFN
node about the mapping information of SID and BID when a new service
is instantiated, terminated, or its metrics (e.g., load) change, as
shown in Figure 3.
Li, et al. Expires May 4, 2021 [Page 7]
Internet-Draft CFN-dyncast Architecture October 2020
SID: Service ID
BID: Binding ID service info
(SID1, BID21, metrics)
(SID2, BID22, metrics)
<--------------->
SID1
+--------+ service
+--| BID21 | instance1
| +--------+
+----------+ |
+------|CFN node 2|-------| SID2
| +----------+ | +--------+ service
| +--| BID22 | instance2
| +--------+
|
+------+ +----------+
|client|---|CFN node 1| SID2
+------+ +----------+ +--------+ service
| +--| BID32 | instance3
| | +--------+
| +----------+ |
+------|CFN node 3|-------| SID3
+----------+ | +--------+ service
+--| BID33 | instance4
+--------+
<---------------->
service info
(SID2, BID32, metrics)
(SID3, BID32, metrics)
Figure 3: CFN-Dyncast Service Notification
Computing resource information of service instances is key
information in CFN-Dyncast. Some of them are relatively static like
CPU/GPU capacity, and some are very dynamic, for example, CPU/GPU
utilization, number of sessions associated, number of queuing
requests. The service side has to notify and refresh this
information to its attaching CFN node. Various ways can be used, for
instance via protocol or via an API of the management system.
Conceptually, a CFN node keeps track of the SIDs and computing
metrics of all service instances attached to it in real-time.
Li, et al. Expires May 4, 2021 [Page 8]
Internet-Draft CFN-dyncast Architecture October 2020
4.3. CFN Dyncast Control Plane
CFN Dyncast needs a control plane allowing to share information about
resources and costs. Through the control plane, CFN nodes share and
update among themselves the service information and the associated
computing metrics for the service instances attached to it. As a
network node, CFN node also monitors the network state to other CFN
nodes. In this way, each CFN node is able to aggregate the
information and create a complete vision of the resources avaible and
the cost to reach them. For instance, for the scenario in Figure 3,
the different CFN nodes will learn that there exists two instances of
SID2, each of which has a certain computational capacity expressed in
the metrics. Different mechanisms can be used in updating the
status, for instance, BGP [RFC4760], IGP or controller based
mechanism.
An important question CFN Dyncast raises is on the different ways to
represent the computing metrics. A single digitalized value
calculated from weighted attributes like CPU/GPU consumption and/or
number of sessions associated may be the easiest. However, it may
not accurately reflect the computing resources of interest. Multi-
dimensional variables may give finer information, however the
structure and the algorithmic processing should be sufficiently
general to accommodate different type of services (i.e., metrics).
A second important issue is related to the system stability and
signaling overhead. As computing metrics may change very frequently,
when and how frequent such information should be exchanged among CFN
nodes should be determined. A spectrum of approaches can be
employed, interval based update, threshold update, policy based
update, etc.
4.4. Service Demand Dispatching
Assuming that the set of metric are well defined and that the update
rate is tailored so to have a stable system, the CFN Dyncast data
plane has the task to dispatch flows to the "best" service instance.
When a new flow comes to a CFN ingress, CFN ingress node selects the
most appropriate CFN egress in terms of the network status and the
computing resources of the attached service instances and guarantees
flow affinity for the flow from now on.
Flow affinity is one of the critical features that CFN-Dyncast should
support. The flow affinity means the packets from the same flow for
a service should always be sent to the same CFN egress to be
processed by the same service instance.
Li, et al. Expires May 4, 2021 [Page 9]
Internet-Draft CFN-dyncast Architecture October 2020
At the time that the most appropriate CFN egress and service instance
is determined when a new flow comes, a flow binding table should save
this flow binding information which may include flow identifier,
selected CFN node, affinity timeout value, etc. The subsequent
packets of the flow are forwarded based on the table. Figure 4 shows
an example of what a flow binding table at CFN ingress node can look
like.
+-----------------------------------------+------------+--------+
| Flow Identifier | | |
+------+--------+---------+--------+------+ CFN egress | timeout|
|src_IP| dst_IP |src_port |dst_port|proto | | |
+------+--------+---------+--------+------+------------+--------+
| X | SID2 | - | 8888 | tcp | CFN node 2 | xxx |
+------+--------+---------+--------+------+------------+--------+
| Y | SID2 | - | 8888 | tcp | CFN node 3 | xxx |
+------+--------+---------+--------+------+------------+--------+
Figure 4: Example of flow binding table
A flow entry in the flow binding table can be identified using the
classic 5-tuple value. However, it is worth noting that different
services may have different granularity of flow identification. For
instance, an RTP video streaming may use different port numbers for
video and audio, and it may be identified as two flows if 5-tuple
flow identifier is used. However they certainly should be treated as
the same flow. Therefore 3-tuple based flow identifier is more
suitable for this case. Hence, it is desired to provide certain
level of flexibility in identifying flows in order to apply flow
affinity.
Flow affinity attributes information can be configured per service in
advance. For each service, the information can include the flow
identifier type, affinity timeout value, etc. The flow identifier
type can indicate what are the values, for instance, 5-tuple, 3-tuple
or anything else that can be used as the flow identifier. Because we
deal with single services the matching rules have to be disjoint,
meaning that two different services need not have non-overlapping
matching flow set.
4.5. CFN Dispatcher
When a CFN node maintains the flow binding table, the memory consumed
is determined by the number of flows that CFN ingress node handles.
The ingress node can be an edge data center gateway, hence it may
cover hundreds of thousands of users and each user may have tens of
flows. The memory space consumption on binding table at the CFN
Li, et al. Expires May 4, 2021 [Page 10]
Internet-Draft CFN-dyncast Architecture October 2020
ingress node can be a concern. To alleviate it, a functional entity
called CFN Dispatcher can help.
CFN Dispatcher is deployed closer to the clients and it normally
handles the flows for a limited number of clients. In this case, the
memory space required by the binding table will be much smaller. CFN
dispatcher is a client side located entity which directs traffic to
an CFN egress node. It is not a CFN node itself, that is to say, it
does not participate in the status update about network and computing
metrics among CFN nodes. CFN dispatcher does not determine the best
CFN egress to forward packets for a new flow by itself. It has to
learn such information from a CFN node and maintains it to ensure the
flow affinity for the subsequent packets. In this way, the CFN node
simply selects the most appropriate egress for the new flows and
informs CFN dispatcher in explicit or implicit way. It is relieved
from flow binding table maintenance.
Figure 5 shows the interaction between an CFN Dispatcher and a CFN
node. After CFN node makes the service demand dispatch, it informs
the CFN dispatcher about the selected CFN egress node for the flow.
Then CFN dispatcher maintains the flow binding table to ensure the
flow affinity. Message exchange between the CFN dispatcher and its
corresponding CFN node needs to be defined. The CFN dispatcher can
simply forward the first packet of a flow to the CFN node, who takes
the decision of which instance to use and pushes this information in
the flow binding table of the CFN dispatcher. However, in case of
failures, e.g., CFN egress not reachable anymore, further interaction
is needed between the CFN dispacther and the CFN node.
Li, et al. Expires May 4, 2021 [Page 11]
Internet-Draft CFN-dyncast Architecture October 2020
SID: Service ID
BID: Binding ID
SID1
+--------+
+--| BID21 |
binding info | +--------+
(flow1,egress2) +----------+ |
(flow2,egress3) +--|CFN node 2|---| SID2
<----- | +----------+ | +--------+
+------+ | +--| BID22 |
|Client|-+ | +--------+
+------+ \ |
\ |
+--------------+ +----------+
|CFN Dispatcher|-----|CFN Node 1|
+--------------+ +----------+
/ | SID3
+------+ / | +--------+
|Client|-+ | +--| BID32 |
+------+ | | +--------+
| +----------+ |
+--|CFN node 3|---| SID3
+----------+ | +--------+
+--| BID33 |
+--------+
Figure 5: Service Demand Dispatch with CFN Dispatcher
5. Summary of the key elements of CFN Dyncast Architecture
o CFN Control Plane:
* SID: CFN nodes have to made aware of existing services through
the existence of the corresponding SID. It can be achieved in
different ways. For example, use a special range or coding of
anycast IP address as service IDs or use DNS.
* BID bindings: SID are bound to a set of BID representing the
different instances of the service. Associated to these BID
there is as well a set of metrics describing the state of the
instance. These bindings have to be shared among the CFN nodes
so that they are aware of the different instances and their
computing resource status.
Li, et al. Expires May 4, 2021 [Page 12]
Internet-Draft CFN-dyncast Architecture October 2020
* Network state: CFN nodes have to be able to share network
status so to have an idea on the impact of the dispatching
decision in terms of link congestion.
* Metric and network status updates need to be sufficiently
sparse so to limit the signaling overhead and keep the system
stable, but also sufficiently regular so to make the system
reactive to sudden traffic fluctuations.
o CFN Data Plane:
* In case of a new flow: CFN ingress node selects the most
appropriate CFN egress in terms of the network status and the
computing resource of the service instance attached to the
egresses.
* Flow affinity: CFN ingress nodes make sure the subsequent
packets of an existing flow are always delivered to the same
CFN egress node so that they can be served by the same service
instance.
6. Conclusion (and call for contributions)
This document introduces an architecture for CFN Dyncast, enabling
the service demand request to be sent to an optimal edge to improve
the overall system load balancing. It can dynamically adapt to the
computing resources consumption and network status change and avoid
overloading single edges. CFN-Dyncast is a network based
architecture that supports a large number of edges and is independent
of the applications or services hosted on the edge.
This present document is a strawman for defining CFN-Dyncast
architecure.
More discussions on control plane and data plane approach are
welcome.
7. Security Considerations
TBD
8. IANA Considerations
No IANA action is required so far.
Li, et al. Expires May 4, 2021 [Page 13]
Internet-Draft CFN-dyncast Architecture October 2020
9. Informative References
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760,
DOI 10.17487/RFC4760, January 2007,
<https://www.rfc-editor.org/info/rfc4760>.
[I-D.geng-rtgwg-cfn-dyncast-ps-usecase]
Geng, L., Liu, P., and P. Willis, "Dynamic-Anycast in
Compute First Networking (CFN-Dyncast) Use Cases and
Problem Statement", draft-geng-rtgwg-cfn-dyncast-ps-
usecase-00 (work in progress), October 2020.
Acknowledgements
TBD
Authors' Addresses
Yizhou Li
Huawei Technologies
Email: liyizhou@huawei.com
Luigi Iannone
Huawei Technologies
Email: Luigi.iannone@huawei.com
Jianfei He
City University of Hong Kong
Email: jianfeihe2-c@my.cityu.edu.hk
Liang Geng
China Mobile
Email: gengliang@chinamobile.com
Peng Liu
China Mobile
Email: liupengyjy@chinamobile.com
Li, et al. Expires May 4, 2021 [Page 14]
Internet-Draft CFN-dyncast Architecture October 2020
Yong Cui
Tsinghua University
Email: cuiyong@tsinghua.edu.cn
Li, et al. Expires May 4, 2021 [Page 15]