Internet DRAFT - draft-fu-cats-sr-te-based-solution
draft-fu-cats-sr-te-based-solution
CATS H. Fu
Internet-Draft D. Huang
Intended status: Standards Track L. Ma
Expires: 7 July 2024 W. Duan
ZTE Corporation
4 January 2024
An SR-TE based Solution For Computing-Aware Traffic Steering
draft-fu-cats-sr-te-based-solution-01
Abstract
Computing-aware traffic steering (CATS) is a traffic engineering
approach [I-D.ietf-teas-rfc3272bis] that takes into account the
dynamic nature of computing resources and network state to optimize
service-specific traffic forwarding towards a given service instance.
Various relevant metrics may be used to enforce such computing-aware
traffic steering policies.It is critical to meet different types of
computing-aware traffic steering requirements without disrupting the
existing network architecture. In this context, this document
proposes a computing-aware traffic steering solution based on the SR-
TE infrastructure of the current traffic engineering technology to
reduce device resource consumption and investment and meet the
requirements for computing-aware traffic steering of network devices.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 7 July 2024.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
Fu, et al. Expires 7 July 2024 [Page 1]
Internet-Draft An SR-TE based Solution of CATS January 2024
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
4. Requirements and Motivation . . . . . . . . . . . . . . . . . 4
5. Background and general scenario . . . . . . . . . . . . . . . 5
6. Service Flow . . . . . . . . . . . . . . . . . . . . . . . . 5
6.1. Service Overview . . . . . . . . . . . . . . . . . . . . 7
6.2. Work Flow Overview . . . . . . . . . . . . . . . . . . . 7
7. Control Plane . . . . . . . . . . . . . . . . . . . . . . . . 7
7.1. Considerations . . . . . . . . . . . . . . . . . . . . . 8
7.2. EGW Processing . . . . . . . . . . . . . . . . . . . . . 8
7.3. IGW Processing . . . . . . . . . . . . . . . . . . . . . 10
7.4. Control Plane WorkFlow . . . . . . . . . . . . . . . . . 11
8. Data Plane . . . . . . . . . . . . . . . . . . . . . . . . . 13
8.1. IGW Processing . . . . . . . . . . . . . . . . . . . . . 13
8.2. EGW Processing . . . . . . . . . . . . . . . . . . . . . 14
8.3. Data Plane WorkFlow . . . . . . . . . . . . . . . . . . . 15
8.4. Flow Affinity Considerations . . . . . . . . . . . . . . 17
9. Security Considerations . . . . . . . . . . . . . . . . . . . 17
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 17
12.1. Normative References . . . . . . . . . . . . . . . . . . 17
12.2. Informative References . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19
1. Introduction
Edge computing provides better response time and transmission rate
than cloud computing by proximity to the end users. Considering
computing resource capacity and locations, peak hours, and economic
factors, traffic steering to the nearest node may not meet service
requirements. To meet the requirements of users, service providers
deploy the same type of service instances at multiple edge sites.
This brings about the key problem of steering the service traffic to
the most suitable computing instances to meet the (service-specific)
requirements of users. When different types of computing services
Fu, et al. Expires 7 July 2024 [Page 2]
Internet-Draft An SR-TE based Solution of CATS January 2024
are accessed, the requirement types for CATS are usually different.
In general, there are the following three types: 1) Experience,
namely, the SLA indicators related to service access QoS are met. 2)
Cost, namely, the optimal cost/energy consumption for service access
resources. 3) Resource , namely, the balance of computing resources.
For experience service access, the end-to-end delay is a key factor
that affects user experience. This delay includes two parts: Network
and computing processing. The CATS would not be able to select a
best service instance with regard to only the compute or network
factor. As described in [I-D.yao-cats-ps-usecases], multiple edge
sites need to be interconnected and coordinated at the network layer
to meet service requirements and ensure better user experience.Based
on this, the two-level service routing mechanism is employed to
reduce the processing load on the control plane and forwarding plane,
and a virtual node and a link (including a computing resource status)
are created based on a service identifier and a corresponding service
instance. The computing and network integration decision-making
could thus be reduced to a conventional network-level traffic
engineering problem. so as to implement a traffic steering solution
of an egress serving gateway for level-1 routing, and a level-2
routing service instance, thereby reducing system complexity and
meeting different requirements for traffic steering. For a
requirement of a cost or resource type service, a computing resource
status is converted into a network factor. Even if the CATS
preferentially selects a computing resource, this solution is still
applicable by increasing a weight of the factor.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Terminology
* CATS: Computing-Aware Traffic Steering.
* SID: Service Identifier.
* IGW: Ingress GateWay.
* EGW: Egress GateWay.
* TEDB: Traffic Engineering DataBase.
Fu, et al. Expires 7 July 2024 [Page 3]
Internet-Draft An SR-TE based Solution of CATS January 2024
* CADB: Computing-Aware DataBase.
* SRT: Service Routing Table.
* SFT: Service Forwarding Table.
4. Requirements and Motivation
As described in [I-D.yao-cats-ps-usecases], multiple edge sites need
to be interconnected and coordinated at the network layer to meet
service requirements and ensure better user experience.
Considering the actual deployment and network resource capabilities
of edge computing in MANs, we believe that the CATS framework should
consider the following requirements and motivations:
1)To meet the requirements of three types of CATS, two problems must
be solved at the same time: 1) IGW selects a specific service
instance during the user service access process; 2)IGW or the network
controller orchestrates the network paths that meet the quality
requirements for the selected service instance.To solve any problem
above, the quality of computing resources and network quality must be
jointly evaluated at the same time. For example, the budgets for
server (computing) delay and network delay are almost the same. It
makes sense to consider the two types of delay . If the computing
domain metric can be converted into the existing network domain
metric in a unified manner, the technical solution will be greatly
simplified by using the existing traffic engineering technology.
REQ 1 It' s recommended the computing status be converted or mapped
into the metric aligning with the existing network metric schemes.
2)Considering the service and resource planning of the existing
network, the edge compute nodes need to be deployed in VPN during the
notification of computing status. As a result, service-layer routes
and Transport-layer path decisions are interdependent. This
undoubtedly increases the coupling between the two layers. The
transmission services provided by the network are divided into two
layers: Service and Transport. Services: services include L2VPN,
L3VPN, and VXLANs, which usually use the OVERLAY technology.
Transport: Uses Underlay technologies such as IPv6, MPLS to control
paths by using traffic engineering technologies. . Therefore, the
cats framework should consider the design of an independent service
routing layer, abstraction of computing resources and status, and
joint TE decision-making involving the public network.
Fu, et al. Expires 7 July 2024 [Page 4]
Internet-Draft An SR-TE based Solution of CATS January 2024
REQ 2 An independent computing service based routing layer should be
empolyed by CATS over the underlying public network to enable joint
traffic steering of computing and networking.
3)To meet the requirements of CATS, the network needs to be aware of
the status change of computing resources (granularity: Minutes). The
status information of a large number of computing instances will
bring great pressure to the control plane and data plane of the
network. The CATS framework should consider reducing the pressure on
the control plane and data plane, and use two-level service routes or
even direct egress gateways to preferentially select service instance
to reduce the scale of computing power information expansion.
5. Background and general scenario
The edge computing service is being expanded from a single edge site
to a networked network and coordinates with multiple edge sites to
solve problems such as low costs, service experience, and resource
utilization. CATS enables large-scale edge interconnection
collaboration, providing optimal service access and load balancing to
adapt to service dynamics. The computing capability and network
conditions based on the real processing delay could dynamically
switch the service requests to proper service nodes, thus improving
resource utilization and user experience.
6. Service Flow
Fu, et al. Expires 7 July 2024 [Page 5]
Internet-Draft An SR-TE based Solution of CATS January 2024
Service Routing Information: +---------------+
<Sid1,EGW01-IP,VPN-SID,RT/RD>, |ServiceID(Sid1)|
<Sid2,EGW01-IP,VPN-SID,RT/RD> | +----+ +----+ |
|<-----------------------------| +-+ |IP11| |IP12| |
| Network | | | +----+ +----+ |
| +-------+ Metric +------+------+ | +---------------+
| | |<-------+ EGW01 +-+ Edge Site 1
| | U I | +--+-------+--+ | +---------------+
| | n n | | | | |ServiceID(Sid2)|
| | d f | IP11| IP22| | | +----+ +----+ |
| | e r | vLink| vLink| +-+ |IP21| |IP22| |
| | r a | | | | +----+ +----+ |
+-+-+ | l s | +--+--+ +--+--+ +---------------+
+------+ | | | a t | |Sid1 | |Sid2 |
|Client+--+IGW|<---+ y r | |vNode| |vNode|
+------+ | |Network u | +--+--+ +--+--+ +---------------+
+-+-+Metric c | | | |ServiceID(Sid1)|
| | t | IP13| IP24| | +----+ +----+ |
| | u | vLink| vLink| +-+ |IP13| |IP14| |
| | r | | | | | +----+ +----+ |
| | e | +--+-------+--+ | +---------------+
| | |<-------+ EGW02 +-+ Edge Site 2
| +-------+ Network+------+------+ | +---------------+
| Metric | | |ServiceID(Sid2)|
|<-----------------------------| | | +----+ +----+ |
Service Routing Information: +-+ |IP23| |IP24| |
<Sid1,EGW02-IP,VPN-SID,RT/RD>, | +----+ +----+ |
<Sid2,EGW02-IP,VPN-SID,RT/RD> +---------------+
Figure 1: Overall Architecture
Figure 1 indicates the network topology and technical architecture of
CATS in terms of service flow. The IGW/EGW node is a functional
entity that provides the switching capability in the CATS network,
and is interconnected by the transport network (Underlay
Infrastructure). The EGW is connected to multiple computing
resources and being aware of the status information of the computing
resources. The EGW provides the CATS service for customers (the EGW
can act as an IGW at the same time). Edge sites often refer to
managed edge computing. IGW/EGW node functions are usually provided
by physical devices, Such as routers in the access network or MAN.
The "underlay infrastructure" in Figure 1 indicates an IP/MPLS
network that is not necessarily CATS-aware. The CATS paths that are
computed will be distributed among the overlay IGW/EGW, and will not
affect the underlay nodes.
Fu, et al. Expires 7 July 2024 [Page 6]
Internet-Draft An SR-TE based Solution of CATS January 2024
6.1. Service Overview
CATS uses Service Identifier (SID) to represent specific service
provided by service nodes on multiple edge sites. The client device
always uses SID to initiate service access. The source or
destination IP or IP extension header options can be used to carry
SID. A CATS request for a single SID could be referred to by
different edge locations and compute instances. The client device
does not know in advance which edge site to satisfy the request.
This service process is a late binding model that selects the
appropriate edge site (i.e. EGW egress) and the corresponding
service instance and provides the network connectivity channel. As
shown in Figure 1, EGW01 is connected to two types of services: Sid1
and Sid2. Computing nodes that provide a Sid1 service include IP11,
IP12, or more, and nodes that provide a Sid2 service include IP21,
IP22, or more. Details are not described again in EGW02.
6.2. Work Flow Overview
The following is a brief description of the CATS system traffic
steering workflow:
(1) The client initiates a computing service request. The packet
carries SID in multiple carrying modes . No matter which SID carrying
mode is used, the goal is to make the request packet reachable and
the IGW perceives the SID.
(2) After receiving the request packet from the client, the IGW
identifies the corresponding SID, selects the corresponding EGW, and
delivers the specified network path to meet the network quality
requirements for service access.
(3) The EGW receives the service request forwarded by the IGW,
identifies the corresponding SID, selects a proper service instance,
modifies the destination address of the packet to the service
instance, lookups the VPN FIB, and forwards the packet to the service
instance to implement the service connection.
(4) The service instance responds with a packet. On the EGW, the
source IP of the packet is changed to the destination IP
corresponding to the service request type. The subsequent procedure
is a normal network service procedure.
7. Control Plane
Fu, et al. Expires 7 July 2024 [Page 7]
Internet-Draft An SR-TE based Solution of CATS January 2024
7.1. Considerations
To achieve the goal of computing-aware traffic steering, the general
design idea of the control plane of network devices is to enable
network link attribute flooding and IGP/BGP extension to implement
computing-aware and advertise to upstream nodes to form the traffic
engineering database (TEDB) and computing-aware database (CADB). The
CADB and TEDB need to be associated across layers. Joint computation
(centralized or distributed) is performed in accordance with the
service access SLA to obtain the required service instances and
network paths.
This bring two issues:
(1) the computing speed requirements are different, both centralized
and distributed systems need to be supported. Therefore, a set of
SDN architecture similar to the PCEP-based solution would have to be
involved repeatedly.
(2) The dimensions of the computing domain parameters (health score,
average processing delay, economic cost, and resource occupation) and
the networking domain parameters (bandwidth, delay, jitter, and
packet loss) would be hard to be unified. The computation
consumption time increases with the increase of constraint
conditions, and CPU resources consumed by a large number cannot be
deployed on a large scale.
In addition, computing instances that provide the same service type
can be flexibly deployed to the same EGW and/or different IGWs. If
the status of an EGW computing resource is continuously updated to
the upstream IGWs, the update of mass computing status information
would overwhelm the control plane of network devices and even cause
system breakdown.
7.2. EGW Processing
Fu, et al. Expires 7 July 2024 [Page 8]
Internet-Draft An SR-TE based Solution of CATS January 2024
+------+----------+--------+------------------------------------------+
| | | | Computing Metric |
| |Service |Service +----------+----------+---------+----------+
|VRF-ID|Identifier|Instance|Processing|Processing|Bandwidth|Processing|
| | | |delay |bandwidth |occupancy|costs |
| | | | |capability| | |
+------+----------+--------+----------+----------+---------+----------+
|100 |Sid1 |IP11 |* 1ms |10G |9G |100 |
+------+----------+--------+----------+----------+---------+----------+
|100 |Sid1 |IP12 |2ms |10G |5G |200 |
+------+----------+--------+----------+----------+---------+----------+
|100 |Sid2 |IP21 |10ms |40G |8G |30 |
+------+----------+--------+----------+----------+---------+----------+
|100 |Sid2 |IP22 |20ms |40G |* 5G |30 |
+------+----------+--------+----------+----------+---------+----------+
Figure 2: Local Service Routing Table
The EGW perceives the status of the computing instance from the Edge
Manager, the corresponding status includes four attributes (we call
computing metric):
(1)Processing delay: the time when a service instance processes a
single service.
(2)Processing bandwidth: Physical bandwidth capability of the service
instance or network port bandwidth for computing resources.
(3)Occupied bandwidth: The service instance occupies the processing
bandwidth or the bandwidth of the computing resource network
interface.
(4)Processing cost: Cost of service instance resources. In most
cases, physical costs are related to energy consumption.
For details, see Figure 2. EGW maintains the corresponding service
instance entry in the SRT in accordance with the VPNs deployed on the
computing resources,VRF-ID, and SID, and the latency, bandwidth, and
cost elements of the service instance VRF-ID and computing resources.
The EGW performs local processing in accordance with the SLA
corresponding to SID (if the service SLA focus on latency, the EGW
preferably selects the local service instance in accordance with the
latency of the instance), generates the local SRTs, and delivers the
preferred entries to the forwarding plane as the local SFT (see
Figure 6) for service request processing and forwarding.
Fu, et al. Expires 7 July 2024 [Page 9]
Internet-Draft An SR-TE based Solution of CATS January 2024
When a preferred service instance exists in a specific SID in the
local SRT, the EGW advertises a VRF route update message to the IGW.
Once the preferred service instances becomes zero due to resource
deterioration, the EGW advertises a VRF route revocation message to
the IGW. The bearer protocol is implemented through the MP-BGP
protocol suite. The carried elements include the message type, SID,
EGW-IP, VPN-SID, and RT/RD. The EGW advertises a service route to
the IGW instead of the specific service instance information. In
this way, a service routing layer independent of VPN IP routes is
formed, reducing the pressure on the control plane.
The EGW installs, in the IGP, a virtual node and a virtual link that
are corresponding to SID based on an entry that is preferred by each
SID and that is based on a local SRT. The virtual node is connected
to the EGW by using a virtual link (refer to Figure 1). A delay,
bandwidth, and a COST that are of a preferred service instance are
used as link attributes of the virtual link, and flood and spread
network metric values are performed in an IGP area, which greatly
reduces a scale of spreading control-plane information.
7.3. IGW Processing
+----------------------------------+
|TE policy py-Sid1-EGW01 |
| Color color1 end-point EGW01-IP |
| SGLIST:{P1-SID,..., EGW01-SID} |
| |
|TE policy py-Sid2-EGW01 |
| Color color2 end-point EGW01 |
| SGLIST:{P1-SID,..., EGW01-SID} |
+----------------------------------+
+------+----------+--------+-------+--------------+
|VRF-ID|Service | EGW IP | COLOR | VPN-SID |
| |Identifier| | | |
+------+----------+--------+-------+--------------+
| | |EGW01-IP|color 1|vidx-EGW01-SID|
|100 |Sid1 +--------+-------+--------------+
| | |EGW02-IP|color 1|vidx-EGW02-SID|
+------+----------+--------+-------+--------------+
| | |EGW01-IP|color 2|vidx-EGW01-SID|
|100 |Sid2 +--------+-------+--------------+
| | |EGW02-IP|color 2|vidx-EGW02-SID|
+------+----------+--------+-------+--------------+
Fu, et al. Expires 7 July 2024 [Page 10]
Internet-Draft An SR-TE based Solution of CATS January 2024
Figure 3: TE Calculation Result and Global Service Routing Table
As shown in Figure 3, traffic steering from accessing the computing
service SID on the IGW to preferred node is converted into a
conventional network traffic engineering process: That is, path
computation is performed between virtual nodes corresponding to SID
connected to the IGW and the EGW according to an SR-POLICY-1
constraint corresponding to the service SID, and a corresponding SR-
POLICY-1 path (color, endpoint: SID) is generated, where a
penultimate SEGMENT ID (NODE) in the segment list indicates an EGW
preferred for service access in a current condition, Convert SR-
POLICY-1 into the required SR-POLICY-2 path (color, endpoint: EGW-
IP).
After receiving the routing information advertised by each EGW, the
IGW generates global SRTs to multiple VPNs, that is, the VRF-ID and
SID are used as the KEY, and different EGW-IP are used as multiple
next hops. The SR-POLICY-2 is matched based on each COLOR and EGW-IP
in SRTs to obtain the preferred global SRTs and generate the global
SFT (refer to Figure 5), which is delivered to the forwarding plane
for traffic steering requests.
7.4. Control Plane WorkFlow
Fu, et al. Expires 7 July 2024 [Page 11]
Internet-Draft An SR-TE based Solution of CATS January 2024
+-----+ +----------+ +-------+ +-------+ +------------+
| IGW | |P(undelay)| | EGW02 | | EGW01 | |Edge Manager|
+--+--+ +----+-----+ +---+---+ +---+---+ +------+-----+
| | | | S1 |
| | | |<-----------|
| | | | S2 |
| | | +--|<-----------|
| | | |S3| |
| | | +->| |
| | | +--| |
| | | |S4| |
| S6 | S5 +->| |
+--+<---------- |<-----------+------------| |
|S7| | | | |
| |<-----------+------------+------------| |
| |---+ | S8 | | |
| |S9 | | | | |
| |<--+ | | | |
| |---| | | | |
| |S10| | | | |
+->|<--+ | | | |
| | | | |
Figure 4: Control Plane Workflow
Figure 4 shows the complete control plane procedure.The related steps
are described as follows:
S1: Edge Manager sends a registration/update/deregistration message
to the EGW01, including SID and the list of the corresponding
instance IP,such as [Sid1, IP11, IP12], [Sid2, IP21, IP22].
S2: Edge Manager periodically sends computing resource status
information to the EGW01, including SID, the corresponding instance
and computing METRIC information, such as [Sid1, IP11 METRIC, IP12
METRIC], and [Sid2, IP21 METRIC, and IP22 METRIC].
S3: EGW01 generates a local SRT in accordance with the obtained
computing resource status and the deployed VPNs. The entries include
[VRF-ID, Sid1, IP11, METRIC], [VRF-ID, Sid2, IP21, METRIC>.
S4: The EGW01 preferentially generates the local SRT in accordance
with the SLA of SID. Preferred entries generate virtual nodes and
links,such as [vNode Sid1, vLink Sid1], and [vNode Sid2, and vLink
Sid2].
Fu, et al. Expires 7 July 2024 [Page 12]
Internet-Draft An SR-TE based Solution of CATS January 2024
S5-S6: Flood the information of virtual nodes and links between EGW
and P node, and between P node and IGW..
S7: SR-TE Path Calculation Between the IGW and the virtual node
vNode-sid1/2 in accordance with the SLAs corresponding to SID.
S8: EGW01 advertises VPNs, such as [Sid1, EGW01-IP, vidx-EGW01-SID,
RT/RD], and [Sid2, EGW01-IP, vidx-EGW01-SID, RT/RD].
S9: IGW receives the service route advertised by EGW01/02, and
generates the global SRT entries with multiple egress next hops, such
as {VRF-ID, Sid1, [EGW01-IP, vidx-EGW01-SID], [EGW02-IP, vidx-
EGW02-SID]}.
S10: Combined with S7 and S9 contents, Selects the specified EGW next
hop based on the SR-POLICY and Global SRT.
Because the service and computing instance status have been converted
into network virtual nodes and links, although the distributed head
node computing is used as an example here, it is still applicable to
the centralized PCE computing architecture.
This solution unifies the traffic to the end-to-end access delay,
cost, bandwidth, jitter, and packet loss in accordance with the SLA
target. Based on different objectives: 1) Experience, the system
focuses on delay, jitter, and packet loss; 2) Costs: Pay attention to
costs and energy consumption, that is, costs; 3) Resource: Check the
resource usage/status. If the remaining cloud resources are
converted into available bandwidth, check the available bandwidth.
In actual service deployment, one of the five measurement indicators
is selected as the preferred indicator in accordance with different
objectives, and other indicators are selected as constraint
conditions.
8. Data Plane
CATS traffic steering belongs to the late binding model, and the
forwarding plane has the ability to assign user flows to the "best"
service instance and network path. When new traffic arrives, the
IGWs select the most appropriate EGW egress in accordance with the
network status and computing resources, and ensure flow affinity (the
data packets of the same flow are sent to the same service instance).
As shown in Figure 5 and Figure 6, the Data plane is divided into two
levels of (global or local) service forwarding tables.
8.1. IGW Processing
Fu, et al. Expires 7 July 2024 [Page 13]
Internet-Draft An SR-TE based Solution of CATS January 2024
+------+------------------+--------------+------------------+
|VRF-ID|Service Identifier| VPN-SID |Outgoing Interface|
+------+------------------+--------------+------------------+
|100 |Sid1 |vidx-EGW01-SID| py-Sid1-EGW01 |
+------+------------------+--------------+------------------+
|100 |Sid2 |vidx-EGW01-SID| py-Sid2-EGW01 |
+------+------------------+--------------+------------------+
Figure 5: Global Service Forwarding Table
After receiving the packet with SID from the client, according to the
VRF-ID bound to the ingress interface of the received packet and the
SID carried in the user packet, the IGW lookups the global SFT to
obtain the VPN-SID and egress interface (SR-POLICY in fact),
encapsulates the SRH and Segment list, and sends the packet to the
EGW along the path indicated in the SRH.There are multiple solutions
for carrying SID in user packet: 1) Anycast ip is used to express
SID, and SID can be directly carried based on destination IP; 2) SID
are expressed based on any digital ID, which can be carried based on
IP extension headers such as DOH and SRH TLV.
8.2. EGW Processing
+------+------------------+----------------+
|VRF-ID|Service Identifier|Service Instance|
+------+------------------+----------------+
|100 |Sid1 |IP11 |
+------+------------------+----------------+
|100 |Sid2 |IP22 |
+------+------------------+----------------+
Figure 6: Local Service Forwarding Table
When receiving a packet sent by the IGW, the EGW decapsulates SRH
encapsulation, obtains SID, obtains the corresponding SID based on
the VPN SID in the SRH, lookups the local SFT based on the VRF-ID and
SID to obtain the service instance IP, changes the destination
address of the packet to the corresponding service instance IP, and
forwards the packet to the service instance by lookuping the VPN FIB.
Fu, et al. Expires 7 July 2024 [Page 14]
Internet-Draft An SR-TE based Solution of CATS January 2024
8.3. Data Plane WorkFlow
As shown in Figure 7, in a processing procedure of a Data plane
instantiation scenario, uplink access is a main service procedure of
an CATS, and in a service instance response packet, except that NAT
translation is added to an EGW01, other steps are completely the same
as a common L3VPN packet forwarding process.
+------+ +-----+ +-------+ +-----+
|CLIENT| | IGW | | EGW01 | | IP1 |
+--+---+ +--+--+ +---+---+ +--+--+
| S1 | | |
+------------------>| | |
| +--| | |
| |S2| S3 | |
| +->+------------------>| |
| | +--| |
| | |S4| S5 |
| | +->+------------------>|
| | | |
| | | S6 |
| | +--+<------------------|
| | |S7| |
| | +->| |
| | S8 | |
| +--+<------------------+ |
| |S9| | |
| +->| | |
| S10 | | |
+-------------------+ | |
| | | |
Figure 7: Main Workflow Of The Forwarding Plane
The related steps are described as follows:
S1: Client initiates a computing service request. SID can be carried
in multiple ways. In this figure, the client initiates a computing
service request with SID is marked Sid1 (SID=Sid1) and sends it to
IGW.
S2: After receiving the service request packet, IGW lookups the SFT
in accordance with the Sid1 carried in the packet, and selects the
egress EGW or service instance (mounted with multiple computing
resources). The S3 uses EGW as an example.
Fu, et al. Expires 7 July 2024 [Page 15]
Internet-Draft An SR-TE based Solution of CATS January 2024
S3: The IGW encapsulates the outer tunnel header and SRH (including
the VPN-SID advertised by the EGW) in accordance with the SFT
content, and keeps the inner packet unchanged. The packet is sent to
the egress interface, and finally forwarded to the EGW01 through the
intermediate P nodes.
S4: The EGW01 receives the service request with tunnel encapsulation
packet from the IGW, decapsulates the tunnel encapsulation, lookups
the VPN SFT according to the VPN-SID and Sid1 to obtain the instance
IP1, and converts the DA in the packet into IP1 to form the SNAT
entry.
S5: The EGW01 lookups the local SFT for the decapsulated packet and
sends the packet to the service instance node.
S6: The service instance responds with the service request packet,
where the source IP is IP1 and the destination IP is CLIENT_IP.
S7: After receiving the response packet, the EGW01 lookups the SNAT
table in the S4 in accordance with the source IP=IP1, modifies the
packet source IP(IP1) to Sid1, and lookups the VPN FIB in accordance
with the packet destination IP(IP1). This VPN RIB comes from the
route advertised by the IGW to the EGW. This is related to network
planning and deployment. Of course, a specific SR-TE path can also
be planned for the returned packet.
S8: The EGW01 encapsulates the outer tunnel header and SRH (including
the VPN SID advertised by the IGW) in accordance with the table query
result, and keeps the inner packet unchanged. The packet is sent to
the egress interface, and finally forwarded to the IGW through the P
node.
S9: IGWs process packets in accordance with the received packets in
the S8. This is a standard L3VPN packet processing procedure.
S10: The IGW lookups the local VPN FIB in accordance with the
decapsulated service packet in the S9, and finally sends the packet
to the client. Now, the client service request and service instance
response packet are processed in a round-trip manner, and the S1-S10
procedure is repeated later.
Fu, et al. Expires 7 July 2024 [Page 16]
Internet-Draft An SR-TE based Solution of CATS January 2024
8.4. Flow Affinity Considerations
The flow affinity mentioned above means that packets from the same
service flow should always be sent to the same EGW egress and
processed by the same service instance. When a new flow arrives,
after the optimal service instance and EGW egress are determined, the
IGW updates the flow identifier (5-tuple), preferred EGW, and
affinity timeout time to the level-1 flow binding table. When the
new flow arrives at the EGW, the EGW updates the flow identifier,
preferred service instance, and affinity timeout time to the level-2
flow binding table. Subsequently, all data packets are forwarded
according to the flow binding table of the two levels. Once no
packet of the service flow is received after the flow affinity period
expires, the IGW and the EGW age the flow affinity table.
9. Security Considerations
(1) There are many computing instances and the resource information
changes rapidly with time, Information is carried in routing
protocols, and network changes may occur frequently. Section 5.2
provides a solution to avoid sending too many updates.
(2) As the two-level Service routing model is used, the EGW does not
need to advertise the details of service instances or aggregate
routes to IGW. Client can only access service instances by carrying
SID. In the future, the authorization management of SID will be
added, greatly improving system access security.
10. Acknowledgements
To be added upon contributions, comments and suggestions.
11. IANA Considerations
There are no IANA considerations in this document.
12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
Fu, et al. Expires 7 July 2024 [Page 17]
Internet-Draft An SR-TE based Solution of CATS January 2024
[RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation
Element (PCE) Communication Protocol (PCEP)", RFC 5440,
DOI 10.17487/RFC5440, March 2009,
<https://www.rfc-editor.org/info/rfc5440>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
Decraene, B., Litkowski, S., and R. Shakir, "Segment
Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
July 2018, <https://www.rfc-editor.org/info/rfc8402>.
[RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J.,
Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header
(SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020,
<https://www.rfc-editor.org/info/rfc8754>.
12.2. Informative References
[I-D.huang-service-aware-network-framework]
Huang, D., Tan, B., and D. Yang, "Service Aware Network
Framework", Work in Progress, Internet-Draft, draft-huang-
service-aware-network-framework-01, 22 November 2022,
<https://datatracker.ietf.org/doc/html/draft-huang-
service-aware-network-framework-01>.
[I-D.ietf-teas-rfc3272bis]
Farrel, A., "Overview and Principles of Internet Traffic
Engineering", Work in Progress, Internet-Draft, draft-
ietf-teas-rfc3272bis-27, 12 August 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-teas-
rfc3272bis-27>.
[I-D.li-dyncast-architecture]
Li, Y., Iannone, L., Trossen, D., Liu, P., and C. Li,
"Dynamic-Anycast Architecture", Work in Progress,
Internet-Draft, draft-li-dyncast-architecture-08, 16
January 2023, <https://datatracker.ietf.org/doc/html/
draft-li-dyncast-architecture-08>.
Fu, et al. Expires 7 July 2024 [Page 18]
Internet-Draft An SR-TE based Solution of CATS January 2024
[I-D.yao-cats-ps-usecases]
Yao, K., Trossen, D., Boucadair, M., Contreras, L. M.,
Shi, H., Li, Y., and S. Zhang, "Computing-Aware Traffic
Steering (CATS) Problem Statement, Use Cases, and
Requirements", Work in Progress, Internet-Draft, draft-
yao-cats-ps-usecases-03, 30 June 2023,
<https://datatracker.ietf.org/doc/html/draft-yao-cats-ps-
usecases-03>.
Authors' Addresses
Huakai Fu
ZTE Corporation
Email: fu.huakai@zte.com.cn
Daniel Huang
ZTE Corporation
Email: huang.guangping@zte.com.cn
Liwei Ma
ZTE Corporation
Email: ma.liwei1@zte.com.cn
Wei Duan
ZTE Corporation
Email: duan.wei1@zte.com.cn
Fu, et al. Expires 7 July 2024 [Page 19]