Network Working Group | D. Purkayastha |
Internet-Draft | A. Rahman |
Intended status: Informational | D. Trossen |
Expires: April 30, 2018 | InterDigital Communications, LLC |
Z. Despotovic | |
R. Khalili | |
Huawei | |
October 27, 2017 |
Alternative Handling of Dynamic Chaining and Service Indirection
draft-purkayastha-sfc-service-indirection-01
Many stringent requirements are imposed on today’s network, such as low latency, high availability and reliability in order to support several use cases such as IoT, Gaming, Content distribution, Robotics etc. Networks need to be flexible and dynamic in terms of allocation of services and resources. Network Operators should be able to reconfigure the composition of a service and steer users towards new service end points as users move or resource availability changes. SFC allows network operators to easily create and reconfigure service function chains dynamically in response to changing network requirements. We discuss a use case where Service Function Chain can adapt or self-organize as demanded by the network condition without requiring SPI re-classification. This can be achieved, for example, by decoupling the service consumer and service endpoint by a new service function proposed in this draft. We describe few requirements for this service function to enable dynamic switching between consumer and end point.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 30, 2018.
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The requirements on today’s networks are very diverse, enabling multiple use cases such as IoT, Content Distribution, Gaming, Network functions such as Cloud RAN. Every use case imposes certain requirements on the network. These requirements vary from one extreme to other and often they are in a divergent direction. Network operator and service providers are pushing many functions towards the edge of the network in order to be closer to the users. This reduces latency and backhaul traffic, as user request can be processed locally.
It becomes more challenging for the network when user mobility as well as non-deterministic availability of compute and storage resources are considered. The impact is felt most in the edge of the network because as the users move, their point of attachment changes frequently, which results in (at least partially) relocating the service as well as the service endpoint. Furthermore, network functions are pushed more and more towards the edge, where compute and storage resources are constrained and availability is non-deterministic. Also, storage resources may need to be moved where the user concentration is more in case of content delivery applications.
We describe a few use cases in the next section and derive the requirements for composing new services and service path in a dynamic edge network. We address this dynamicity by introducing a special Service Function, called SRR (service request routing). We describe the problems associated with today’s network and Layer 3 based approach to handle dynamicity in the network. We then discuss how such new Service Function with certain capabilities can handle the dynamicity better than these conventional methods. Note : State migration is not in the scope of our solution since this problem is a general one pertaining to re-chaining stateful SFs.
The data center use case draft [I-D.ietf-sfc-dc-use-cases] describes an East West traffic use case. This is the predominant traffic in data centers today. Server virtualization has led to the new paradigm where virtual machines can migrate from one server to another across the data center. This explosion in east-west traffic is leading to newer data center network fabric architectures that provide consistent latencies from one point in the fabric to another.
SFCs applied in an enterprise or service provider data center can be broadly categorized into two types:
Access SFCs are focused on servicing traffic entering and leaving the data center while Application SFCs are focused on servicing traffic destined to applications. Service providers deploy a single "Access SFC" and multiple "Application SFCs" for each tenant. Enterprise data center operators on the other hand may not have a need for Access SFCs depending on the size and requirements of the enterprise.
In carrier networks, operators may deploy multiple data centers dispersed geographically. Each data center may host different types of service functions. For example, latency sensitive or high usage service functions are deployed in regional data centers while other latency tolerant, low usage service functions are deployed in global or central data centers. In such deployments, SFCs may span multiple data centers and enable operators to deploy services in a flexible and inexpensive way.
It is clear that within the data center as well as in inter data center scenarios, users are serviced by multiple SFs distributed inside as well as outside a location. In this scenario, it is clear that Service function chains should be able to reselect, redirect traffic very fast. The draft identifies that Static service chains do not allow for modifying the SFCs as they require the ability to add SNs or remove SNs to scale up and down the service capacity. Likewise the ability to dynamically pick one among the many SN instance is not available.
Take the following video orchestration service example from ETSI MEC Requirements document [ETSI_MEC]. The proposed use case of edge video orchestration suggests a scenario where visual content can be produced and consumed at the same location close to consumers in a densely populated and clearly limited area. Such a case could be a sports event or concert where a remarkable number of consumers are using their handheld devices to access user select tailored content. The overall video experience is combined from multiple sources, such as local recording devices, which may be fixed as well as mobile, and master video from central production server. The user is given an opportunity to select tailored views from a set of local video sources.
3GPP Rel. 15 introduces the notion of the service-based interface (SBI) as an alternative to the traditional call pattern invocation of network functions. This introduction targets the support for replication, e.g., driven by virtualized functions, as well as supporting alternative interactions, e.g., for different vertical market specific control planes, by making the discovery as well as composition of new interactions more flexible.
We believe that SFC is a suitable framework for the interconnection of such network functions through the new SBI. One of the aforementioned driving forces, namely the replication of functions aligns with our thinking in this draft in that indirections to new vertical instances need to be dynamic in reacting to the appearance of new virtual instances or to changes in policies for the selection of specific instances by specific calling entities.
In such a dynamic network environment, the capability to dynamically compose new services from available services as well as move a service instance in response to user mobility or resource availability is desirable. SFC allows network operators as well as service providers to compose new services by chaining individual service functions towards the composed new service. In a dynamic network environment where service functions move frequently because of user movement, load balancing or resource modification, service function chains and the service end points need to be created and recreated frequently. SFC, as defined in IETF, is capable of modifying the service chain dynamically in response to network conditions.
In order to route the service requests to service end points in a dynamic manner, we identify the following desirable features in a service function chain:
[RFC7498] captures the problems associated with existing service deployments that are problematic. The problems are described below at a high level:
These factors provide motivation for a simplified and flexible service insertion model that addresses many of the current shortcomings and provides new, much needed functionality to enable service deployments in modern network environments. Service chaining accomplishes this by considering service functions as resources, with associated attributes, available for scheduled consumption. Selective traffic, subject to policy, may then be “steered” to the requisite service resources, along with any “extra” information referred to as metadata. This metadata is used for policy enforcement.
A basic form of service chaining may be realized using existing transport encapsulations. This method of chaining relies upon the tunneling of selected data between service functions. Although this form of service chaining achieves some level of abstraction from the underlying topology, it does not truly create a service plane. NSH [I-D.ietf-sfc-nsh] is a distinct identifiable plane that can be used across all transports to create a service chain and exchange metadata along the chain.
Fundamentally, however, the notion of "services" in SFC is tied into specific service function endpoints, which lie along a well-defined service function path (SFP) where the path is defined through lower layer transport encapsulations. If any such service function endpoint changes, the service chain needs to be adjusted; a procedure we outline in the following sub-section.
We revisit the dynamic service chain creation capability of NSH. NSH defines a new service plane protocol [I-D.ietf-sfc-nsh]. A Network Service Header (NSH) contains service path information and optionally metadata that are added to a packet or frame and used to create a service plane. A control plane is required in order to exchange NSH values with participating nodes, and to provision the same nodes with requisite information such as service path ID to overlay mapping.
The Network Service Header has three parts, Base header, Service Path Header and Context Header. NSH Service Path Header is a 4-byte service path header follows the base header and defines two fields used to construct a service path:
The following figure depicts the service path header.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Service Path ID | Service Index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: NSH Path Header
The service path identifier (SPI) is used to identify the service path that interconnects the needed service functions. It allows nodes to utilize the identifier to select the appropriate network transport protocol and forwarding techniques. The service index (SI) identifies the location of a packet within a service path. As packets traverse a service path, the SI is decremented post-service.
SPI represents the service path and altering the path identifier results in a change of a service path. A change in SPI value is a result of re-classification. It means a node in the service path determined, based on policy, that the initial classification was incorrect or incomplete. If the updated classification results in the necessity of a new service path, the node updates the SPI and SI fields accordingly. The new identifier is then used to select the appropriate overlay topology. This allows service functions to alter the path of a packet without having to participate in the network topology and its associated control plane(s). The method to determine that an existing classification is incorrect and how to determine the new classification is not defined.
The emerging trend in today’s network is to deploy network functions, services and applications at the edge of the network to support latency requirements, computational offload, traffic optimization etc. As users are moving, application or services being used by users, may need to be moved closer to the user’s new location. This implies another instance of the service function may need to be instantiated close to the user’s new location. It may result in re-establishing service path from the newly instantiated service function to other service instances. It is also possible that the newly instantiated service function may be redirected to a new service end point (e.g. Application Server) for various reasons, such as incomplete content, proximity to data store, load balancing etc. In another scenario, a single instance of the service function may not handle all users. A single service function may be instantiated more than once to balance user load. As the number of instances increase and along with mobility, the complexity of service routing increases. It is anticipated that there may be a constant action of function chaining, re-chaining occurring in the network.
The challenge of dynamic indirection may be better described by analyzing the working of CDNs, which dynamically (re-)direct user-initiated requests towards the most appropriate content instance. This task becomes more difficult if granularity of the instance placement increases. For instance, in case of a CDN being realized close to end users, specifically in edge of the network, the specific content instance might need to be selected dynamically. After initial selection, the instance may change during service execution.
In a conventional network, an instance of a service is found and selected using DNS. The subsequent service request is then routed through the network between the client and the service. If the user is doing a DNS lookup to access content served by a CDN then the DNS service will maintain a list of IP addresses that can be returned for a given domain name and will try to return an IP address of a node geographically close to the client. Should the service provider want to replace an instance of their service with another one at a different IP address (and potentially a different physical location for various reasons such as load balancing, reliability etc.) then the DNS tables must be updated, i.e., the service needs to be (re-)registered quickly. This is done by updating the local authoritative DNS server which then propagates the new mapping to DNS services across the world. DNS propagation can take up to 48 hours so fast and dynamic switching from one service instance to another is not possible in conventional networks. When relying on many surrogate service endpoints to exist in the edge network, there is a clear issue of certain resources not being available in one surrogate instance while existing in another so that changes in redirection might be desirable, while also changes in local load drive the need for such change in redirection.
The other issue in conventional network lies with mobility management procedure. These procedures use an anchor point, which terminates a session at the network edge. As user moves around, traffic is redirected from the anchor point to the new point of attachment. Relying on typical mobility management approaches found in IP networks, usually leads to inefficient ‘triangular’ routing of requests through this common ‘anchor’ point. This triangular routing increases the latency in reaching the new service function or service end points as users move.
Traffic steering is a common procedure in managed networks, particularly at the edge, due to desired subscriber-centric traffic policies (e.g., related to pricing structures), resource requirements (e.g., related to using particular paths in the network) or mobility (e.g., users moving in a cellular network). Today’s methods for traffic steering include anchor-based mobility management as well as traffic classification, for instance, in packet gateways of cellular systems (using, e.g., deep packet inspection as well as port and address classification). While the former leads to inefficient ‘triangular’ traffic forwarding, the latter often requires additional state in the forwarders to differentiate traffic from one user to another.
The analysis of CDN network shows that dynamic indirection is a necessary requirement, which needs to be supported by the networks. The goal for this indirection is to provide user applications lowest possible latency. But as discussed above, relying on today’s technique does not help in guaranteeing same latency to user applications. On the other hand, there is a high possibility that latency may increase if we rely on Layer 3 based service redirection techniques.
SFC handles indirection through the use of SPI. A packet needs to be reclassified and the intermediate node changes the SPI. Following are the typical steps that happens in order to implement the indirection.
The indirection mechanism in SFC involves certain steps to process policy information and change the SPI in the packet header, making it suitable to handle dynamic indirection requirements. Our proposed SF in this document provides an additional method to handle dynamic indirection of service requests, not relying on the reclassification mechanism. Combining these two techniques may provide flexibility and improvement over single method.
In order to route the service requests to service end points in a dynamic manner, we identify the following desirable features:
The following diagram shows the application of the new proposed SRR service function in an example of media clients connecting to media servers. There may be more than one media functions to support CDN like architecture, Surrogate servers to handle mobility and load balancing.
+--------+ | | |-------------------------|------------------+ SRR + | | | | | | +---/|\--+ | | | +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ | | | | | | | | | | + Client +-->+ IP +-->+ Media +-->+ SRR +-->+ Media + | | | Routing | | Fn1 | | | | Fn2 | +--------+ +---------+ +-------+ +------+ +--------+
Figure 2: General SFC with SRR Flexible Chaining, Initiated via IP Routed Client Connection
The clients are connected to media functions through frontend routed network, e.g., relying on standard IP routing, while media functions are chained via the new proposed service request routing (SRR) function. Alternatively, we also envision to utilize the SRR function directly between client SF and media function SF, as outlined in the figure below
+--------+ | | |-------------------------|------------------+ SRR + | | | | | | +---/|\--+ | | | +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ | | | | | | | | | | + Client +-->+ SRR +-->+ Media +-->+ SRR +-->+ Media + | | | | | Fn1 | | | | Fn2 | +--------+ +---------+ +-------+ +------+ +--------+
Figure 3: General SFC with SRR Flexible Chaining, Initiated between via SRR Chained Client
For our considerations, we assume that each SF is realized by at least one or more service function endpoints (SFEs). Hence, instead of looking at "chaining" as a concept that connects specific SFEs along a well-defined SFP, we propose to look at "chaining" at the level of "named" service functions rather than their specific endpoints. With this in mind, the SRR service function lifts the relationship between the connecting SFs to the level of "logical" service functions rather than their specific realizing endpoints. Instead of relying on dynamic re-chaining in case of any dynamically changing relationship between specific SFEs, the SRR provides the selection of suitable SFEs while maintaining the logical relationship between the SFs. In Section 6.3, we will present the necessary extensions to the SFP concept to support this higher abstraction of "chaining" via "named" logical SFs. The SRR introduces the flexibility in routing service requests from client to specific SFEs. In the edge network, where users are moving and service end points may also change, having flexibility to decide and steer service requests directly helps in guaranteeing the same latency to user applications. Clearly, that is achieved by reducing the switching time from SF to another. As service end point changes, the routing functions makes instantaneous decision to route the request to the appropriate media server.
The possible improvements of using SRR within an SFC framework are listed below:
As a first proposed extension to the SFC framework, we introduce the notion of a "HTTP-based transport" utilizing URLs as addressing scheme. With that, we can create SFPs as shown in Fig 4, "i.e., 192.168.x.x -> www.foo.com -> 192.168.x.x -> www.foo2.com -> 192.168.x.x -> … -> www.fooN.com." It is this "name-based" relationship that we see possibly realized through specific replicated instances, where in turn the routing towards those specific instances is realized by the SRR.
+--------+ | | |-------------------------|------------------+ SRR + | | | | | | +---/|\--+ | | | +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ | | | | | | | | | | + Client +-->+ SRR +-->+ Media +-->+ SRR +-->+ Media + | | | | | Fn1 | | | | Fn2 | +--------+ +---------+ +-------+ +------+ +--------+ SFP:192.168.x.x-->www.foo.com-->192.168.x.x-->www.foo2.com-->192.168.x.x-->www.fooN.com
Figure 4: SFP with new HTTP-based Transport option
In a pure SFC architectural framework, Classifier function can may interact with SRR to obtain an SE (Service Encapsulation). E.g. the Classifier function may look into the network locator map in Fig 4 and determine the next SF is www.foo.com. It provides this information to SRR to obtain the next hop information. SRR returns the SE for next hop, which can be a “bitfield” information that is being used in the overlay routing for this part of the SFP. The Classifier function uses this SE to route the incoming packet directly at the transport network level.
Assuming such introduction of an HTTP-level transport notion, the SRR function can be decomposed further as shown in Fig 5.
+--------+ | | |-------------------------|------------------+ SRR + | | | | | | +---/|\--+ | | | +---\|/--+ +---------+ +--\|/--+ +------+ +----+----+ | | | | | | | | | | + Client +-->+ SRR +-->+Service+-->+ SRR +-->+ Service + | | | | | Fn1 | | | | Fn2 | +--------+ +---------+ +-------+ +------+ +---------+ / \ / \ / \ +--------------------------------------+ | +------------------+ | | | +-----+ +----+ | +-----+ | |---> | SFC | |NAP | | |NAP |-----> | | |Proxy| | | | | | | | | +-----+ +----+ | +-/|\-+ | | | Use Proxy if NAP| | | | | is not SFC | | | | | enabled | | | | +-------/|\--------+ | | | | | | | | | | | | +----------+ | | | |->| tSFF1 |----- | | +---/|\----+ | | | | | | | | +----------+ | | | | | | | | + PCE +---- | | | | | | +----------+ | | | +--------------------------------------+
Figure 5: SRR decomposition
Another option for the two functions routing via the SRR could be entirely link-local, i.e., there’s another simple tSFF2 between client and SRR as well as SF1 and SRR that is simply a link-local transport. The following figure describes this alternate option.
+--------+ | | |-------------------------|------------------+ SRR + | | | | | | +---/|\--+ | | | +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ | | | | | | | | | | + Client +-->+ SRR +-->+Service+-->+ SRR +-->+Service + | | | | | Fn1 | | | | Fn2 | +--------+ +---------+ +-------+ +------+ +--------+ / \ / \ / \ +-----+ +---------------------------------+ |tSFF2|--------->+----+ +-----+ | +--------+ +-----+ | |NAP | |NAP |----->| tSFF2 |--> | | | | | | +--------+ | +----+ +-/|\-+ | | | | | | | | | | | | | | | | | | | +-------+ | | | |---->| tSFF1 |--- | | +--/|\--+ | | | | | | | | +-------+ | | | | | | | | + PCE +--- | | | | | | +-------+ | | | +---------------------------------+
Figure 6: SRR decomposition using link-local client/function communication
The SRR function may be composed of the following functions:
+---------+ +---------+ | | | | 10010011 00000010 +--------+ +IP only +---+ ICN +--------| | ICN | |reciever | | NAP1 | | |-----------| NAP3 | |UE | +---------+ | | +---||---+ +---------+ | | || +----------+ +----------+ |-----||-------| | | | | | Cloud | |SDN Switch|---|SDN Switch| | | | | | | |---||---| +----------+ +----------+ || | || +---------+ +---------+ | +----||------+ | | | | | | | +IP only +---+ ICN +---------| + IP only + |sender UE| | NAP2 | 10100011 | Server | +---------+ +---------+ +------------+
Figure 7: Illustration of Bitfield-based Forwarding using SDN
For the operations outlined in the previous section, we foresee the following protocol changes are required:
This document requests no IANA actions.
TBD.