Internet DRAFT - draft-cao-dataplane-acceleration-framework
draft-cao-dataplane-acceleration-framework
Internet Engineering Task Force Z. Cao
Internet-Draft Q. Fu
Intended status: Experimental L. Deng
Expires: January 5, 2015 China Mobile
July 4, 2014
Data Plane Processing Acceleration Framework
draft-cao-dataplane-acceleration-framework-01
Abstract
It is getting popular to running data applications over general
purpose hardware/chipsets, instead of customized and dedicated
hardware/chipset. This way further decouples the software functions
from the hardware. But moving data processing intensive applications
to general purpose hardware is still challenging, although the
industry has supplied some proprietary solutions. This document
discusses the problems of data plane acceleration and proposes its
framework.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 5, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Cao, et al. Expires January 5, 2015 [Page 1]
Internet-Draft DPA Framework July 2014
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
3. DPA Framework . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Framework . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2. Components . . . . . . . . . . . . . . . . . . . . . . . 4
3.3. Protocol Portfolio . . . . . . . . . . . . . . . . . . . 5
4. Existing Work - Intel DPDK . . . . . . . . . . . . . . . . . 5
5. Fast Path across (Virtual) Network Functions . . . . . . . . 7
5.1. ForCES . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Open Questions to IETF . . . . . . . . . . . . . . . . . . . 8
7. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 9
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
9. Security Considerations . . . . . . . . . . . . . . . . . . . 9
10. Informative References . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
The need of running network data processing functions over general
purpose hardware/chipset (e.g., X86, PPC, etc) is multi-folded.
1. Decoupling software functions from hardware. Traditional network
devices are built upon dedicated or deep customized hardware and
chipsets. This way restricts the flexibility of both service
providers and network operators.
2. Network Function Virtualization (NFV). NFV is an initiative of
ETSI to virtualize the network functions to the overlay on top of
the virtualization layer. It provides network elasticity in that
the network functions can be scaled up/down according to the
traffic load. NFV solutions often bundle with the virtual
switches to provide VM-VM communications. Theses virtual
switches are running on top of the servers that bear the network
functions. Therefore, the need to accelerate the data processing
efficiency is indispensable.
3. Service Time-to-Market . Via the software and hardware
decoupling, the speed to provide new services (TTM) is greatly
enhanced. Since more and more services would like to have the
most convenient time to market, they would also like to move data
processing functions on top of general purpose hardware/chipsets.
Cao, et al. Expires January 5, 2015 [Page 2]
Internet-Draft DPA Framework July 2014
4. Capex and Opex pressure. Having the network functions running
over general purpose device will help operators to cut down their
Capex and Opex.
5. Cost-performance targets: software development, debug and
integration is simplified; processor resource utilization is
improved because the control plane and data plane can be
distributed among cores with greater flexibility; development
schedule risk is minimized and software maintenance is much
easier with a common code base and a single development
environment.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. DPA Framework
NF (Network Function): A functional building block within an
operator's network infrastructure, which has well-defined external
interfaces and a well-defined functional behaviour. Note that the
totality of all network functions constitutes the entire network and
services infrastructure of an operator/service provider. In
practical terms, a Network Function is today often a network node or
physical appliance. [Quoted from ETSI NFV]
3.1. Framework
The framework is depicted in Figure 1. Framework.
Cao, et al. Expires January 5, 2015 [Page 3]
Internet-Draft DPA Framework July 2014
+--------------------+ +-+
|Buffer Management | | |
|Queue Management | |A|===== App
|Memory Management |==|P|===== App
|Flow Classification | |I|===== App
|Other techniques | | |
+--------------------+ +-+
|| || ||
+----------------------------------------------+
| Hardware Abstraction Layer |
+----------------------------------------------+
User Space
---------------------------------------------------------
|| || || Kernel Space
+--------------------------------------------------+
| +--------------------------+ |
| |Hardware Abstraction Layer| OS Kernel |
| +--------------------------+ |
+--------------------------------------------------+
|| || ||
+----------------------------------------------+
| Platform Hardware |
+----------------------------------------------+
Figure 1
3.2. Components
The DPA may include the following components.
Memory/Buffer Manager. The Memory/Buffer Manager is responsible for
allocating NUMA-aware pools of objects in memory and balancing memory
bandwidth utilization across the channels. Such management can
significantly reduces the amount of time the operating system must
spend allocating and de-allocating buffers.
Queue Manager. The Queue Manager is responsible for queue
scheduling. The ultimate goal of the Queue Manager is to allow
different software components to process packets, while avoiding
unnecessary wait times.
Flow Classification. The Flow Classification component is an
efficient mechanism for generating a hash used to quickly combine
packets into flows, which enables faster processing and greater
throughput.
Poll Mode Drivers. The Poll Mode Drivers is capable of speeding up
the packet pipeline for 1 GbE and 10 GbE ethernet controllers by
Cao, et al. Expires January 5, 2015 [Page 4]
Internet-Draft DPA Framework July 2014
receiving and transmitting packets without the use of asynchronous,
interrupt- based signaling mechanisms, which have a lot of overhead.
Environment Abstraction Layer. The Environment Abstraction Layer
provides an abstraction to platform-specific initialization code,
which eases application porting effort. The EAL provides access to
low-level resources (hardware, memory space, logical cores, etc.)
through a generic interface that hides the environment specifics from
the applications and libraries.
3.3. Protocol Portfolio
On one hand, for the data plane, DPA should provide an efficient
stack for common protocols utilized by various internet applications,
including but not limited to:
1. Link layer: Layer 2 switch, VLAN.
2. Network layer: IPv4 and IPv6 for packet routing; MPLS and GRE/GTP
for tunneled routing; IPsec, TLS/DTLS, NAT and QoS support for
security and management features.
3. Transport layer: SCTP/MPTCP as well as TCP and UDP, for multi-
homing/stream traffic.
4. Application layer: SSL termination for remote administration of
virtualized device.
On the other hand, for the control plane, DPA should provide an
efficient stack for common protocols utilized by various network
devices/ISPs for improved operation and Management, including:
NetFlow, sFlow, IPFIX, SPAN, RSPAN for VM traffic monitory, LACP, STP
and openflow for L2/L3 management.
4. Existing Work - Intel DPDK
This section introduces DPDK [DPDK].
Intel Data Plane Development Kit (DPDK) is a set of libraries and
drivers for fast packet processing on x86 platforms. It runs mostly
in Linux userland.The idea of DPDK has significantly advanced the
concept of consolidation of data and control planes on a general
purpose processor. Such idea greatly boosts packet processing
performance and throughput by providing Intel architecture-optimized
libraries to accelerate L3 forwarding, yielding performance that
scales linearly with the number of cores, in contrast to native
Linux.
Cao, et al. Expires January 5, 2015 [Page 5]
Internet-Draft DPA Framework July 2014
The Intel DPDK contains a growing number of libraries, whose source
code is available for developers to use and/or modify in a production
network element. Likewise, there are various usecase examples, such
as L3 forwarding, load balancing, and timers, that help reduce
development time. The libraries can be used to build applications
based on "run-to completion" or "pipeline" models, enabling the
equipment provider's application to maintain complete control.
the Intel DPDK software is also available to aid in the development
of I/O intensive applications running in a virtualized environment.
This combination allows application developers to achieve near-native
performance.
The Intel DPDK provides a simple framework for fast packet processing
in data plane application. Developers may use the code to understand
some of the techniques employed, to build upon for prototyping, or to
add their own protocol stacks. SR-IOV features are also used for
hadware-based I/O sharing in I/O virtualization (IOV) mode.
Therefore, it is possible to partition intel 82599 10 Gb Ethernet
controller NIC resources logically and expose them to a VM as a
virtual function
Furthermore, 6WIND has developed a number of value-added enhancements
to the Intel DPDK library that provide increased system functionality
and performance compared to the baseline software. These value-added
enhancements include the following aspects.
Hige-performance software crypto support, implemented via the Intel
Advanced Encryption Standard New Instructions (Intel AES-NI) in the
Intel Xeon processor E5600 series and E5-2600 v2 series.
Device monitoring and statistics functions,such as Linux Ethtool MTU
support, full RX/TX queue statistics and CRC error statistics, which
enable improved system-level profiling, analysis and debug.
Support for additional Network Interface Cards(NICs), such as the
Intel 82571EB Gibabit Ethernet controller, beyond those supported in
the baseline Intel DPDK library.
6WIND also provides a range of optional add-on extensions to the
Intel DPDK designed to improve the cost/performance of both physical
and virtual networking appliances while enabling the use of the intel
DPDK in software-defined networks. These optional add-ons include:
IPsec acceleration, achieved through integration of the Intel Multi-
buffer Crypto for IPSec library;
Cao, et al. Expires January 5, 2015 [Page 6]
Internet-Draft DPA Framework July 2014
Crypto acceleration via support of an external accelerator, the Intel
Communications Chipset 89xx series, which is part of Intel's next-
generation communications platform,codenamed "Crystal Forest"
Virtualization-related enhancements that maximize system performance
by removing key I/O and communication bottlenecks include:
1. I/O Virtualization(IOV), an industry-standard approach for
increasing the performance of virtual network appliances by
bypassing the virtual switch within the hypervisor, thus removing
the I/O performance constraints imposed by the virtual switch.
2. A virtual NIC(vNIC) driver that leverages communication between
virtual machines via the virtual switch, enabling the efficient
development and provisioning of systems with multiple VMs and
significant East-West network traffic.
3. For system that require the ultimate level of performance for
East- West traffic between VMs, a VM-to-VM driver enables direct
VM-to-VM communication, bypassing the virtual switch while
remaining fully compatible with industry-standard hypervisors.
These Intel DPDK enhancements and optional add-ons are maintained by
6WIND as private branch, regularly synchronized with Intel's on-going
releases of the baseline library. They are delivered to customers
either as a stand alone library or, for applications that also
require high- performance packet processing software, and integrated
within the 6WINDGate software solution.
The 6WINDGate packet processing software is designed to solve the
problem of exploiting the potential packet processing performance of
multicore processor through a fast pth-based architecture, while
incorporating a comprehensive set of high performance networking
protocols fully optimized for intel Xeon processor-based platforms.
5. Fast Path across (Virtual) Network Functions
Previous sections basically talk about the data path acceleration on
one device with multiple threads/VMs sharing the physical resource.
This section will talk about the data plane acceleration across
multiple (virtual) network functions.
In NFV, layer 4-7 network functions are virtualized on top of the
computing nodes. But sometimes, these vNFs are only used for session
estabalishment, after which the packets can be handled by the L2/3
devices. Given that he higher layer the packet is being processed,
the more challenge to its performance. So in some scenarios, it is
desirable to offload the packet processing to the L2/3 fabrics,
Cao, et al. Expires January 5, 2015 [Page 7]
Internet-Draft DPA Framework July 2014
eliminating the burden on the higher layer NFs. The scenario is
depicted in Figure 2.
One vivid example is the ACL or Parental Control services. The ACL
Network Function will determine the forwarding rules configured by
its user, say, IP 5 tuples. After the session has been established,
the ACL NF can inform the L2/3 devices about the forwarding rule in a
control message. And the followed packets will be handled according
to the logics.
+---------+ +---------+
| L4-7 NF | | L4-7 NF |
,__ +---------+__. ,____+---------+__.
/ \ / \__
/ \ / \
+--------+ +--------+ +--------+
| L2/3 |_______________| L2/3 |__________________| L2/3 |
| Fabric | | Fabric | | Fabric |
+--------+ +--------+ +--------+
Figure 2: Fast Path across devices
5.1. ForCES
Forwarding and Control Element Separation (ForCES)
[RFC5810][I-D.ietf-forces-protoextension] defines an architectural
framework and associated protocols to standardize information
exchange between the control plane and the forwarding plane .
In the Fast path offload senario described above, the ForCES
protocols could be used or extended to serve as the communicaton
protocols between the NF and L2/3 fabrics.
6. Open Questions to IETF
IETF has been design Layer 2&3 protocols, and most of them are
dedicated to data plane processing. The efficient implementation of
protocol and tailoring them for specific hardware/chipsets have not
been considered as main-stream IETF work (there are indeed some
thread anyway, e.g. tailor for M2M). But to make IETF protocols as
efficient as possible is definitely within the scope of IETF. Below
are some discussion of open questions to IETF w.r.t. the data plane
process acceleration topic.
1. Importance. The game changing initiatives already started. NFV
and further virtualization and decoupling practices are
happening. Before the questions have been ported to specialized
Cao, et al. Expires January 5, 2015 [Page 8]
Internet-Draft DPA Framework July 2014
hardware, but now the industry is changing the game. Do it need
the standardization collaboration?
2. Relevance. As we authors believe it, to make IETF protocols as
efficient as possible is definitely within the scope of IETF.
Although implementation techniques are mostly software
engineering practice and have no business with any SDOs, the
abstract API design and exposure of lower layer capability will
definitely benefit the data plane processing efficiency.
3. Necessity. Now that DPDK is already open source. But the
experience in DPDK can feedback to IETF on how to improve the
protocol design in promoting data plane acceleration
effectiveness.
7. Acknowledgement
This work was inspired by the DPDK open source project.
Thank you for the discussion with Hui Deng, Dapeng Liu, and Lingli
Deng on how to improve and promote this document.
8. IANA Considerations
To be specified.
9. Security Considerations
TBD.
10. Informative References
[DPDK] "Packet Processing - Intel DPDK, https://01.org/packet-
processing/overview/dpdk-detail", .
[I-D.ietf-forces-protoextension]
Salim, J., "ForCES Protocol Extensions", draft-ietf-
forces-protoextension-02 (work in progress), June 2014.
[NFVE2E] "Network Functions Virtualisation: End to End
Architecture, http://docbox.etsi.org/ISG/NFV/70-
DRAFT/0010/NFV-0010v016.zip", .
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Cao, et al. Expires January 5, 2015 [Page 9]
Internet-Draft DPA Framework July 2014
[RFC5810] Doria, A., Hadi Salim, J., Haas, R., Khosravi, H., Wang,
W., Dong, L., Gopal, R., and J. Halpern, "Forwarding and
Control Element Separation (ForCES) Protocol
Specification", RFC 5810, March 2010.
Authors' Addresses
Zhen Cao
China Mobile
Xuanwumenxi Ave. No. 32
Beijing 100053
China
Email: zehn.cao@gmail.com, caozhen@chinamobile.com
Qiao Fu
China Mobile
Xuanwumenxi Ave. No. 32
Beijing 100053
China
Email: fuqiao@chinamobile.com
Lingli Deng
China Mobile
Xuanwumenxi Ave. No. 32
Beijing 100053
China
Email: denglingli@chinamobile.com
Cao, et al. Expires January 5, 2015 [Page 10]