Internet DRAFT - draft-kim-bmwg-ha-nfvi
draft-kim-bmwg-ha-nfvi
Network Working Group T. Kim
Internet-Draft E. Paik
Intended status: Informational KT
Expires: September 22, 2016 March 21, 2016
Considerations for Benchmarking High Availability of NFV Infrastructure
draft-kim-bmwg-ha-nfvi-01
Abstract
This documents lists additional considerations and strategies for
benchmarking high availability of NFV infrastructure when network
functions are virtualized and performed in NFV infrastructure.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 22, 2016.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Kim & Paik Expires September 22, 2016 [Page 1]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Considerations for Benchmarking High Availability of NFV
Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Definitions for High Availability Benchmarking Test . . . 3
2.2. Configuration Parameters for Benchmarking Test . . . . . 3
3. High Availability Benchmarking test strategies . . . . . . . 4
3.1. Single Point of Failure Check . . . . . . . . . . . . . . 4
3.2. Failover Time Check . . . . . . . . . . . . . . . . . . . 6
4. Security Considerations . . . . . . . . . . . . . . . . . . . 6
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
6. Normative References . . . . . . . . . . . . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7
1. Introduction
As both the amount and variety of traffic massively increase,
operators are adopting SDN and NFV, the new paradigm of networking,
in order to secure scalability and flexibility. Service providers
and venders are developing SDN and NFV solutions to reduce CAPEX and
OPEX, focusing on the increment of the scalability and flexibility of
the network with programmable networking.
While VNF and NFVI replacing the legacy network devices and operators
selecting the fittest one from various products, operators have
several issues such as availability, resiliency and immeasurable
failures. Above all, they want to ensure the availability of the VNF
products and their infrastructures.
Customer expectations on the availability of service are as high as
five 9s on any infrastructure operators offer because the standard
availability of 4G mobile communication on the legacy physical
network among operators has been already five 9s; downtime being 5.26
minutes per year. Therefore, to meet the customer expectations,
services on the NFV infrastructure also need to meet the availability
to that level, five 9s. Furthermore, the availability of NFV
infrastructure needs to be almost six 9s with the consideration of
the impact of virtualization and interoperability among different
vendor solutions and layers. From the operator point of view, the
availability is the most important feature and the benchmarking tests
for the high availability of NFV infrastructure are also important.
Kim & Paik Expires September 22, 2016 [Page 2]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
This document investigates considerations for high availability of
NFV Infrastructure benchmarking test.
2. Considerations for Benchmarking High Availability of NFV
Infrastructure
This section defines and lists considerations which must be addressed
to benchmark the high availability of the NFV infrastructure.
2.1. Definitions for High Availability Benchmarking Test
Metrics for high availability Benchmarking of NFV infrastructure are
as follows.
o Failure Detection Time : the time takes to detect a failure
o Failure Recovery Time : the time takes to recover the failure
o Failure Rates : the frequency of failures
o Success Rates of Detection Time : the percentage of success among
a number of attempts to detect failures
o Success Rates of Recovery : the percentage of success among a
number of attempts to recover the failures
o Failure Impact Fraction : the fraction of the infrastructure when
a failure happens
Generally, availability and failure rates are defined as follows,
where MTBF stands for Mean Time Between Failure and MTTR stands for
Mean Time To Recovery.
Availability : MTBF / (MTBF + MTTR)
Failure Rates : 1 - Availability
A failover procedure in an infrastructure is as follows.
Failure -> Failure Detection -> Isolation -> Recovery, therefore
failover time starts from the time when a failure happens.
2.2. Configuration Parameters for Benchmarking Test
o Types of VNFs; depending on the type of VNF, followings are
different.
1. What kind of operations they do
Kim & Paik Expires September 22, 2016 [Page 3]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
2. How many CPUs, MEMs, Storages they need
3. What kind of traffic pattern they usually face
o The specification of the physical machine which VMs
o The mapping ratio of hardware resources to VMs(virtual machine)
where VNF runs, such as vCPU:pCPU (virtual CPU to physical CPU),
vMEM:pMEM (virtual memory to physical memory), vNICs as shown
below.
o Types of hypervisor and the different limitations of their roles.
o Cloud Design Pattern of NFVI
o The composition of network functions in VNFs : for example,
sometimes in vEPC implementations, PGW(Packet Data Network
Gateway) and SGW(Serving Gateway) are combined or PGW+SGW+MME.
+---------------+ +---------------+
| vCPU for VNF1 | | |
+---------------+ | vCPU for VNF2 |
+---------------+ | | +---------------+
| vCPU for VNF2 | +---------------+ | vCPU for VNF1 |
+---------------+ +---------------+
+---------------+ +---------------+ +---------------+ +---------------+
| vCPU for VNF3 | | vCPU for VNF2 | | vCPU for VNF3 | | vCPU for VNF3 |
+---------------+ +---------------+ +---------------+ +---------------+
+---------------+ +---------------+ +---------------+ +---------------+
| pCPU 1 | | pCPU 2 | | pCPU 3 | | pCPU 4 |
+---------------+ +---------------+ +---------------+ +---------------+
3. High Availability Benchmarking test strategies
This section discusses benchmarking test strategies for high
availability of NFV infrastructure. For the continuity of the
services, followings needs to be considered.
3.1. Single Point of Failure Check
All devices and software have potential failures, therefore,
redundancy is mandatory. First, the redundancy implementation of
every sing point of NFV infrastructure must be tested as shown below.
o Hardware
* Power supply
Kim & Paik Expires September 22, 2016 [Page 4]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
* CPU
* MEM
* Storage
* Network :NICs, ports, LAN cable, ..
o Software
* The redundancy of VNFs
* The redundancy of VNFs path
* The redundancy of OvS
* The redundancy of vNICs
* The redundancy of VMs
+--------------------------------------------------------------+
| Physical Machine |
| |
| |
| +--------------------------------------------------------+ |
| | Virtual Network Function | |
| +--------------------------------------------------------+ |
| +--------------------------------------------------------+ |
| | Virtual Machine | |
| +--------------------------------------------------------+ |
| +--------------------------------------------------------+ |
| | Virtual Bridge | |
| +--------------------------------------------------------+ |
| +--------------------------------------------------------+ |
| | Hypervisor | |
| +--------------------------------------------------------+ |
| +--------------------------------------------------------+ |
| | Operating System | |
| +--------------------------------------------------------+ |
| +--------------------------------------------------------+ |
| | Generic Hardware | |
| +--------------------------------------------------------+ |
+--------------------------------------------------------------+
Kim & Paik Expires September 22, 2016 [Page 5]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
3.2. Failover Time Check
Even though the components of NFV infrastructure are redundant,
failover time can be long. For example, when a failure happens, the
VNF with failure stops and should be replaced by backup VNF but the
time to be shifted to the new VNF can be varied with the VNF;
stateless or stateful. Namely, redundancy does not guarantees high
availability and short failover time is required to reach high
availability. This section discusses strategy about measuring
failover time.
In order to measure the failover time presicely, the time when
failure happens must be defined. Followings are three different
criteria which is the time when failure happens.
1. The time starts when failure actually happens
2. The time starts when failure detected by manager or controller
3. The time starts when failure event alerts to the operator
As the actual operations in VNFs and NFV infrastructure start to be
changed when failure happens, the precise time of the failure
happened must be the 1. When measuring the failover time, it starts
from the time when the failures happens at a point in NFV
infrastructure or VNF itself.
4. Security Considerations
TBD.
5. IANA Considerations
No IANA Action is requested at this time.
6. Normative References
[NFV.REL001]
"Network Function Virtualization: Resiliency
Requirements", Group Specification ETSI GS NFV-REL 001
V1.1.1 (2015-01), January 2015.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
Kim & Paik Expires September 22, 2016 [Page 6]
Internet-Draft Benchmarking High Availability of NFV Inf. March 2016
Authors' Addresses
Taekhee Kim
KT
Infra R&D Lab. KT
17 Woomyeon-dong, Seocho-gu
Seoul 137-792
Korea
Phone: +82-2-526-6688
Fax: +82-2-526-5200
Email: taekhee.kim@kt.com
EunKyoung Paik
KT
Infra R&D Lab. KT
17 Woomyeon-dong, Seocho-gu
Seoul 137-792
Korea
Phone: +82-2-526-5233
Fax: +82-2-526-5200
Email: eun.paik@kt.com
URI: http://mmlab.snu.ac.kr/~eun/
Kim & Paik Expires September 22, 2016 [Page 7]