Internet DRAFT - draft-zheng-opsawg-network-ai-usecases
draft-zheng-opsawg-network-ai-usecases
Network Working Group Y. Zheng
Internet-Draft China Unicom
Intended status: Informational S. Xu
Expires: September 14, 2017 D. Dhody
Huawei Technologies
March 13, 2017
Usecases for Network Artificial Intelligence (NAI)
draft-zheng-opsawg-network-ai-usecases-00
Abstract
This document discusses the scope of Network Artificial Intelligence
(NAI), and the possible use cases that are able to demonstrate the
advantage of applying NAI.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 14, 2017.
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Zheng, et al. Expires September 14, 2017 [Page 1]
Internet-Draft Usecases of NAI March 2017
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. NAI Architecture . . . . . . . . . . . . . . . . . . . . . . 3
3. NAI Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Traffic Predication and Re-Optimization/Adjustment . . . 3
3.2. Route Monitoring and Analytics . . . . . . . . . . . . . 4
3.3. Multilayer Fault Detection In NFV Framework . . . . . . . 5
3.4. Data Center Network Use Cases . . . . . . . . . . . . . . 7
3.4.1. Service Function Chaining . . . . . . . . . . . . . . 7
4. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8
5. Security Considerations . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
7. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 8
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
8.1. Normative References . . . . . . . . . . . . . . . . . . 9
8.2. Informative References . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
Current networks have become much more dynamic and complex, and pose
new challenges for network management and optimization. For example,
network management/optimization should be automated to avoid human
intervention (and thus to minimize the operational expense).
Artificial Intelligence (AI) and Machine Learning (ML) is a promising
approach to realize such automation, and can even do better than
human beings. Furthermore, the population of Software-Defined
Networks (SDN) paradigm makes the application of Artificial
Intelligence in networks possible, since the SDN controller has the
complete knowledge of the network status and can control behavior of
network nodes to implement AI decisions.
AI and ML technologies can learn from historical data, and make
predictions or decisions, rather than following strictly static
program instructions. They can dynamically adapt to a changing
situation and enhance their own intelligence with by learning from
new data. It can learn and complete complicated tasks. It also has
potential in the network technology area especially with SDN and
Network Function Virtualization (NFV).
This document presents the concept of Network Artificial
Intelligence. It first discusses the scope of Network Artificial
Intelligence (NAI). And then Some use cases are discussed to
demonstrate the advantage of applying NAI.
Zheng, et al. Expires September 14, 2017 [Page 2]
Internet-Draft Usecases of NAI March 2017
2. NAI Architecture
The definition of the architecture of NAI could be refer to
[I-D.li-rtgwg-network-ai-arch]. In the architecture of NAI, central
controller is the core part of Network Artificial Intelligence which
can be called as 'Network Brain'. The Network Telemetry and
Analytics (NTA) engines can be introduced acompanying with the
central controller. The Network Telemetry and Analytics (NTA) engine
inclues data collector, analytics framework, data persistence, and
NAI applications.
^ ^
(4)| |(4)
+---------------|--------------+ +---------------|--------------+
| Domain 1 | | | | Domain 2 |
| +------------+ | | +------------+ |
| | Central | | | | Central | |
| (1)| Controller |----------------------| Controller |(1) |
| | with | | | | with | |
| | NTA | | | | NTA | |
| +------------+ | | +------------+ |
| / \ | | / \ |
| (3)/ \ | | / \(3) |
| / \ | | / \ |
| +--------+ +--------+ | | +--------+ +--------+ |
| | | | | | | | | | | |
| |Network | ...... |Network | | | |Network | ...... |Network | |
| | Device | (2) | Device | | | | Device | (2) | Device | |
| | 1 | | N | | | | 1 | | N | |
| +--------+ +--------+ | | +--------+ +--------+ |
| | | |
+------------------------------+ +------------------------------+
Figure 1: An Architecture of Network Artificial Intelligence(NAI)
3. NAI Use Cases
3.1. Traffic Predication and Re-Optimization/Adjustment
This subsection introduces the Path Computation Element (PCE)
[RFC4655] use cases in wide area networks (WAN). In PCE scenario,
network data collection is realized through the control plane
protocols such as PCE protocol (PCEP) and BGP-LS [RFC7752] protocol
and data are passed to the PCE application. PCEP receives the state
of Label Switched Path (LSP) from the network, and BGP-LS receives
the topology information from the network. If network telemetry is
used, traffic information can be received from the network as well
directly at the NTA engine using protocols such as gRPC.
Zheng, et al. Expires September 14, 2017 [Page 3]
Internet-Draft Usecases of NAI March 2017
PCE application (APP) only maintains the latest information. To
enable NAI, history of all LSP and topology changes is stored in
external data repository. Further traffic monitoring data could also
be collected and stored, if network telemetry is used. There are two
usecases in the application scenarios: (1) reroute/re-optimize using
the historical trend and predications from AI; (2) traffic congestion
avoidance and AI-enabled auto-bandwidth adjustment.
For the usecase (1), the analytics component in NTA (Network
Telemetry and Analytics), can use stored data to build models to
predict impact of network events and state of the LSPs. For example,
it can use historical trends to guide path computation to include/
exclude specific links. Finding correlations between data, finding
anomalies and data visualization are also possible.
The analytics component in NTA can also use stored data to detect and
predict network events and request PCE to take necessary actions.
For example, it can use network bandwidth utilization historical
trends to request for re-optimizations.
For the usecase (2), with network telemetry, the NTA can collect per-
link and per-LSP traffic flow using gRPC from network. Such network
telemetry data includes statistics for tunnels, links, bandwidth
reservations, actual usage, delay, jitter, packet loss, etc.
Meanwhile, it also collects data regarding network events and its
impact on traffic flows. The analytics component can use telemetry
data to build traffic models to predict traffic congestion when new
years or sporting events are coming. According to the congestion
prediction, the PCE app could reroute traffic to avoid congested
links. Besides the case, NTA can also perform predication and make
necessary changes to network. In particular, the PCE APP performs
bandwidth usage prediction (i.e., bandwidth calendaring) by looking
at the historical trends of all sampled data instead of the instant
sampled data. The collected data are traffic engineering data base
(TEDB) and LSP-DB, and can also include scheduling information. In
addition, the collected data also include auto-bandwidth related
changes under particular network events. Using machine learning
algorithm, the analytics component is able to correct such changes
with the events, and predicts network events and their impact.
3.2. Route Monitoring and Analytics
This subsection introduces the BGP Monitoring Protocol (BMP)
[RFC7854] use case in wide area networks (WAN). The BGP protocol is
known for its flexibility and ability to manage a large number of
neighbors and routes. It is also the basis for many overlay services
such as L3VPN, L2VPN and so on. The BMP protocol can be used by the
Zheng, et al. Expires September 14, 2017 [Page 4]
Internet-Draft Usecases of NAI March 2017
controller to monitor BGP protocol neighbor status and routing
information on the routers.
According to [RFC7854], BMP client located in the router collects BGP
neighbor status, routes for each neighbor, and events defined by the
user. And then it passes the informations through the BMP protocol
to the management station located on the controller. Based on BMP
monitoring of BGP, there are three use cases: (1) BGP Route Leaks
Monitoring; (2) BGP Hijacks Monitoring; (3) Traffic Analytics.
Route leaks involve the illegitimate advertisement of prefixes,
blocks of IP addresses, which propagate across networks and lead to
incorrect or suboptimal routing. For case (1), based on BMP, NAI
apps can analyze BGP route leaks.
For case (2), by manipulating BGP, data can be rerouted in an
attacker's favor out them to intercept or modify traffic.If the
malicious announcement is more specific than the legitimate one, or
claims to offer a shorter path, the traffic may be directed to the
attacker.By broadcasting false announcements, the compromised router
may poison the RIB of its peers.After poisoning one peer, the
malicious routing information could propagate to other peers, to
other Autonomous Systems, and onto the interactive Internet. Based
on monitoring BGP routes, ML algorithms can be trained to determine
when a hijack has taken place and take necessary actions.
In case (3), with BMP protocol providing BGP changes, together with
Telemetry providing network traffic information, The NAI Apps can
analyze traffic trends, predict traffic changes, and do traffic
optimizing.
3.3. Multilayer Fault Detection In NFV Framework
The high reliability and high availability required for carrier-class
applications is a big challenge in virtualized and software-based
environment where failures are normal in a software-based
environment. The interdependence between NFV's abstraction levels
and virtual resources is complex as shown in Fig.. The dynamic
characteristics of the resources in the cloud environment make it
difficult to locate the fault. So multilayer fault detection for NFV
networks and cloud environment will be very useful.
Zheng, et al. Expires September 14, 2017 [Page 5]
Internet-Draft Usecases of NAI March 2017
+--------------------+
| Central |
| Controller |
| with |
| NTA |
+--------------------+
| | |
| | |
| | |
V V V
+-------------------------------------------------------+
| |
| +-----------+ +-----------+ +-----------+ |
| | VNF1 | | VNF2 | | VNF3 | |
| +-----|-----+ +-----|-----+ +-----|-----+ |
| | | VN-NF | |
| +-------|--------------|--------------|-------+ |
| | NFVI | |
| | +-----------+ +-----------+ +-----------+ | |
| | | Virtual | | Virtual | | Virtual | | |
| | | Computing | | Storage | | Network | | |
| | +-----------+ +-----------+ +-----------+ | |
| | +-----------------------------------------+ | |
| | | VIRTUALIZATION LAYER | | |
| | +--------------------|--------------------+ | |
| | VI-Ha | | |
| |+---------------------|---------------------+| |
| || Hardware Resouces || |
| ||+-----------+ +-----------+ +-----------+|| |
| ||| Computing | | Storage | | Network ||| |
| ||| Hardware | | Hardware | | Hardware ||| |
| ||+-----------+ +-----------+ +-----------+|| |
| |+-------------------------------------------+| |
| +---------------------------------------------+ |
| |
+-------------------------------------------------------+
Figure 2 NAI in Multi-layer NFV Framework
For the virtualization layer, CPU performance, memory usage,
interface bandwidth and other KPI indicators can be monitored. At
the same time resource occupancy and the life cycle of NVF software
process can also be monitored. Through the NAI, the relevant
statistical data in multiple levels can be analyzed and the models
can be setup to locate the root cause for the possible fault in the
multi-layer environment.
Zheng, et al. Expires September 14, 2017 [Page 6]
Internet-Draft Usecases of NAI March 2017
3.4. Data Center Network Use Cases
Traditionally, data center networks have comprised a large number of
switches and routers that direct traffic based on the limited view of
each device. With help of SDN/NFV the data center networks are more
agile and dynamic to changing usage and traffic patterns. The real-
time traffic data and usage can be used to make the data center
management and operations intelligent.
Various protocols such as sFLOW, IPFIX could be used to get the port
statistics as well as traffic sampling. Over time this information
can help build the traffic usage models on a per port and per flow
basis. With historical data as the base the NTA engine can predict
the traffic usage and make necessary instructions to the SDN
controller or NFV orchestrator. These instructions could be reroute
a flow to avoid a congested port or scale-in another switch to share
load based on the predicted traffic demand.
The NTA engine should find correlation between the various network
data to build models and predict the impact of network events,
congestions, network utilization patters etc. Further NTA could
detect anomalies based on the historical patterns and help in root
cause analysis. The policy framework can be enhanced to consider the
analytics.
NTA engine could also get the usage and health information from the
Host (servers). Correlation between this information with the
information received from network could help in finding security
flows and anomalies when the information does not match.
3.4.1. Service Function Chaining
This sub section introduces how to apply NAI to SFC scenario to
intelligently reroute/re-optimize the service chains; increase
utilization for both Service Functions(SF) and network; intelligent
selection of the Service Function Path (SFP) based on data traffic
trends.
As per [RFC7665], Service function chaining (SFC) enables the
creation of composite (network), services that consist of an ordered
set of SFs that must be applied for specific treatment of received
packets and/or frames and/or flows selected as a result of
classification The SFs of chain are connected using a service
function forwarder (SFF), which is responsible for forwarding traffic
to one or more connected SFs according to information carried in the
SFC encapsulation, as well as handling traffic coming back from the
SF.
Zheng, et al. Expires September 14, 2017 [Page 7]
Internet-Draft Usecases of NAI March 2017
The various network telemetry information like delay, jitter, packet
loss from the network and the CPU/memory usage utilizations from the
SFs, can be collected using sFLOW/gRPC protocol and stored in
persistent data repository. The analytics component in NTA can use
stored data to build statistics models to predict the impact on
various Service Function Paths due to network events, traffic and
state of the SFPs and instruct the SDN controller to take necessary
actions SDN controller can calculate new paths/reroute the SFC path
to avoid congested Ports/SFFs or overloaded SFs. This correlation of
application analytics from the SFs and the network analytics from the
SFFs could enhance the intelligent management of the service chains
for the operators.
The usage and traffic pattern over time can help increase the
utilization of SF as well as the underlay network.
4. Contributors
The following people have substantially contributed to the usecases
of NAI:
Lizhao You
Huawei
Email: youlizhao@huawei.com
Kalyankumar Asangi
Huawei
Email: kalyana@huawei.com
5. Security Considerations
TBD
6. IANA Considerations
This document has no actions for IANA.
7. Acknowledgement
Thanks to Li Zhenbin and Liu Shucheng for their comments and
contribution.
8. References
Zheng, et al. Expires September 14, 2017 [Page 8]
Internet-Draft Usecases of NAI March 2017
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
8.2. Informative References
[I-D.li-rtgwg-network-ai-arch]
Li, Z. and J. Zhang, "An Architecture of Network
Artificial Intelligence(NAI)", draft-li-rtgwg-network-ai-
arch-00 (work in progress), October 2016.
[RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation
Element (PCE)-Based Architecture", RFC 4655,
DOI 10.17487/RFC4655, August 2006,
<http://www.rfc-editor.org/info/rfc4655>.
[RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
Chaining (SFC) Architecture", RFC 7665,
DOI 10.17487/RFC7665, October 2015,
<http://www.rfc-editor.org/info/rfc7665>.
[RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and
S. Ray, "North-Bound Distribution of Link-State and
Traffic Engineering (TE) Information Using BGP", RFC 7752,
DOI 10.17487/RFC7752, March 2016,
<http://www.rfc-editor.org/info/rfc7752>.
[RFC7854] Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP
Monitoring Protocol (BMP)", RFC 7854,
DOI 10.17487/RFC7854, June 2016,
<http://www.rfc-editor.org/info/rfc7854>.
Authors' Addresses
Yi Zheng
China Unicom
No.9, Shouti Nanlu, Haidian District
Beijing 100048
China
Email: zhengyi39@chinaunicom.cn
Zheng, et al. Expires September 14, 2017 [Page 9]
Internet-Draft Usecases of NAI March 2017
Xu Shiping
Huawei Technologies
Huawei Bld., No.156 Beiqing Rd.
Beijing 100095
P.R. China
Email: xushiping7@huawei.com
Dhruv Dhody
Huawei Technologies
Divyashree Techno Park, Whitefield
Bangalore, Karnataka 560066
India
Email: dhruv.ietf@gmail.com
Zheng, et al. Expires September 14, 2017 [Page 10]