Internet DRAFT - draft-jiang-nmlrg-network-machine-learning
draft-jiang-nmlrg-network-machine-learning
Network Machine Learning Research Group S. Jiang
Internet-Draft Huawei Technologies Co., Ltd
Intended status: Informational October 28, 2016
Expires: May 1, 2017
Network Machine Learning
draft-jiang-nmlrg-network-machine-learning-02
Abstract
This document introduces background information of machine learning
briefly, then explores the potential of machine learning techniques
for networks. This document is serving as a white paper of the
(proposed) IRTF Network Machine Learning Research Group.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 1, 2017.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Jiang Expires May 1, 2017 [Page 1]
Internet-Draft Network Machine Learning October 2016
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Brief Background of Machine Learning . . . . . . . . . . . . 3
3.1. Machine Learning Categories . . . . . . . . . . . . . . . 3
3.2. Machine Learning Approaches . . . . . . . . . . . . . . . 3
3.3. Successful Applications . . . . . . . . . . . . . . . . . 5
3.4. Precondition of Applying Machine Learning Approach . . . 5
3.5. Limitation of Machine Learning Mechanism . . . . . . . . 5
4. Network Machine Learning Research Group in IRTF . . . . . . . 6
5. Use Cases Study of Applying Machine Learning in Network . . . 7
5.1. Network Traffic . . . . . . . . . . . . . . . . . . . . . 7
6. Security Considerations . . . . . . . . . . . . . . . . . . . 7
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
9. Change log [RFC Editor: Please remove] . . . . . . . . . . . 8
10. Informative References . . . . . . . . . . . . . . . . . . . 8
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction
Machine learning techniques help to make predictions or decisions by
learning from historical data. As machine learning mechanism could
dynamically adapt to varying situations and enhance their own
intelligence by learning from new data, they are more flexible in
handling complicated tasks than strictly static program instructions.
Therefore, machine learning techniques have been widely applied in
image analysis, pattern recognition, language recognition,
conversation simulation, and etc.
With deep exploration, machine learning techniques would cast light
on studies of autonomic networking, in that they could be well
adapted to learn the various environments of networks and react to
dynamic situations.
The proposed Network Machine Learning Research Group (NMLRG) was
formed within IRTF (Internet Research Task Force), October, 2015. As
a procedure, currently, IRTF requests an one-year provisional period.
After this period, the proposed research group may become a formal
research group if there is a steady research community. The NMLRG
provides a forum for researchers to explore the potential of machine
learning techniques for networks.
This document firstly provides background information of machine
learning briefly, then explores the potential of machine learning
techniques for networks functions, such as network control, network
management, and supplying network data for upper-layer applications.
Jiang Expires May 1, 2017 [Page 2]
Internet-Draft Network Machine Learning October 2016
Author notice: this document is in the primary stage. It is an
ongoing document for the proposed Network Machine Learning Research
Group. For now, it is not clear whether it would be published or
not.
2. Terminology
The terminology defined in this document.
Machine Learning A computational mechanism that analyzes and learns
from data input, either historic data or real-time feedback data,
following designed model/pattern. It can be used to make
predictions or decision, rather than following strictly static
program instructions.
3. Brief Background of Machine Learning
3.1. Machine Learning Categories
Machine learning mechanisms are typically classified into three broad
categories, depending on the nature of the learning "signal" or
"feedback" available:
Supervised learning The machine learning mechanism is given labeled
inputs and the correspondent desired outputs. The mechanism could
learn a general rule that maps inputs to outputs by itself.
Unsupervised learning The given input are not labeled. It leaves
the machine learning mechanism itself to find structure in its
input and output.
Reinforcement learning The machine learning mechanism interacts with
dynamic environments in which it performs a certain task and
receives feedback from its action.
Between supervised and unsupervised learning, there is semi-
supervised learning, in which input data are partially labeled.
3.2. Machine Learning Approaches
There are a few basic machine learning approaches. They can be mixed
together to complete complicated tasks.
Classification With the training data that has been labeled into a
number of classes, the machine learning mechanism could assign new
unlabeled data into one or more these classes. An example is SPAM
filtering, in which emails are classified into "spam" or "not
spam" classes.
Jiang Expires May 1, 2017 [Page 3]
Internet-Draft Network Machine Learning October 2016
Clustering Without labeled training data, the machine learning
mechanism divides data into groups. It is the learning mechanism
itself to decide the number or structure of output classes.
Regression It estimates the relationships among variables. The
outputs are continuous.
Anomaly detection It detects specific data which do not conform to
an expected pattern or other data in a data set.
Density estimation The machine learning mechanism needs to identify
the distribution of input data.
Dimensionality reduction The machine learning mechanism could
simplify inputs by mapping them into a lower-dimensional space.
Decision tree learning The learning output is structured into a
decision tree as a predictive model.
Association rule learning The learning delivers potential relations
between variables.
Artificial neural networks also called "neural network". It is
inspired by the structure and functions of biological neural
networks. It is structured by a number of interconnected
computational "neurons", each of which has independent deciding
ability. The connections have numeric weights that can be tuned
according to feedback and trends, making neural nets adaptive to
inputs and capable of learning.
Reinforcement learning It is inspired by behaviorist psychology.
The mechanism take actions in an environment so as to maximize
cumulative reward.
Similarity and metric learning It learns from training data a
similarity function that measures how similar or related two
objects are.
Representation learning Also called feature learning. It learns a
feature - a transformation of raw data input to a representation
that can be effectively exploited in machine learning tasks.
This is not a full enumerated list of machine learning approaches.
Other approaches may include support vector machines, bayesian
networks, inductive logic programming, sparse dictionary learning,
genetic algorithms, and etc.
Jiang Expires May 1, 2017 [Page 4]
Internet-Draft Network Machine Learning October 2016
Editor notes: the basic algorithms that machine learning approaches
use may be listed as a future work. It may be too detailed and too
many to be included.
3.3. Successful Applications
Machine learning approaches have been successfully applied in many
areas, such as human behavior analysis, image analysis, nature
language recognition (including speech and handwriting processing),
conversation simulation, medical diagnosis, structural health
monitoring, stock market analysis, biological analysis and
classifying, loan and insurance evaluation, game playing, and many
other applications.
As for network applications, such as search engines, SPAM filtering,
adaptive website, Internet fraud detection, online advertising, etc.,
have all been greatly benefited from the machine learning mechanism.
However, most of those successful stories are in the application
layer of network perspective.
3.4. Precondition of Applying Machine Learning Approach
Although it is different from big data or data mining, machine
learning does also need data. However, machine learning can be
applied with small set of data or dynamic feedback from environment.
The quality of data decides the efficient and accuracy of machine
learning.
There is no generic machine learning mechanism that could suitable
for all or most of use cases. For each use case, the developers need
to design a specific analysis path, which may combine multiple
approaches or algorithms together. The feature design and analysis
path design are the key factor in the machine learning applications.
To achieve autonomic decision or minimize the human intervention,
there should be evaluation system for the results of machine learning
mechanism. The evaluation system could be the measurement that the
results of machine learning mechanism are executed. The evaluation
system and machine learning mechanism could compose a close decision
loop for autonomic decision.
3.5. Limitation of Machine Learning Mechanism
So far, the machine learning mechanism does not perform very well for
accurate result. In most successful cases, it is used as an
assistant analysis tool. Its results are usually accepted in fault-
tolerant environment or with further human confirmation.
Jiang Expires May 1, 2017 [Page 5]
Internet-Draft Network Machine Learning October 2016
4. Network Machine Learning Research Group in IRTF
The Network Machine Learning Research Group (NMLRG), which was formed
as a proposed research group of IRTF, October, 2015 (as a procedure,
a proposed research group may become a formal research group after
one year provisional period), provides a forum for researchers to
explore the potential of machine learning techniques for networks.
In particular, the NMLRG will work on potential approaches that apply
machine learning techniques in network control, network management,
and supplying network data for upper-layer applications.
The initial focus of the NMLRG will be on higher-layer concepts where
the machine learning mechanism could be applied in order to enhance
the network establishing, controlling, managing, network applications
and customer services. This includes mechanisms to acquire knowledge
from the existing networks so that new networks can be established
with minimum efforts; the potential to use machine learning
mechanisms for routing control and optimization; using machine
learning mechanisms in network management to predict future network
status; using machine learning mechanisms to autonomic and dynamical
network management; using machine learning mechanisms to analyze
network faults and support recovery; learning network attacks and
their behaviors, so that protection mechanisms could be self-
adapted; unifying the data structure and the communication interface
between network/network devices and customers, so that the upper-
layer applications could easily obtain relevant network information,
etc. The NMLRG is expected to identify and document requirements, to
survey possible approaches, to provide specifications for proposed
solutions, and to prove concepts with prototype implementations that
can be tested in real-world environments.
The more knowledge we have, the more intelligent we are. It is the
same for networks and network management. Up to now, the only
available network knowledge is usually the current network status
inside a given device or relevant current status from other devices.
However, historic knowledge is very helpful to make correct
decisions, in particular to reduce network oscillation or to manage
network resources over time. Transplantable knowledge from other
networks can be helpful to initially set up a new network or new
network devices. Knowledge of relationships between network events
and network configuration may help a network to decide the best
parameters according to real performance feedback. In addition to
such historic knowledge, powerful data analytics of current network
conditions may also be a valuable source of knowledge that can be
exploited directly. The machine learning mechanism is the
correspondent mechanism to learn and apply knowledge intelligently.
Jiang Expires May 1, 2017 [Page 6]
Internet-Draft Network Machine Learning October 2016
5. Use Cases Study of Applying Machine Learning in Network
In 2016, the NMLRG is focusing on collecting and studying of use
cases that applies machine learning mechanisms into network area.
More use cases are still in the collecting process.
5.1. Network Traffic
Network traffic is one of the most important objectives that needs to
be managed in network/Internet area.
Network traffic meets preconditions of applying Machine Learning
mechanisms. It is full of data: the network traffic itself is data
source, also there are many properties of network traffic are
measurable, such as latency, number of packets, last period, etc.
The network traffics are complicated. Its characteristics are often
beyond the awareness of human operators. Machine Learning would
greatly help to discover knowledge regarding to network traffics.
The network traffics are always dynamic changing. There is both
regularity and irregularity. Quick response to real-time network
traffic is a big challenge to network management. It is beyond the
ability of human operator. The rigid management has already become a
bottleneck of current networks. Machine Learning could form a quick
and adaptive auto response managing system.
There are many different types of network traffic. In April 2016,
NMLRG #2 IETF 95 meeting was organized with the theme of network
traffic. There are multiple use cases presented: HTTPS traffic
classification, machine learning in the router - learn from and act
on network traffics, applications of machine learning to flow-based
monitoring, malicious domains: automatic detection with DNS traffic
analysis, machine-learning based policy derivation and evaluation in
broadband networks, predicting interface failures for better traffic
management
NMLRG is currently working on a dedicated document for this theme.
It is potential this document becomes RG document and is published as
a RFC in the future.
6. Security Considerations
This document is focused on applying machine learning in network,
including of course applying machine learning in network security, on
higher-layer concepts. Therefore, it does not itself create any new
security issues.
Jiang Expires May 1, 2017 [Page 7]
Internet-Draft Network Machine Learning October 2016
7. IANA Considerations
This memo includes no request to IANA.
8. Acknowledgements
The author would like to acknowledge the valuable comments made by
participants in the IRTF Network Machine Learning Research Group,
particular thanks to Lars Eggert, Brian Carpenter, Albert Cabellos,
Shufan Ji, Panagiotis Demestichas, Jerome Francois, Susan Hares,
Rudra Saha, Dacheng Zhang and Bing Liu.
This document was produced using the xml2rfc tool [RFC2629].
9. Change log [RFC Editor: Please remove]
draft-jiang-nmlrg-network-machine-learning-01: adding brief
description of network traffic and ML into use case study, 2016-4-23.
draft-jiang-nmlrg-network-machine-learning-00: original version,
2015-10-19.
10. Informative References
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
DOI 10.17487/RFC2629, June 1999,
<http://www.rfc-editor.org/info/rfc2629>.
Author's Address
Sheng Jiang
Huawei Technologies Co., Ltd
Q14, Huawei Campus, No.156 Beiqing Road
Hai-Dian District, Beijing, 100095
P.R. China
Email: jiangsheng@huawei.com
Jiang Expires May 1, 2017 [Page 8]