Internet DRAFT - draft-pedro-nmrg-ai-framework
draft-pedro-nmrg-ai-framework
NMRG P. Martinez-Julia, Ed.
Internet-Draft NICT
Intended status: Informational S. Homma
Expires: 24 April 2024 NTT
D. R. Lopez
TID
22 October 2023
Artificial Intelligence Framework for Network Management
draft-pedro-nmrg-ai-framework-04
Abstract
The adoption of artificial intelligence (AI) in network management
(NM) solutions is the way to resolve many of the complex management
problems arising from the adoption of NFV and SDN technologies. The
AINEMA framework, as discussed in this document, includes the
functions, capabilities, and components that MUST be provided by AI
modules and models to be successfully applied to NM. This is
enhanced by the consideration of seamless integration of different
services, including the ability of having multiple AI models working
in parallel, as well as the ability of complex reasoning and event
processing. In addition, disparate sources of information are put
together without increasing complexity, through the definition of a
control and management service bus. It allows, for instance, to
involve external events in NM operations. Using all available
sources of information --- namely, intelligence sources --- allows NM
solutions to apply proper intelligence processes that provide
explainable results instead of simple AI-based guesses. Such
processes are highly based in reasoning and formal and target-based
intelligence analysis and decision --- providing evidence-based
conclusions and proofs for the decisions --- in the form of AI method
outputs with explanations. This will allow computer and network
system infrastructures --- and user demands --- to grow in complexity
while keeping dependability.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Martinez-Julia, et al. Expires 24 April 2024 [Page 1]
Internet-Draft AI Framework for NM October 2023
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 24 April 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1. Virtual Computer and Network Systems . . . . . . . . . . 4
3.2. SDN and NFV . . . . . . . . . . . . . . . . . . . . . . . 5
3.3. Management and Control . . . . . . . . . . . . . . . . . 5
3.4. Network Slice Controller (NSC) . . . . . . . . . . . . . 6
3.5. Artificial Intelligence and Machine Learning . . . . . . 7
4. AI Framework for NM . . . . . . . . . . . . . . . . . . . . . 7
4.1. Target Challenges . . . . . . . . . . . . . . . . . . . . 8
4.2. Intent Support . . . . . . . . . . . . . . . . . . . . . 8
4.3. Network Digital Twin . . . . . . . . . . . . . . . . . . 9
4.4. AINEMA Operation . . . . . . . . . . . . . . . . . . . . 9
4.4.1. Overview . . . . . . . . . . . . . . . . . . . . . . 10
4.4.2. Monitoring / Executing . . . . . . . . . . . . . . . 12
4.4.3. Deliberation . . . . . . . . . . . . . . . . . . . . 13
4.4.4. Involving Data Fabric and Fog . . . . . . . . . . . . 14
4.4.5. Involving External Event Detector . . . . . . . . . . 15
4.5. Effectiveness . . . . . . . . . . . . . . . . . . . . . . 16
4.6. Closed Loop . . . . . . . . . . . . . . . . . . . . . . . 18
4.7. Network Intelligence: From Data to Wisdom . . . . . . . . 18
4.8. External Event Detectors . . . . . . . . . . . . . . . . 19
4.9. Anticipation of Network Requirements . . . . . . . . . . 20
4.10. Intelligent Reasoning and Explainable AI . . . . . . . . 21
5. Gaps and Standardization Issues . . . . . . . . . . . . . . . 22
Martinez-Julia, et al. Expires 24 April 2024 [Page 2]
Internet-Draft AI Framework for NM October 2023
6. AINEMA Information Model . . . . . . . . . . . . . . . . . . 23
6.1. Tree Structure . . . . . . . . . . . . . . . . . . . . . 23
6.1.1. event-payloads . . . . . . . . . . . . . . . . . . . 23
6.1.1.1. basic . . . . . . . . . . . . . . . . . . . . . . 24
6.1.1.2. seismometer . . . . . . . . . . . . . . . . . . . 24
6.1.1.3. big-data . . . . . . . . . . . . . . . . . . . . 24
6.1.2. external-events . . . . . . . . . . . . . . . . . . . 24
6.1.3. notifications/event . . . . . . . . . . . . . . . . . 25
6.2. YANG Module . . . . . . . . . . . . . . . . . . . . . . . 25
7. The Autonomic Resource Control Architecture (ARCA) . . . . . 27
8. ARCA Integration With ETSI-NFV-MANO . . . . . . . . . . . . . 29
8.1. Functional Integration . . . . . . . . . . . . . . . . . 29
8.2. Target Experiment and Scenario . . . . . . . . . . . . . 32
8.3. OpenStack Platform . . . . . . . . . . . . . . . . . . . 34
8.4. Initial Results . . . . . . . . . . . . . . . . . . . . . 35
9. Relation to Other IETF/IRTF Initiatives . . . . . . . . . . . 38
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38
11. Security Considerations . . . . . . . . . . . . . . . . . . . 38
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 38
13.1. Normative References . . . . . . . . . . . . . . . . . . 38
13.2. Informative References . . . . . . . . . . . . . . . . . 39
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41
1. Introduction
The current network ecosystem is quickly evolving from an almost
rigid environment based on static network to a highly flexible,
powerful, and somehow hybrid system. Software Defined Networking
(SDN) and Network Function Virtualization (NFV) provide the basis for
such evolution. Thus, the management and control of such systems
MUST be automated. It has motivated the move towards autonomic
networking (ANIMA) and the inclusion of AI solutions alongside the
management plane of the network, enough justified by the increasing
size and complexity of the network, which exposes complex problems
that MUST be resolved in scales that escape human possibilities.
Therefore, in order to allow current computer and network system
infrastructures to constantly grow in complexity, in parallel to the
demands of users, the AI solutions MUST work together with other
network management solutions.
However, exploiting the possibilities of AI is not an easy task.
There has been a lot of effort to make Machine Learning (ML)
solutions reliable and acceptable but, at the same time, other
mechanisms have been forgotten. It is the particular case of
reasoning. Although it can provide enormous benefits to management
solutions by, for example, inferring new knowledge and applying
different kind of rules (e.g. logical) to choose from several
Martinez-Julia, et al. Expires 24 April 2024 [Page 3]
Internet-Draft AI Framework for NM October 2023
actions, it has received little attention. While ML solutions work
with data, so they only require to retrieve data from the network
infrastructure, reasoning solutions MUST work in collaboration to the
network they are managing. This makes the challenges arisen from
intelligent reasoning to be a key for the evolution of network
management towards the full adoption of AI.
The present document aims to define the AINEMA framework, gathering
the necessary information for getting the most benefits from the
application of intelligent reasoning to bring explainable AI methods
to network management, including, but not limited to, defining the
gaps that MUST be covered for reasoning to be correctly integrated
into network management solutions.
The construction and maintenance of AINEMA-compatible components MUST
consider the existence several mechanisms, which are extended beyond
ML. For instance, intelligent reasoning is a key aspect of AINEMA
that MUST be taken into account by autonomic management components
and solutions. It will provide enormous benefits to NM solutions by,
for example, inferring new knowledge and applying different kind of
rules (e.g. logical) to choose from several actions. While ML
solutions work with data, so they only require to retrieve data from
the network infrastructure, AINEMA modules MUST work in collaboration
to the network it is managing. This makes the challenges arisen from
intelligent reasoning essential for the evolution of NM. They will
be addressed within the context of AINEMA.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Background
3.1. Virtual Computer and Network Systems
The continuous search for efficiency and cost reduction to get the
most optimum exploitation of available resources (e.g. CPU power and
electricity) has conducted current physical infrastructures to move
towards virtualized infrastructures. Also, this trend enables end
systems to be centralized and/or distributed, so that they are
deployed to best accomplish customer requirements in terms of
resources and features.
A high degree of flexibility and reliability is thus required --- and
it is provided by computing and network virtualization. Both
characteristics are subject to the underlying technologies but, while
Martinez-Julia, et al. Expires 24 April 2024 [Page 4]
Internet-Draft AI Framework for NM October 2023
the latter has been always enforced to computer and network systems,
flexibility is relatively new and MUST be achieved with support from
virtualization and cloud technologies.
3.2. SDN and NFV
SDN and NFV are conceived to bring high degree of flexibility and
conceptual centralization qualities to the network. On the one hand,
with SDN, the network can be programmed to implement a dynamic
behavior that changes its topology and overall features. Moreover,
with NFV the functions that are typically provided by physical
network equipment are now implemented as virtual appliances that can
be deployed and linked together to provide customized network
services. SDN and NFV complement to each other to actually implement
the network aspect of the aforementioned virtual computer and network
systems.
Although centralization can lead us to think of the single-point-of-
failure concept, it is not the case for these technologies.
Conceptual centralization highly differs from centralized deployment.
It brings all benefits of having a single point of decision but
retaining the benefits of distributed systems. For instance, control
decisions in SDN can be centralized while the mechanisms that enforce
such decisions into the network (SDN controllers) can be implemented
as highly distributed systems. The same approach can be applied to
NFV. Network functions can be implemented in a central computing
facility, but they can also take advantage of several replication and
distribution techniques to achieve the properties of distributed
systems. Nevertheless, NFV also allows the deployment of functions
on top of distributed systems, so they benefit from both distribution
alternatives at the same time.
3.3. Management and Control
The introduction of virtualization into the computer and network
system landscape has increased the complexity of both underlying and
overlying systems. On the one hand, virtualizing underlying systems
adds extra functions that MUST be managed properly to ensure the
correct operation of the whole system, which not just encompasses
underlying elements but also the virtual elements running on top of
them. Such functions are used to actually host the overlying virtual
elements, so there is an indirect management operation that involves
virtual systems. Moreover, such complexities are inherited by final
systems that get virtualized and deployed on top of those
virtualization infrastructures.
Martinez-Julia, et al. Expires 24 April 2024 [Page 5]
Internet-Draft AI Framework for NM October 2023
In parallel, virtual systems are empowered with additional, and
widely exploited, functionality that MUST be managed correctly. It
is the case, virtual resources MUST be dynamically adapted to their
operation environments, or even the composition of distributed
elements across heterogeneous underlying infrastructures, and
probably providers.
Taking both complex functions into account, either separately or
jointly, makes clear that management requirements are difficult to be
fulfilled by direct human action, so automation has become essential
to accomplish most common tasks.
3.4. Network Slice Controller (NSC)
Network slicing is a concept to provide connectivity coupled with a
set of specific commitments of network resources between a number of
endpoints, and the framework is defined in
[I-D.ietf-teas-ietf-network-slices]. The network slice controller
(NSC), also defined in[I-D.ietf-teas-ietf-network-slices], is the key
component for control and management of IETF network slices, and it
takes requests for IETF network slicing services and implements,
changes, or deletes the IETF network slices using suitable underlying
technologies. NSC has northbound interfaces for communicating with
higher level operation systems, and AINEMA may collect information of
IETF network slices, and control them via NSC.
Moreover, an NSC MUST be used to support handling services provided
on network slices in addition to controlling them because an NSC is
the edge node on an end-to-end network slice (E2E-NS).
Therefore, the NSC exposes the following requirements:
* Data plane for NSs as infrastructure.
* Control/management plane for NSs as infrastructure.
* Data plane for services on NSs.
* Control/management plane for services on NSs.
In summary, NSC provides the required functions for the enforcement
of AI decisions in multi-domain (and federated) network slices, so it
will play a key role in general network management.
Martinez-Julia, et al. Expires 24 April 2024 [Page 6]
Internet-Draft AI Framework for NM October 2023
3.5. Artificial Intelligence and Machine Learning
Intelligence does not directly imply intelligent behavior. On the
one hand, intelligence emphasizes data gathering and management,
which can be processed by systematic methods or intelligent methods.
On the other hand, intelligent behavior emphasizes the reasoning and
understanding of data to actually "posses" the intelligence.
The justification of applying AI in network (and) management is
sometimes overlooked. First, management decisions are more and more
complex. We have moved from asking simple questions ("Where is the
problem in my system?") to much more complex ones ("Where should I
migrate this VM to accomplish my goals?"). Moreover, operational
environments are more and more dynamic. On the one hand,
softwarization and programmability elevate flexibility and allow
networks to be totally adapted to their static and/or dynamic
requirements. On the other hand, network virtualization highly
enables network automation.
The new functions and possibilities allow network devices to become
autonomic. However, they take complex decisions by themselves,
without human intervention, realizing the "dream" of Zero-Touch
Networks, which exploit fully programmable elements and advanced
automation methods (ETSI ZSM). Nevertheless, we have to remember
that AI methods are just resources, not solutions. They will not
replace the human decisions, just complement and "automate" them.
Finally, ML is not AI. AI has a broader spectrum of methods, some of
them are already exploited in the network for a long time.
Perception, reasoning, and planning are still not fully exploited in
the network.
4. AI Framework for NM
The basic concept is to develop a generic, scalable, deployable and
trustworthy AI framework for network management (AINEMA). The
framework will be used to design, deploy, instantiate, scale and
validate AI models and related algorithms applied to network
operation, administration and management (OAM). AINEMA will
particularly target E2E network management. AINEMA is underpinned by
the principle that a generic and scalable AI framework will allow for
more general-purpose AI solutions to network OAM, which can be scaled
from one network domain to multiple network domains, and to multi-
site and multi-tenant scenarios.
In a broader way, the key components of AINEMA are:
Martinez-Julia, et al. Expires 24 April 2024 [Page 7]
Internet-Draft AI Framework for NM October 2023
* The data framework. It is responsible for acquiring, modeling,
storing, and distributing data, both historical, collected off-
line, and real-time, on-line, from different parts of a network in
a unified and efficient manner. It also provides the internal
communication layer to the AI framework and serves as the
communication path between the AI framework and network
orchestration entities.
* The AI modules. They contain the AI functions that individually
or collectively accomplish local, E2E or global intelligent tasks
for network OAM.
* The AI hub. It receives data, knowledge, and localized decisions
from AI modules and outputs desired actions as recommendations to
network management entities. The AI hub is also in charge of the
life cycle management of the AI modules.
4.1. Target Challenges
The document [I-D.irtf-nmrg-ai-challenges] presents the most
outstanding challenges for exploiting AI in NM. Most of them focus
on the evaluation of different methods and assessment of results.
However, other challenges target specific capabilities that must be
provided by AI methods, such as resolving action planning for AI
decisions and obtaining AI output explanation. These challenges are
aimed by AINEMA. The action planning challenge is resolved with the
definition of the capabilities that MUST be offered by the planner
(see below). Additionally, AINEMA proposes to obtain explanations of
AI decisions by defining the intelligent reasoning methods that a
solution for NM MUST include.
4.2. Intent Support
AINEMA operation, described below, is able to create and manage
network services. The creation process is guided by a set of
requirements, while the management cycle is guided by policies. Both
requirements and policies are established by network service
administrators or tenants. AINEMA supports different ways to specify
the requirements and policies, among which we highlight the network
intent defined in RFC 9316 [RFC9316].
Martinez-Julia, et al. Expires 24 April 2024 [Page 8]
Internet-Draft AI Framework for NM October 2023
The translation of network intents to requirements or policies is
supported by different AI mechanisms. For instance, to support
intents written in natural language, AINEMA proposes to use natural
language processing (NLP) tools with pre-trained models to parse
input sentences and understand their grammatical content. Other AI
mechanisms are needed to extract the important semantic concepts to
achieve overall understanding of an intent and, thus, set boundaries
to its application, following specifications gathered in RFC 9316
[RFC9316].
4.3. Network Digital Twin
Many situations occurring to network services cannot be easily
traceable to causes or be easily anticipated, as discussed below.
The former usually occurs because of the length of the cause-effect
chain hindered behind some situation. The latter usually occurs
because of the limitations of typical time-series analysis methods.
To overcome those problems, AINEMA includes the support for
constructing a network digital twin (NDT) that represents a network
service. The most basic structure of NDT will be a huge dictionary
of key-value pairs that represents a huge knowledge base of subject-
predicate-object triples. For it, AINEMA defines a dynamic ontology
for representing aspects of network services as well as processing
mechanisms that obtain the results of what-if scenarios. Those
mechanisms heavily rely on AI methods, such as ensembles of time-
series data projection and fault pattern learning and matching
algorithms. These processes require the collection of particular
telemetry data in a particular format, as considered in
[I-D.zcz-nmrg-digitaltwin-data-collection]. Therefore, AINEMA
collection, which is part of the monitor defined below, is based on
knowledge handling, understands high dimensional data, and includes
functions to processes data to obtain valid knowledge in the
monitoring sources, which largely supports the NDT operation.
4.4. AINEMA Operation
A monitor-analyze-plan-execute cycle ---MAPE cycle--- MUST be
implemented for the exploitation of AI in network management, as
shown by ARCA [TNSM-2018], among others. Thus, AINEMA defines the
MAPE functions that MUST be incorporated in separate elements, and
defines a management service bus that MUST be incorporated to enable
AINEMA compliant services to be plugged without requiring them to
support connecting to each other expressly. The service bus ---as
middleware--- MUST route messages between services and replicate them
when required.
Martinez-Julia, et al. Expires 24 April 2024 [Page 9]
Internet-Draft AI Framework for NM October 2023
4.4.1. Overview
+----------+ +----------+ +----------+ +----------------+
| Analyzer | | Planner | | AI Model | | Knowledge Base |
+----------+ +----------+ +----------+ +----------------+
| | | |
+--------------------------------------------------------------+
| Control and Mgmt Service Bus |
+--------------------------------------------------------------+
| | | | |
+----------+ +----------+ +-------+ +----------+ +-------------+
| Monitor | | Executer | | NDT | | Fabric | | Fog Monitor |
+----------+ +----------+ +-------+ +----------+ +-------------+
| | |
+-------------------------+ +-------------+
| VIM | | Fog Element |
+-------------------------+ +-------------+
Figure 1: AINEMA Architecture
In a specific way, as shown in Figure 1, the components that MUST be
implemented to realize AINEMA are:
* The control and management service bus provides a pub/sub system
for AINEMA components to exchange the information they need. It
shares the information items published by some elements to any
other element interested on it --- namely, subscribed to the
appropriate topic.
* Elements of the MAPE cycle:
- The monitor service retrieves monitoring data items from an
underlying virtual infrastructure manager (VIM) and other
telemetry sources, such as NFs, and communicates them, or a
more processed version of them in the form of monitoring
information items, to any other component requiring them
through the service bus, as shown in Figure 2.
- The analyzer service subscribes to monitoring data ---
telemetry data) and other data and state notifications, such as
data published by the telemetry data fabric (shown in
Figure 4), data published by the fog monitoring (also shown in
Figure 4), or external event notifications published by the
external event detector (EED, as shown in Figure 5). The
service receives all the data, analyzes it to find the new
state of the managed network service, and publishes a new state
through the service bus. As detailed below, the deliberation
process involves interacting with different elements, if
Martinez-Julia, et al. Expires 24 April 2024 [Page 10]
Internet-Draft AI Framework for NM October 2023
available --- and connected to the service bus. Although the
analysis includes the application of ML-based classification,
as well as the interaction with the AI model instances (as
detailed below), a key part of the analysis is the intelligent
reasoning. This implies the application of both logical
reasoning as well as higher-order reasoning. It also includes
the application of complex event processing (CEP) and the
exploitation of available theories to extract as much
conclusions as possible from provided information and
knowledge.
- The planner service subscribes to the messages regarding the
state of the managed network service. If the state is outside
the boundaries of a target state, as indicated by
administrators --- and stored in the knowledge base ---, the
planner service builds an action plan to correct the problem
and publishes the plan through the service bus.
- The executer service subscribes to action plans and translates
them to commands that the underlying VIM or others MUST
implement.
* The AI model service can appear multiple times in an AINEMA
compliant system. Each service instance subscribes to knowledge
objects published through the service bus and uses them to update
its structures. After that, the service publishes its predictions
through the service bus, so other elements, such as the analyzer,
can know them. Thus, multiple answers to the same stimulus can be
obtained and published throughout the service bus.
* The knowledge base service provides a knowledge-oriented database
that enables other components to get state persistence. It will
be subscribed to the topics that indicate new data and will
publish most of the data it has. In some scenarios, it is able to
dynamically publish some data under demand. The implementation
determines the efficiency of this operation.
* The network digital twin (NDT) service implements a digital
version of the managed network service and updates it with the new
information provided through the service bus (see
[I-D.zcz-nmrg-digitaltwin-data-collection]). When some new state
is found --- projected after running a "what-if" process --- it
will publish the new state for other elements such as the
analyzer.
* The fabric service connects the service bus, and the AINEMA
instance, to other --- generally bigger --- source of telemetry
data, information, and knowledge.
Martinez-Julia, et al. Expires 24 April 2024 [Page 11]
Internet-Draft AI Framework for NM October 2023
* The fog monitor service (see [I-D.bernardos-anima-fog-monitoring])
publishes monitoring data and telemetry knowledge objects (TKOs)
obtained external fog elements, which are not directly managed by
the AINEMA instance, but whose behavior will have some impact in
the operation of the managed network service. This is a
particular case of an EED.
4.4.2. Monitoring / Executing
+-------+ +---------+ +----------+ +---------+ +----------+
| VIM | | Monitor | | Analyzer | | Planner | | Executer |
+-------+ +---------+ +----------+ +---------+ +----------+
| | | | |
| Telemetry | | | |
| Data | | | |
+------------>| | | |
| | Telemetry | | |
| | Knowledge | | |
| | Object | | |
| +----------->| | |
| | | | |
| | | ... | |
| | | | |
| | | | Actions |
| | | +------------>|
| | | | |
| | | | Commands |
|<------------------------------------------------------+
| | | | |
Figure 2: AINEMA<->VIM Workflow
The overall monitoring and execution operation of a management cycle
is shown in Figure 2. This is the external point of attachment and
vaguely involves AI processes, although some AI tools are envisioned
to be used by the monitoring and execution processes to respectively
ensure collected data and planned commands are correct.
In the process, the monitor element retrieves monitoring data ---
telemetry data --- from the monitored sources and formats it
according to some ontology and semantic data format to get a
telemetry knowledge object (TKO), which is then sent through the bus
to the analyzer, and it will be retrieved by any other element
interested on it --- subscribed to this type of publications.
Martinez-Julia, et al. Expires 24 April 2024 [Page 12]
Internet-Draft AI Framework for NM October 2023
The resulting action plan of a deliberation, obtained as described
below, is sent through the service bus to the executer service. This
translates the plan to a set of commands and executes them in the
VIM.
4.4.3. Deliberation
+----------+ +-------+ +----------+ +---------+ +---------+
| Analyzer | | NDT | | AI Model | | K. Base | | Planner |
+----------+ +-------+ +----------+ +---------+ +---------+
| | | | |
| Knowl. +->| +->| | |
| Update / | / | | |
+--------+------------+------------------>| |
| | | | |
| NDT State | | | |
|<------------+ | | |
| | | | |
| Predictions | | | |
|<-------------------------+ | |
| | | | |
Analyze | | | |
State | | | |
| | | | |
| State +->| +->| | |
| Update / | / | | |
+--------+------------+------------------>| |
| | | | |
| Found or | | | |
| Predicted | | +->| |
| Exception | | / | |
+------------------------------------+----------------->|
| | | | |
| | | | Plan
| | | | Actions
| | | | |
Figure 3: AINEMA Deliberation Workflow
Martinez-Julia, et al. Expires 24 April 2024 [Page 13]
Internet-Draft AI Framework for NM October 2023
The deliberation operation is continuously run by the analyzer.
After some new data or knowledge objects are received, it creates a
processed knowledge update message and sends it through the bus to
the network digital twin (NDT)---see its location within AINEMA
above---, the AI models, and the knowledge base. The latter will
just store the knowledge update. Meanwhile, the NDT will use the
knowledge update to update its digital view of the managed network
service. Each AI model instance will also use the knowledge update
to update their views of the managed network service and obtain new
predictions.
The new NDT state and predictions are published by the NDT and AI
models, respectively. They are retrieved by the analyzer and put
together to generate a state update, which is published for all
previous elements to get an even newer view of the network --- which
includes both actual state and predicted and projected states.
If a problem is found, the analyzer publishes a message indicating
it. This is received and stored by the knowledge base. It is also
received by the planner, which uses additional AI mechanisms to get
an action plan that resolves the problem and publishes the action
plan through the service bus. The planner will generally use the
services from AI modules by publishing messages to the appropriate
topics and subscribing to the corresponding topics. The action plan
is retrieved by the executer, as discussed above.
4.4.4. Involving Data Fabric and Fog
Martinez-Julia, et al. Expires 24 April 2024 [Page 14]
Internet-Draft AI Framework for NM October 2023
+-------+ +--------+ +----------+ +----------+ +----------+
| VIM | | Fabric | | Fog Mon. | | Analyzer | | Executer |
+-------+ +--------+ +----------+ +----------+ +----------+
| | | | |
| | Telemetry | | |
| | Knowledge | | |
| | Object | | |
| +------------------------->| |
| | | | |
| | | Telemetry | |
| | | Knowledge | |
| | | Object | |
| | +------------>| |
| | | | |
| | | | ... |
| | | | |
| | | | Actions |
| | | +------------->|
| | | | |
| | | | Commands |
|<------------------------------------------------------+
| | | | |
Figure 4: AINEMA<->VIM<->Fog<->Fabric Workflow
The flexibility provided by the service bus to AINEMA is particularly
shown in the ability to seamlessly integrate disparate components
without adding complexity. For instance, AINEMA envisions the
connection of a fog monitor (see
[I-D.bernardos-anima-fog-monitoring]). It provides monitoring
information from fog elements that can be correlated with other
performance and external event notifications to get a better view of
the state of the managed network service. It is also the case of the
data fabric (ref ??), which interfaces with an external system that
processes telemetry and exposes the processing outputs to the service
bus, so that other elements of the corresponding AINEMA instance are
able to receive (consume) data provided by the fabric --- namely, the
analyzer.
4.4.5. Involving External Event Detector
Martinez-Julia, et al. Expires 24 April 2024 [Page 15]
Internet-Draft AI Framework for NM October 2023
+-------+ +----------+ +----------+ +-------+ +----------+
| EED | | Analyzer | | AI Model | | NDT | | Executer |
+-------+ +----------+ +----------+ +-------+ +----------+
| | | | |
| +->| +->| | |
| TKO / | / | | |
+-------+--------------+------------------>| |
| | | | |
| | ... | | |
| | | | |
| Analyze | | |
| State | | |
| | | | |
| | Actions | | |
| +----------------------------------------->|
| | | | |
| | | | Commands |
[To VIM and others through the bus]<------------------------+
| | | | |
Figure 5: AINEMA<->VIM<->EED Workflow
In order to get the best possible analysis results --- and plans ---,
AINEMA also considers the incorporation of an external event detector
(EED). It publishes to the service bus notifications of external
events. For example, it can notify of a physical event (e.g.,
earthquake) or it can provide information about a common
vulnerabilities and exposures report --- namely a CVE report. Other
services connected to the service bus, such as the analyzer, AI
models, or NDT, will receive such notifications and consider them in
their procedures.
4.5. Effectiveness
The correct and pertinent operation of AINEMA provides significant
benefits, mainly in terms of making complex management operations
feasible and improving the performance of typically expensive tasks.
By taking advantage of these benefits, the amount of data that can be
analyzed to make decisions on the network can be hugely increased.
As shown in Figure 5 AINEMA makes possible to leverage intelligence
for network management operations. Instead of just being focused on
the analysis of performance measurements retrieved from the managed
network and the subsequent decision (proaction or reaction), the
extension of management operation enabled by AINEMA encompasses
different sub-processes.
Martinez-Julia, et al. Expires 24 April 2024 [Page 16]
Internet-Draft AI Framework for NM October 2023
First, AINEMA has a sub-process for retrieving the performance
measurements from the managed network. This is the same found in
typical management processes. Moreover, AINEMA encourages the
application of the same ML techniques to obtain some insight of the
situation of the managed network.
Second, AINEMA incorporates a reasoning sub-process. It receives
both the output of the previous sub-process and additional context
information, which can be provided by an external event detector, as
described below. Then, this sub-process finds out and particularizes
the rules that are governing the situation described above. Such
rules are semantically constructed and will abstract the situation of
the network in terms of logical and other semantic concepts, together
with actions and transformations that can be applied to those rules.
All such constructions will be stored in the Intelligent Network
Management Knowledge Base (INMKB), which will follow a pre-determined
ontology and will also extend the knowledge by applying basic and
atomic logic inference statements.
Third, AINEMA defines the solving sub-process. It works as follows.
Once obtained the abstracted situation of the managed network and the
rules to it, the solving subprocess builds a graph with all semantic
constructions. It reflects the managed network, since all network
elements have their semantic counterpart, but it also has all
situations, rules, actions, and even the measurements. The solving
sub-process applies ontology transformations to find a graph that is
acceptable in terms of the associated situation and its adherence to
administrative goals.
Fourth, AINEMA incorporates the planning sub-process. It receives
the solution graph obtained by the previous sub-process and makes a
linear plan of actions to execute in order to enforce the required
changes into the network. The actions used by this planning sub-
process are the building blocks of the plan. Each block will be
defined with a precondition, invariant, and postcondition. A
planning algorithm should be used to obtain such plan of actions by
linking the building blocks so they can be enforced to finally adapt
the managed network to get the desired situation.
All these processes MUST be executed in parallel, using strong inter-
process communication and synchronization constraints. Moreover, the
requests to the underlying infrastructure for the adaptation of the
managed network will be sent to the corresponding controllers without
waiting for finishing the deliberation cycle. This way, the time
needed by the whole cycle is highly reduced. This can be possible
because of the assumptions and anticipations tied to AINEMA and the
intelligence it denotes.
Martinez-Julia, et al. Expires 24 April 2024 [Page 17]
Internet-Draft AI Framework for NM October 2023
4.6. Closed Loop
AINEMA follows the closed loop methodology to achieve and assess the
accomplishment of network management goals. It ensures that the
state of the network is continuously within the working parameters
desired by its administrators. This is enforced among all management
cycles along the full life-cycle of the network.
To obtain the benefits from integrating AI within the closed loop,
AINEMA processes MUST be re-wired to connect their outputs to their
inputs, so obtaining feedback analysis. Moreover, an additional
process MUST be defined for ensuring that the objectives defined in
the last steps of AINEMA are actually present in the near future
situation of the managed network.
In addition, the data plane elements, such as the NSC described
above, MUST provide some capabilities to make them coherent to the
closed control loop. Particularly, they MUST provide symmetric
enforcement and telemetry interfaces, so that the elements composing
the managed network can be modified and monitored using the same
identifiers and having the same assumptions about their topology and
context. For instance, NSC MUST be able to provide the needed
functionality to enable AINEMA to request NSC to set up and connect
the necessary structures for telemetry collection and request changes
in the network.
4.7. Network Intelligence: From Data to Wisdom
Enabling AINEMA with full intelligence process extends the analytics
and decision power beyond current AI and ML solutions. Instead of
just analyzing observations, the incorporation of intelligence
processes to AINEMA makes hypotheses on the current and projected
situation of the managed system and finds evidences to justify it.
This process is much faster and much more effective than relying on
data. This is because the hypotheses will be formally formulated
within the scope and policies established by administrators.
Several hypotheses can be formulated in parallel. After gathering
evidences for all of them, the one that has the strongest evidences
can be taken as real and the potential effects can be fixed or
prevented (anticipated) as discussed below. As AI methods gain
access to a huge amount of intelligence data from the systems they
manage, they become more and more able to take strategic decisions,
mainly deriving such data to knowledge towards wisdom. This supports
the well known DIKW process (Data, Information, Knowledge, Wisdom)
that enables elements to operate autonomously, subject to the goals
established by administrators.
Martinez-Julia, et al. Expires 24 April 2024 [Page 18]
Internet-Draft AI Framework for NM October 2023
In such way, AI methods can be guided by the events or situations
found in underlying networks in a constantly evolving model. We can
call it the Knowledge (and Intelligence) Driven Network. The
structure itself of the network results from reasoning on
intelligence data. The network adapts to new situations without
requiring human involvement but administrative policies are still
enforced to decisions. Nevertheless, intelligence data MUST be
managed properly to exploit all its potential. Data with high
accuracy and high frequency will be processed in real-time.
Meanwhile, fast and scalable methods for information retrieval and
decision enforcement become essential to the objectives of the
network.
To achieve such goals, AI algorithms MUST be adapted to work on
network problems. Joint physical and virtual network elements can
form a multi-agent system focused on achieving such system goals. It
can be applied to several use-cases. For instance, it can be used
for predicting traffic behavior, iterative network optimization, and
assessment of administrative policies.
4.8. External Event Detectors
Current mechanisms for automated management and control rely only on
the continuous monitoring of the resources they control or the
underlying infrastructure that host them. However, there are several
other sources of information that can be exploited to make the
systems more robust and efficient. It is the case of the
notifications that can be provided by physical or virtual elements or
devices that are watching for specific events, hence called external
event detectors.
The external event detectors are a huge source of intelligence data
that can be used as evidence to demonstrate the hypotheses formulated
by AINEMA. More specifically, although the notifications provided by
these external event detectors are related to successes that occur
outside the boundaries of the controlled system, such success
episodes can affect the typical operation of controlled systems. For
instance, a heavy rainfall or snowfall can be detected and correlated
to a huge increase in the amount of requests experienced by some
emergency support service. Therefore, the evidence they provide can
be even stronger than performance measurements obtained by the
managed system and, in general, they will be used for anticipating
requirements much more effectively.
Martinez-Julia, et al. Expires 24 April 2024 [Page 19]
Internet-Draft AI Framework for NM October 2023
4.9. Anticipation of Network Requirements
One of the main goals of the NFV MANO mechanisms is to ensure the
virtual computer and network system they manage meets the
requirements established by their owners and administrators. It is
currently achieved by observing and analyzing the performance
measurements obtained either by directly asking the resources forming
the managed system or by asking the controllers of the underlying
infrastructure that hosts such resources. Thus, under changing or
eventual situations, the managed system MUST be adapted to cope with
the new requirements, increasing the amount of resources assigned to
it, or to make efficient use of available infrastructures, reducing
the amount of resources assigned to it.
However, the time required by the infrastructure to make effective
the adaptations requested by the MANO mechanisms is longer than the
time required by client requests to overload the system and make it
discard further client requests. This situation is generally
undesired but particularly dangerous for some systems, such as the
emergency support system mentioned above. Therefore, in order to
avoid the disruption of the service, the change in requirements MUST
be anticipated to ensure that any adaptation has finished as soon as
possible, preferably before the target system gets overloaded or
underloaded.
AINEMA exploits ARCA (Section 7). ARCA is integrated to NFV-MANO to
implement a closed loop network management that integrates external
events. ARCA provides to AINEMA the ability to correlate previous
external events (causes) with current performance measurements
(effects). It also is able to find the preventive actions
(anticipated countermeasures) that MUST be enforced to avoid the
effects once the causes have been detected. Thus, AINEMA is able to
enforce the necessary adaptations to avoid failure of the managed
system beforehand, particularly before the system performance metrics
have actually changed.
The following abstract algorithm formalizes the workflow expected to
be followed by the different implementations of the operation
proposed here.
Martinez-Julia, et al. Expires 24 April 2024 [Page 20]
Internet-Draft AI Framework for NM October 2023
while TRUE do
event = GetExternalEventInformation()
if event != NONE then
anticipated_resource_amount = Anticipator.Get(event)
if IsPolicyCompliant(anticipated_resource_amount) then
current_resource_amount = anticipated_resource_amount
anticipation_time = NOW
end if
end if
anticipated_event = event
if anticipated_event != NONE and
(NOW - anticipation_time) > EXPIRATION_TIME then
current_resource_amount = DEFAULT_RESOURCE_AMOUNT
anticipated_event = NONE
end if
state = GetSystemState()
if not IsAcceptable(state, current_resource_amount) then
current_resource_amount = GetResourceAmountForState(state)
if anticipated_event is not NONE then
Anticipator.Set
(anticipated_event, current_resource_amount)
anticipated_event = NONE
end if
end if
end while
This algorithm considers both internal and external events to
determine the necessary control and management actions to achieve the
proper anticipation of resources assigned to the target system. We
propose the different implementations to follow the same approach so
they can guess what to expect when they interact. For instance, a
consumer, such as an Application Service Provider (ASP), can expect
some specific behavior of the Virtual Network Operator (VNO) from
which it is consuming resources. This helps both the ASP and VNO to
properly address resource fluctuations.
4.10. Intelligent Reasoning and Explainable AI
It is trivial for anybody to understand that the behavior of the
network results from user activity. For instance, more users means
more traffic. However, it is not commonly considered that user
activity has a direct dependency on events that occur outside the
boundaries of the networks they use. For example, if a video becomes
trendy, the load of the network that hosts the video increases, but
also the load of any network with users watching the video. In the
same way, if a natural incident occurs (e.g. heavy rainfall,
earthquake), people try to contact their relatives and the load of a
telephony network increases. From this we can easily find out that
Martinez-Julia, et al. Expires 24 April 2024 [Page 21]
Internet-Draft AI Framework for NM October 2023
there is a clear causality relation between events occurring in the
real and digital world and the behavior of the network.
Network management outcomes, in terms of system stability,
performance, reliability, etc., would greatly improve by exploiting
such causality relation. An easy and straightforward way to do so is
to apply AI reasoning methods. These methods can be used to "guess"
the effect for a given cause. Moreover, reasoning can be used to
choose the specific events that can impact the system, so being the
cause for some effect.
Meanwhile, reasoning on network behavior from performance
measurements and external events places some challenges. First,
external event information MUST cross the administrative domain of
the network to which it is relevant. This means that there MUST be
interfaces and security policies that regulate how information is
exchanged between the external event detector, which can be some
sensor deployed in some "smart" place (e.g. smart city, smart
building), and the management solution, which resides inside the
administrative domain of the managed network. This function MUST be
highly conformed and regulated, and the protocols used to achieve it
MUST be widely accepted and tested, in order for it to exploit the
overall potential of external events.
Second, enough meta-data MUST be associated to performance
measurements to clearly identify all aspects of the effects, so that
they can be traced back to the causes (events). Such meta-data MUST
follow an ontology (information model) that is somewhat common and
widely accepted or, at least, to be able to easily transform it among
the different formats and models used by different vendors and
software.
Third, the management ontology MUST be extended by all concepts from
the boundaries of the managed network, its external environment
(surroundings), and any entity that, albeit being far away, can
impact on the function of the managed network.
5. Gaps and Standardization Issues
Several gaps and standardization issues arise from applying AI and
reasoning to network management solutions:
* Methods from different providers/vendors MUST be able to coexist
and work together, either directly or by means of a translator.
They must, however, use the same concepts, albeit using different
naming, so they actually share a common ontology.
Martinez-Julia, et al. Expires 24 April 2024 [Page 22]
Internet-Draft AI Framework for NM October 2023
* Information retrieval MUST be assessed for quality so that the
outputs from AI reasoning, and thus management solutions, can be
reliable.
* Ontological concepts MUST be consistent so that the types and
qualities of information that is retrieved from a system or object
are as expected.
* The protocols used to communicate (or disseminate, or publish) the
information MUST respond to the constraints of their target usage.
6. AINEMA Information Model
In this section we introduce the basic information model that MUST be
supported by AINEMA modules to gain the ability of reasoning on
external events. It basically includes the concepts and structures
used to describe external events and notify (communicate) them to the
interested sink, the network controller/manager, through the control
and management plane, depending on the specific instantiation of the
system.
6.1. Tree Structure
module: ietf-nmrg-nict-ainema
+--rw events
+--rw event-payloads
+--rw external-events
notifications:
+---n event
The main models included in the tree structure of the module are the
events and notifications. On the one hand, events are structured in
payloads and the content of events itself (external-events). On the
other hand, there is only one notification, which is the event
itself.
6.1.1. event-payloads
+--rw event-payloads
+--rw event-payloads-basic
+--rw event-payloads-seismometer
+--rw event-payloads-bigdata
Martinez-Julia, et al. Expires 24 April 2024 [Page 23]
Internet-Draft AI Framework for NM October 2023
The event payloads are, for the time being, composed of three types.
First, we have defined the basic payload, which is intended to carry
any arbitrary data. Second, we have defined the seismometer payload
to carry information about seisms. Third, we have defined the big-
data payload that carries notifications coming from BigData sources.
6.1.1.1. basic
+--rw event-payloads-basic* [plid]
+--rw plid string
+--rw data? union
The basic payload is able to hold any data type, so it has a union of
several types. It is intended to be used by any source of events
that is (still) not covered by other model. In general, any source
of telemetry information (e.g. OpenStack [OPENSTACK] controllers)
can use this model as such sources can encode on it their
information, which typically is very simple and plain. Therefore,
the current model is tightly interrelated to a framework to retrieve
network telemetry (see RFC 9232 [RFC9232]).
6.1.1.2. seismometer
+--rw event-payloads-seismometer* [plid]
+--rw plid string
+--rw location? string
+--rw magnitude? uint8
The seismometer model includes the main information related to a
seism, such as the location of the incident and its magnitude.
Additional fields can be defined in the future by extending this
model.
6.1.1.3. big-data
+--rw event-payloads-bigdata* [plid]
+--rw plid string
+--rw description? string
+--rw severity? uint8
The big-data model includes a description of an event (or incident)
and its estimated general severity, unrelated to the system. The
description is an arbitrary string of characters that would normally
carry information that describes the event using some higher level
format, such as Turtle or N3 for carrying RDF knowledge items.
6.1.2. external-events
Martinez-Julia, et al. Expires 24 April 2024 [Page 24]
Internet-Draft AI Framework for NM October 2023
+--rw external-events* [id]
+--rw id string
+--rw source? string
+--rw context? string
+--rw sequence? int64
+--rw timestamp? yang:date-and-time
+--rw payload? binary
The model defined to encode external events, which encapsulates the
payloads introduced above, is completed with an identifier of the
message, a string describing the source of the event, a sequence
number and a timestamp. Additionally, it includes a string
describing the context of the event. It is intended to communicate
the required information about the system that detected the event,
its location, etc. As the description of the Big Data payload, this
field can be formatted with a high level format, such as RDF.
6.1.3. notifications/event
notifications:
+---n event
+--ro id? string
+--ro source? string
+--ro context? string
+--ro sequence? int64
+--ro timestamp? yang:date-and-time
+--ro payload? binary
The event notification inherits all the fields from the model of
external events defined above. It is intended to allow software and
hardware elements to send, receive, and interpret not just the events
that have been detected and notified by, for instance, a sensor, but
also the notifications issued by the underlying infrastructure
controllers, such as the OpenStack controller.
6.2. YANG Module
Preamble and group definitions:
Martinez-Julia, et al. Expires 24 April 2024 [Page 25]
Internet-Draft AI Framework for NM October 2023
module ietf-nmrg-nict-ainema {
namespace "urn:ietf:params:xml:ns:yang:ietf-nmrg-nict-ainema";
prefix rant;
import ietf-yang-types { prefix yang; }
grouping external-event-information {
leaf id { type string; }
leaf source { type string; }
leaf context { type string; }
leaf sequence { type int64; }
leaf timestamp { type yang:date-and-time; }
leaf payload { type binary; }
}
grouping event-payload-basic {
leaf plid { type string; }
leaf data { type union { type string; type binary; } }
}
grouping event-payload-seismometer {
leaf plid { type string; }
leaf location { type string; }
leaf magnitude { type uint8; }
}
grouping event-payload-bigdata {
leaf plid { type string; }
leaf description { type string; }
leaf severity { type uint8; }
}
}
Event notifications for EED:
Martinez-Julia, et al. Expires 24 April 2024 [Page 26]
Internet-Draft AI Framework for NM October 2023
notification event {
uses external-event-information;
}
container events {
container event-payloads {
list event-payloads-basic {
key "plid";
uses event-payload-basic;
}
list event-payloads-seismometer {
key "plid";
uses event-payload-seismometer;
}
list event-payloads-bigdata {
key "plid";
uses event-payload-bigdata;
}
}
list external-events {
key "id";
uses external-event-information;
}
}
}
7. The Autonomic Resource Control Architecture (ARCA)
ARCA is a reference implementation of AINEMA that includes the
minimum services to showcase and evaluate its benefits and
compatibility with other services, such as using OpenStack as VIM and
Open Source MANO (OSM) to provide orchestration throughout the
control and management service bus. As deeply discussed in TNSM 2018
[TNSM-2018] and ICIN 2018 [ICIN-2018], ARCA leverages the elastic
adaptation of resources assigned to virtual computer and network
systems by calculating or estimating their requirements from the
analysis of load measurements and the detection of external events.
These events can be notified by physical elements (things, sensors)
that detect changes on the environment, as well as software elements
that analyze digital information, such as connectors to sources or
analyzers of Big Data. For instance, ARCA is able to consider the
detection of an earthquake or a heavy rainfall to overcome the
damages it can make to the controlled system.
The policies that ARCA enforces are specified by administrators
during the configuration of the control/management engine. Then,
ARCA continues running autonomously, with no more human involvement
Martinez-Julia, et al. Expires 24 April 2024 [Page 27]
Internet-Draft AI Framework for NM October 2023
unless some parameter MUST be changed. ARCA and its modules MUST
adopt the control and management operations needed to adapt the
controlled system to the new situation or requirements. The main
goal of ARCA is thus to reduce the time needed for resource
adaptation from hours/minutes to seconds/milliseconds. With the
aforementioned statements, system administrators are able to specify
the general operational boundaries in terms of lower and upper system
load thresholds, as well as the minimum and maximum amount of
resources that can be allocated to the controlled system to overcome
any eventual situation, including the natural crossing of such
thresholds.
ARCA functional goal is to run autonomously while the performance
goal is to keep the resources assigned to the controlled resources as
close as possible to the optimum (e.g. 5 % from the optimum) while
avoiding service disruption as much as possible, keeping client
request discard rate as low as possible (e.g. below 1 %). To achieve
both goals, ARCA relies on the Autonomic Computing (AC) paradigm, in
the form of interconnected micro-services. Therefore, ARCA includes
the four main elements and activities defined by AC, incarnated as:
Collector Is responsible of gathering and formatting the
heterogeneous observations that will be used in the control
cycle.
Analyzer Correlates the observations to each other in order to find
the situation of the controlled system, especially the
current load of the resources allocated to the system and
the occurrence of an incident that can affect to the normal
operation of the system, such as an earthquake that
increases the traffic in an emergency-support system, which
is the main target scenario studied in this paper.
Decider Determines the necessary actions to adjust the resources to
the load of the controlled system.
Enforcer Requests the underlying and overlying infrastructure, such
as OpenStack, to make the necessary changes to reflect the
effects of the decided actions into the system.
Being a micro-service architecture means that the different
components are executed in parallel. This allows such components to
operate in two ways. First, their operation can be dispatched by
receiving a message from the previous service or an external service.
Second, the services can be self-dispatched, so they can activate
some action or send some message without being previously stimulated
by any message. The overall control process loops indefinitely and
it is closed by checking that the expected effects of an action are
Martinez-Julia, et al. Expires 24 April 2024 [Page 28]
Internet-Draft AI Framework for NM October 2023
actually taking place. The coherence among the distributed services
involved in the ARCA control process is ensured by enforcing a common
semantic representation and ontology to the messages they exchange.
ARCA semantics are built with the Resource Description Framework
(RDF) and the Web Ontology Language (OWL), which are well known and
widely used standards for the semantic representation and management
of knowledge. They provide the ability to represent new concepts
without needing to change the software, just plugin extensions to the
ontology. ARCA stores all its knowledge is stored in the Knowledge
Base (KB), which is queried and kept up-to-date by the analyzer and
decider micro-services. It is implemented by Apache Jena Fuseki,
which is a high-performance RDF data store that supports SPARQL
through an HTTP/REST interface. Being de-facto standards, both
technologies enable ARCA to be easily integrated to virtualization
platforms like OpenStack.
8. ARCA Integration With ETSI-NFV-MANO
In this section we describe how to fit ARCA on a general SDN/NFV
underlying infrastructure and introduce a showcase experiment that
demonstrates its operation on an OpenStack-based experimentation
platform. We first describe the integration of ARCA with the NFV-
MANO reference architecture. We contextualize the significance of
this integration by describing an emergency support scenario that
clearly benefits from it. Then we proceed to detail the elements
forming the OpenStack platform and finally we discuss some initial
results obtained from them.
8.1. Functional Integration
The most important functional blocks of the NFV reference
architecture promoted by ETSI (see ETSI-NFV-MANO [ETSI-NFV-MANO]) are
the system support functions for operations and business (OSS/BSS),
the element management (EM) and, obviously. the Virtual Network
Functions (VNFs). But these functions cannot exist without being
instantiated on a specific infrastructure, the NFV infrastructure
(NFVI), and all of them MUST be coordinated, orchestrated, and
managed by the general NFV-MANO functions.
Martinez-Julia, et al. Expires 24 April 2024 [Page 29]
Internet-Draft AI Framework for NM October 2023
Both the NFVI and the NFV-MANO elements are subdivided into several
sub-components. The NFVI has the underlying physical computing,
storage, and network resources, which are assigned and virtualized to
conform the virtual computing, storage, and network resources that
will host the VNFs. In addition, the NFV-MANO is subdivided in the
NFV Orchestrator (NFVO), the VNF manager (VNFM) and the Virtual
Infrastructure Manager (VIM). As their name indicates, all high-
level elements and sub-components have their own and very specific
objective in the NFV architecture.
During the design of ARCA we enforced both operational and
interfacing aspects to its main objectives. From the operational
point of view, ARCA processes observations to manage virtual
resources, so it plays the role of the VIM mentioned above.
Therefore, ARCA has been designed with appropriate interfaces to fit
in the place of the VIM. This way, ARCA provides the NFV reference
architecture with the ability to react to external events to adapt
virtual computer and network systems, even anticipating such
adaptations as performed by ARCA itself. However, some interfaces
were extended to fully enable ARCA to perform its work within the NFV
architecture.
Once ARCA is placed in the position of the VIM, it enhances the
general NFV architecture with its autonomic management capabilities.
In particular, it discharges some responsibilities from the VNFM and
NFVO, so they can focus on their own business while the virtual
resources are behaving as they expect (and request). Moreover, ARCA
improves the scalability and reliability of the managed system in
case of disconnection from the orchestration layer due to some
failure, network split, etc. It is also achieved by the autonomic
capabilities, which, as described above, are guided by the rules and
policies specified by the administrators and, here, communicated to
ARCA through the NFVO. However, ARCA will not be limited to such
operation so, more generally, it will accomplish the requirements
established by the Virtual Network Operators (VNOs), which are the
owners of the virtual resources that is managed by a particular
instance of NFV-MANO, and therefore ARCA.
In addition to the operational functions, ARCA incorporates the
necessary mechanisms to engage the interfaces that enable it to
interact with other elements of the NFV-MANO reference architecture.
More specifically, ARCA is bound to the Or-Vi (see ETSI-NFV-IFA-005
[ETSI-NFV-IFA-005]) and the Nf-Vi (see ETSI-NFV-IFA-004
[ETSI-NFV-IFA-004] and ETSI-NFV-IFA-019 [ETSI-NFV-IFA-019]). The
former is the point of attachment between the NFVO and the VIM while
the latter is the point of attachment between the NFVI and the VIM.
In our current design we decided to avoid the support for the point
of attachment between the VNFM and the VIM, called Vi-Vnfm (see
Martinez-Julia, et al. Expires 24 April 2024 [Page 30]
Internet-Draft AI Framework for NM October 2023
ETSI-NFV-IFA-006 [ETSI-NFV-IFA-006]). We leave it for future
evolutions of the proposed integration, that MUST be enabled by
solutions that provide the functions of the VNFM needed by ARCA.
Through the Or-Vi, ARCA receives the instructions it will enforce to
the virtual computer and network system it is controlling. As
mentioned above, these are specified in the form of rules and
policies, which are in turn formatted as several statements and
embedded into the Or-Vi messages. In general, these will be high-
level objectives, so ARCA will use its reasoning capabilities to
translate them into more specific, low-level objectives. For
instance, the Or-Vi can specify some high-level statement to avoid
CPU overloading and ARCA will use its innate and acquired knowledge
to translate it to specific statements that specify which parameters
it has to measure (CPU load from assigned servers) and which are
their desired boundaries, in the form of high threshold and low
threshold. Moreover, the Or-Vi will be used by the NFVO to specify
which actions can be used by ARCA to overcome the violation of the
mentioned policies.
All information flowing the Or-Vi interface is encoded and formatted
by following a simple but highly extensible ontology and exploiting
the aforementioned semantic formats. This ensures that the
interconnected system is able to evolve, including the replacement of
components, updating (addition or removal) the supported concepts to
understand new scenarios, and connecting external tools to further
enhance the management process. The only capability that MUST be
implemented to ensure this feature is the support for the mentioned
ontology and semantic formats. Although it is not a finished task,
the development of semantic technologies allows the easy adaptation
and translation of existing information formats, so it is expected
that more and more software pieces become easily integrable with the
ETSI-NFV-MANO [ETSI-NFV-MANO] architecture.
In contrast to the Or-Vi interface, the Nf-Vi interface exposes more
precise and low-level operations. Although this makes it easier to
be integrated to ARCA, it also makes it to be tied to specific
implementations. In other words, building a proxy that enforces the
aforementioned ontology to different interface instances to
homogenize them adds undesirable complexity. Therefore, new
components have been specifically developed for ARCA to be able to
interact with different NFVIs. Nevertheless, this specialization is
limited to the collector and enforcer. Moreover, it allows ARCA to
have optimized low-level operations, with high improvement of the
overall performance. This is the case of the specific
implementations of the collector and enforcer used with Mininet and
Docker, which are used as underlying infrastructures in previous
experiments described in ICIN 2017 [ICIN-2017]. Moreover, as
Martinez-Julia, et al. Expires 24 April 2024 [Page 31]
Internet-Draft AI Framework for NM October 2023
discussed in the following section, this is also the case of the
implementations of the collector and enforcer tied to OpenStack
telemetry and compute interfaces, respectively. Hence it is
important to ensure that telemetry is properly addressed, so a common
framework MUST be adopted in such endpoint (see RFC 9232 [RFC9232]).
Although OpenStack still lacks some functionality regarding the
construction of specific virtual networks, we use it as the NFVI
functional block in the integrated approach. Therefore, OpenStack is
the provider of the underlying SDN/NFV infrastructure and we
exploited its APIs and SDK to achieve the integration. More
specifically, in our showcase we use the APIs provided by Ceilometer,
Gnocchi, and Compute services as well as the SDK provided for Python.
All of them are gathered within the Nf-Vi interface. Moreover, we
have extended the Or-Vi interface to connect external elements, such
as the physical or environmental event detectors and Big Data
connectors, which is becoming a mandatory requirement of the current
virtualization ecosystem and it conforms our main extension to the
NFV architecture.
8.2. Target Experiment and Scenario
From the beginning of our work on the design of ARCA we are targeting
real-world scenarios, so we get better suited requirements. In
particular we work with a scenario that represents an emergency
support service that is hosted on a virtual computer and network
system, which is in turn hosted on the distributed virtualization
infrastructure of a medium-sized organization. The objective is to
clearly represent an application that requires high dynamicity and
high degree of reliability. The emergency support service
accomplishes this by being barely used when there is no incident but
also being heavily loaded when there is an incident.
Both the underlying infrastructure and virtual network share the same
topology. They have four independent but interconnected network
domains that form part of the same administrative domain
(organization). The first domain hosts the systems of the
headquarters (HQ) of the owner organization, so the VNFs it hosts
(servants) implement the emergency support service. We defined them
as ``servants'' because they are Virtual Machine (VM) instances that
work together to provide a single service by means of backing the
Load Balancer (LB) instances deployed in the separate domains. The
amount of resources (servants) assigned to the service will be
adjusted by ARCA, attaching or detaching servants to meet the load
boundaries specified by administrators.
Martinez-Julia, et al. Expires 24 April 2024 [Page 32]
Internet-Draft AI Framework for NM October 2023
The other domains represent different buildings of the organization
and will host the clients that access to the service when an incident
occurs. They also host the necessary LB instances, which are also
VNFs that are controlled by ARCA to regulate the access of clients to
servants. All domains will have physical detectors to provide
external information that can (and will) be correlated to the load of
the controlled virtual computer and network system and thus will
affect to the amount of servants assigned to it. Although the
underlying infrastructure, the servants, and the ARCA instance are
the same as those those used in the real world, both clients and
detectors will be emulated. Anyway, this does not reduce the
transferability of the results obtained from our experiments as it
allows to expand the amount of clients beyond the limits of most
physical infrastructures.
Each underlying OpenStack domain will be able to host a maximum of
100 clients, as they will be deployed on a low profile virtual
machine (flavor in OpenStack). In general, clients will be
performing requests at a rate of one request every ten seconds, so
there would be a maximum of 30 requests per second. However, under
the simulated incident, the clients will raise their load to reach a
common maximum of 1200 requests per second. This mimics the shape
and size of a real medium-size organization of about 300 users that
perform a maximum of four requests per second when they need some
support.
The topology of the underlying network is simplified by connecting
the four domains to the same, high-performance switch. However, the
topology of the virtual network is built by using direct links
between the HQ domain and the other three domains. These are
complemented by links between domains 2 and 3, and between domains 3
and 4. This way, the three domains have three paths to reach the HQ
domain: a direct path with just one hop, and two indirect paths with
two and three hops, respectively.
During the execution of the experiment, the detectors notify the
incident to the controller as soon as it happens. However, although
the clients are stimulated at the same time, there is some delay
between the occurrence of the incident and the moment the network
service receives the increase in the load. One of the main targets
of our experiment is to study such delay and take advantage of it to
anticipate the amount of servants required by the system. We discuss
it below.
Martinez-Julia, et al. Expires 24 April 2024 [Page 33]
Internet-Draft AI Framework for NM October 2023
In summary, this scenario highlights the main benefits of ARCA to
play the role of VIM and interacting with the underlying OpenStack
platform. This means the advancement towards an efficient use of
resources and thus reducing the CAPEX of the system. Moreover, as
the operation of the system is autonomic, the involvement of human
administrators is reduced and, therefore, the OPEX is also reduced.
8.3. OpenStack Platform
The implementation of the scenario described above reflects the
requirements of any edge/branch networking infrastructure, which are
composed of several distributed micro-data-centers deployed on the
wiring centers of the buildings and/or storeys. We chose to use
OpenStack to meet such requirements because it is being widely used
in production infrastructures and the resulting infrastructure will
have the necessary robustness to accomplish our objectives, at the
time it reflects the typical underlying platform found in any SDN/NFV
environment.
We have deployed four separate network domains, each one with its own
OpenStack instantiation. All domains are totally capable of running
regular OpenStack workload, i.e. executing VMs and networks, but, as
mentioned above, we designate the domain 1 to be the headquarters of
the organization. The different underlying networks required by this
(quite complex) deployment are provided by several VLANs within a
high-end L2 switch. This switch represents the distributed network
of the organization. Four separated VLANs are used to isolate the
traffic within each domain, by connecting an interface of OpenStack's
controller and compute nodes. These VLANs therefore form the
distributed data plane. Moreover, other VLAN is used to carry the
control plane as well as the management plane, which are used by the
NFV-MANO, and thus ARCA. It is instantiated in the physical machine
called ARCA Node, to exchange control and management operations in
relation to the collector and enforcer defined in ARCA. This VLAN is
shared among all OpenStack domains to implement the global control of
the virtualization environment pertaining to the organization.
Finally, other VLAN is used by the infrastructure to interconnect the
data planes of the separated domains and also to allow all elements
of the infrastructure to access the Internet to perform software
installation and updates.
Installation of OpenStack is provided by the Red Hat OpenStack
Platform, which is tightly dependent on the Linux operating system
and closely related to the software developed by the OpenStack Open
Source project. It provides a comprehensive way to install the whole
platform while being easily customized to meet our specific
requirements, while it is also backed by operational quality support.
Martinez-Julia, et al. Expires 24 April 2024 [Page 34]
Internet-Draft AI Framework for NM October 2023
The ARCA node is also based on Linux but, since it is not directly
related to the OpenStack deployment, it is not based on the same
distribution. It is just configured to be able to access the control
and management interfaces offered by OpenStack, and therefore it is
connected to the VLAN that hosts the control and management planes.
On this node we deploy the NFV-MANO components, including the micro-
services that form an ARCA instance.
In summary, we dedicate nine physical computers to the OpenStack
deployment, all are Dell PowerEdge R610 with 2 x Xeon 5670 2.96 GHz
(6 core / 12 thread) CPU, 48 GiB RAM, 6 x 146 GiB HD at 10 kRPM, and
4 x 1 GE NIC. Moreover, we dedicate an additional computer with the
same specification to the ARCA Node. We dedicate a less powerful
computer to implement the physical router because it will not be
involved in the general execution of OpenStack nor in the specific
experiments carried out with it. Finally, as detailed above, we
dedicate a high-end physical switch, an HP ProCurve 1810G-24, to
build the interconnection networks.
8.4. Initial Results
Using the platform described above we execute an initial but long-
lasting experiment based on the target scenario introduced at the
beginning of this section. The objective of this experiment is
twofold. First, we aim to demonstrate how ARCA behaves in a real
environment. Second, we aim to stress the coupling points between
ARCA and OpenStack, which will raise the limitations of the existing
interfaces.
With such objectives in mind, we define a timeline that will be
followed by both clients and external event detectors. It forces the
virtualized system to experience different situations, including
incidents of many severities. When an incident is found in the
timeline, the detectors notify it to the ARCA-based VIM and the
clients change their request rates, which will depend on the severity
of the incident. This behavior is widely discussed in ICIN 2018
[ICIN-2018], remarking how users behave after occurring a disaster or
another similar incident.
The ARCA-based VIM will know the occurrence of the incident from two
sources. First, it will receive the notification from the event
detectors. Second, it will notice the change of the CPU load of the
servants assigned to the target service. In this situation, ARCA has
different opportunities to overcome the possible overload (or
underload) of the system. We explore the anticipation approach
deeply discussed in ICIN 2018 [ICIN-2018]. Its operation is enclosed
in the analyzer and decider and it is based on an algorithm that is
divided in two sub-algorithms.
Martinez-Julia, et al. Expires 24 April 2024 [Page 35]
Internet-Draft AI Framework for NM October 2023
The first sub-algorithm reacts to the detection of the incident and
ulterior correlation of its severity to the amount of servants
required by the system. This sub-algorithm hosts the regression of
the learner, which is based on the SVM/SVR technique, and predicts
the necessary resources from two features: the severity of the
incident and the time elapsed from the moment it happened. The
resulting amount of servants is established as the minimum amount
that the VIM can use.
The second sub-algorithm is fed with the CPU load measurements of the
servants assigned to the service, as reported by the OpenStack
platform. With this information it checks whether the system is
within the operating parameters established by the NFVO. If not, it
adjusts the resources assigned to the system. It also uses the
minimum amount established by the other sub-algorithm as the basis
for the assignation. After every correction, this algorithm learns
the behavior by adding new correlation vectors to the SVM/SVR
structure.
When the experiment is running, the collector component of the ARCA-
based VIM is attached to the telemetry interface of OpenStack by
using the SDK to access the measurement data generated by Ceilometer
and stored by Gnocchi. In addition, it is attached to the external
event detectors in order to receive their notifications. On the
other hand, the enforcer component is attached to the Compute
interface of OpenStack by also using its SDK to request the
infrastructure to create, destroy, query, or change the status of a
VM that hosts a servant of the controlled system. Finally, the
enforcer also updates the lists of servers used by the load balancers
to distribute the clients among the available resources.
During the execution of the experiment we make the ARCA-based VIM to
report the severity of the last incident, if any, the time elapsed
since it occurred, the amount of servants assigned to the controlled
system, the minimum amount of servants to be assigned, as determined
by the anticipation algorithm, and the average load of all servants.
In this instance, the severities are spread between 0 (no incident)
and 4 (strongest incident), the elapsed times are less than 35
seconds, and the minimum server assignation (MSA) is below 10,
although the hard maximum is 15.
Martinez-Julia, et al. Expires 24 April 2024 [Page 36]
Internet-Draft AI Framework for NM October 2023
With such measurements we illustrate how the learned correlation of
the three features (dimensions) mentioned above is achieved. Thus,
when there is no incident (severity = 0), the MSA is kept to the
minimum. In parallel, regardless of the severity level, the
algorithm learned that there is no need to increase the MSA for the
first 5 or 10 seconds. This shows the behavior discussed in this
paper, that there is a delay between the occurrence of an event and
the actual need for updated amount of resources, and it forms one
fundamental aspect of our research.
By inspecting the results, we know that there is a burst of client
demands that is centered (peak) around 15 seconds after the
occurrence of an incident or any other change in the accounted
severity. We also know that the burst lasts longer for higher
severities, and it fluctuates a bit for the highest severities.
Finally, we can also notice that for the majority of severities, the
increased MSA is no longer required after 25 seconds from the time
the severity change was notified.
All that information becomes part of the knowledge of ARCA and it is
stored both by the internal structures of the SVM/SVR and, once
represented semantically, in the semantic database that manages the
knowledge base of ARCA. Thus, it is used to predict any future
behavior. For instance, is an incident of severity 3 has occurred 10
seconds ago, ARCA knows that it will need to set the MSA to 6
servants. In fact, this information has been used during the
experiment, so we can also know the accuracy of the algorithm by
comparing the anticipated MSA value with the required value (or even
the best value). However, the analysis of such information is left
for the future.
While preparing and executing the experiment we found several
limitation intrinsic to the current OpenStack platform. First,
regardless of the CPU and memory resources assigned to the underlying
controller nodes, the platform is unable to record and deliver
performance measurements at a lower interval than every 10 seconds,
so it is currently not suitable for real time operations, which is
important for our long-term research objectives. Moreover, we found
that the time required by the infrastructure to create a server that
hosts a somewhat heavy servant is around 10 seconds, which is too far
from our targets. Although these limitations can be improved in the
future, they clearly justify that our anticipation approach is
essential for the proper working of a virtual system and, thus, the
integration of external information becomes mandatory for future
system management technologies, especially considering the
virtualization environments.
Martinez-Julia, et al. Expires 24 April 2024 [Page 37]
Internet-Draft AI Framework for NM October 2023
Finally, we found it difficult for the required measurements to be
pushed to external components, so we had to poll for them.
Otherwise, some component of ARCA MUST be instantiated along the main
OpenStack components and services so it has first-hand and prompt
access to such features. This way, ARCA could receive push
notifications with the measurements, as it is for the external
detectors. This is a key aspect that affects the placement of the
NFV-VIM, or some subpart of it, on the general architecture.
Therefore, for future iterations of the NFV reference architecture,
an integrated view between the VIM and the NFVI could be required to
reflect the future reality.
9. Relation to Other IETF/IRTF Initiatives
TBD
10. IANA Considerations
This memo includes no request to IANA.
11. Security Considerations
As with other AI mechanisms, the major security concern for the
adoption of intelligent reasoning on external events to manage SDN/
NFV systems is that the boundaries of the control and management
planes are crossed to introduce information from outside. Such
communications MUST be highly and heavily secured since some
malfunction or explicit attacks might compromise the integrity and
execution of the controlled system. However, it is up to
implementers to deploy the necessary countermeasures to avoid such
situations. From the design point of view, since all operations are
performed within the control and/or management planes, the security
level of reasoning solutions is inherited and thus determined by the
security measures established by the systems conforming such planes.
12. Acknowledgements
TBD
13. References
13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
Martinez-Julia, et al. Expires 24 April 2024 [Page 38]
Internet-Draft AI Framework for NM October 2023
[RFC9232] Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and
A. Wang, "Network Telemetry Framework", RFC 9232,
DOI 10.17487/RFC9232, May 2022,
<https://www.rfc-editor.org/info/rfc9232>.
[RFC9316] Li, C., Havel, O., Olariu, A., Martinez-Julia, P., Nobre,
J., and D. Lopez, "Intent Classification", RFC 9316,
DOI 10.17487/RFC9316, October 2022,
<https://www.rfc-editor.org/info/rfc9316>.
13.2. Informative References
[ETSI-NFV-IFA-004]
ETSI NFV GS NFV-IFA 004, "Network Functions Virtualisation
(NFV); Acceleration Technologies; Management Aspects
Specification", 2016.
[ETSI-NFV-IFA-005]
ETSI NFV GS NFV-IFA 005, "Network Functions Virtualisation
(NFV); Management and Orchestration; Or-Vi reference point
- Interface and Information Model Specification", 2016.
[ETSI-NFV-IFA-006]
ETSI NFV GS NFV-IFA 006, "Network Functions Virtualisation
(NFV); Management and Orchestration; Vi-Vnfm reference
point - Interface and Information Model Specification",
2016.
[ETSI-NFV-IFA-019]
ETSI NFV GS NFV-IFA 019, "Network Functions Virtualisation
(NFV); Acceleration Technologies; Management Aspects
Specification; Release 3", 2017.
[ETSI-NFV-MANO]
ETSI NFV GS NFV-MAN 001, "Network Functions Virtualisation
(NFV); Management and Orchestration", 2014.
[I-D.bernardos-anima-fog-monitoring]
Bernardos, C. J., Mourad, A., and P. Martinez-Julia,
"Autonomic setup of fog monitoring agents", Work in
Progress, Internet-Draft, draft-bernardos-anima-fog-
monitoring-07, 4 July 2023,
<https://datatracker.ietf.org/doc/html/draft-bernardos-
anima-fog-monitoring-07>.
[I-D.ietf-teas-ietf-network-slices]
Farrel, A., Drake, J., Rokui, R., Homma, S., Makhijani,
K., Contreras, L. M., and J. Tantsura, "A Framework for
Martinez-Julia, et al. Expires 24 April 2024 [Page 39]
Internet-Draft AI Framework for NM October 2023
Network Slices in Networks Built from IETF Technologies",
Work in Progress, Internet-Draft, draft-ietf-teas-ietf-
network-slices-25, 14 September 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-teas-
ietf-network-slices-25>.
[I-D.irtf-nmrg-ai-challenges]
François, J., Clemm, A., Papadimitriou, D., Fernandes, S.,
and S. Schneider, "Research Challenges in Coupling
Artificial Intelligence and Network Management", Work in
Progress, Internet-Draft, draft-irtf-nmrg-ai-challenges-
01, 10 July 2023, <https://datatracker.ietf.org/doc/html/
draft-irtf-nmrg-ai-challenges-01>.
[I-D.zcz-nmrg-digitaltwin-data-collection]
Zhou, C., Chen, D., Martinez-Julia, P., and Q. Ma, "Data
Collection Requirements and Technologies for Digital Twin
Network", Work in Progress, Internet-Draft, draft-zcz-
nmrg-digitaltwin-data-collection-03, 9 July 2023,
<https://datatracker.ietf.org/doc/html/draft-zcz-nmrg-
digitaltwin-data-collection-03>.
[ICIN-2017]
P. Martinez-Julia, V. P. Kafle, and H. Harai, "Achieving
the autonomic adaptation of resources in virtualized
network environments, in Proceedings of the 20th ICIN
Conference (Innovations in Clouds, Internet and Networks,
ICIN 2017). Washington, DC, USA: IEEE, 2018, pp. 1--8",
2017.
[ICIN-2018]
P. Martinez-Julia, V. P. Kafle, and H. Harai,
"Anticipating minimum resources needed to avoid service
disruption of emergency support systems, in Proceedings of
the 21th ICIN Conference (Innovations in Clouds, Internet
and Networks, ICIN 2018). Washington, DC, USA: IEEE, 2018,
pp. 1--8", 2018.
[OPENSTACK]
The OpenStack Project, "http://www.openstack.org/", 2018.
[TNSM-2018]
P. Martinez-Julia, V. P. Kafle, and H. Harai, "Exploiting
External Events for Resource Adaptation in Virtual
Computer and Network Systems, in IEEE Transactions on
Network and Service Management. Vol. 15, n. 2, pp. 555--
566, 2018.", 2018.
Martinez-Julia, et al. Expires 24 April 2024 [Page 40]
Internet-Draft AI Framework for NM October 2023
Authors' Addresses
Pedro Martinez-Julia (editor)
NICT
4-2-1, Nukui-Kitamachi, Koganei, Tokyo
184-8795
Japan
Phone: +81 42 327 7293
Email: pedro@nict.go.jp
Shunsuke Homma
NTT
Japan
Email: shunsuke.homma.ietf@gmail.com
Diego R. Lopez
Telefonica I+D
Don Ramon de la Cruz, 82
28006 Madrid
Spain
Email: diego.r.lopez@telefonica.com
Martinez-Julia, et al. Expires 24 April 2024 [Page 41]