Delay-Tolerant Networking E. Birrane
Internet-Draft Johns Hopkins Applied Physics Laboratory
Intended status: Informational August 20, 2015
Expires: February 21, 2016

Asynchronous Management Architecture
draft-birrane-dtn-ama-01

Abstract

This document describes the motivation, desirable properties, system model, roles/responsibilities, and component models associated with an asynchronous management architecture (AMA) suitable for providing application-level network management services in a challenged networking environment. Challenged networks are those that require fault protection, configuration, and performance reporting while unable to provide human-in-the-loop operations centers with synchronous feedback in the context of administrative sessions. In such a context, networks must exhibit behavior that is both deterministic and autonomous while maintaining compatibility with existing network management protocols and operational concepts.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on February 21, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This document presents an Asynchronous Management Architecture (The AMA) providing application-layer network management services over links where delivery delays prevent timely communications between a network operator and a managed device. These delays may be caused by long signal propagations or frequent link disruptions (such as described in [RFC4838]) or by non-environmental delay drivers such as unavailability of network operators, administrative delays, or delays caused by quality-of-service prioritizations and service-level agreements.

1.1. Purpose

This document describes the motivation, rationale, desirable properties, and roles/responsibilities associated with an asynchronous management architecture (AMA) suitable for providing network management services in a challenged networking environment. These descriptions should be of sufficient specificity such that an implementing Network Management Protocol (NMP) conformant with this architecture will operate successfully in a challenged networking environment.

An AMA is necessary as the assumptions inherent to the architecture and design of synchronous management tools and techniques fail in challenged network scenarios. Absent an asynchronous management approach, network operators must either adapt to scaling outages of common network management functionality or, more often, must invest time and resources to evolve a challenged network into a well-connected, low-latency network. In some cases such evolution is merely a costly way to over-resource a network. In other cases, such evolution is impossible given physical limitations imposed by signal propagation delays, power, transmission technologies, and other phenomena. The ability to asynchronously manage asynchronous networks enables the large-scale deployment of such networks providing both enhanced technical capabilities and reduced deployment and operations costs. This document presents six sections that, together, describe an AMA suitable for enterprise management of asynchronous networks: motivation, service definitions, desirable properties, roles/responsibilities, system model, and logical component model. The purpose of each section is as follows.

1.2. Scope

It is assumed that any challenged network where network management would be usefully applied support basic services such as naming, addressing, security, fragmentation, and traditional network/session layer functions. Therefore, these items are not covered in this architectural document.

While likely that a challenged network will interface with a non-challenged network, this architecture does not address the concept of network management compatibility with traditional, non-challenged network management approaches. Implementing NMPs conformant with this architecture should examine compatibility with existing approaches as part of supporting nodes acting as gateways between network types.

1.3. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Terminology

This section identifies those terms critical to understanding the proper operation of the AMA. Whenever possible, these terms align in both word selection and meaning with their analogs from other management protocols.

3. Motivation

The characteristics of challenged networks, to include those networks challenged by administrative or policy delays, do not conform to several assumptions made by current network management approaches. These assumptions include high-rate, high-available data, round-trip data exchange, and operator-in-the-loop operation. The inability of current approaches to provide network management services in a challenged network motivate the need for a new network management architecture focused on asynchronous, open-loop, autonomous control of network components.

3.1. Challenged Networks

A growing variety of link-challenged networks support packetization to increase data communications reliability without otherwise guaranteeing a simultaneous end-to-end path. Examples of such networks include Mobile Ad-Hoc Networks (MANets), Vehicular Ad-Hoc Networks (VANets), space-terrestrial internetworks, and heterogeneous networking overlays. Links in such networks are often unavailable due to attenuations, propagation delays, occultation, and other limitations imposed by energy and mass considerations. Data communications in such networks rely on store-and-forward and other queueing strategies to wait for the connectivity necessary to usefully advance a packet along its route.

Similarly, there also exist well-resourced networks that incur high message delivery delays due to non-environmental limitations. For example, networks whose operations centers are understaffed or where data volume and management requirements exceed the real-time cognitive load of operators or the associated operations console software support. Also, networks where policy prevents certain data users from utilizing existing bandwidth also create delayed and disrupted environments that create administratively controlled periods of no communication.

Regardless of the reason, during periods of no communications nodes must rely on fault-management and other autonomous mechanisms to ensure the safe operation of the node and its ability to usefully re-join the network at a later time. In cases of sparsely-populated networks, there may never be a practical concept of "the connected network" as most nodes may be disconnected most of the time. In such environments, defining a network in terms of instantaneous connectivity becomes impractical or impossible.

Specifically, challenged networks exhibit the following properties that may violate assumptions built into current approaches to network management.

3.2. Current Management Approaches

Network management in non-challenged networks provides mechanisms for communicating locally-collected data from Agents to associated Managers, typically using a "pull" mechanism where data must be explicitly requested in order to be transmitted.

A near ubiquitous method for management in non-challenged networks today is the Simple Network Management Protocol (SNMP) [RFC3416]. SNMP utilizes a request/response model to set and retrieve data values such as host identifiers, link utilizations, error rates, counters, etc., between application software on Agents and Managers. Data may be directly sampled or consolidated into representative statistics. Additionally, SNMP supports a model for asynchronous notification messages, called traps, based on predefined triggering events. Thus, Managers can query Agents for status information, send new configurations, and be informed when specific events have occurred. Traps and query-able data are defined in one or more Managed Information Bases (MIBs) which define the information for a particular data standard, protocol, device, or application.

In challenged networks, the request/response method of data collection is neither efficient nor, at times, possible as it relies on sessions, round-trip latency, message retransmission, and ordered delivery. Adaptive modifications to SNMP to support challenged networks would alter the basic function of the protocol (data models, control flows, and syntax) so as to be functionally incompatible with existing SNMP installations. While a standard for networking, extending SNMP into this new domain is no more plausible than extending IP routing protocols into this domain.

The Network Configuration Protocol (NETCONF) provides device-level configuration capabilities [RFC6241]. so as to replace vendor-specific command line interface (CLI) configuration software. The XML-based protocol provides a remote procedure call (RPC) syntax such that any exposed functionality on an Agent can be exercised via a software application interface. NETCONF places no specific functional requirements or constraints on the capabilities of the Agent, which makes it a very flexible tool for configuring a homogeneous network of devices. However, NETCONF does place specific constraints on any underlying transport protocol: namely, a long-lived, reliable, low-latency sequenced data delivery session. This is a fundamental requirement given the RPC-nature of the operating concept, and it is unsustainable in a challenged network.

3.3. Limitations of Current Approaches

Ultimately, management approaches that rely on timely data exchange, such as those that rely on negotiated sessions or other synchronized acknowledgment, do not function in challenged network environments. Familiar examples of TCP/IP based management via closed-loop, synchronous messaging does not work when network disruptions increase in frequency and severity. While no protocol delivers data in the absence of a networking link, protocols that eliminate or drastically reduce overhead and end-point coordination require smaller transmission windows and continue to function when confronted with scaling delays and disruptions in the network.

Just as the concept of a loosely-confederated set of nodes changes the definition of a network, it also changes the operational concept of what it means to manage a network. When a network stops being a single entity exhibiting a single behavior, "network management" becomes large-scale "node management". Individual nodes must share the burden of implementing desirable behavior without reliance on a single oracle of configuration or other coordinating function such as an operator-in-the-loop.

4. Service Definitions

This section identifies the type of services that must exist between Managers and Agents within an AMA. These services include configuration, reporting, parameterized control, and administration.

4.1. Configuration

Configuration services update local information held by an Agent as it relates to managed applications and protocols. Such information refers to the data necessary to configure behavior in response to state and time changes on these devices. The local information configured through these services includes the data definitions from Application Data Models (ADMs), the specification of parameters associated with these models, and tactical data definitions defined by operators in the network.

New configurations received by a node must be validated to ensure that they do not conflict with other configurations at the node, or prevent the node from effectively working with other nodes in its region. In challenged networks there may not be sufficient time to prevent an erroneous or stale configuration from harming the flow of data through the network.

Examples of configuration service behavior include the following.

4.2. Reporting

Reporting services collect state information from an Agent, such as performance information, and send this information to one or more Managers. The term "reporting" is used in place of the term "monitoring" as challenged networks cannot support closed-loop monitoring. Reports received by an Agent provide best-effort information to Managers.

Since a Manager is not actively "monitoring" an Agent, the Agent must make its own determination on when to send what reports based on its own local time and state information. Agents should produce reports of varying fidelity and with varying frequency based on thresholds and other information set as part of configuration services.

Examples of reporting service behavior include the following.

4.3. Parameterized Control

Control services provide mechanisms for an Agent to change its behavior using pre-defined, pre-configured responses from a Manager. By setting autonomous actions on Agents, Managers can "manage" the node asynchronously during periods of no communication. Agents must understand a finite set of pre-programmed functions related to the protocols and applications managed on the device. As such, controls comprise the basic autonomy mechanism within the AMA.

Similar to reporting services, controls are run based on the Agent's notion of time and state in accordance with directives provided by configuration services.

Examples of potential control service behavior include the following.

4.4. Administration

Administration services enforce the potentially complex mapping of configuration, reporting, and control services amongst Agents and Managers in the network. Fine-grained access control specifying which Managers may apply which services to which Agents may be necessary in networks dealing with multiple administrative entities or overlay networks crossing multiple administrative boundaries. Whitelists, blacklists, shared keys, PKI, or other schemes may be used for this purpose.

Examples of administration service behavior include the following.

5. Desirable Properties

As discussed, realizing necessary service definitions given the characteristics of challenged networks cannot be performed using current network management approaches and operational concepts. This section describes those desirable properties of an AMA that enable the implementation of service definitions in such networks. These properties include open-loop, intelligent push, asynchronous mechanisms that can scale as message delivery delays scale. Ultimately, a useful AMA MUST be built around the following five design principles.

5.1. Intelligent Push of Information

Pull management mechanisms require that a Manager send a query to an Agent and then wait for the response to that query. This practice both implies a control-session between entities and increases the overall message traffic in the network. Challenged networks cannot guarantee timely roundtrip data-exchange and, in extreme cases, are comprised solely of uni-directional links. Therefore, pull mechanisms must be avoided in favor of push mechanisms.

Push mechanisms, in this context, refer to Agents making their own determinations relating to the information that should be sent to Managers. Such mechanisms do not require round-trip communications as Managers do not request each reporting instance; Managers need only request once, in advance, that information be produced in accordance with a pre-determined schedule or in response to a pre-defined state on the Agent. In this was information is "pushed" from Agents to Managers and the push is "intelligent" because it is based on some internal evaluation performed by the Agent.

5.2. Minimize Message Size Not Node Processing

Protocol designers must balance message size versus message processing time at sending and receiving nodes. Verbose representations of data simplify node processing whereas compact representations require additional activities to generate/parse the compacted message. There is no asynchronous management advantage to minimizing node processing time in a challenged network. However, there is a significant advantage to smaller message sizes in such networks. Compact messages require smaller periods of viable transmission for communication, incur less re-transmission cost, and consume less resources when persistently stored en-route in the network. AMPs should minimize PDUs whenever practical, to include packing and unpacking binary data, variable-length fields, and pre-configured data definitions.

5.3. Specific Data Identification

Elements within the management system must be uniquely identifiable so that they can be individually manipulated. Identification schemes that are relative to system configuration make data exchange between Agents and Managers difficult as system configurations may change faster than nodes can communicate. For example, SNMP-managed systems often approximate associative array lookups by (1) querying a list of known array keys, (2) making a key-index map, and (3) then querying a specific index into the array based on that map. Ignoring the inefficiency of two pull requests, this mechanism fails when the Agent changes its key-index mapping between the first and second query. AMPs must find a way to uniquely identify such data that does not rely on system configuration, perhaps through parameterization of the initial query.

5.4. Custom, Tactical Data Definition

Tactical definition of new data from existing data (such as through data fusion, averaging, sampling, or other mechanisms) provides the ability to communicate desired information in as compact a form as possible. Specifically, an Agent should not be required to transmit a large data set for a Manager that only wishes to calculate a smaller, inferred data set. The Agent should calculate the smaller data set on its own and transmit that instead. Since the identification of these smaller data sets is likely both tactical and in the context of a specific network deployment, AMPs must provide a mechanism for their definition.

5.5. Autonomous Operation

AMA network functions must be achievable using only knowledge local to the Agent. Performance data production, reconfiguration, and other activity must be autonomously evaluated and implemented by the impacted node. Managers, rather than directing an Agent, configure the autonomy engine of the Agent to take its own action under the appropriate conditions in accordance with the Agent's notion of local state and time.

6. Roles and Responsibilities

By definition, Agents reside on managed devices and Managers reside on managing devices. This section describes how these roles participate in the network management functions outlined in the prior section.

6.1. Agent Responsibilities

Application Data Model (ADM) Support

Agents MUST collect all data, execute all controls, and provide all reports and operations required by each ADM which the Agent claims to support. Agents MUST enumerated supported ADMs so that Managers in a network understands what information is understood by that Agent.
Local Data Collection

Agents MUST collect from local firmware (or other on-board mechanisms) and report all atomic data defined in all ADMs for which they have been configured.
Autonomous Control

Agents MUST determine, without Manager intervention, whether a configured control should be invoked. Agents MUST periodically evaluate the conditions associated with configured controls and invoke those controls based on local state. Agents MAY also invoke controls on other devices for which they act as proxy.
User Data Definition

Agents MUST provide mechanisms for operators in the network to use configuration services to create customized data, reports, macro definitions and other information specific to a particular operator need in the context of a specific network or network use-case. Agents MUST allow for the creation, listing, and removal of such data definitions in accordance with whatever security models are deployed within the particular network.

Where applicable, Agents MUST verify the validity of custom data definitions when they are configured and respond in a way consistent with the logging/error-handling policies of the Agent and the network.
Autonomous Reporting

Agents MUST determine, without Manager intervention, whether and when to populate and transmit a given data report targeted to one or more Managers in the network.
Consolidate Messages

Agents SHOULD produce as few messages as possible when sending information. For example, rather than sending multiple report messages to a Manager, an Agent SHOULD prefer to send a single message containing multiple reports.
Regional Proxy

Agents MAY perform any of their responsibilities on behalf of other network nodes that, themselves, do not have an Agent. In such a configuration, the Agent acts as a proxy for these other network nodes.

6.2. Manager Responsibilities

Agent/ADM Mapping

Managers MUST understand what ADMs are supported by the various Agents with which they communicate. Managers SHOULD NOT attempt to request, invoke, or refer to ADM information for ADMs unsupported by an agent.
Data Collection

Managers MUST receive information from Agents by asynchronously configuring the production of data reports and then waiting for, and collecting, responses from Agents over time. Managers MAY try to detect conditions where Agent information has not been received within operationally relevant timespans and react in accordance with network policy.
Custom Definitions

Managers SHOULD provide the ability to define custom data and report definitions. Any defined custom definitions MUST be transmitted to appropriate Agents and these definitions MUST be remembered to interpret the reporting of these custom values from Agents in the future.
Data Translation

Managers SHOULD provide some interface to other network management protocols, such as the SNMP. Managers MAY accomplish this by accumulating a repository of push-data from high-latency parts of the network from which data may be pulled by low-latency parts of the network.
Data Fusion

Managers MAY support the fusion of data from multiple Agents with the purpose of transmitting fused data results to other Managers within the network. Managers MAY receive fused reports from other managers pursuant to appropriate security and administrative configurations.

7. System Model

This section describes the notional data flows and control flows that illustrate how Managers and Agents within an AMA cooperate to perform network management services.

7.1. Data Flows

The AMA identifies three significant data flows: control flows from Managers to Agents, reports flows from Agents to Managers, and fusion reports from Managers to other Managers. These data flows are illustrated in Figure 1.

AMA Data Flows

    
 +---------+       +------------------------+      +---------+        
 | Node A  |       |         Node B         |      |  Node C |
 |         |       |                        |      |         |
 |+-------+|       |+-------+      +-------+|      |+-------+|
 ||       ||=====>>||Manager|====>>|       ||====>>||       ||
 ||       ||<<=====||   B   |<<====|Agent B||<<====||       ||
 ||       ||       |+--++---+      +-------+|      ||Manager||
 || Agent ||       +---||-------------------+      ||   C   ||              
 ||   A   ||           ||                          ||       ||
 ||       ||<<=========||==========================||       ||
 ||       ||===========++========================>>||       ||
 |+-------+|                                       |+-------+|
 +---------+                                       +---------+
             

Figure 1

In this data flow, the Agent on node A receives configurations from Managers on nodes B and C, and replies with reports back to these Managers. Similarly, the Agent on node B interacts with the local Manager on node B and the remote Manager on node C. Finally, the Manager on node B may fuse data reports received from Agents at nodes A and B and send these fused reports back to the Manager on node C.
From this figure it is clear that there exist many-to-many relationships amongst Managers, amongst Agents, and between Agents and Managers. Note that Agents and Managers are roles, not necessarily differing software applications. Node A may represent a single software application fulfilling only the Agent role, whereas node B may have a single software application fulfilling both the Agent and Manager roles. The specifics of how these roles are realized is an implementation matter.

7.2. Control Flow by Role

This section describes three common configurations of Agents and Managers and the flow of messages between them. These configurations involve local and remote management and data fusion.

7.2.1. Notation

The notation outlined in Table 1 describes the types of control messages exchanged between Agents and Managers.

Terminology
Term Definition Example
AD# Atomic data definition, from ADM. AD1
CD# Custom data definition. CD1 = AD1 + CD0.
DEF([ACL], ID,EXPR) Define id from expression. Allow managers in access control list (ACL) to request this id. DEF([*], CD1, AD1 + AD2)
PROD(P,ID) Produce ID according to predicate P. P may be a time period (1s) or an expression (AD1 > 10). PROD(1s, AD1)
RPT(ID) A report identified by ID. RPT(AD1)

7.2.2. Serialized Management

This is a nominal configuration of network management where a Manager interacts with a set of Agents. The control flows for this are outlined in Figure 2.

Serialized Management Control Flow

    
 +----------+            +---------+           +---------+              
 |  Manager |            | Agent A |           | Agent B |
 +----+-----+            +----+----+           +----+----+
      |                       |                     |
      |-----PROD(1s, AD1)---->|                     |(Step 1)
      |----------------------------PROD(1s, AD1)--->|                    
      |                       |                     |
      |                       |                     |
      |<-------RPT(AD1)-------|                     |(Step 2)
      |<-----------------------------RPT(AD1)-------|
      |                       |                     |
      |                       |                     |
      |<-------RPT(AD1)-------|                     |
      |<-----------------------------RPT(AD1)-------|
      |                       |                     |
      |                       |                     |
      |<-------RPT(AD1)-------|                     |
      |<-----------------------------RPT(AD1)-------|
      |                       |                     |
                 

In a simple network, a Manager interacts with multiple Agents.

Figure 2

In this figure, the Manager configures Agents A and B to produce atomic data AD1 every second in (Step 1). At some point in the future, upon receiving and configuring this message, Agents A and B then build a report containing AD1 and send those reports back to the Manager in (Step 2).

7.2.3. Multiplexed Management

Networks spanning multiple administrative domains may require multiple Managers (for example, one per domain). When a Manager defines custom reports/data to an Agent, that definition may be tagged with an access control list (ACL) to limit what other managers will be privy to this information. Managers in such networks SHOULD synchronize with those other Managers granted access to their custom data definitions. When Agents generate messages, they MUST only send messages to Managers according to these ACLs, if present. The control flows in this scenario are outlined in Figure 3.

Multiplexed Management Control Flow

    
 +-----------+            +-------+            +-----------+              
 | Manager A |            | Agent |            | Manager B |
 +-----+-----+            +---+---+            +-----+-----+
       |                      |                      |
       |--DEF(A,CD1,AD1*2)--->|<--DEF(B, CD2, AD2*2)-|(Step 1)
       |                      |                      |
       |---PROD(1s, CD1)----->|<---PROD(1s, CD2)-----|(Step 2)
       |                      |                      |
       |<-------RPT(CD1)------|                      |(Step 3)
       |                      |--------RPT(CD2)----->|
       |<-------RPT(CD1)------|                      |
       |                      |--------RPT(CD2)----->|
       |                      |                      |
       |                      |<---PROD(1s, CD1)-----|(Step 4)
       |                      |                      |
       |                      |--ERR(CD1 no perm.)-->|   
       |                      |                      |
       |--DEF(*,CD3,AD3*3)--->|                      |(Step 5)
       |                      |                      |
       |---PROD(1s, CD3)----->|                      |(Step 6)
       |                      |                      |
       |                      |<---PROD(1s, CD3)-----|
       |                      |                      |
       |<-------RPT(CD3)------|--------RPT(CD3)----->|(Step 7)
       |<-------RPT(CD1)------|                      |
       |                      |--------RPT(CD2)----->|
       |<-------RPT(CD3)------|--------RPT(CD3)----->|
       |<-------RPT(CD1)------|                      |
       |                      |--------RPT(CD2)----->|
                 

Complex networks require multiple Managers interfacing with Agents.

Figure 3

In more complex networks, Managers may choose to define custom reports and data definitions, and Agents may need to accept such definitions from multiple Managers. Custom data definitions may include an ACL that describes who may query and otherwise understand the custom definition. In (Step 1), Manager A defines CD1 only for A while Manager B defines CD2 only for B. Managers may, then, request the production of reports containing these custom definitions, as shown in (Step 2). Agents produce different data for different Managers in accordance with configured production rules, as shown in (Step 3). If a Manager requests an operation, such as a production rule, for a custom data definition for which the Manager has no permissions, a response consistent with the configured logging policy on the Agent should be implemented, as shown in (Step 4). Alternatively, as shown in (Step 5), a Manager may define custom data with no restrictions allowing all other Managers to request and use this definition. This allows all Managers to request the production of reports containing this definition, shown in (Step 6) and have all Managers receive this and other data going forward, as shown in (Step 7).

7.2.4. Data Fusion

In some networks, Agents do not individually transmit their data to a Manager, preferring instead to fuse reporting data with local nodes prior to transmission. This approach reduces the number and size of messages in the network and reduces overall transmission energy expenditure. DTNMP supports fusion of NM reports by co-locating Agents and Managers on nodes and offloading fusion activities to the Manager. This process is illustrated in Figure 4.

Data Fusion Control Flow

    
 +-----------+        +-----------+      +---------+      +---------+               
 | Manager A |        | Manager B |      | Agent B |      | Agent C |
 +---+-------+        +-----+-----+      +----+----+      +----+----+
     |                      |                 |                |
     |--DEF(A,CD0,AD1+AD2)->|                 |                |(Step 1)
     |--PROD(AD1&AD2, CD0)->|                 |                |
     |                      |                 |                |
     |                      |--PROD(1s,AD1)-->|                |(Step 2)
     |                      |-------------------PROD(1s, AD2)->|
     |                      |                 |                |
     |                      |<---RPT(AD1)-----|                |(Step 3)
     |                      |<-------------------RPT(AD2)------|
     |                      |                 |                |
     |<-----RPT(A,CD0)------|                 |                |(Step 4)
     |                      |                 |                |
                 

Data fusion occurs amongst Managers in the network.

Figure 4

In this example, Manager A requires the production of a computed data set, CD0, from node B, as shown in (Step 1). The manager role understands what data is available from what agents in the subnetwork local to B, understanding that AD1 is available locally and AD2 is available remotely. Production messages are produced in (Step 2) and data collected in (Step 3). This allows the manager at node B to fuse the collected data reports into CD0 and return it in (Step 4). While a trivial example, the mechanism of associating fusion with the manager function rather than the agent function scales with fusion complexity, though it is important to reiterate that agent and manager designations are roles, not individual software components. There may be a single software application running on node B implementing both Manager B and Agent B roles.

8. Logical Data Model

This section enumerates the different kinds of information present in an asynchronously-managed network and describes how this information should be communicated in the context of an ADM.

8.1. Data Representation

8.1.1. Types

The AMA notionally supports four basic types of information: Data, Controls, Literals, and Operators:

Value
Values consist of information collected by an Agent and reported to Managers. This includes definitions from an ADM, derived values as configured from Managers, and reports which are collections of data elements.
Controls
Controls consist of actions that may be invoked on Agents and Managers to change behavior in response to some external event (such as local state changes or time). Controls include application-specific functions specified as part of an ADM and macros which are collections of these controls.
Literals
Literals are constant numerical values that may be used in the evaluation of expressions and predicates.
Operators
Operators are those mathematical functions that operate on series of Data and Literals, such as addition, subtraction, multiplication, and division.

8.1.2. Classification

The AMA notionally defines three data classifications that describe the origins and multiplicity of data types within the system. These classifications are atomic, computed, and collection.

Atomic

The Atomic classification contains those data types that are directly specified by an ADM and not otherwise derived from other identified data. Atomic data is, therefore, not subject to change. Atomic component identifiers MUST be unique and SHOULD be managed by a registration authority, perhaps similar to the mechanisms used to assign Object Identifiers (OIDs).
Computed

The Computed classification contains those data types whose definition is computed from some operation applied to one or more pieces of data. Computed data items may be formally specified in an ADM (and therefore not subject to change) or may be defined by users/operators within a system and therefore subject to change in accordance with configuration services. Specifically, the definition of a computed component MAY be dynamically defined by Managers and communicated to one or more Agents in a network. The definition of a computed component may include other computed components or other atomic components. The identifier of a computed component is not guaranteed to be universally unique but MUST be unique within the context of a particular network or internetwork.
Collection

The Collection classification represents a set of information, including atomic data, computed data, and other collections.

8.2. Data Model

Each component of the DTNMP data model can be identified as a combination of type and category, as illustrated in Table 2. In this table type/category combinations that are unsupported are listed as N/A. Specifically, DTNMP does not support user-defined controls, constants, or operations; any value that specifies action on an agent MUST be pre-configured as part of an ADM.

Data Action Literals Operator
Atomic Primitive Value Control Literal Operator
Computed Computed Value Rule N/A N/A
Collection Report Macro N/A N/A

The eight elements of the AMA logical data model are described as follows.

8.2.1. Primitive Values, Computed Values, and Reports

Fundamental to any performance reporting function is the ability to measure the state of value on the Agent. Measurement may be accomplished through direct sampling of hardware, query against in-situ data stores, or other mechanisms that provide the initial quantification of state.

Primitive values serve as the "lingua franca" of the management system: the unit of information that cannot be otherwise created. As such, this information serves as the basis for any user-defined (computed) values in the system.

AMPs MAY consider the concept of the confidence of the primitive value as a function of time. For example, to understand at which point a measurement should be considered stale and need to be re-measured before acting on the associated data. For example, one approach to mitigate this concern is to measure values on-demand. Another approach is to populate a centralized data store of values and refresh that data store according to some pre-defined period.

While primitives provide the full, raw set of information available to Managers and Agents there is a performance optimization to pre-computed re-used combinations of these values. Computing new values as a function of measured values simplifies operator specifications and prevents Agent implementations from continuously re-calculating the same value each time it is used in a given time period.

For example, consider a sensor node which wishes to report a temperature averaged over the past 10 measurements. An Agent may either transmit all 10 measurements to a Manager, or calculate locally the average measurement and transmit the "fused" data. Clearly, the decision to reduce data volume is highly coupled to the nature of the science and the resources of the network. For this reason, the ability to define custom computations per deployment is necessary.

Periodically, or in accordance with local state changes, Agents must collect a series of measured values and computed values and communicate them back to Managers. This ordered collection of value information is noted in this architecture as a "report". In support of hierarchical definitions, reports may, themselves, contain other reports. It would be incumbent on an AMP implementation to guard against circular reference in report definitions.

8.2.2. Controls and Macros

Just as traditional network management approaches provide well-known identifiers for values, the AMA must provide a similar service for taking action on a node. Whereas several low-latency, high-availability approaches in networks can use approaches such as remote procedure calls (RPCs), challenged networks cannot provide a similar function - Managers cannot be in the processing loop of an Agent when the Agent is not in communication with the Manager.

Controls in a system are the combination of a well-known operation that can be taken by an Agent as well as any parameters that are necessary for the proper execution of that function. For specific applications or protocols a control specification (as a series of opcodes) can be published such that any implementing AMP accepts these opcodes and understands that sending the opcodes to an Agent supporting the application or protocol will properly execute the associated function. Parameters to such functions are provided in real-time by either Managers requesting that a control be run, pre-configured, or auto-populated by the Agent in-situ.

Often, a series of controls must be executed in concert to achieve a particular function, especially when controls represent more primitive operations for a particular application/protocol. In such scenarios, an ordered collection of controls can be specified as a "macro". In support of the hierarchical build-up of functionality, macros may, themselves, contain other macros, through it would be incumbent on an AMP implementation to guard against excessive recursion or other resource-intensive nesting.

8.2.3. Rules

Stimulus-response autonomy systems provide a way to pre-configure responses to anticipated events. Such a mapping from responses to events is advantageous in a challenged network for a variety of reasons, as listed below.

  • Distributed Operation - The concept of pre-configuration allows the Agent to operate without regular contact with Managers in the system. Configuration opportunities will be sporadic in any challenged network making bootstrapping of the system difficult, but this is a fundamental problem in any network scenario and any autonomy approach.
  • Deterministic Behavior - Where the mapping of stimulus to response is stable, the behavior of the Agent to a variety of in-situ state also remains stable. This stable behavior is necessary in critical operational systems where the actions of a platform must be well understood even in the absence of an operator in the loop.
  • Engine-Based Behavior - Several operational systems are unable to deploy "mobile code" based solutions due to network bandwidth, memory or processor loading, or security concerns. The benefit of engine-based approaches is that the configuration inputs to the engine can be flexible without incurring a set of problematic requirements or concerns.

The logical unit of stimulus-response autonomy proposed in the AMA is a "rule" of the form:
IF stimulus THEN response
Where the set of such rules, when evaluated in some prioritized sequence, provides the full set of autonomous behavior for an Agent. Stimulus in such a system would either be a function of relative time, absolute time, or some mathematical expression comprising one or more values (measurement values or computed values).

Notably, in such a system, stimuli and responses from multiple applications and protocols may be combined to provide an expressive capability.

8.2.4. Operators and Literals

The act of computing values or evaluating the expressions that comprise a stimulus in a rule both require applying mathematical operations to data known to the management system.

Operators in the AMA represent enumerated mathematical operations applied to primitive and computed data in the AMA for the purpose of creating new values. Operations may be simple binary operations such as "A + B" or more complex functions such as sin(A) or avg(A,B,C,D).

Literals represent pre-configured constants in the AMA, such as well-known mathematical numbers (e.g., PI, E), or other useful data such as Epoch times. Literals also represent asserted primitive values used in expressions. For example, considering the expression (A = B + 10), A would be a computed value, B would be either computed value or a primitive value, + would be an operator, and 10 would be a literal.

8.3. Application Data Model

Application data models (ADMs) specify the data associated with a particular application/protocol. The purpose of the ADM is to provide a guaranteed interface for the management of an application or protocol independent of the nuances of its software implementation. In this respect, the ADM is conceptually similar to the Managed Information Base (MIB) used by SNMP, but contains additional information relating to command opcodes and more expressive syntax for automated behavior.

Within the AMA, an ADM MUST define all well-known items necessary to manage the specific application or protocol. This includes the definitions of primitive values, measured values, reports, controls, macros, rules, literals, and operators.

9. IANA Considerations

At this time, this protocol has no fields registered by IANA.

10. Security Considerations

Security within an AMA MUST exist in two layers: transport layer security and access control.

Transport-layer security addresses the questions of authentication, integrity, and confidentiality associated with the transport of messages between and amongst Managers and Agents in the AMA. This security is applied before any particular Actor in the system receives data and, therefore, is outside of the scope of this document.

Finer grain application security is done via ACLs which are defined via configuration messages and implementation specific.

11. Informative References

[BIRRANE1] Birrane, E. and R. Cole, "Management of Disruption-Tolerant Networks: A Systems Engineering Approach", 2010.
[BIRRANE2] Birrane, E., Burleigh, S. and V. Cerf, "Defining Tolerance: Impacts of Delay and Disruption when Managing Challenged Networks", 2011.
[BIRRANE3] Birrane, E. and H. Kruse, "Delay-Tolerant Network Management: The Definition and Exchange of Infrastructure Information in High Delay Environments", 2011.
[I-D.irtf-dtnrg-dtnmp] Birrane, E. and V. Ramachandran, "Delay Tolerant Network Management Protocol", Internet-Draft draft-irtf-dtnrg-dtnmp-01, December 2014.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3416] Presuhn, R., "Version 2 of the Protocol Operations for the Simple Network Management Protocol (SNMP)", STD 62, RFC 3416, DOI 10.17487/RFC3416, December 2002.
[RFC4838] Cerf, V., Burleigh, S., Hooke, A., Torgerson, L., Durst, R., Scott, K., Fall, K. and H. Weiss, "Delay-Tolerant Networking Architecture", RFC 4838, April 2007.
[RFC6241] Enns, R., Bjorklund, M., Schoenwaelder, J. and A. Bierman, "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011.

Author's Address

Edward J. Birrane Johns Hopkins Applied Physics Laboratory EMail: Edward.Birrane@jhuapl.edu