OPSA                                                             F. Zhao
Internet-Draft                                                    Huawei
Intended status: Informational                            21 August 2023
Expires: 22 February 2024

                   Near Data Processing for Telemetry


   As the scale of IP networks and the importance of services increase
   continuously, the data collected by telemetry is increasing
   exponentially.  To support the network simulation, traffic
   optimization and risk detection of the network, Digital Twin
   technology on data communication networks is mentioned increasingly.
   Real-time synchronization between controllers and devices is a
   critical feature of Digital Twin Network, which require much more
   volume telemetry data between controllers and devices.  All of these
   bring more pressure on the network bandwidth, device CPU and
   controller CPU.  This document proposes a method to optimize the

1.  Introduction

   To implement more accurate, reliable, and timely network control,
   more and more Telemetry data is collected from network devices to
   controllers.  In the future, the proposed and application of digital
   twin network has higher requirements for the real-time consistency
   between network devices and controllers.

   Meanwhile, with the widely used heterogeneous computing hardware and
   AI technologies, network devices will have more and more computing
   resources.  We can consider migrating the analysis, data association,
   and even closed-loop decision-making processes implemented by
   controllers to the NE side to make full use of distributed computing

   Similar frameworks have been maturely applied in the automotive self-
   driving field.  In automotive self-driving field, BEV technologies
   are used to merge multi-dimensional data(such as camera, radar and
   position) and make self-decision at the car side.

2.  Traditional telemetry is extremely applied

   Network devices collect information from multiple dimensions,
   including flow information, configuration, events, alarms, logs,
   dynamic topology and routes, and device status (including CPU,
   memory, and hardware health).  To implement accurate network
   management, a large amount of data is collected to the controller for
   analysis and processing.  This brings great challenges to the
   controller in terms of bandwidth resources, CPU resources.

   In order to try to optimize this problem, many methods have been
   proposed, such as Sketch, variable frequency sampling, data
   compression, and so on.  However, these techniques are difficult to
   achieve a good balance between the integrity and the amount of data

3.  Near data processing and distributed computing method

   We consider a near-data and distributed computing scheme that makes
   full use of the computing power of network devices to realize real-
   time sensing and control of network status.  It is also the
   foundation for the realization of the future digital twin network.
   Here's a framework for addressing the problem:

                                    |              |
                                    |  Controller  |
                                    |              |
                                           | result
                |                    +----------------+   |
                | Flow Data     ---->|                |   |
                |                    |                |   |
                | Event         ---->|  NDP Operation |   |
                |                    |                |   |
                | Configuration ---->|                |   |
                |                    +----------------+   |

   Device side: The CPU and AI computing capabilities of devices are
   used to converge and process original device data.  For example, the
   flow data, events, and network topology can be comprehensively
   analyzed to determine the root cause of route flapping.  Then only
   the results are sent to the controller.  In this way, the original

   data and result data can be sent.  Of course, how to define and
   design a new result-based Telemetry model is a matter that needs
   further consideration in the future.

4.  Summary

   This draft describes a lightweight and efficient method to
   synchronize data between devices and controller in really time.
   Going forward, how to describe the computing model on the device and
   the interface shall be studied to extend the draft.

