Internet DRAFT - draft-song-opsa-dnp4iq

draft-song-opsa-dnp4iq








OPSAWG                                                      H. Song, Ed.
Internet-Draft                                                   J. Gong
Intended status: Informational                                   H. Chen
Expires: December 17, 2017                  Huawei Technologies Co., Ltd
                                                           June 15, 2017


     Requirements for Interactive Query with Dynamic Network Probes
                       draft-song-opsa-dnp4iq-00

Abstract

   This document discusses the motivation and requirements for
   supporting interactive network queries and data collection through a
   mechanism called Dynamic Network Probes (DNP).  Network applications
   and OAM have various data requirements from the data plane.  The
   unpredictable and interactive nature of the query for network data
   analytics asks for dynamic and on-demand data collection
   capabilities.  As user programmable data plane is becoming a reality,
   it can be enhanced to support interactive query through DNPs.  DNP
   supports node, path, and flow-based data preprocessing and
   collection.  For example, in-situ OAM (iOAM) with user-defined flow-
   based data collection can be programmed and configured through DNP.
   DNPs serve as a building block of an integrated network data
   telemetry and analytics platform which involves the network data
   plane as an active component for user-defined data collection and
   preparation.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 17, 2017.







Song, et al.            Expires December 17, 2017               [Page 1]

Internet-Draft          IQ with DNP Requirements               June 2017


Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Motivation for Interactive Query with DNP . . . . . . . . . .   3
   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  In-Situ OAM with User Defined Data Collection . . . . . .   6
     3.2.  DDoS Detection  . . . . . . . . . . . . . . . . . . . . .   6
     3.3.  Elephant Flow Identification  . . . . . . . . . . . . . .   6
     3.4.  Network Congestion Monitoring . . . . . . . . . . . . . .   7
   4.  Enabling Technologies for DNP . . . . . . . . . . . . . . . .   7
   5.  Dynamic Network Probes  . . . . . . . . . . . . . . . . . . .   9
     5.1.  DNP Types . . . . . . . . . . . . . . . . . . . . . . . .  11
       5.1.1.  Node Based  . . . . . . . . . . . . . . . . . . . . .  11
       5.1.2.  Path Based  . . . . . . . . . . . . . . . . . . . . .  12
       5.1.3.  Flow Based  . . . . . . . . . . . . . . . . . . . . .  13
   6.  Interactive Query Architecture  . . . . . . . . . . . . . . .  13
   7.  Requirements for IQ with DNP  . . . . . . . . . . . . . . . .  14
   8.  Considerations for IQ with DNP  . . . . . . . . . . . . . . .  15
     8.1.  Technical Challenges  . . . . . . . . . . . . . . . . . .  15
     8.2.  Standard Consideration  . . . . . . . . . . . . . . . . .  16
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  16
   12. Informative References  . . . . . . . . . . . . . . . . . . .  16
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  18

1.  Introduction

   Network service provider's pain points are often due to the lack of
   network visibility.  For example, network congestion collapse could
   be avoided in many cases if it were known exactly when and where
   congestion is happening or even better, if it could be precisely
   predicted well before any impact is made; sophisticated network



Song, et al.            Expires December 17, 2017               [Page 2]

Internet-Draft          IQ with DNP Requirements               June 2017


   attacks could be prevented through stateful and distributed network
   behavior analysis.

   In order to provide better application-centric services, user flows
   and their interaction with networks need to be tracked and
   understood.

   The emerging trend of network automation aims to keep people out of
   the OAM and control loop to the greatest extent for automated health
   prediction, fault recovery, demand planning, network optimization,
   and intrusion prevention, based on big data analytics and machine
   learning technologies.

   These applications need all kinds of network data, either passing
   through networks or generated by network devices.  For such
   applications to be effective, the data of interest needs to be
   retrieved in real time and on demand in an interactive and iterative
   fashion.  Continuous streaming data is often required.  Therefore, it
   is valuable to build a unified and general-purpose network telemetry
   and analytics platform with integrated data plane support to provide
   the complete network visibility at the minimum data bandwidth.  This
   is in contrast to the piecemeal solutions which only deal with one
   single problem at a time.

   We propose two ideas to enable such a vision.  First, we devise the
   Dynamic Network Probe (DNP) as a flexible and dynamic means for data
   plane data collection and preprocessing, which can prepare data for
   data analytics applications (Note that most of the DNPs are so common
   that it makes perfect sense to predefine the standard data models for
   them such that the conventional data plane devices can still be
   designed and configured to support them).  Second, we show the
   possibility to build a universal network telemetry and analytics
   platform with an Interactive Query (IQ) interface to the data plane
   which can compile and deploy DNPs at runtime (or configure DNPs
   dynamically based on standard data models).  In such a system,
   network devices play an integral and active role.  We show a layered
   architecture based on a programmable data plane which supports
   interactive queries on network data.

   In this document We discuss requirements, use cases, working items,
   and challenges, with the hope to trigger community interests to
   develop corresponding technologies and standards.

2.  Motivation for Interactive Query with DNP

   Network applications, such as traffic engineering, network security,
   network health monitoring, trouble shooting, and fault diagnosis,
   require different types of data collection.  The data are either



Song, et al.            Expires December 17, 2017               [Page 3]

Internet-Draft          IQ with DNP Requirements               June 2017


   normal traffic packets that are filtered, sampled, or digested, or
   metadata generated by network devices to convey network states and
   status.  Broadly speaking, there are three types of data to be
   collected from network data plane: path-based, flow-based, and node-
   based.  Path-based data is usually collected through dedicated
   probing packets (e.g., ping and traceroute); Flow-based data
   collection designates user flows to carry data of interest (e.g., in-
   situ OAM [I-D.brockners-inband-oam-requirements]); Node-based data is
   directly retrieved from selected network devices (e.g., ipfix
   [RFC7011]).

   Some data is considered atomic or primitive.  For example, a packet's
   arrival timestamp at a particular node cannot be further
   disintegrated.  The atomic data can be used to generate synthetic and
   combinational data.  For example, a packet's latency on a path can be
   calculated through the packet timestamps at the end of the path.
   Depending on the application, either data may be required.  If the
   application's real intent is the latter, it makes sense to directly
   provide such data to reduce the data transfer bandwidth, at the cost
   of a small processing overhead in the data plane and/or control
   plane.  Some synthetic and combinational data can be acquired through
   multiple data types, but the most efficient way is preferred for a
   specific network.  For the similar purpose of data traffic reduction,
   applications may not need the "raw" data all the time.  Instead, they
   may want data that is sampled and filtered, or only when some
   predefined condition is met.  Anyway, application's requirements on
   data are diversified and unpredictable.  Applications may need some
   data which is not readily available at the time of request.

   Some applications are interactive or iterative.  After analyzing the
   initial data, these applications may quickly shift interests to new
   data or need to keep refining the data to be collected based on
   previous observations (e.g., an elephant flow detector continues to
   narrow down the flow granularity and gather statistics).  The control
   loop algorithms of these applications continuously interact with the
   data plane and modify the data source and content in a highly dynamic
   manner.

   Ideally, to support all potential applications, we need full
   visibility to know any states anytime anywhere in the entire network
   data plane.  In reality, this is extremely difficult if not
   impossible.  A strawman option is to mirror all the raw traffic to
   servers where data analytics engine is running.  This brute-force
   method requires to double the device port count and the traffic
   bandwidth, and poses enormous computing and storage cost.  As a
   tradeoff, Test Access Port (TAP) or Switch Port Analyzer (SPAN) is
   used to selectively mirror only a portion of the overall traffic.
   Network Packet Broker (NPB) is deployed along with TAP or SPAN to



Song, et al.            Expires December 17, 2017               [Page 4]

Internet-Draft          IQ with DNP Requirements               June 2017


   process and distribute the raw data to various data analytics tools.
   There are some other solutions (e.g., sflow [RFC3176] and ipfix
   [RFC7011]) which can provide sampled and digested packet data and
   some traffic statistics.  Meanwhile, network devices also generate
   various log files to record miscellaneous events in the system.

   When aggregating all these solutions together, we can gain a
   relatively comprehensive view of the network.  However, the main
   problem is the lack of a unified platform to deal with the general
   network telemetry problem and the highly dynamic and unpredictable
   data requirements.  Moreover, each piecemeal solution inevitably
   loses information due to data plane resource limitations which makes
   the data analytical results suboptimal.

   Trying to design an omnipotent system to support all possible runtime
   data requests is also unviable because the resources required are
   prohibitive (e.g., even a simple counter per flow is impossible in
   practice).  An alternative is to reprogram or reconfigure the data
   plane device whenever an unsupported data request appears.  This is
   possible thanks to the recently available programmable chips and the
   trend to open the programmability to service providers.
   Unfortunately, the static programming approach cannot meet the real
   time requirements due to the latency incurred by the programming and
   compiling process.  The reprogramming process also risks breaking the
   normal operation of network devices.

   Then a viable solution left to us is: whenever applications request
   data which is yet unavailable in the data plane, the data plane can
   be configured in real time to return the requested data.  That is, we
   do not attempt to make the network data plane provide all data all
   the time.  Instead, we only need to ensure that any application can
   acquire necessary data instantly whenever it actually needs it.  This
   data-on-demand model can support effectively omni network visibility,
   Note that data collection is meant to be passive and should not
   change the network forwarding behavior.  The active forwarding
   behavior modification is out of the scope of this draft.

   Data can be customized dynamically and polled or pushed based on
   application's request.  Moderate data preprocessing and preparation
   by data plane devices may be needed.  Such "in-network" processing
   capability can be realized through DNP.

3.  Use Cases








Song, et al.            Expires December 17, 2017               [Page 5]

Internet-Draft          IQ with DNP Requirements               June 2017


3.1.  In-Situ OAM with User Defined Data Collection

   In-situ OAM [I-D.brockners-inband-oam-requirements] collects data on
   user traffic's forwarding path.  From the control and management
   plane point of view, each data collection task is a query from the
   OAM application.  In case the data collection function is not hard
   coded in network devices, DNP can be dynamically deployed to support
   the in-situ OAM.

   While the current in-situ OAM drafts only concern the data plane
   packet format and use cases, the applications still need a control
   and management interface to dynamically enable and disable the in-
   situ OAM functions, which involves the tasks such as choosing the
   source and destination nodes on the path, the flow to carry the OAM
   data, and the way to handle the data at the path end.  These
   configuration tasks can be done through DNP.

   More importantly, in-situ OAM [I-D.brockners-inband-oam-data] may
   collect user-defined data which are not available at device
   configuration time.  In this case, the data can be defined by DNP.
   DNP can further help to preproess the data before sending the data to
   the subscribing application.  This can help to reduce the OAM header
   size and the application's work load.

3.2.  DDoS Detection

   In a data center the security application wants to find the servers
   under possible DDoS attack with a suspiciously large number of
   connections.  It can deploy DNPs on all the portal switches to
   periodically report the number of unique flows targeting the set of
   the protected servers.  Once the queried data are collected, it is
   easy to aggregate the data to find the potential DDoS attacks.

3.3.  Elephant Flow Identification

   An application wants to query the network-wide top-n flows.  Various
   algorithms have been developed at each network device to detect local
   elephant flows.  These algorithms can be defined as DNPs.  A set of
   network devices are chosen to deploy the DNPs so each will
   periodically report the local elephant flows.  The application will
   aggregate the data to find the global elephant flows.  The elephant
   flow identification can be an interactive process.  The application
   may need to adjust the existing DNPs or deploy new DNPs to refine the
   detection results.

   In some cases, the local resource in a network device is not
   sufficient to monitor the entire flow space.  We can partition the
   flow space and configure one network device in a group with a DNP to



Song, et al.            Expires December 17, 2017               [Page 6]

Internet-Draft          IQ with DNP Requirements               June 2017


   track only a subset of flows, given the assumption that each device
   can see all the flows.

3.4.  Network Congestion Monitoring

   Network congestion is reflected by packet drops at routers or
   switches.  While it is easy to get the packet drop count at each
   network device, it is difficult to gain insights on the victims, hot
   spots, and lossy paths.  We can deploy DNPs to acquire such
   information.  DNPs are deployed on all network devices to collect the
   detailed information about the dropped packet such as its signature
   and the port it is dropped.  Based on the collected data, the
   application can generate the report on the top victims, hot spots,
   and the most lossy paths.

4.  Enabling Technologies for DNP

   Network data plane is becoming user programmable.  It means the
   network operators are in control of customizing the network device's
   function and forwarding behavior.  Figure 1 shows the industry trend,
   which shapes new forms of network devices and inspires innovative
   ways to use them.

                  +-------+        +-------+           +-------+
                  |       |        |       |           |       |
                  |  NOS  |        |  APP  |           |  APP  |
                  |       |        |       |           |       |
                  +-------+        +-------+           +-------+
                      ^               ^                    ^
                      |               |                    | runtime
            decouple  |    ------>    | config time --->   | interactive
                      |               | programming        | programming
                      V               V                    V
                  +----------+     +-------------+  +-------------+
                  | box with |     | box with    |  | box with    |
                  | fixed    |     | programmable|  | interactive |
                  | function |     | chip        |  | programmable|
                  | ASIC     |     |             |  | chip        |
                  +----------+     +-------------+  +-------------+


              Figure 1: Towards User Programmable Data Plane

   The first trend is led by the OCP networking project, which advocates
   the decoupling of the network operating system and the network device
   hardware.  A common Switch Abstract Interface (SAI) allows
   applications to run on heterogeneous substrate devices.  However,




Song, et al.            Expires December 17, 2017               [Page 7]

Internet-Draft          IQ with DNP Requirements               June 2017


   such devices are built with fixed function ASICs, which provide
   limited flexibility for application customization.

   The second trend is built upon the first one yet makes a big leap.
   Chip and device vendors are working on opening the programmability of
   the NPU, CPU, and FPGA-based network devices to network operators.
   Most recently, programmable ASIC has been proven feasible.  High
   level languages such as P4 [DOI_10.1145_2656877.2656890] have been
   developed to make the network device programming easy and fast.  Now
   a network device can be programmed into different functioning boxes
   depending on the program installed.

   However, such programming process is considered static.  Even a minor
   modification to the existing application requires to recompile the
   updated source code and reinstall the application.  This incurs long
   deployment latency and may also temporarily break the normal data
   plane operation.

   User programmable data plane should be stretched further to support
   runtime interactive programming in order to extend its scope of
   usability, as proposed in POF [DOI_10.1145_2491185.2491190] Dynamic
   application requirements cannot be foreseen at design time, and
   runtime data plane modifications are required to be done in real time
   (for agile control loop) and on demand (to meet data plane resource
   constraints).  Meanwhile, the data plane devices are capable of doing
   more complex things such as stateful processing without always
   resorting to a controller for state tracking.  This allows network
   devices to offload a significant portion of the data processing task
   and only hand off the preprocessed data to the data-requesting
   applications.

   We can still use static programming with high level languages such as
   P4 to define the main data plane processing and forwarding function.
   But at runtime, whenever an application requires to make some
   modification to the data plane, we deploy the incremental
   modification directly through the runtime control channel.  The key
   to make this dynamic and interactive programming work is to maintain
   a unified interface to devices for both configuration and runtime
   control, because both programming paths share the same data plane
   abstraction and use the same back-end adapting and mapping method.

   NPU-based network devices and virtual network devices running on CPU/
   GPU can easily support the static and runtime in-service data plane
   programmability.  ASIC and FPGA-based network devices may be
   difficult to support runtime programming and update natively.
   However, for telemetry data collection tasks, the device local
   controller (or even remote servers) can be used in conjunction with
   the forwarding chip to complete the data preprocessing and



Song, et al.            Expires December 17, 2017               [Page 8]

Internet-Draft          IQ with DNP Requirements               June 2017


   preparation.  After all, applications do not care how the data probes
   are implemented as long as the same API is maintained.

5.  Dynamic Network Probes

   Network probes are passive monitors which are installed at specific
   forwarding data path locations to process and collect specific data.
   DNPs are dynamically deployed and revoked probes by applications at
   runtime.  The customizable DNPs can collect simple statistics or
   conduct more complex data preprocessing.  Since DNPs may require
   actively modifying the existing data path pipeline beyond simple flow
   entry manipulation, these operations need to be done through
   interactive programming process.  When a DNP is revoked, the involved
   shared resources are automatically recycled and returned back to the
   global resource pool.

   DNPs can be deployed at various data path locations including port,
   queue, buffer, table, and table entry.  When the data plane
   programmability is extended to cover other components (e.g., CPU
   load, fan speed, GPS coordination, etc.), DNPs can be deployed to
   collect corresponding data as well.  A few data plane objectives can
   be composed to form probes.  These objectives are counter, meter,
   timer, timestamp, register, and table.  Combining these with the
   packet filter through flow table entry configuration, one can easily
   monitor and catch arbitrary states on the data plane.

   In practice, DNP can be considered a virtual concept.  Its deployment
   can be done through either configuration or programming.  For less
   flexible platforms, probes can be predefined but support on-demand
   runtime activation.  Complex DNP functions can also be achieved
   through collaboration between data plane and control plane.  Most
   common DNPs can be modeled for easy implementation.  The goal is to
   make DNP implementation transparent to upper layer applications.

   The simplest probe is just a counter.  The counter can be configured
   to count bytes or packets and the counting can be conditional.  The
   more complex probes can be considered as Finite State Machines (FSM)
   which are configured to capture specific events.  FSMs essentially
   preprocess the raw stream data and only report the necessary data to
   subscribing applications.

   Applications can use poll mode or push mode to access probes and
   collect data.  The normal counter probes are often accessed via poll
   mode.  Applications decide what time and how often the counter value
   is read.  On the other hand, the complex FSM probes are usually
   accessed in push mode.  When the target event is triggered, a report
   is generated and pushed to the application.




Song, et al.            Expires December 17, 2017               [Page 9]

Internet-Draft          IQ with DNP Requirements               June 2017


   Timer is a special global resource.  A timer can be configured to
   link to some action.  When the time is up, the corresponding action
   is executed.  For example, to get notification when a port load
   exceeds some threshold, we can set a timer with a fixed time-out
   interval, and link the timer to an action which reads the counter and
   generates the report packet if the condition is triggered.  This way,
   the application avoids the need to keep polling statistics from the
   data plane.

   With the use of global registers and state tables, more complex FSM
   probes can be implemented.  For example, to monitor the half-open TCP
   connections, for each SYN request, we store the flow signature to a
   state table.  Then for each ACK packet, the state table is checked
   and the matched entry is removed.  The state table can be
   periodically polled to acquire the list of half-open connections.
   The application can also choose to only retrieve the counter of half-
   open connections.  When the counter exceeds some threshold, further
   measures can be taken to examine if a SYN flood attack is going on.

   Registers can be considered mini state tables which are good to track
   a single flow and a few state transitions.  For example, to get the
   duration of a particular flow, when the flow is established, the
   state and the timestamp are recorded in a register; when the flow is
   torn down, the flow duration can be calculated with the old timestamp
   and the new timestamp.  In another example, we want to monitor a
   queue by setting a low water mark and a high water mark for the fill
   level.  Every time when an enqueue or a dequeue event happens, the
   queue depth is compared with the marks and a report packet is
   generated when a mark is crossed.

   Some probes are essentially packet filters which are used to filter
   out a portion of the traffic and mirrored the traffic to the
   application or some other target port for further processing.  There
   are two ways to implement a packet filter: use a flow table that
   matches on the filtering criteria and specify the associated action;
   or directly make a decision in the action.  An example of the former
   case is to filter all packets with a particular source IP address.
   An example of the latter case is to filter all TCP FIN packets at the
   edge.  Although we can always use a flow table to filter traffic,
   sometimes it is more efficient and convenient to directly work on the
   action.  As being programmed by the application, the filtered traffic
   can be further processed before being sent.  Two most common
   processes are digest and sample, both aiming to reduce the quantity
   of raw data.  The digest process prunes the unnecessary data from the
   original packet and only packs the useful information in the digest
   packet.  The sample process picks a subset of filtered traffic to
   send based on some predefined sampling criteria.  The two processes
   can be used jointly to maximize the data reduction effect.



Song, et al.            Expires December 17, 2017              [Page 10]

Internet-Draft          IQ with DNP Requirements               June 2017


   An application may need to install multiple DNPs in one device or
   across multiple devices to finish one data analytical task.  For
   example, to measure the latency of any link in a network.  We install
   a DNP on the source node to generate probe packets with timestamp.
   We install another DNP at the sink node to capture the probe packets
   and report both the source timestamp and the sink timestamp to the
   application for link latency calculation.  The probe packets are also
   dropped by the sink DNP.  The source DNP can be configured to
   generate probe packets at any rate.  It can also generate just one
   probe packet per application request.

   Using the similar idea, we can deploy DNPs to measure the end-to-end
   flow latency or trace exact flow paths.  In this case, the DNPs can
   be deployed to enable the corresponding iOAM in-situ data collection
   service.  At the path end, the DNP calculates the desired output
   based on the collected data.

   Applications could have many such custom data requests.  Each request
   lasts various time and consumes various network resources.  Dynamic
   probe configuration or programming is not only efficient but also
   necessary.  In summary, DNP is a versatile tool to prepare and
   generate just-in-time telemetry data for data analytical
   applications.

5.1.  DNP Types

   DNP can be roughly grouped into three types: node-based, path-based,
   and flow-based.  Following is the list of DNPs.  Some are atomic and
   the others can be derived from the atomic ones.  Note that the list
   is by no means comprehensive.  The list does not include the device
   state and status data that is steadily available.  Depending on the
   device capability, more complex DNPs can be implemented.
   Applications can subscribe data from multiple DNPs to meet their
   needs.  The flow-based data can be directly provided by iOAM data or
   derived from iOAM data.

5.1.1.  Node Based

   o  Streaming Packets

      *  Filter flow by user-defined flow definition.

      *  Sample with user-defined sample rate.  The sample can be based
         on interval or probability.

      *  Generate packet digest with user defined format.

   o  Flow Counter



Song, et al.            Expires December 17, 2017              [Page 11]

Internet-Draft          IQ with DNP Requirements               June 2017


      *  Associate poll-mode counter for user-defined flow.

      *  Associate push-mode counter for user-defined flow.  The counter
         value is pushed at user-defined threshold or interval.

   o  Flow Meter

      *  Associate poll-mode meter for user-defined flow.

      *  Associate push-mode meter for user-defined flow.  The meter
         value is pushed at user-defined threshold or interval.

   o  Queue

      *  Queue depth for designated queue is polled or pushed at user-
         defined threshold or interval.

      *  Designated buffer depth is polled or pushed at user-defined
         threshold or interval.

   o  Time

      *  Time gap between user-defined flow packets is polled or pushed
         in streaming data or at user-defined threshold.

      *  Arrival/Departure/Sojourn time of user-defined flow packets is
         polled or pushed streaming data or at user defined threshold.

   o  Statistics

      *  Number of active flows, elephant flows, and mice flows.

5.1.2.  Path Based

   o  Number of active flows per node on the path.

   o  Path latency.

   o  Round trip time of the path.

   o  Node ID and ingress/egress port of the path.

   o  Hop count of the path.

   o  Buffer/queue depth of the nodes on the path.

   o  Workload of the nodes on the path.




Song, et al.            Expires December 17, 2017              [Page 12]

Internet-Draft          IQ with DNP Requirements               June 2017


5.1.3.  Flow Based

   o  Flow Latency: Latency at each hop or cumulative E2E latency for
      user-defined flow.

   o  Flow Jitter: Jitter at each hop or on the entire path for user-
      defined flow.

   o  Flow Bandwidth: Bandwidth at each hop or the bottleneck bandwidth
      on the entire path for user-defined flow.

   o  Flow Path Trace: Port and Node ID, and other data of the path for
      user-defined flow.

   o  Proof of Transit (PoT) for particular set of nodes.

6.  Interactive Query Architecture

   In the past, network data analytics is considered a separate function
   from networks.  They consume raw data extracted from networks through
   piecemeal protocols and interfaces.  With the advent of user
   programmable data plane, we expect a paradigm shift that makes the
   data plane be an active component of the data analytics solution.
   The programmable in-network data preprocessing is efficient and
   flexible to offload some light-weight data processing through dynamic
   data plane programming or configuration.  A universal network data
   analytics platform built on top of this enables a tight and agile
   network control and OAM feedback loop.

   While DNP is a passive data plane data collection mechanism, we need
   to provide a query interface for applications to use the DNPs for
   data analytics.  A proposed dynamic networking data analytical system
   architecture is illustrated in Figure 2.  An application translates
   its data requirements into some dynamic transactional queries.  The
   queries are then compiled into a set of DNPs targeting a subset of
   data plane devices (Note that in a less flexible target with
   predefined models, DNPs are configured).  After the DNPs are
   deployed, each DNP conducts in-network data preprocessing and feeds
   the preprocessed data to the collector.  The collector finishes the
   data post-processing and presents the results to the data-requesting
   application.










Song, et al.            Expires December 17, 2017              [Page 13]

Internet-Draft          IQ with DNP Requirements               June 2017


                  +------------------------------------+
                  |network data analytics applications |
                  +----------- ------------------------+
                              ^
                              V
                  +------------------------------------+
                  |dynamic and interactive query       |
                  +------------------------------------+
                         ^                   |
                         |                   V
                  +---------------+ +------------------+
                  |post process   | |DNP compile/config|
                  +---------------+ +------------------+
                         ^                   |
                         |                   V
                  +---------------+    +---------------+
                  |data collection|    |DNP deployment |
                  +---------------+    +---------------+
                      ^   ^   ^            |   |   |
                      |   |   |            V   V   V
                  +------------------------------------+
                  |network data plane                  |
                  |(in-network data preprocessing)     |
                  +------------------------------------+

                   Figure 2: Architecture of IQ with DNP

   A query can be either continuous or one-shot.  The continuous query
   may require the application to refine the existing DNPs or deploy new
   DNPs.  When an application revokes its queries, the idle DNP resource
   is released.  Since one DNP may be subscribed by multiple
   applications, the runtime system needs to keep track of the active
   DNPs.

7.  Requirements for IQ with DNP

   This section lists the requirements for interactive query with DNP:

   o  Applications should conduct interactive query through a standard
      interface (i.e., API).  The system is responsible to compile the
      IQ into DNPs and deploy the DNPs to the corresponding network
      nodes.

   o  DNPs can be deployed through some standard south bound interface
      and protocols such as gRPC, NETCONF, etc.

   o  The interactive query should not modify the forwarding behavior.
      The API should provide the necessary isolation.



Song, et al.            Expires December 17, 2017              [Page 14]

Internet-Draft          IQ with DNP Requirements               June 2017


   o  The deployed DNP should not lower the forwarding performance of
      the data plane devices.  If the DNP would affect the forwarding
      performance, the query should be denied.

   o  The system should support multiple parallel queries from multiple
      applications.

   o  One application can deploy different DNPs to a set of network
      nodes and these DNPs work jointly to finish a function.

   o  DNP may be revoked and preempted by the controller due to resource
      conflict and application priority.

8.  Considerations for IQ with DNP

8.1.  Technical Challenges

   Some technical issues need to be addressed to realize interactive
   query with DNP on general network data plane:

   o  Allowing applications to modify the data plane has security and
      safety risks (e.g., DoS attack).  The counter measure is to supply
      a standard and safe API to segregate applications from the runtime
      system and provide applications limited accessibility to the data
      plane.  Each API can be easily compiled and mapped to standard
      DNPs.  An SQL-like query language which adapts to the stream
      processing system might be feasible for the applications.

   o  When multiple correlated DNPs are deployed across multiple network
      devices or function blocks, or when multiple applications request
      the same DNPs, the deployment consistency needs to be guaranteed
      for correctness.  This requires a robust runtime compiling and
      management system which keeps track of the subscription to DNPs
      and controls the DNP execution time and order.

   o  The performance impact of DNPs must be evaluated before deployment
      to avoid unintentionally reducing the forwarding throughput.
      Fortunately, the resource consumption and performance impact of
      standard DNPs can be accurately profiled in advance.  A device is
      usually over provisioned and is capable of absorbing extra
      functions up to a limit.  Moreover, programmable data plane allows
      users to tailor their forwarding application to the bare bones so
      more resources can be reserved for probes.  The runtime system
      needs to evaluate the resulting throughput performance before
      committing a DNP.  If it is unacceptable, either some old DNPs
      need to be revoked or the new request must be denied.





Song, et al.            Expires December 17, 2017              [Page 15]

Internet-Draft          IQ with DNP Requirements               June 2017


   o  While DNP is relatively easy to be implemented in software-based
      platform (e.g., NPU and CPU), it is harder in ASIC-based
      programmable chips.  Architectural and algorithmic innovations are
      needed to support a more flexible pipeline which allows new
      pipeline stage, new tables, and new custom actions to be inserted
      at runtime through hitless in-service updates.  An architecture
      with shared memory and flexible processor cores might be viable to
      meet these requirements.  Alternatively, DNPs can be implemented
      using an "out-of-band" fashion.  That is, the slow path processor
      is engaged in conjunction with the forwarding chip to complete the
      DNP function.

8.2.  Standard Consideration

   The query API can be potentially standardized.  The actually DNP
   deployment interface may consider to reuse or extend the IETF
   standards and drafts such as gRPC [I-D.talwar-rtgwg-grpc-use-cases]
   and NETCONF [RFC6241].  We may also define standard telemetry YANG
   [RFC6020] models for common DNPs so these DNPs can be used in a
   configurable way.

9.  Security Considerations

   Allowing applications to modify the data plane has security and
   safety risks (e.g., DoS attack).  The counter measure is to supply
   standard and safe API to segregate applications from the runtime
   system and provide applications limited accessibility to the data
   plane.  Each API can be easily compiled and mapped to standard DNPs.
   An SQL-like query language which adapts to the stream processing
   system might be feasible and secure for the applications.

10.  IANA Considerations

   This memo includes no request to IANA.

11.  Acknowledgments

   The authors would like to thank Frank Brockners, Carlos Pignataro,
   Tom Tofigh, Bert Wijnen, Stewart Bryant, James Guichard, and Tianran
   Zhou for the valuable comments and advice.

12.  Informative References

   [DOI_10.1145_2491185.2491190]
              Song, H., "Protocol-oblivious forwarding", Proceedings of
              the second ACM SIGCOMM workshop on Hot topics in software
              defined networking - HotSDN '13 ,
              DOI 10.1145/2491185.2491190, 2013.



Song, et al.            Expires December 17, 2017              [Page 16]

Internet-Draft          IQ with DNP Requirements               June 2017


   [DOI_10.1145_2656877.2656890]
              Bosshart, P., Varghese, G., Walker, D., Daly, D., Gibb,
              G., Izzard, M., McKeown, N., Rexford, J., Schlesinger, C.,
              Talayco, D., and A. Vahdat, "P4", ACM SIGCOMM Computer
              Communication Review Vol. 44, pp. 87-95,
              DOI 10.1145/2656877.2656890, July 2014.

   [I-D.brockners-inband-oam-data]
              Brockners, F., Bhandari, S., Pignataro, C., Gredler, H.,
              Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov,
              P., and R. <>, "Data Formats for In-situ OAM", draft-
              brockners-inband-oam-data-02 (work in progress), October
              2016.

   [I-D.brockners-inband-oam-requirements]
              Brockners, F., Bhandari, S., Dara, S., Pignataro, C.,
              Gredler, H., Leddy, J., Youell, S., Mozes, D., Mizrahi,
              T., <>, P., and r. remy@barefootnetworks.com,
              "Requirements for In-situ OAM", draft-brockners-inband-
              oam-requirements-02 (work in progress), October 2016.

   [I-D.talwar-rtgwg-grpc-use-cases]
              Specification, g., Kolhe, J., Shaikh, A., and J. George,
              "Use cases for gRPC in network management", draft-talwar-
              rtgwg-grpc-use-cases-01 (work in progress), January 2017.

   [RFC3176]  Phaal, P., Panchen, S., and N. McKee, "InMon Corporation's
              sFlow: A Method for Monitoring Traffic in Switched and
              Routed Networks", RFC 3176, DOI 10.17487/RFC3176,
              September 2001, <http://www.rfc-editor.org/info/rfc3176>.

   [RFC6020]  Bjorklund, M., Ed., "YANG - A Data Modeling Language for
              the Network Configuration Protocol (NETCONF)", RFC 6020,
              DOI 10.17487/RFC6020, October 2010,
              <http://www.rfc-editor.org/info/rfc6020>.

   [RFC6241]  Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed.,
              and A. Bierman, Ed., "Network Configuration Protocol
              (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011,
              <http://www.rfc-editor.org/info/rfc6241>.

   [RFC7011]  Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
              "Specification of the IP Flow Information Export (IPFIX)
              Protocol for the Exchange of Flow Information", STD 77,
              RFC 7011, DOI 10.17487/RFC7011, September 2013,
              <http://www.rfc-editor.org/info/rfc7011>.





Song, et al.            Expires December 17, 2017              [Page 17]

Internet-Draft          IQ with DNP Requirements               June 2017


Authors' Addresses

   Haoyu Song (editor)
   Huawei Technologies Co., Ltd
   2330 Central Expressway
   Santa Clara, 95050
   USA

   Email: haoyu.song@huawei.com


   Jun Gong
   Huawei Technologies Co., Ltd
   156 Beiqing Road
   Beijing, 100095
   P.R. China

   Email: gongjun@huawei.com


   Hongfei Chen
   Huawei Technologies Co., Ltd
   156 Beiqing Road
   Beijing, 100095
   P.R. China

   Email: chenhongfei@huawei.com
























Song, et al.            Expires December 17, 2017              [Page 18]