IPPM | H. Song, Ed. |
Internet-Draft | Futurewei |
Intended status: Standards Track | T. Zhou |
Expires: March 12, 2020 | Z. Li |
Huawei | |
J. Shin | |
SK Telecom | |
K. Lee | |
LG U+ | |
September 9, 2019 |
Postcard-based On-Path Flow Data Telemetry
draft-song-ippm-postcard-based-telemetry-05
The Postcard-Based Telemetry (PBT) allows network OAM applications to directly collect and export telemetry data about any user packet at each node on the forwarding path. PBT has two variations. One requires inserting an instruction header to user packets to guide the data collection. This variation has been recast into an independent IOAM option mode, Direct Export, and described in a standalone document. This document describes the second variation, the mark triggered PBT or PBT-M. PBT-M only marks the user packets or configure the flow filter to invoke the data collection and postcard export. It complements IOAM by addressing several specific implementation and deployment challenges.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 12, 2020.
Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
In order to gain detailed data plane visibility to support effective network OAM, it is important to be able to examine the trace of user packets along their forwarding paths. Such on-path flow data reflect the state and status of each user packet's real-time experience and provide valuable information for network monitoring, measurement, and diagnosis.
The telemetry data include but not limited to the detailed forwarding path, the timestamp/latency at each network node, and, in case of packet drop, the drop location and reason. The emerging programmable data plane devices allow user-defined data collection[I-D.song-opsawg-dnp4iq] or conditional data collection based on trigger events. Such on-path flow data are from and about the live user traffic, which complement the data acquired through other passive and active OAM mechanisms such as IPFIX and ICMP.
In-band Network Telemetry (INT) was designed to cater this need (note that although INT has been widely used, the term "in-band" here does not comply with IETF's definition. "on-path" or "in-situ" may be more accurate terms). in-situ OAM (IOAM) represents the related standardization efforts. In essence, INT augments user packets with instructions to tell each network node on their forwarding paths what data to collect. The requested data are inserted into and travel along with the user packets. Some end nodes are responsible to strip off the data trace and export it to a data collector for processing.
While the concept is simple and straightforward, INT faces several technical challenges:
The above issues are inherent to the INT-based solutions. Nevertheless, the on-path data acquired by INT are valuable for network operators. Therefore, alternative approaches which can collect the same data but avoid or mitigate the above issues are desired. This document provides a new approach named Postcard-Based Telemetry (PBT) with two different implementation variations, each having its own trade-off and addressing some or all of the above issues. The basic idea of PBT is simple: at each node, instead of inserting the collected data into the user packets, the data are directly exported through dedicated OAM packets. Such "postcard" approach is in contrast to the "passport stamps" approach adopted by INT [DOI_10.1145_2342441.2342453]. The OAM packets or postcards can be generated by the node's slow path and transported in band or out of band, independent of the original user packets.
The variation that requires an instruction header has been recast an IOAM option mode named Direct Export (DX). DX is described in another document and the reference will be included later. This document only covers the second variation.
This section describes the variation of PBT which triggers the postcard export with a mark in user packets. PBT-M aims to address the challenges of INT listed above and introduce some new benefits. We first list all the design requirements of PBT-M.
In light of the above discussion, the sketch of the proposed solution, PBT-M, is as follows. The user packet, if its path-associated data need to be collected, is marked at the path head node. At each PBT-aware node, if the mark is detected, a postcard (i.e., the dedicated OAM packet triggered by a marked user packet) is generated and sent to a collector. The postcard contains the data requested by the management plane. The requested data are configured by the management plane through data set templates (as in IPFIX). Once the collector receives all the postcards for a single user packet, it can infer the packet's forwarding path and analyze the data set. The path end node is configured to unmark the packets to its original format if necessary.
The overall architecture of PBT-M is depict in Figure 1.
+------------+ +-----------+ | Network | | Telemetry | | Management |(-------| Data | | | | Collector | +-----:------+ +-----------+ : ^ :configurations |postcards (OAM pkts) : | ...............:.....................|........ : : : | : : +---------:---+-----------:---+--+-------:---+ : | : | : | : | V | V | V | V | +------+-+ +-----+--+ +------+-+ +------+-+ usr pkts | Head | | Path | | Path | | End | ====>| Node |====>| Node |====>| Node |====>| Node |====> | | | A | | B | | | +--------+ +--------+ +--------+ +--------+ gen postcards gen postcards gen postcards gen postcards mark usr pkts unmark usr pkts
Figure 1: Architecture of PBT-M
Although PBT-M solves the issues of INT, it introduces a few new challenges.
To address the above challenges, we propose several design details of PBT-M.
To trigger the path-associated data collection, usually a single bit from some header field is sufficient. While no such bit is available, other packet marking techniques are needed. we discuss three possible application scenarios.
By default, all PBT-aware nodes are configured to react to the marked packets by exporting some basic data such as node ID and TTL before a data set template for that flow is configured. This way, the management plane can learn the flow path dynamically.
If the management plane wants to collect the path-associated data for some flow, it configures the head node(s) with a probability or time interval for the flow packet marking. When the first marked packet is forwarded in the network, the PBT-aware nodes will export the basic data to the collector. Hence, the flow path is identified. If other types of data need to be collected, the management plane can further configure the data set template to the target nodes on the flow's path. The PBT-aware nodes would collect and export data accordingly if the packet is marked and a data set template is present.
If for any reason, the flow path is changed. The new path nodes can be learned immediately by the collector, so the management plane controller can be informed to configure the new path nodes. The outdated configuration can be automatically timed out or explicitly revoked by the management plane controller.
The collector needs to correlate all the OAM packets for a single user packet. Once this is done, the TTL (or the timestamp, if the network time is synchronized) can be used to infer the flow forwarding path. The key issue here is to correlate all the postcards for a same user packet.
The first possible approach is to include the flow ID plus the user packet ID in the OAM packets. The flow ID can be the 5-tuple IP header of the user traffic. The user packet ID can be some unique information pertaining to a user packet (e.g., the sequence number of a TCP packet).
If the packet marking interval is large enough, then the flow ID itself is enough to identify the user packet. That is, we can assume all the exported OAM packets for the same flow during a short period of time belong to the same user packet.
Alternatively, if the network is synchronized, then the flow ID plus the timestamp at each node can also infer the postcard affiliation. However, some errors may occur under some circumstances. For example, if two consecutive user packets from the same flows are both marked but one exported postcard from a node is lost, then it is difficult for the collector to decide which user packet the remaining postcard belongs to. In many cases, such rare error has no catastrophic consequence therefore is tolerable.
It is possible to avoid needing to mark user packets yet still allowing in-band flow data collection. We could simply configure the Access Control List (ACL) to filter out the set of target flows. This approach has two potential issues: (1) Since the packet forwarding path is unknown in advance, one needs to configure all the nodes in a network to filter the flows and capture the complete data set. This wastes the precious ACL resource and is not scalable. (2) If a node cannot collect data for all the filtered packets of a flow, it needs to determine which packets to sample independently, so the collector may not be able to receive the full set of postcards for a same user packet.
Nevertheless, since this approach does not require to touch the user packets at all, it has its unique merits: (1) User can freely choose any nodes as vantage points for data collection; (2) No need to worry that any "modified" user packets to leak out of the PBT domain; (3) It has the minimum impact to the forwarding of the user traffic.
No data plane standard is required to support this mode, except the postcard format.
Postcard can use the same data export format as that used by IOAM. [I-D.spiegel-ippm-ioam-rawexport] proposes a raw format that can be interpreted by IPFIX.
Several security issues need to be considered.
No requirement for IANA is identified.
TBD.
TBD.
[DOI_10.1145_2342441.2342453] | Handigol, N., Heller, B., Jeyakumar, V., Maziéres, D. and N. McKeown, "Where is the debugger for my software-defined network?", Proceedings of the first workshop on Hot topics in software defined networks - HotSDN '12, DOI 10.1145/2342441.2342453, 2012. |
[I-D.brockners-inband-oam-requirements] | Brockners, F., Bhandari, S., Dara, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mozes, D., Mizrahi, T., Lapukhov, P. and r. Chang, "Requirements for In-situ OAM", Internet-Draft draft-brockners-inband-oam-requirements-03, March 2017. |
[I-D.brockners-inband-oam-transport] | Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "Encapsulations for In-situ OAM Data", Internet-Draft draft-brockners-inband-oam-transport-05, July 2017. |
[I-D.brockners-ippm-ioam-geneve] | Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "Geneve encapsulation for In-situ OAM Data", Internet-Draft draft-brockners-ippm-ioam-geneve-01, June 2018. |
[I-D.bryant-mpls-synonymous-flow-labels] | Bryant, S., Swallow, G., Sivabalan, S., Mirsky, G., Chen, M. and Z. Li, "RFC6374 Synonymous Flow Labels", Internet-Draft draft-bryant-mpls-synonymous-flow-labels-01, July 2015. |
[I-D.clemm-netconf-push-smart-filters-ps] | Clemm, A., Voit, E., Liu, X., Bryskin, I., Zhou, T., Zheng, G. and H. Birkholz, "Smart filters for Push Updates - Problem Statement", Internet-Draft draft-clemm-netconf-push-smart-filters-ps-00, October 2017. |
[I-D.ietf-ippm-alt-mark] | Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G. and T. Mizrahi, "Alternate Marking method for passive and hybrid performance monitoring", Internet-Draft draft-ietf-ippm-alt-mark-14, December 2017. |
[I-D.ietf-ippm-ioam-data] | Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P., Chang, R. and d. daniel.bernier@bell.ca, "Data Fields for In-situ OAM", Internet-Draft draft-ietf-ippm-ioam-data-00, September 2017. |
[I-D.ietf-netconf-udp-pub-channel] | Zheng, G., Zhou, T. and A. Clemm, "UDP based Publication Channel for Streaming Telemetry", Internet-Draft draft-ietf-netconf-udp-pub-channel-01, November 2017. |
[I-D.ietf-netconf-yang-push] | Clemm, A., Voit, E., Prieto, A., Tripathy, A., Nilsen-Nygaard, E., Bierman, A. and B. Lengyel, "YANG Datastore Subscription", Internet-Draft draft-ietf-netconf-yang-push-12, December 2017. |
[I-D.ietf-sfc-ioam-nsh] | Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "NSH Encapsulation for In-situ OAM Data", Internet-Draft draft-ietf-sfc-ioam-nsh-00, May 2018. |
[I-D.ietf-sfc-nsh] | Quinn, P., Elzur, U. and C. Pignataro, "Network Service Header (NSH)", Internet-Draft draft-ietf-sfc-nsh-28, November 2017. |
[I-D.sambo-netmod-yang-fsm] | Sambo, N., Castoldi, P., Fioccola, G., Cugini, F., Song, H. and T. Zhou, "YANG model for finite state machine", Internet-Draft draft-sambo-netmod-yang-fsm-00, October 2017. |
[I-D.song-ippm-ioam-data-extension] | Song, H. and T. Zhou, "In-situ OAM Data Type Extension", Internet-Draft draft-song-ippm-ioam-data-extension-00, October 2017. |
[I-D.song-ippm-ioam-tunnel-mode] | Song, H., Li, Z., Zhou, T. and Z. Wang, "In-situ OAM Processing in Tunnels", Internet-Draft draft-song-ippm-ioam-tunnel-mode-00, June 2018. |
[I-D.song-mpls-extension-header] | Song, H., Li, Z., Zhou, T. and L. Andersson, "MPLS Extension Header", Internet-Draft draft-song-mpls-extension-header-01, August 2018. |
[I-D.song-opsawg-dnp4iq] | Song, H. and J. Gong, "Requirements for Interactive Query with Dynamic Network Probes", Internet-Draft draft-song-opsawg-dnp4iq-01, June 2017. |
[I-D.spiegel-ippm-ioam-rawexport] | Spiegel, M., Brockners, F., Bhandari, S. and R. Sivakolundu, "In-situ OAM raw data export with IPFIX", Internet-Draft draft-spiegel-ippm-ioam-rawexport-01, October 2018. |
[I-D.talwar-rtgwg-grpc-use-cases] | Specification, g., Kolhe, J., Shaikh, A. and J. George, "Use cases for gRPC in network management", Internet-Draft draft-talwar-rtgwg-grpc-use-cases-01, January 2017. |
[I-D.weis-ippm-ioam-gre] | Weis, B., Brockners, F., crhill@cisco.com, c., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Kfir, A., Gafni, B., Lapukhov, P. and M. Spiegel, "GRE Encapsulation for In-situ OAM Data", Internet-Draft draft-weis-ippm-ioam-gre-00, March 2018. |
[RFC2925] | White, K., "Definitions of Managed Objects for Remote Ping, Traceroute, and Lookup Operations", RFC 2925, DOI 10.17487/RFC2925, September 2000. |
[RFC6241] | Enns, R., Bjorklund, M., Schoenwaelder, J. and A. Bierman, "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011. |
[RFC7011] | Claise, B., Trammell, B. and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, DOI 10.17487/RFC7011, September 2013. |