Internet DRAFT - draft-asechoud-rtgwg-qos-telemetry-req
draft-asechoud-rtgwg-qos-telemetry-req
Network Working Group A. Choudhary
Internet-Draft Cisco Systems
Intended status: Standards Track May 12, 2018
Expires: November 13, 2018
QoS Telemetry Requirements
draft-asechoud-rtgwg-qos-telemetry-req-00
Abstract
This document discusses QoS requirements for data model based network
telemetry. QoS configuration and operational models have been
defined as part of [I-D.asechoud-rtgwg-qos-model] and
[I-D.asechoud-rtgwg-qos-oper-model] respectively. This document
describes the requirement to extend the models to support QoS
Telemetry.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 13, 2018.
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
Choudhary Expires November 13, 2018 [Page 1]
Internet-Draft QoS Telemetry Requirements May 2018
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2
3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 2
3.1. Granularity and Completeness . . . . . . . . . . . . . . 2
3.2. Scale . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3. Cadence . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.4. Time-stamping . . . . . . . . . . . . . . . . . . . . . . 3
3.5. Grouping . . . . . . . . . . . . . . . . . . . . . . . . 3
3.6. Filtering . . . . . . . . . . . . . . . . . . . . . . . . 3
3.7. Aggregation . . . . . . . . . . . . . . . . . . . . . . . 3
3.8. Threshold . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Security Considerations . . . . . . . . . . . . . . . . . . . 4
5. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 4
6. Normative References . . . . . . . . . . . . . . . . . . . . 4
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 5
1. Motivation
Network visibility is an important aspect of Network availability.
QoS counters provide good insight into network device performance,
congestion and security. Continuous monitoring of each QoS resource
may not be always desired. Mechanism to monitor data set of QoS
resources is needed. The motivation of this document is to come up
with the set of requirements of such a mechanism.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
3. Requirements
3.1. Granularity and Completeness
Statistics passed for a QoS resource should be complete at the same
time granular to avoid sending undesired information. A particular
counter in isolation may not provide sufficient information about a
QoS resource. E.g tail-drop counter of a queue is not sufficient to
define state of a Queue unless some other counters like maximum queue
size and average queue size is known. Hence, it is important to get
complete context of a resource based on device capability.
Choudhary Expires November 13, 2018 [Page 2]
Internet-Draft QoS Telemetry Requirements May 2018
3.2. Scale
It is important to have visibility into vast number of QoS resources
in a network device on a regular basis. Some of the devices, e.g.,
may support millions of queues in a single device. To be able to
scale, it is desired to have data sets of important resources and
monitor based on the need.
3.3. Cadence
Cadence defines the frequency of data collection from the forwarding
path. Cadence can be limited by device capability as well as based
on the amount of data requested. It can also be desired to have
higher cadence of a resource in critical condition versus when it is
not in critical condition.
3.4. Time-stamping
Time stamping defines the time when the data was collected from the
data path. Time stamping helps in calculating various traffic rates
and draw right patterns.
3.5. Grouping
There may be multiple collectors of same telemetry data. The purpose
and focus of each collector may be different. By defining the right
set of groupings, a collector may be able to easily fetch the desired
data. E.g A network slice may define set of QoS resources on each
interface. A collector may be interested in a particular network
slice may request the data accordingly. Similarly, queues data on an
interfaces or set of interfaces can be defined as group.
3.6. Filtering
Many times a collector is interested in specific data, e.g. Real-
time queue on an egress interface or metering ([RFC2697] and
[RFC2698]) data on a best-effort traffic of an ingress interface. An
effective filtering mechanism can be done in the network device or by
the collector.
3.7. Aggregation
Sometime aggregation of data becomes important to define meaning of
the data. E.g. Consider a QoS policy applied on various ingress
interfaces. An underway DDOS attack can be better understood when
all the traffic to a particular destination coming through various
interfaces is summed up. Aggregation can also be done for multiple
Choudhary Expires November 13, 2018 [Page 3]
Internet-Draft QoS Telemetry Requirements May 2018
QoS resources within a Policy to save important hardware counter
resources.
3.8. Threshold
Collector may not be interested in a QoS resource data till it is
in critical condition. E.g. a tail-drop is seen on a particular
best-effort queue or queue is built up on a critical data of WFQ.
Many times it follows a pattern, like 9 am in the morning when
the trading starts, drops are seen on a particular queue but
otherwise there are no drops. So, it becomes important to observe a
resource in a critical condition and avoid otherwise. Defining a
threshold helps collector and device alike. Also, it is important to
define how long a resource will be monitored once it is out of
critical condition.
4. Security Considerations
5. Acknowledgement
6. Normative References
[I-D.asechoud-rtgwg-qos-model]
Choudhary, A., Jethanandani, M., Strahle, N., Aries, E.,
and I. Chen, "YANG Model for QoS", draft-asechoud-rtgwg-
qos-model-05 (work in progress), March 2018.
[I-D.asechoud-rtgwg-qos-oper-model]
Choudhary, A. and I. Chen, "YANG Model for QoS Operational
Parameters", draft-asechoud-rtgwg-qos-oper-model-01 (work
in progress), May 2018.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color
Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999,
<https://www.rfc-editor.org/info/rfc2697>.
[RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color
Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
<https://www.rfc-editor.org/info/rfc2698>.
Choudhary Expires November 13, 2018 [Page 4]
Internet-Draft QoS Telemetry Requirements May 2018
Author's Address
Aseem Choudhary
Cisco Systems
170 W. Tasman Drive
San Jose, CA 95134
US
Email: asechoud@cisco.com