Network Working Group | A. Clark |
Internet-Draft | Telchemy Incorporated |
Intended status: Best Current Practice | B. Claise |
Expires: January 29, 2012 | Cisco Systems, Inc. |
July 28, 2011 |
Guidelines for Considering New Performance Metric Development
draft-ietf-pmol-metrics-framework-12
This document describes a framework and a process for developing Performance Metrics of protocols and applications transported over IETF-specified protocols, and that can be used to characterize traffic on live networks and services.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 29, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
Many networking technologies, applications, or services, are distributed in nature, and their performance may be impacted by IP impairments, server capacity, congestion and other factors. It is important to measure the performance of applications and services to ensure that quality objectives are being met and to support problem diagnosis. Standardized metrics help to ensure that performance measurement is implemented consistently and facilitate interpretation and comparison.
There are at least three phases in the development of performance standards. They are:
During the development of metrics, it is often useful to define performance objectives and expected value ranges. However, this is not defined as part of the metric specification.
The intended audience for this document includes, but is not limited to, IETF participants who write Performance Metrics documents in the IETF, reviewers of such documents, and members of the Performance Metrics Directorate.
Previous IETF work related to reporting of application Performance Metrics includes the "Real-time Application Quality-of-Service Monitoring (RAQMON) Framework" [RFC4710], which extends the remote network monitoring (RMON) family of specifications to allow real-time quality-of-service (QoS) monitoring of various applications that run on devices such as IP phones, pagers, Instant Messaging clients, mobile phones, and various other handheld computing devices. Furthermore, the "RTP Control Protocol Extended Reports (RTCP XR)" [RFC3611] and the "SIP RTCP Summary Report Protocol" [RFC6035] are protocols that support the real-time reporting of Voice over IP and other applications running on devices such as IP phones and mobile handsets.
The IETF is also actively involved in the development of reliable transport protocols, such as TCP [RFC0793] or SCTP [RFC4960], which would affect the relationship between IP performance and application performance.
Thus there is a gap in the currently chartered coverage of IETF Working Groups (WG): development of Performance Metrics for protocols above and below the IP-layer that can be used to characterize performance on live networks.
Similarly to the "Guidelines for Considering Operations and Management of New Protocols and Protocol Extensions" [RFC5706], which is the reference document for the IETF Operations Directorate, this document should be consulted as part of the new Performance Metric review by the members of the Performance Metrics Directorate.
This document is divided in two major sections beyond the "Purpose and Scope" section. The first is a definition and description of a Performance Metric and its key aspects. The second defines a process to develop these metrics that is applicable to the IETF environment.
The Performance Metrics Directorate is a directorate provides guidance for Performance Metrics development in the IETF.
The Performance Metrics Directorate should be composed of experts in the performance community, potentially selected from the IPPM, BMWG, and PMOL WGs.
Quality of Service (QoS) is defined similarly to the ITU "QoS experienced/perceived by customer/user (QoE)" E.800 [E.800], i.e.: "Totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service."
Quality of Experience (QoE) is defined in a similar way to the ITU "QoS experienced/perceived by customer/user (QoE)" E.800 [E.800], i.e.: "a statement expressing the level of quality that customers/users believe they have experienced."
NOTE 1 - The level of QoS experienced and/or perceived by the customer/user may be expressed by an opinion rating.
NOTE 2 - QoE has two main components: quantitative and qualitative. The quantitative component can be influenced by the complete end-to-end system effects (including user devices and network infrastructure).
NOTE 3 - The qualitative component can be influenced by user expectations, ambient conditions, psychological factors, application context, etc.
NOTE 4 - QoE may also be considered as QoS delivered, received, and interpreted by a user with the pertinent qualitative factors influencing his/her perception of the service.
A quantitative measure of performance, specific to an IETF-specified protocol or specific to an application transported over an IETF-specified protocol. Examples of Performance Metrics are: the FTP response time for a complete file download, the DNS response time to resolve the IP address, a database logging time, etc.
The purpose of this document is to define a framework and a process for developing Performance Metrics for protocols above and below the IP-layer (such as IP-based applications that operate over reliable or datagram transport protocols), that can be used to characterize traffic on live networks and services. As such, this document does not define any Performance Metrics.
The scope of this document covers guidelines for the Performance Metrics Directorate members for considering new Performance Metrics, and suggests how the Performance Metrics Directorate will interact with the rest of the IETF. However this document is not intended to supersede existing working methods within WGs that have existing chartered work in this area.
This process is not intended to govern Performance Metric development in existing IETF WG that are focused on metrics development, such as IPPM and BMWG. However, this guidelines document may be useful in these activities, and MAY be applied where appropriate. A typical example is the development of Performance Metrics to be exported with the IPFIX protocol RFC 5101 [RFC5101], with specific IPFIX information elements RFC 5102 [RFC5102], which would benefit from the framework in this document.
The framework in this document applies to Performance Metrics derived from both active and passive measurements.
Network QoS deals with network and network protocol performance, while QoE deals with the assessment of a user's experience in a context of a task or a service. As a result, the topic of application-specific Performance Metrics includes the measurement of performance at layers between IP and the user. For example, network QoS metrics (packet loss, delay, and delay variation [RFC5481]) can be used to estimate application-specific Performance Metrics (de-jitter buffer size and RTP-layer packet loss), then combined with other known aspects of a VoIP application (such as codec type) to estimate a Mean Opinion Score (MOS) [P.800]. However, the QoE for a particular VoIP user depends on the specific context, such as a casual conversation, a business conference call, or an emergency call. Finally, QoS and application-specific Performance Metrics are quantitative, while QoE is qualitative. Also network QoS and application-specific Performance Metrics can be directly or indirectly evident to the user, while the QoE is directly evident.
This section provides key definitions and qualifications of Performance Metrics.
Many of the aspects of metric definition and reporting, even the selection or determination of the essential metrics, depend on who will use the results, and for what purpose. For example, the metric description SHOULD include use cases and example reports that illustrate service quality monitoring and maintenance or identification and quantification of problems.
All documents defining Performance Metrics SHOULD identify the primary audience and its associated requirements. The audience can influence both the definition of metrics and the methods of measurement.
The key areas of variation between different metric users include:
A Performance Metric is a measure of an observable behavior of a networking technology, an application, or a service. Most of the time, the Performance Metric can be directly measured however, sometimes, the Performance Metric value is computed. The process for determining the value of a metric may assume some implicit or explicit underlying statistical process, in this case, the Performance Metric is an estimate of a parameter of this process, assuming that the statistical process closely models the behavior of the system.
A Performance Metric should serve some defined purpose. This may include the measurement of capacity, quantifying how bad some problem is, measurement of service level, problem diagnosis or location and other such uses. A Performance Metric may also be an input to some other process, for example the computation of a composite Performance Metric or a model or simulation of a system. Tests of the "usefulness" of a Performance Metric include:
For example, consider a distributed application operating over a network connection that is subject to packet loss. A Packet Loss Rate (PLR) Performance Metric is defined as the mean packet loss ratio over some time period. If the application performs poorly over network connections with high packet loss ratio and always performs well when the packet loss ratio is zero then the PLR Performance Metric is useful to some degree. Some applications are sensitive to short periods of high loss (bursty loss) and are relatively insensitive to isolated packet loss events; for this type of application there would be very weak correlation between PLR and application performance. A "better" Performance Metric would consider both the packet loss ratio and the distribution of loss events. If application performance is degraded when the PLR exceeds some rate then a useful Performance Metric may be a measure of the duration and frequency of periods during which the PLR exceeds that rate (as for example in RFC3611).
Some Performance Metrics may not be measured directly, but can be composed from base metrics that have been measured. A composed Performance Metric is derived from other metrics by applying a deterministic process or function (e.g., a composition function). The process may use metrics that are identical to the metric being composed, or metrics that are dissimilar, or some combination of both types. Usually the base metrics have a limited scope in time or space, and they can be combined to estimate the performance of some larger entities.
Some examples of composed Performance Metrics and composed Performance Metric definitions are:
In the context of flow records in IP Flow Information eXport (IPFIX), the IPFIX Mediation: Framework [RFC6183] also discusses some aspects of the temporal and spatial composition.
An Index is a metric for which the output value range has been selected for convenience or clarity, and the behavior of which is selected to support ease of understanding; for example the R Factor [G.107]. The deterministic function for an index is often developed after the index range and behavior have been determined.
The informative part of a Performance Metric specification is intended to support the implementation and use of the metric. This part SHOULD provide the following data:
(i) Implementation
The implementation description MAY be in the form of text, algorithm or example software. The objective of this part of the metric definition is to assist implementers to achieve consistent results.
(ii) Verification
The Performance Metric definition SHOULD provide guidance on verification testing. This may be in the form of test vectors, a formal verification test method or informal advice.
(iii) Use and Applications
The use and applications description is intended to assist the "user" to understand how, when and where the metric can be applied, and what significance the value range for the metric may have. This MAY include a definition of the "typical" and "abnormal" range of the Performance Metric, if this was not apparent from the nature of the metric. The description MAY include information about the influence of extreme measurement values, i.e. if the Performance Metric is sensitive to outliers. The Use and Application section SHOULD also include the security implications in the description.
For example:
(a) it is fairly intuitive that a lower packet loss ratio would equate to better performance. However the user may not know the significance of some given packet loss ratio,
(b) the speech level of a telephone signal is commonly expressed in dBm0. If the user is presented with:
Speech level = -7 dBm0
this is not intuitively understandable, unless the user is a telephony expert. If the metric definition explains that the typical range is -18 to -28 dBm0, a value higher than -18 means the signal may be too high (loud) and less than -28 means that the signal may be too low (quiet), it is much easier to interpret the metric.
(iv) Reporting Model
The reporting model definition is intended to make any relationship between the metric and the reporting model clear. There are often implied relationships between the method of reporting metrics and the metric itself, however these are often not made apparent to the implementor. For example, if the metric is a short term running average packet delay variation (e.g. the interarrival jitter in [RFC3550]) and this value is reported at intervals of 6-10 seconds, the resulting measurement may have limited accuracy when packet delay variation is non-stationary.
Normative
Informative
A Performance Metric definition MUST have a normative part that defines what the metric is and how it is measured or computed and SHOULD have an informative part that describes the Performance Metric and its application.
The normative part of a Performance Metric definition MUST define at least the following:
(i) Metric Name
Performance Metric names are RECOMMENDED to be unique within the set of metrics being defined for the protocol layer and context. While strict uniqueness may not be attainable (See the IPPM registry [RFC6248] for an example of IANA metric registry failing to provide sufficient specificity), broad review must be sought to avoid naming overlap. Note that the Performance Metrics Directorate can help with suggestions for IANA metric registration for unique naming. The Performance Metric name MAY be descriptive.
(ii) Metric Description
The Performance Metric description MUST explain what the metric is, what is being measured and how this relates to the performance of the system being measured.
(iii) Method of Measurement or Calculation
The method of measurement or calculation MUST define what is being measured or computed and the specific algorithm to be used. Does the measurement involve active or only passive measurements? Terms such as "average" should be qualified (e.g. running average or average over some interval). Exception cases SHOULD also be defined with the appropriate handling method. For example, there are a number of commonly used metrics related to packet loss; these often don't define the criteria by which a packet is determined to be lost (vs very delayed) or how duplicate packets are handled. For example, if the average packet loss rate during a time interval is reported, and a packet's arrival is delayed from one interval to the next then was it "lost" during the interval during which it should have arrived or should it be counted as received?
Some methods of calculation might require discarding some data collected (due to outliers) so as to make the measurement parameters meaningful. One example is burstable billing that sorts the 5-min samples, and discard the top 5 percentile.
Some parameters linked to the method MAY also be reported, in order to fully interpret the Performance Metric. For example, the time interval, the load, the minimum packet loss, the potential measurement errors and their sources, the attainable accuracy of the metric (e.g. +/-0,1), the method of caluclation, etc...
(iv) Units of measurement
The units of measurement MUST be clearly stated.
(v) Measurement Point(s)
If the measurement is specific to a measurement point, this SHOULD be defined. The measurement domain MAY also be defined. Specifically, if measurement points are spread across domains, the measurement domain (intra-, inter-) is another factor to consider.
The Performance Metric definition should discuss how the Performance Metric value might vary depending which measurement point is chosen. For example, the time between a SIP request [RFC3261] and the final response can be significantly different at the User Agent Client (UAC) or User Agent Server (UAS).
In some cases, the measurement requires multiple measurement points: all measurement points SHOULD be defined, including the measurement domain(s).
(vi) Measurement timing
The acceptable range of timing intervals or sampling intervals for a measurement and the timing accuracy required for such intervals MUST be specified. Short sampling intervals or frequent samples provide a rich source of information that can help to assess application performance but may lead to excessive measurement data. Long measurement or sampling intervals reduce the amount of reported and collected data such that it may be insufficient to understand application performance or service quality insofar as the measured quantity may vary significantly with time.
In case of multiple measurement points, the potential requirement for synchronized clocks must be clearly specified. In the specific example of the IP delay variation application metric, the different aspects of synchronized clocks are discussed in [RFC5481].
The example used is the loss rate metric as specified in RFC 3611 [RFC3611].
Metric Name: LossRate
Metric Description: The fraction of RTP data packets from the source lost since the beginning of reception.
Method of measurement or calculation: This value is calculated by dividing the total number of packets lost (after the effects of applying any error protection such as FEC) by the total number of packets expected, multiplying the result of the division by 256, limiting the maximum value to 255 (to avoid overflow), and taking the integer part.
Units of Measurement: This metric is expressed as a fixed point number with the binary point at the left edge of the field. For example, a metric value of 12 means a loss rate of approximately 5%.
Measurement Point(s): This metric is made at the receiving end of the RTP stream sent during a Voice over IP call.
Measurement Timing: This metric can be used over a wide range of time intervals. Using time intervals of longer than one hour may prevent the detection of variations in the value of this metric due to time- of-day changes in network load. Timing intervals should not vary in duration by more than +/- 2%.
Implementation: The numbers of duplicated packets and discarded packets do not enter into this calculation. Since receivers cannot be required to maintain unlimited buffers, a receiver MAY categorize late-arriving packets as lost. The degree of lateness that triggers a loss SHOULD be significantly greater than that which triggers a discard.
Verification: The metric value ranges between 0 and 255.
Use and Applications: This metric is useful for monitoring VoIP calls. More precisely, to detect the VoIP loss rate in the network. This loss rate, along with the rate of packets discarded due to jitter, has some effect on the quality of the voice stream.
Reporting Model: This metric needs to be associated with a defined time interval, which could be defined by fixed intervals or by a sliding window. In the context of RFC3611 the metric is measured continuously from the start of the RTP stream, the value of the metric is sampled and reported in RTCP XR VoIP Metrics reports
This section introduces several Performance Metrics dependencies, which the Performance Metric designer should keep in mind during the Performance Metric development. These dependencies, and any others not listed here, SHOULD be documented in the Performance Metric specifications.
The accuracy of the timing of a measurement may affect the accuracy of the Performance Metric. This may not materially affect a sampled value metric however would affect an interval based metric. Some metrics, for example the number of events per time interval, would be directly affected; for example a 10% variation in time interval would lead directly to a 10% variation in the measured value. Other metrics, such as the average packet loss ratio during some time interval, would be affected to a lesser extent.
If it is necessary to correlate sampled values or intervals then it is essential that the accuracy of sampling time and interval start/ stop times is sufficient for the application (for example +/- 2%).
Performance Metric definitions may explicitly or implicitly rely on factors that may not be obvious. For example, the recognition of a packet as being "lost" relies on having some method to know the packet was actually lost (e.g. RTP sequence number), and some time threshold after which a non-received packet is declared as lost. It is important that any such dependencies are recognized and incorporated into the metric definition.
Lower layer Performance Metrics may be used to compute or infer the performance of higher layer applications, potentially using an application performance model. The accuracy of this will depend on many factors including:
(i) The completeness of the set of metrics - i.e. are there metrics for all the input values to the application performance model?
(ii) Correlation between input variables (being measured) and application performance
(iii) Variability in the measured metrics and how this variability affects application performance
Presence of a middlebox [RFC3303], e.g., proxy, network address translation (NAT), redirect server, session border controller (SBC, [RFC5853]), and application layer gateway (ALG) may add variability to or restrict the scope of measurements of a metric. For example, an SBC that does not process RTP loopback packets may block or locally terminate this traffic rather then pass it through to its target.
The IPPM Framework [RFC2330] organizes the results of metrics into three related notions:
Performance Metrics MAY use this organization for the results, with or without the term names used by IPPM WG. Section 11 of RFC 2330 [RFC2330] should consulted for further details.
Metrics are completely defined when all options and input variables have been identified and considered. These variables are sometimes left unspecified in a metric definition, and their general name indicates that the user must set them and report them with the results. Such variables are called "parameters" in the IPPM metric template. The scope of the metric, the time at which it was conducted, the length interval of the sliding window measurement, the settings for timers and the thresholds for counters are all examples of parameters.
All documents defining Performance Metric SHOULD identify all key parameters for each Performance Metric.
This process is intended to add additional considerations to the processes for adopting new work as described in RFC 2026 [RFC2026] and RFC 2418 [RFC2418]. Note that new Performance Metrics work item proposals SHALL be approved using the existing IETF process. The following entry criteria will be considered for each proposal.
Proposals SHOULD be prepared as Internet Drafts, describing the Performance Metric and conforming to the qualifications above as much as possible. Proposals SHOULD be deliverables of the corresponding protocol development WG charters. As such, the Proposals SHOULD be vetted by that WG prior to discussion by the Performance Metrics Directorate. This aspect of the process includes an assessment of the need for the Performance Metric proposed and assessment of the support for their development in IETF.
Proposals SHOULD include an assessment of interaction and/or overlap with work in other Standards Development Organizations. Proposals SHOULD identify additional expertise that might be consulted.
Proposals SHOULD specify the intended audience and users of the Performance Metrics. The development process encourages participation by members of the intended audience.
Proposals SHOULD identify any security and IANA requirements. Security issues could potentially involve revealing of user identifying data or the potential misuse of active test tools. IANA considerations may involve the need for a Performance Metrics registry.
Each Performance Metric SHOULD be assessed according to the following list of qualifications:
The Performance Metrics Directorate SHALL provide guidance to the related protocol development WG when considering an Internet Draft that specifies Performance Metrics for a protocol. A sufficient number of individuals with expertise must be willing to consult on the draft. If the related WG has concluded, comments on the proposal should still be sought from key RFC authors and former chairs.
A formal review is recommended by the time the document is reviewed by the Area Directors, or an IETF Last Call is being conducted - same as expert reviews are being performed by other directorates.
Existing mailing lists SHOULD be used, however a dedicated mailing list MAY be initiated if necessary to facilitate work on a draft.
In some cases, it will be appropriate to have the IETF session discussion during the related protocol WG session, to maximize visibility of the effort to that WG and expand the review.
The Performance Metrics Directorate will assist with the progression of RFCs along the Standards Track. See [I-D.bradner-metricstest]. This may include the preparation of test plans to examine different implementations of the metrics to ensure that the metric definitions are clear and unambiguous (depending on the final form of the draft above).
This document makes no request of IANA.
Note to RFC EDITOR: this section may be removed on publication as an RFC.
In general, the existence of a framework for Performance Metric development does not constitute a security issue for the Internet. Performance Metric definitions may introduce security issues and this framework recommends that those defining Performance Metrics should identify any such risk factors.
The security considerations that apply to any active measurement of live networks are relevant here. See [RFC4656].
The security considerations that apply to any passive measurement of specific packets in live networks are relevant here as well. See the security considerations in [RFC5475].
The authors would like to thank Al Morton, Dan Romascanu, Daryl Malas and Loki Jorgenson for their comments and contributions. The authors would like to thank Aamer Akhter, Yaakov Stein, Carsten Schmoll, and Jan Novak for their reviews.
[RFC2026] | Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. |
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC2418] | Bradner, S., "IETF Working Group Guidelines and Procedures", BCP 25, RFC 2418, September 1998. |
[RFC4656] | Shalunov, S., Teitelbaum, B., Karp, A., Boote, J. and M. Zekauskas, "A One-way Active Measurement Protocol (OWAMP)", RFC 4656, September 2006. |