Internet-Draft | Performance Measurement for SR-MPLS | October 2024 |
Gandhi, et al. | Expires 19 April 2025 | [Page] |
This document specifies the application of the MPLS loss and delay measurement techniques, originally defined in RFC 6374, RFC 7876, and RFC 9341 within Segment Routing (SR) networks that utilize the MPLS data plane. Segment Routing enables the forwarding of packets through an ordered list of instructions, known as segments, which are imposed at the ingress node. By applying the mechanisms from RFC 6374, RFC 7876, and RFC 9341 to SR-MPLS networks, this document facilitates accurate measurement of packet loss and delay for Segment Routing paths. It defines the procedures and extensions necessary to perform performance measurement and fault management in SR-MPLS environments, ensuring that network operators can effectively measure and maintain the quality of service across their SR-based MPLS networks. This includes coverage of links and end-to-end SR-MPLS paths, as well as SR Policies.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 April 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Segment Routing (SR), as specified in [RFC8402], leverages the source routing paradigm and applies to both the Multiprotocol Label Switching (SR-MPLS) and IPv6 (SRv6) data planes. SR takes advantage of Equal-Cost Multipaths (ECMPs) between source and transit nodes, between transit nodes, and between transit and destination nodes. SR Policies, defined in [RFC9256], are used to steer traffic through specific, user-defined paths using a list of segments.¶
A comprehensive SR Performance Measurement toolset is one of the essential requirements for measuring network performance to provide Service Level Agreements (SLAs).¶
[RFC6374] specifies protocol mechanisms to enable efficient and accurate measurement of packet loss, one-way and two-way delay, as well as related metrics such as delay-variation in MPLS networks.¶
[RFC7876] specifies mechanisms for sending and processing out-of-band responses over a UDP return path when receiving query messages defined in [RFC6374]. These mechanisms can be applied to SR-MPLS networks.¶
[RFC9341] defines the Alternate-Marking Method using block number as a data correlation mechanism for packet loss measurement.¶
This document utilizes the mechanisms from [RFC6374], [RFC7876], and [RFC9341] for delay and loss measurements in SR-MPLS networks. This includes coverage of links and end-to-end SR-MPLS paths, as well as SR Policies.¶
This document defines Return Path and Block Number TLV extensions for [RFC6374] for delay and loss measurement in SR-MPLS networks. These TLV extensions also apply to MPLS Label Switched Paths (LSPs) [RFC3031]. However, the procedure for delay and loss measurement of MPLS LSPs is outside the scope of this document.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
ACH: Associated Channel Header.¶
DM: Delay Measurement.¶
ECMP: Equal Cost Multi-Path.¶
G-ACh: Generic Associated Channel (G-ACh).¶
GAL: Generic Associated Channel (G-ACh) Label.¶
LM: Loss Measurement.¶
LSE: Label Stack Entry.¶
MPLS: Multiprotocol Label Switching.¶
PSID: Path Segment Identifier.¶
SID: Segment Identifier.¶
SL: Segment List.¶
SR: Segment Routing.¶
SR-MPLS: Segment Routing with MPLS data plane.¶
TC: Traffic Class.¶
TE: Traffic Engineering.¶
TTL: Time-To-Live.¶
URO: UDP Return Object.¶
In the Reference Topology shown in Figure 1, the querier node Q1 initiates a query message, and the responder node R1 transmits a response message for the query message received. The response message may be sent back to the querier node Q1 on the same path (same set of links and nodes) or a different path in the reverse direction from the path taken towards the responder R1.¶
T1 is a transmit timestamp, and T4 is a receive timestamp, both added by node Q1. T2 is a receive timestamp, and T3 is a transmit timestamp, both added by node R1.¶
SR is enabled with the MPLS data plane on nodes Q1 and R1. The nodes Q1 and R1 are connected via a channel (Section 2.9.1 of [RFC6374]). The channel may be a directly connected link enabled with MPLS (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path [RFC8402]. The link may be a physical interface, a virtual link, or a Link Aggregation Group (LAG) [IEEE802.1AX], or a LAG member link. The SR-MPLS path may be an SR-MPLS Policy [RFC9256] on node Q1 (called head-end) with the destination to node R1 (called tail-end).¶
In this document, the procedures defined in [RFC6374], [RFC7876], and [RFC9341] are utilized for delay and loss measurement in SR-MPLS networks. Specifically, the one-way, two-way, and round-trip delay measurements described in Section 2.4 of [RFC6374] are further elaborated for application within SR-MPLS networks. Similarly, the packet loss measurement procedures outlined in Section 2.2 of [RFC6374] are extended for use in SR-MPLS networks.¶
Packet loss measurement using the Alternate-Marking Method defined in [RFC9341] may employ the Block Number for data correlation. This is achieved by utilizing the Block Number TLV extension defined in this document.¶
In SR-MPLS networks, the query and response messages defined in [RFC6374] are transmitted as follows:¶
If it is desired in SR-MPLS networks that the same path (i.e., the same set of links and nodes) between the querier and responder be used in both directions of the measurement, this can be achieved by using the Return Path TLV extension defined in this document.¶
The performance measurement procedures for links can be used to compute extended Traffic Engineering (TE) metrics for delay and loss, as described herein. These metrics are advertised in the network using the routing protocol extensions defined in [RFC7471], [RFC8570], and [RFC8571].¶
The query message, as defined in [RFC6374], is sent over the links for both delay and loss measurement. In each Label Stack Entry (LSE) [RFC3032] in the MPLS label stack, the TTL value MUST be set to 255 [RFC5082].¶
An SR-MPLS Policy Candidate-Path may contain a number of Segment Lists (SLs) (i.e., a stack of MPLS labels) [RFC9256]. For delay and/or loss measurement for an end-to-end SR-MPLS Policy, the query messages MUST be transmitted for every SL of the SR-MPLS Policy Candidate-Path, by creating a separate session for each SL. Each query message of a session contains an SR-MPLS label stack of the Candidate-Path, with the G-ACh Label (GAL) at the bottom of the stack (with S=1) as shown in Figure 2. In each LSE in the MPLS label stack, the TTL value MUST be set to 255 [RFC5082].¶
The fields "0001", Version, Reserved, and Channel Type shown in Figure 2 are specified in [RFC5586].¶
The SR-MPLS label stack can be empty in the case of a one-hop SR-MPLS Policy with an Implicit NULL label.¶
For an SR-MPLS Policy, to ensure that the query message is processed by the intended responder, the Destination Address TLV (Type 129) [RFC6374] containing the address of the responder can be sent in the query messages. The responder that supports this TLV MUST return Success in "Control Code" [RFC6374] if it is the intended destination for the query. Otherwise, it MUST return 0x15: Error - Invalid Destination Node Identifier [RFC6374].¶
In one-way measurement mode defined in Section 2.4 of [RFC6374], the querier can receive "out-of-band" response messages with an IP/UDP header by properly setting the UDP Return Object (URO) TLV in the query message. The URO TLV (Type=131) is defined in [RFC7876] and includes the UDP-Destination-Port and IP Address. When the querier sets an IP address and a UDP port in the URO TLV, the response message MUST be sent to that IP address as the destination address and UDP port as the destination port. In addition, the "Control Code" in the query message MUST be set to "out-of-band response requested" [RFC6374].¶
In two-way measurement mode defined in Section 2.4 of [RFC6374], the response messages SHOULD be sent back in-band on the same link or the same end-to-end SR-MPLS path (same set of links and nodes) in the reverse direction to the querier.¶
For links, the response message as defined in [RFC6374] is sent back on the same incoming link where the query message is received. In this case, the "Control Code" in the query message MUST be set to "in-band response requested" [RFC6374].¶
For end-to-end SR-MPLS paths, the responder transmits the response message (example as shown in Figure 2) on a specific return SR-MPLS path. The querier can request in the query message for the responder to send the response message back on a given return path using the MPLS Label Stack sub-TLV in the Return Path TLV defined in this document.¶
The loopback measurement mode defined in Section 2.8 of [RFC6374] is used to measure round-trip delay for a bidirectional circular SR-MPLS path. In this mode for SR-MPLS, the received query messages are not punted out of the fast path in forwarding (i.e., to the slow path or control plane) at the responder. In other words, the responder does not process the payload or generate response messages. The loopback function simply returns the received query message to the querier without responder modifications [RFC6374].¶
The loopback mode is done by generating "queries" with the Response flag set to 1 and adding the Loopback Request object (Type 3) [RFC6374]. The label stack, as shown in Figure 2, in query messages in this case carries both the forward and reverse paths in the MPLS header. The GAL is still carried at the bottom of the label stack (with S=1) (example as shown in Figure 2).¶
As defined in [RFC6374], MPLS Delay Measurement (DM) query and response messages use the Associated Channel Header (ACH) (value 0x000C for delay measurement) [RFC6374], which identifies the message type and the message payload as defined in Section 3.2 [RFC6374] following the ACH. For delay measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
The Loss Measurement (LM) protocol can perform two distinct kinds of loss measurement as described in Section 2.9.8 of [RFC6374].¶
As defined in [RFC6374], MPLS LM query and response messages use the Associated Channel Header (ACH) (value 0x000A for direct loss measurement or value 0x000B for inferred loss measurement), which identifies the message type and the message payload defined in Section 3.1 [RFC6374] following the ACH. For loss measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
As defined in [RFC6374], Combined DM+LM query and response messages use the Associated Channel Header (ACH) (value 0x000D for direct loss and delay measurement or value 0x000E for inferred loss and delay measurement), which identifies the message type and the message payload defined in Section 3.3 [RFC6374] following the ACH. For combined loss and delay measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
The Path Segment Identifier (PSID) [RFC9545] MUST be carried in the received data packet for the traffic flow under measurement for accounting received traffic on the egress node of the SR-MPLS Policy. In direct mode, the PSID in the received query message, as shown in Figure 3, can be used to associate the received traffic counter on the responder to detect the transmit packet loss for the end-to-end SR-MPLS Policy.¶
In inferred mode, the PSID in the received query messages, as shown in Figure 3, can be used to count the received query messages on the responder to detect the transmit packet loss for an end-to-end SR-MPLS Policy.¶
The fields "0001", Version, Reserved, and Channel Type shown in Figure 3 are specified in [RFC5586].¶
Different values of PSID can be used per Candidate-Path for accounting received traffic to measure packet loss at the Candidate-Path level. Similarly, different values of PSID can be used per Segment List of the Candidate-Path for accounting received traffic to measure packet loss at the Segment List level. The same value of PSID can be used for all Segment Lists of the SR-MPLS Policy to measure packet loss at the SR-MPLS Policy level.¶
The packet loss measurement using the Alternate-Marking Method defined in [RFC9341] may use the block number for data correlation for the traffic flow under measurement. As defined in Section 3.1 of [RFC9341], the block number is used to divide the traffic flow into consecutive blocks and count the number of packets transmitted and received in each block for loss measurement.¶
As described in Section 4.3 of [RFC9341], a protocol-based distributed solution can be used to exchange values of counters on the nodes for loss measurement. That solution is further described in this document using the LM messages defined in [RFC6374].¶
The querier node assigns a block number to the block of data packets of the traffic flow under measurement. The querier counts the number of packets transmitted in each block. The mechanism for the assignment of the block number is a local decision on the querier and is outside the scope of this document.¶
As an example, the querier can use the procedure defined in [I-D.ietf-mpls-inband-pm-encapsulation] for alternate marking of the data packets of the traffic flow under measurement. The responder counts the number of received packets in each block based on the marking in the received data packets. The querier and responder maintain separate sets of transmit and receive counters for each marking. The marking can be used as a block number or a separate block number can be incremented when the marking changes. Other methods can be defined for alternate marking of the data packets of the traffic flow under measurement to assign a block number for the counters.¶
The LM query and response messages defined in [RFC6374] are used to measure packet loss for the block of data packets transmitted with the previous marking while data packets carry alternate marking. Specifically, LM query and response messages carry the transmit and receive counters (which are currently not incrementing) along with their block number to correlate for loss measurement.¶
"The assumption of the block number mechanism is that the measurement nodes are time synchronized" as specified in Section 4.3 of [RFC9341] is not necessary, as the block number on the responder can be synchronized based on the received LM query messages.¶
In two-way measurement mode, the responder may transmit the response message on a specific return path, for example, in an ECMP environment. The querier can request in the query message for the responder to send a response message back on a given return path (e.g., co-routed bidirectional path). This allows the responder to avoid creating and maintaining additional states (containing return paths) for the sessions.¶
The querier may not be directly reachable from the responder in a network. The querier in this case MUST send its reachability path information to the responder using the Return Path TLV.¶
[RFC6374] defines query and response messages that can include one or more optional TLVs. A new TLV Type (TBA1) is defined in this document for the Return Path TLV to carry return path information in query messages. The Return Path TLV is specific to a measurement session. The format of the Return Path TLV is shown in Figure 4:¶
The Length is a one-byte field and is equal to the length of the Return Path Sub-TLV and the Reserved field in bytes. Length MUST NOT be 0 or 1.¶
The Return Path TLV is defined in the Mandatory TLV Type registry space [RFC6374]. The querier MUST only insert one Return Path TLV in the query message. The responder that supports this TLV MUST only process the first Return Path TLV and ignore the other Return Path TLVs if present. The responder that supports this TLV also MUST send the response message back on the return path specified in the Return Path TLV. The responder also MUST NOT add a Return Path TLV in the response message. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
The Return Path TLV contains a Sub-TLV to carry the return path. The format of the MPLS Label Stack Sub-TLV is shown in Figure 5. The label entries in the Sub-TLV MUST be in network order. The MPLS Label Stack Sub-TLV in the Return Path TLV is of the following type:¶
The MPLS Label Stack contains a list of 32-bit LSE that includes a 20-bit label value, 8-bit TTL value, 3-bit TC value, and 1-bit EOS (S) field. An MPLS Label Stack Sub-TLV may carry a stack of labels or a Binding SID label [RFC8402] of the Return SR-MPLS Policy.¶
The Length is a one-byte field and is equal to the length of the label stack field and the Reserved field in bytes. Length MUST NOT be 0 or 1.¶
The Return Path TLV MUST carry only one Return Path Sub-TLV. The MPLS Label Stack in the Return Path Sub-TLV MUST contain at least one MPLS Label. The responder that supports this Sub-TLV MUST only process the first Return Path Sub-TLV and ignore the other Return Path Sub-TLVs if present. The responder that supports this Sub-TLV MUST send the response message back on the return path specified in the Return Path Sub-TLV. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
[RFC6374] defines query and response messages that can include one or more optional TLVs. A new TLV Type (value TBA2) is defined in this document to carry the Block Number (8-bit) of the traffic counters in the LM query and response messages. The format of the Block Number TLV is shown in Figure 6:¶
The Length is a one-byte field and is equal to 2 bytes.¶
The Block Number TLV is defined in the Mandatory TLV Type registry space [RFC6374]. The querier MUST only insert one Block Number TLV in the query message to identify the Block Number for the traffic counters in the forward direction. The responder that supports this TLV MUST only insert one Block Number TLV in the response message to identify the Block Number for the traffic counters in the reverse direction. The responder also MUST return the first Block Number TLV from the query message and ignore the other Block Number TLVs if present. The Block Number TLV is specific to a measurement session. The R flag is used to indicate the query and response message direction associated with the Block Number. The R flag MUST be clear in the query message for the Block Number associated with Counter 1 and Counter 2, and set in the response message for the Block Number associated with Counter 3 and Counter 4. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
The SLs of an SR-MPLS Policy can have ECMPs between the source and transit nodes, between transit nodes, and between transit and destination nodes. Usage of node SID [RFC8402] by the SLs of an SR-MPLS Policy can result in ECMP paths. In addition, usage of Anycast SID [RFC8402] by the SLs of an SR-MPLS Policy can result in ECMP paths via transit nodes that are part of that Anycast group. The query and response messages SHOULD be sent to traverse different ECMP paths to measure the delay of each ECMP path of an SL.¶
The forwarding plane has various hashing functions available to forward packets on specific ECMP paths. For end-to-end SR-MPLS Policy delay measurement, different entropy label [RFC6790] values can be used in query and response messages to take advantage of the hashing function in the forwarding plane to influence the ECMP path taken by them.¶
The considerations for loss measurement for different ECMP paths of an SR-MPLS Policy are outside the scope of this document.¶
The extended TE metrics for link delay (namely, average delay, minimum delay, maximum delay and delay-variance) and packet loss can be computed using the performance measurement procedures described in this document and advertised in the routing domain as follows:¶
The procedures defined in this document are backwards compatible with the procedures defined in [RFC6374] at both the querier and responder. If the responder does not support the new Mandatory TLV Types defined in this document; it MUST return Error 0x17: Unsupported Mandatory TLV Object as per [RFC6374].¶
The manageability considerations described in Section 7 of [RFC6374] and Section 6 of [RFC7876] are applicable to this specification.¶
The security considerations specified in [RFC6374], [RFC7471], [RFC8570], [RFC8571], [RFC7876], and [RFC9341] also apply to the procedures described in this document.¶
The procedure defined in this document is intended for deployment in a single operator administrative domain. As such, the querier node, responder node, forward, and return paths are provisioned by the operator for the probe session. It is assumed that the operator has verified the integrity of the forward and return paths of the probe packets.¶
The "Return Path" TLV extensions defined in this document may be used for potential address spoofing. For example, a query message may carry a return path that has a destination that is not local at the querier. To prevent such possible attacks, the responder may drop the query messages when it cannot determine whether the return path has the destination local at the querier. The querier may send a proper source address in the "Source Address" TLV that the responder can use to make that determination, for example, by checking the access control list provisioned by the operator.¶
IANA is requested to allocate values for the following Mandatory TLV Types for [RFC6374] from the "MPLS Loss/Delay Measurement TLV Object" registry contained within the "Generic Associated Channel (G-ACh) Parameters" registry set:¶
Value | Description | Reference |
---|---|---|
TBA1 | Return Path TLV | This document |
TBA2 | Block Number TLV | This document |
The Block Number TLV is carried in the query and response messages, and the Return Path TLV is carried in the query messages.¶
IANA is requested to create a registry for "Return Path Sub-TLV Type." All code points in the range 0 through 175 in this registry shall be allocated according to the "IETF Review" procedure as specified in [RFC8126]. Code points in the range 176 through 239 in this registry shall be allocated according to the "First Come, First Served" procedure as specified in [RFC8126]. Remaining code points are allocated according to Table 2:¶
Value | Description | Reference |
---|---|---|
1 - 175 | IETF Review | This document |
176 - 239 | First Come First Served | This document |
240 - 251 | Experimental Use | This document |
252 - 254 | Private Use | This document |
This document defines the following values in the Return Path Sub-TLV Type registry:¶
Value | Description | Reference |
---|---|---|
0 | Reserved | This document |
1 | MPLS Label Stack of the Return Path | This document |
255 | Reserved | This document |
The authors would like to thank Thierry Couture and Ianik Semco for the discussions on the use cases for performance measurement in segment routing networks. The authors would like to thank Patrick Khordoc, Ruby Lin, and Haowei Shi for implementing the mechanisms defined in this document. The authors would like to thank Greg Mirsky and Xiao Min for providing many useful comments and suggestions. The authors would also like to thank Stewart Bryant, Sam Aldrin, Tarek Saad, and Rajiv Asati for their review comments. Thanks to Huaimo Chen, Yimin Shen, and Xufeng Liu for MPLS-RT expert review, Zhaohui Zhang for RTGDIR early review, Tony Li for shepherd's review, Ned Smith for SECDIR review, Roni Even for Gen-ART review, Marcus Ihlar for TSV-ART review, Dhruv Dhody for OPSDIR review, Gunter Van de Velde, Paul Wouters, and Roman Danyliw for IESG review.¶
Sagar Soni Cisco Systems, Inc. Email: sagsoni@cisco.com Zafar Ali Cisco Systems, Inc. Email: zali@cisco.com Pier Luigi Ventre CNIT Italy Email: pierluigi.ventre@cnit.it¶