PCE Working Group | H. Pouyllau |
Internet-Draft | Alcatel-Lucent |
Updates: 5440 (if approved) | R. Theillaud |
Intended status: Standards Track | Marben Products |
Expires: August 11, 2020 | J. Meuric |
Orange | |
H. Zheng (Editor) | |
X. Zhang | |
Huawei Technologies | |
February 8, 2020 |
Extensions to the Path Computation Element Communication Protocol for Enhanced Errors and Notifications
draft-ietf-pce-enhanced-errors-07
This document defines new error and notification TLVs for the PCE Communication Protocol (PCEP) specified in RFC5440, and will update it. It identifies the possible PCEP behaviors in case of error or notification. Thus, this draft defines types of errors and how they are disclosed to other PCEs in order to support predefined PCEP behaviors.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 11, 2020.
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
PCE terminology is defined in [RFC4655].
PCEP Peer: An element involved in a PCEP session (i.e. a PCC or a PCE).
Source PCC: the PCC, for a given path computation query, initiating the first PCEP request, which may then trigger a chain of successive requests.
Target PCE: the PCE that can compute a path to the destination without having to query any other PCE.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The PCE Communication Protocol [RFC5440] is designed to be flexible and extensible in order to allow future evolutions or specific constraint support such as proposed in [RFC7470]. Crossing different PCE implementations (e.g. from different providers or due to different releases), a PCEP request may encounter unknown errors or notification messages. In such a case, the PCEP RFC [RFC5440] specifies to send a specific error code to the PCEP peer. This document updates [RFC5440] by introducing mechanism to propagate the error message, with specifying error and notification TLVs.
In the context of path computation crossing different routing domains or autonomous systems, the number of different PCE system specificities is potentially high, thus possibly leading to divergent and unstable situations. Such phenomenon can also occur in homogeneous cases since PCE systems have their own policies that can introduce differences in requests treatment even for requests having the same destination. In order to generalize PCEP behaviors in the case of heterogeneous PCE systems, new objects have to be defined. Dealing with heterogeneity is a major challenge considering PCE applicability, particularly in multi-layer, multi-domain and H-PCE contexts [I-D.ietf-pce-stateful-hpce]. Thus, extending such error codes and PCEP behaviors accordingly would improve interoperability among different PCEP implementations and would solve some of these issues. However, some of them would still remain (e.g. the divergences in request treatment introduced by different policies).
The purpose of this draft is to identify and specify new optional TLVs and objects in order to generalize PCEP behaviors.
The two following scenarios underline the need for a normalization of the PCEP behaviors according to existing error or notification types.
PCE(i-1) has sent a request to PCE(i) which has also sent a request to PCE(i+1). PCE(i-1) and PCE(i+1) have the same error semantic but not PCE(i). If PCE(i+1) throws an error type and value unknown by PCE(i). PCE(i) could then adopt any other behaviors and sends back to PCE(i-1) an error of type 2 (Capability not supported), 3 (Unknown Object) or 4 (Not supported Object) for instance. As a consequence, the path request would be cancelled but the error has no meaning for PCE(i-1) whereas if PCE(i) had simply forwarded the error sent by PCE(i+1), it would have been understood by PCE(i-1).
PCE(i-1) has sent a request to PCE(i) which has also sent a request to PCE(i+1) but PCE(i+1) is overloaded. Without extensions, PCE(i+1) should send a notification of type 2 and a value flag giving its estimated congestion duration. PCE(i) can choose to stop the path computation and send a NO_PATH reply to PCE(i-1). Hence, PCE(i-1) ignores the congestion duration on PCE(i+1) and could seek it for further requests.
One of the purposes of the PCE architecture is to compute paths across networks, but an added value is to compute such paths in inter-area/layer/domain environments. The PCE Communication Protocol [RFC5440] is based on the Transport Communication Protocol (TCP). Thus, to compute a path within the PCE architecture, several TCP/PCEP sessions have to be set up, in a peer-to-peer manner, along a set of identified PCEs.
When the PCEP session is up for two PCEP peers, the PCC of the first PCE System (the source PCC) sends a PCReq message. If the PCC does not receive any reply before the dead timer is out, then it goes back to the idle state. A PCC can expect two kinds of replies: a PCRep message containing one or more valid paths (EROs) or a negative PCRep message containing a NO-PATH object.
Beside PCReq and PCRep messages, notification and error messages, named respectively PCNtf and PCErr, can be sent. There are two types of notification messages: type 1 is for cancelling pending requests and type 2 for signaling a congestion of the PCE. Several error values are described in [RFC5440]. The error types concerning the session phase begin at 2, error type 1 values are dedicated to the initialization phase.
As the PCE Communication Protocol is built to work in a peer-to-peer manner (i.e. supported by a TCP Connection), it supposes that the "deadtimer" of the source PCC is long enough to support the end-to-end distributed path computation process.
The exchange of messages in the PCE Communication Protocol is described in details when PCEP is in states OpenWait and KeepWait in [RFC5440]. When the session is up, message exchange is defined in [RFC5440]. [RFC5441] describes the Backward Recursive Path Computation (BRPC) procedure, and, because it considers an inter-domain path computation, gives a bigger picture of the possible behaviors when the session is up. Detailed behavior is mostly let free to any specific implementation. The following sections identifies the PCEP behaviors in case of error or notification and also introduce the requirement of PCEP peer identification in both cases.
[RFC5440] specifies that "a PCEP Error message is sent in several situations: when a protocol error condition is met or the request is not compliant with the PCEP specification". On this basis, and according to the other RFCs, the identified PCEP behaviors are the followings:
The high-level of criticality has been extracted from [RFC5440] which associates such a behavior to error-type of 1 (errors raised during the PCEP session establishment). Hence, such errors are quite specific. For the sake of completeness, they have been included in this document.
Notification messages can be employed in two different manners: during the treatment of a PCEP request, or independently from it to advertise information (in [RFC5440], the request ID list within a PCNtf message is optional). Hence, three different types of behaviors can be identified:
The propagation of errors and notifications affects the state of the PCEP peers along the chain. In some cases, for instance a notification that a PCE is overloaded, the identification of the PCEP peer - or that the sender PCE is not the direct neighbor - might be an important information for the PCEP peers receiving the message. The ID of sender PCE is not carried in the error TLVs, but can be achieved via the speaker entity ID TLV during state synchronization. An example can be found in [RFC8232].
This section describes extensions to support error and notification with respect to the PCEP behavior description defined in Section 4. This document does not intend to modify errors and notification types previously defined in existing documents (e.g. [RFC5440], [RFC5441], etc.). Error related TLVs have been specified in this section, while the notification functionality can be achieved via using PCNtf message with RP object with no need to extend further notification type.
To support the propagation behavior mentioned in Section 4.1 and Section 4.2, a new optional TLV is defined, which can be carried in PCEP-ERROR and NOTIFICATION objects, to indicate whether a message has to be propagateed or not. The allocation from the "PCEP TLV Type Indicators" sub-registry will be assigned by IANA and the request is documented in Section 10.
The description is "Propagation", the length value is 2 bytes and the value field is 1 byte. The value field is set to 0 meaning that the message MUST NOT be propagated. If the value field is set to 1, the message MUST be propagated. Section 5.4 specifies the destination and to limit the number of messages.
To support the shutdown behavior mentioned in Section 4.1, we extend the PCEP-ERROR object by creating a new optional TLV to indicate whether an error is recoverable or not. The allocation from the "PCEP TLV Type Indicators" sub-registry will be assigned by IANA and the request is documented in Section 10.
The description is "Error-criticality", the length value is 2 bytes and the value field is 1 byte. The value field is set to 0 meaning that the error has a low-level of criticality (so further messages can be expected for this request). If the value field is set to 1, the error has a medium-level of criticality and requests whose identifiers appear in the same message MUST be cancelled (so no further messages can be expected for these requests). If the value field is set to 2, the error has a high-level of criticality, the connection for this PCEP session is closed by the sender PCE peer.
The propagation behavior MAY be combined with all criticality levels, thus leading to 6 different behaviors. In the case of a criticality level of 2, the session is closed by the PCE peer which sends the message. Hence, the criticality level is purely informative for the PCE peer which receives the message. If it is combined with a propagation behavior, then the PCE propagating the message MUST indicate the same level of criticality if it closes the session. Otherwise, it MUST use a criticality level of 1 if it does not close the session.
For a PCErr message, all the possible behaviors described in Section 4.1 can be covered with TLVs included in a PCEP-ERROR object. The following table captures all combinations of error behaviors:
| Error \Propogation| 0 | 1 | | criticallity\ Value | ( No |(Propogation | | value \ | Propagation) | Required) | |------------------------------------------------------| | 0 (low) | Type 1 | Type 4 | | 1 (medium) | Type 2 | Type 5 | | 2 (high) | Type 3 | Type 6 | |------------------------------------------------------|
In order to limit the propagation of errors and notifications, the following mechanisms SHOULD be used:
Such mechanisms SHOULD be used jointly or independently depending the error or notification behaviors they are associated to. The conditions of use for the TTL and DIFFUSION-LIST TLVs are described in sections below.
The TTL value is set to any integer value to indicate the number of PCEP peers that will recursively receive the message. The TTL TLV SHOULD be used with propagated errors or notifications ("Propagation" TLV with value 1 in PCEP-ERROR or NOTIFICATION objects). Each PCEP peer MUST decrement the TTL value before propagating the message. When the TTL value becomes 0, the message is no more propagated.
If the message to be propagated is request-specific and there is no TTL or DIFFUSION-LIST TLVs included, the message MUST reach the source PCC (or alternatively the target PCE).
The DIFFUSION-LIST TLV can be carried within either the error object of a PCErr message, or the notification object of a PCNtf message. It can either be used in a message sent by a PCC to a PCE or vice versa. The DIFFUSION-LIST MAY be used with propagated errors (TLV "Propagation"at value 1 in PCEP-ERROR object).
The format of the DIFFUSION-LIST object body is as follows:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // (Sub-objects) // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type (16 bits): restricts the diffusion to certain peers. The following values are currently defined:
The value of DIFFUSION-LIST is made of sub-objects similar to the IRO defined in [RFC5440]. The following sub-object types are supported.
Type Sub-object 1 IPv4 address 2 IPv6 address 4 Unnumbered Interface ID 5 4-byte AS number 6 OSPF area ID 7 IS-IS Area ID 32 Autonomous System number 33 Explicit eXclusion Route Sub-object (EXRS)
If the error or notification codes target specific PCEP peers, a DIFFUSION-LIST TLV avoids partially flooding all PCEP peers. Any PCEP peer receiving a PCErr or PCNTf message containing a PCEP-ERROR or a NOTIFICATION object with a TLV "Propagation" at value 1 and where a DIFFUSION-LIST appears, MUST remove the addresses of the PCEP peers from the DIFFUSION-LIST, before sending the message to any other PCEP peers. This is performed by adding the PCEP peer addresses to the Explicit eXclusion Route Sub-object of the DIFFUSION-LIST. If a DIFFUSION-LIST value is empty, the PCEP peer MUST NOT propagate the message to any peer.
Note that, a Diffusion-List could contain strict or loose addresses to refer to a network domain (e.g. an Autonomous System number, an OSPF area, an IP address). Hence, the PCEP peers targeted by the message would be the PCEP peers covering the corresponding domain. If an address is loose, each time a PCEP peer forwards a message to another PCEP peer of this address, it MUST add it own address to the Explicit eXclusion Route Sub-object (EXRS) of the Diffusion-List for any forwarded messages. Hence, a PCE SHOULD avoid forwarding the same message repeated to the same set of peers. Finally, when an address is loose, the forwarding SHOULD be restrained indicating what type of PCEP peers are targeted (i.e. PCE and/or PCC).
Many existing normative references states on error definitions (see for instance [RFC5440], [RFC5441],[RFC5455], [RFC5521], [RFC5557], [RFC5886], [RFC8231], [RFC8232],[RFC8253], [RFC8281], [RFC8306], [RFC8408], [RFC8697]). This section provides processing rules for existing error types handling, as a recommendation. According to the definitions provided in this document, the follwoing rules are applicable:
Error and Notification handling in this document should be considered in PCE documents that include new errors and notifications. A requirement for the authors of these drafts is to evaluate the applicability of the procedure in this document and provide details about the "Error-criticality" TLV and "Propagation" TLV for errors and notifications defined in the draft. Examples of this can be found in section 5.4.3 of this document.
There would be backward compatibility issue if there are multiple PCEs with different level understanding of error message. In a scenario that PCE(i) propagate the error message to PCE (i+1), it is possible that PCE (i+1) is not capable to extract the message correctly, then such error message would be ignored and not be further propagated.
There can be potential approach to avoid these problem, such as recognizing the incapable PCE and avoiding propagation. However, these approach is not in the scope of this document.
[Editor Note] This section will be moved to appendix for publication.
This section provides some examples depicting how the error described above can be used in a PCEP session. The origin of the errors or notifications is only illustrative and has no normative purpose. Sometimes the PCE features behind may be implementation-specific (e.g. detection of flooding). This section does not provide scenarios for errors with a high-level of critcity (i.e., Error behaviors 3 and 6) since such errors are very specific and until now have been normalized only during the session establishment (error-type of 1).
In this example, a PCC attempts to establish a second PCEP session with the same PCE for another request. Consequently the PCE sends back an error message with error-type 9. This error stays local and does not affect the former session. The second session is ignored. If the "Propagation" TLV and "Error-criticality" TLV are used, they should be both set to value 0.
+-+-+ +-+-+ |PCC| |PCE| +-+-+ +-+-+ 1) Path computation | | event | | 2) PCE selection |----- Open Message--->| |<--- Open message ----| 3) Path computation |---- PCReq message--->| request X sent to | |4) Path computation the selected PCE | | request queued | | 5) Path computation | | event | | 6) PCE selection | | |----- Open Message--->|8) Session already | |opened |<--- PCErr message----| Error-type=9 | |
In this example, the PCC sends a DiffServ-aware path computation request. If the PCE receiving the request does not support the indicated class-type, it thus sends back a PCErr message with error-type=12 and error-value=1. If the "Propagation" TLV and "Error-criticality" TLV are present, they should carry value 0 and value 1 respectively. Consequently, the request is cancelled.
+-+-+ +-+-+ |PCC| |PCE| +-+-+ +-+-+ 1) Path computation | | event | | 2) PCE selection | | 3) Path computation |---- PCReq message--->| request X sent to | |4) Path computation the selected PCE | | request queued | | | |5) DiffServ class-type | | not supported | |6) Path computation | | request X | | cancelled |<--- PCErr message----| Error-type=12 | |
In this example, a PCC sends a path computation requests with no P flag set (e.g. END-POINT object with P-flag cleared). This is detected by another PCE in the sequence. The path computation request can thus be treated but the P-Flag will be ignored. Hence, this error is not critical but the source PCC should be informed of this fact. So, a PCErr message with error-type 10 ("Reception of an invalid object"). The PCEP-ERROR object of the message contains a "Propagation" TLV at value 1 and a "Error-criticality" TLV at value 0. It is hence propagated backwardly to the source PCC.
+-+-+ +-+-+-+-+ +-+-+ |PCC| |PCE|PCC| |PCE| +-+-+ +-+-+-+-+ +-+-+ |---- PCReq message-->| | | | | | |---- PCReq message--->| | | | | | |1) Parameter is | | | not supported | | | | |<--- PCErr message----| Error-type=10 |<--- PCErr message---| | | | |
In this example, PCEs are using the BRPC procedure to treat a path computation request [RFC5441]. However, one of the PCEs does not support a parameter of the request. Hence, a PCErr message with error-type 4 and error-value 4 is sent by this PCE and has to be forwarded to the source PCC. The PCEP-ERROR object includes a "Propagation" TLV at value 1 and "Error-criticality" TLV at value 1 and the message is propagated backwardly to the source PCC. Consequently, the request is cancelled.
+-+-+ +-+-+-+-+ +-+-+ |PCC| |PCE|PCC| |PCE| +-+-+ +-+-+-+-+ +-+-+ |---- PCReq message-->| | | | | | |---- PCReq message--->| | | | | | |1) Unsupported | | | Parameter BRPC | | |2) Path | | | computation | | | request X | | | cancelled | |<--- PCErr message----| Error-type=4 |<--- PCErr message---| | | | |
Within the introduced set of TLVs, the "Propagation" TLV affects PCEP security considerations since it forces propagation behaviors. Thus, a PCEP implementation SHOULD activate stateful mechanism when receiving PCEP-ERROR or NOTIFICATION object including this TLV in order to avoid DoS attacks.
IANA maintains a registry of PCEP parameters. This includes a sub-registry for PCEP Objects.
IANA is requested to make an allocation from the sub-registry as follows. The values here are suggested for use by IANA.
As described in Section 5.4 the newly defined TLVs allows a PCE to enforce specific error and notification behaviors within PCEP-ERROR and NOTIFICATION objects. IANA is requested to make the following allocations from the "PCEP TLV Type Indicators" sub-registry.
Value Description Reference TBD Propagation this document TBD Error-criticality this document
Type Value Meaning Reference 0 Any PCEP peers this document 1 PCEs but excludes PCC-only peers this document 2 PCEs and PCCs this document with which a session is still opened Subobjects Reference 1: IPv4 prefix this document 2: IPv6 prefix this document 4: Unnumbered Interface ID this document 5: OSPF Area ID this document 6 OSPF area ID this document 7 IS-IS Area ID this document 32: Autonomous system number this document 33: Explicit Exclusion Route subobject (EXRS) this document
[I-D.ietf-pce-stateful-hpce] | Dhody, D., Lee, Y., Ceccarelli, D., Shin, J. and D. King, "Hierarchical Stateful Path Computation Element (PCE)", Internet-Draft draft-ietf-pce-stateful-hpce-15, October 2019. |
[RFC4655] | Farrel, A., Vasseur, J. and J. Ash, "A Path Computation Element (PCE)-Based Architecture", RFC 4655, DOI 10.17487/RFC4655, August 2006. |
[RFC7470] | Zhang, F. and A. Farrel, "Conveying Vendor-Specific Constraints in the Path Computation Element Communication Protocol", RFC 7470, DOI 10.17487/RFC7470, March 2015. |