Internet DRAFT - draft-lin-ccamp-gmpls-proactive-protection
draft-lin-ccamp-gmpls-proactive-protection
CCAMP Working Group Yi Lin
Internet Draft Huawei Technologies
Intended status: Standards Track November 3, 2019
Expires: May 2020
RSVP-TE Extensions in Support of Proactive Protection
draft-lin-ccamp-gmpls-proactive-protection-00.txt
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on May 3, 2020.
Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Yi Lin Expires May 3, 2020 [Page 1]
Internet-Draft GMPLS Proactive Protection November 2019
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Abstract
This document describes protocol-specific procedures and extensions
for Generalized Multi-Protocol Label Switching (GMPLS) Resource
ReSerVation Protocol - Traffic Engineering (RSVP-TE) signaling to
support Label Switched Path (LSP) Proactive Protection, which create
the protection LSP after a failure is predicted and before it
becomes a real failure.
Table of Contents
1. Introduction .................................................. 2
2. Conventions used in this document ............................. 3
3. Overview of Predicted Failure and Related Recovery Methods .... 3
3.1. Predicted Failure ........................................ 3
3.2. Proactive Protection ..................................... 4
4. Modified PROTECTION Object Format ............................. 5
5. Extension to ERROR_SPEC Object ................................ 6
5.1. New Error Code / Sub-code ................................ 6
5.2. New TLV in ERROR_SPEC Object ............................. 6
6. End-to-end Proactive Protection ............................... 7
6.1. Creation of the Protected LSP ............................ 7
6.2. Notification of Predicted Failure Event .................. 7
6.3. Tearing Down of the Protection LSP ....................... 8
7. Proactive Segment Protection .................................. 8
7.1. Creation of the Protected LSP ............................ 8
7.2. Notification of Predicted Failure Event .................. 9
7.3. Tearing Down of the Segment Recovery LSP ................. 9
7.4. Priority and Resource Pre-emption ....................... 10
8. Consideration of Backward Compatibility ...................... 11
9. Security Considerations ...................................... 11
10. IANA Considerations ......................................... 11
11. References .................................................. 12
11.1. Normative References ................................... 12
11.2. Informative References ................................. 12
12. Authors' Addresses .......................................... 12
1. Introduction
[RFC4872] and [RFC4873] describe protocol-specific procedures and
extensions for GMPLS RSVP-TE signaling to support end-to-end LSP
Yi Lin Expires May 3, 2020 [Page 2]
Internet-Draft GMPLS Proactive Protection November 2019
recovery (including protection and restoration) and segment LSP
recovery, respectively.
Traditional protection solution (e.g., 1+1 or 1:1 protection) could
have very fast protection switch after failure happens, but takes
twice of resource in the network during the whole lifetime of the
LSP. On the other hand, the traditional restoration solution has
much higher resource use, but the recovery of the LSP is much
slower, due to the additional signaling time to create the
restoration LSP.
In order to reduce the recovery resource while keeping the very fast
protection switch, an approach is to use the failure prediction
technologies and to create 1+1 or 1:1 protection only when a
potential failure is predicted. This approach refers to "Proactive
Protection" in this document.
This document extends the RSVP-TE protocol to support the control of
the Proactive Protection.
2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Overview of Predicted Failure and Related Recovery Methods
3.1. Predicted Failure
In most cases, there will be some indications before a physical
failure happens in a network. For example, abnormal fluctuation of
noise of a lightpath, BER (Bit Error Rate) (before error correction)
rising, temperature rising of a transponder.
Therefore, by monitoring on certain physical parameters and
analyzing the change tendency using, for example, Machine Learning
(ML) or other technologies, a node is possible to predict whether
failure will happen in an upcoming period of time.
Note that a predicted failure is different from a Signal Degrade in
that:
- When Signal Degrade happens to a connection, the connection is
still available but the quality of the signal carried by this
Yi Lin Expires May 3, 2020 [Page 3]
Internet-Draft GMPLS Proactive Protection November 2019
connection has declined and is lower than the predetermined
threshold. For example, the BER of a connection rises and is out
of tolerance.
- When a predicted failure of a connection is inferred, no failure
nor degradation happens at present, but there is a trend that
after a period of time, failure will probably happen, which will
cause Signal Fail or Signal Degrade.
The methods to predict failures are outside the scope of this
document.
3.2. Proactive Protection
The "Proactive Protection" refers to an LSP protection approach
which create the protection LSP after a failure is predicted and
before it becomes a real failure. Both end-to-end protection
(defined in [RFC4872] and segment protection (defined in [RFC4873])
are applicable for the Proactive Protection.
The main procedure of Proactive Protection is shown in Figure 1:
|-> Predicted failure notification received
| |-> Proactive Protection path created
| | |-> Real failure happens
| | | |-> Protection switch finished
| | | |
| | | | Protection path deleted <-|
| | | | if no failure happened |
| | | | |
| | t3 | | t6 |
---+---+--------+======x=+==========================+----+---> t
t1 t2 | t4 t5 | t7
| |
|<--Predicted failure time period-->|
Figure 1: Overview of Proactive Protection
- t1: The protection source node of an LSP is notified that a
failure will probably happen during t3~t6, so it starts to create
1+1 or 1:1 protection of the connection. Here the protection
source node can be the source node of the LSP (for end-to-end
protection case), or a branch node located between the source node
and the predicted failure point of the LSP (for segment protection
case).
Yi Lin Expires May 3, 2020 [Page 4]
Internet-Draft GMPLS Proactive Protection November 2019
t2: The 1+1 or 1:1 protecting path is created between the
protection source node and the protection destination node. Here
the protection destination node can be the destination node of the
LSP (for end-to-end protection case), or a merge node located
between the predicted failure point and the destination node of
the LSP (for segment protection case).
- t4: If real failure happens as predicted, the 1+1 or 1:1
protection switch will be triggered.
- t5: Protection switch finished and the service in the connection
is recovered.
- t7: If in fact the predicted failure didn't happen, and no further
predicted failure notification received, the protection source
node MAY tear down the protecting path after t6, in order to save
the network resource.
4. Modified PROTECTION Object Format
This document modifies the PROTECTION object (C-Type=2) by adding
two new bits T and A in reserved fields, as shown in Figure 2 below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Class-Num(37) | C-Type (2) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|P|N|O|T| Res. | LSP Flags | Reserved | Link Flags|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|I|R|A| Reserved | Seg.Flags | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: The modified PROTECTION object (C-Type=2)
- T (Triggered End-to-end Proactive Protection): 1 bit, when set
(1), it indicates that the end-to-end Proactive Protection are
required.
Note that if T bit is set (1), the LSP Flags SHOULD be one of:
0x04 1:N Protection with Extra-Traffic
0x08 1+1 Unidirectional Protection
0x10 1+1 Bidirectional Protection
- A (proActive Segment Protection): 1 bit, when set (1), it
indicates that the Proactive Segment Protection are required.
Yi Lin Expires May 3, 2020 [Page 5]
Internet-Draft GMPLS Proactive Protection November 2019
Note that If A bit is set (1), the Seg. Flags SHOULD be one of:
0x04 1:N Protection with Extra-Traffic
0x08 1+1 Unidirectional Protection
0x10 1+1 Bidirectional Protection
See [RFC4872] and [RFC4873] for the definition of other fields.
5. Extension to ERROR_SPEC Object
5.1. New Error Code / Sub-code
A new Error Sub-code under Error Code "25 - Notify Error" is defined
in this document, which is used to notify the event of a predicted
failure:
Error Code = 25: "Notify Error" (see [RFC3209])
Error Sub-code = TBA: "Notify Error/LSP Local Predicted Failure"
5.2. New TLV in ERROR_SPEC Object
When predicting a failure, a certain time before which the failure
may happen may also be predicted. This time information is useful
for the source node to know how long it should wait for the
predicted failure to become a real failure, and to decide when it's
safe to tear down the protection LSP if the predicted failure didn't
happen.
A new TLV in IPv4/IPv6 IF_ID ERROR_SPEC Object is defined in this
document, which is used to indicate the time before which the
predicted failure will probably become real failure. The format of
this new TLV is shown in Figure 3 below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = TBA | Length = 8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: New TLV (type=TBA) in ERROR_SPEC Object
- Type: TBA
- Length: 8
Yi Lin Expires May 3, 2020 [Page 6]
Internet-Draft GMPLS Proactive Protection November 2019
- Time: A relative time measured in second, which indicates within
how many seconds (from the current time) the predicted failure
will probably become real failure.
6. End-to-end Proactive Protection
6.1. Creation of the Protected LSP
To create an LSP with recovery type of "End-to-end Proactive
Protection", the source node of the LSP generates a Path message
with a PROTECTION object included. The T bit in the PROTECTION
object MUST be set to 1 (End-to-end Proactive Protection), so that
all other nodes along the LSP can start the failure prediction
function on related links/nodes.
Note that the N bit in the PROTECTION object is used to indicate
whether the control plane message exchange is only used for
notification or for protection-switching purpose after real failure
happens, see [RFC4872]. In other words, the N bit have nothing to do
with the notification of a predicted failure before real failure
happens.
To allow the notification of predicted failure event to the source
node by the Notify message, the NOTIFY REQUEST object MUST also be
included in the Path message (see [RFC3473]), where the "Notify Node
Address" SHOULD be the address of the source node of the LSP.
6.2. Notification of Predicted Failure Event
When an intermediate node on an LSP infers that a failure will
happen and will affect the LSP, a Notify message will be sent to the
source node of the LSP, to inform such predicted failure event. A
new error code/sub-code "Notify Error/LSP Local Predicted Failure"
is used in the ERROR_SPEC object or IF_ID_ERROR_SPEC object in the
Notify message.
The Notify message MAY also include a TLV (type = TBA) in the IPv4
or IPv6 IF_ID_ERROR_SPEC object, to indicate the time before which
the predicted failure will probably become real failure.
On receiving the Notify message with error code/sub-code "Notify
Error/LSP Local Predicted Failure", the source node of the LSP
SHOULD trigger the procedure to create the protection LSP, according
to the protection type indicated in the "LSP Flags" field of the
PROTECTION object in the Path message for the protected LSP. The
procedures of creating the protection LSP and the protection
switching after real failure happens are described in [RFC4872].
Yi Lin Expires May 3, 2020 [Page 7]
Internet-Draft GMPLS Proactive Protection November 2019
6.3. Tearing Down of the Protection LSP
After the protected LSP is created, the source node MAY start a
timer T_wait and wait for the predicted failure to become a real
failure. If no real failure happens and no more notification of
predicted failure is received till T_wait, the source node MAY
trigger the procedure to tear down the protection LSP, according to
local policy. See [RFC4872] about the process of tearing down a
protection LSP.
Implementations SHOULD allow this policy to be configured to provide
a default across all LSPs on a node, but SHOULD also allow it to be
configured per LSP.
Note that the T_wait MUST longer than the time indicated in the TLV
(type=TBA) in the ERROR_SPEC object in the Notify message, if the
TLV exists.
Note also that the value of T_wait is a local matter of the source
node, and is outside the scope of this document.
7. Proactive Segment Protection
7.1. Creation of the Protected LSP
To create an LSP with recovery type of "Proactive Segment
Protection", the source node of the LSP generates a Path message,
where:
- A PROTECTION object is included, where the A bit MUST be set to 1
(Proactive Segment Protection), so that all nodes along the
protected LSP can start the failure prediction function on related
links/nodes if supported. The "Seg. Flags" are used to indicate
the protection type of the Proactive Segment Protection.
- One or more SERO objects MAY included (i.e., explicit Proactive
Segment Protection), indicating the branch node and the merge node
of each segment recovery LSP. If no SERO object is included, it
indicates that the dynamic Proactive Segment Protection method is
used.
- A NOTIFY REQUEST object is included, where the Notify Node
Address" SHOULD be the address of the source node of the LSP.
For explicit Proactive Segment Protection, when a branch node
receives a Path message with A bit set to 1 in the PROTECTION
object, the branch node follows [RFC4873] to process the Path
Yi Lin Expires May 3, 2020 [Page 8]
Internet-Draft GMPLS Proactive Protection November 2019
message, except that the Path message for the recovery LSP will not
be generated and be sent at this stage. Also, one more NOTIFY
REQUEST object SHOULD be added to the Path message of the protected
LSP, which carries the address of this branch node.
For dynamic Proactive Segment Protection, when an intermediate node
receives a Path message with A bit set to 1 in the PROTECTION
object, the node will determine if it has the ability to be a branch
node, as described in Section 6.2 of [RFC4873]. If yes, it follows
the same procedure as what a branch node does in the case of
explicit Proactive Segment Protection, as described above. If not,
the node only follows the standard procedure to create the protected
LSP.
7.2. Notification of Predicted Failure Event
When an intermediate node between a pair of branch and merge nodes
on an LSP infers that a failure will happen and will affect the LSP,
a Notify message will be sent to the nearest branch node on the
upstream direction of the LSP, to inform such predicted failure
event. The error code/sub-code "Notify Error/LSP Local Predicted
Failure" is used in the ERROR_SPEC object or IF_ID_ERROR_SPEC object
in the Notify message.
Similar to End-to-end Proactive Protection, the time before which
the predicted failure may occur MAY also be included in the Notify
message.
On receiving the Notify message with error code/sub-code "Notify
Error/LSP Local Predicted Failure", the branch node on the protected
LSP SHOULD generate a new Path message, and send this new Path
message along the recovery LSP between the branch and the merge
nodes. The procedures of generating new Path message and creating
the recovery LSP are the same as what is described in [RFC4873],
except that the A bit in the PROTECTION object of this new Path
message MUST set to 1.
7.3. Tearing Down of the Segment Recovery LSP
After the segment recovery LSP is created, the branch node MAY start
a timer T_wait and wait for the predicted failure to become a real
failure. If no real failure happen and no more notification of
predicted failure is received till T_wait, the branch node MAY
trigger the procedure to tear down the segment recovery LSP,
according to local policy. See [RFC4873] about the process of
tearing down a segment recovery LSP.
Yi Lin Expires May 3, 2020 [Page 9]
Internet-Draft GMPLS Proactive Protection November 2019
Implementations SHOULD allow this policy to be configured to provide
a default across all LSPs on a node, but SHOULD also allow it to be
configured per LSP.
Note that the T_wait MUST longer than the time indicated in the TLV
(type=TBA) in the ERROR_SPEC object in the Notify message, if the
TLV exists.
Note also that the value of T_wait is a local matter of the branch
node, and is outside the scope of this document.
7.4. Priority and Resource Pre-emption
It's possible that after recovery LSP is created and before the
predicted failure becomes a real failure, another real failure
happens on the LSP outside the protected segment. In this case, the
source node (or an intermediate node in the upstream direction of
the real failure) may start a restoration procedure to recover the
LSP. For the same protected LSP, since recovering from a real
failure always has higher priority than protecting against a
predicted failure which still hasn't happened, the restoration LSP
can pre-empt the resource of the segment recovery LSP.
As shown in Figure 4, assume that node B (branch node) was notified
of a predicted failure event between N-4 and M (merge node), and has
created the segment recovery LSP along B, N-1, N-2, N-3 and M. If
another failure between S (source node) and B happens before the
predicted failure becomes a real failure, node S will try to create
the restoration LSP. Since that resource is limited, the restoration
LSP can pre-empt the resource of the segment recovery LSP between N-
1 and N-3.
The nodes along the segment recovery LSP has enough information to
determine whether pre-emption is allowed. This is because these
nodes know that:
- The current segment recovery LSP is used for Proactive Segment
Protection through the A bit in the PROTECTION object;
- The segment recovery LSP and the restoration LSP are protecting
the same LSP through the association relationship.
Yi Lin Expires May 3, 2020 [Page 10]
Internet-Draft GMPLS Proactive Protection November 2019
|<------ Pre-emption ------>|
| |
***************************************************************
*+---+ +---+ +---+ +---+ +---+*
*| +---------+N-1+---------+N-2+---------+N-3+---------+ |*
*+-+-+ +-+-+ +---+ +-+-+ +-+-+*
* | |###########################| | *
* | |# #| | *
* | |# #| | *
*+-+-+ +-+-+ +---+ +-+-+ +-+-+*
***| S +----X----+ B +---------+N-4+----?----+ M +---------+ D |***
+---+ +---+ +---+ +---+ +---+
===================================================================
S: Source node D: Destination node
B: Branch node M: Merge node
X: Real failure ?: Predicted failure (haven't happened yet)
=====: Protected LSP
#####: Segment Recovery LSP
*****: Restoration LSP
Figure 4: Resource pre-emption by restoration LSP
8. Consideration of Backward Compatibility
TBD.
[Editor's note]: will add some description about interwork with
legacy nodes which do not support the function of failure prediction
and reporting.
9. Security Considerations
TBD.
10. IANA Considerations
IANA assigns values to RSVP protocol parameters. Within the current
document, a new Error code/sub-code value is defined:
Error Code = 25: "Notify Error" (see [RFC3209])
o "Notify Error/LSP Local Predicted Failure" (TBA)
Yi Lin Expires May 3, 2020 [Page 11]
Internet-Draft GMPLS Proactive Protection November 2019
11. References
11.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI
10.17487/RFC2119, March 1997.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
Tunnels", RFC 3209, December 2001.
[RFC3473] Berger, L., Ed., "Generalized Multi-Protocol Label
Switching (GMPLS) Signaling Resource ReserVation Protocol-
Traffic Engineering (RSVP-TE) Extensions", RFC 3473,
January 2003.
[RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou,
Ed., "RSVP-TE Extensions in Support of End-to-End
Generalized Multi-Protocol Label Switching (GMPLS)
Recovery", RFC 4872, May 2007.
[RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel,
"GMPLS Segment Recovery", RFC 4873, May 2007.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017.
11.2. Informative References
[RFC4426] Lang, J., Ed., Rajagopalan, B., Ed., and D. Papadimitriou,
Ed., "Generalized Multi-Protocol Label Switching (GMPLS)
Recovery Functional Specification," RFC 4426, March 2006.
12. Authors' Addresses
Yi Lin
Huawei Technologies
F3 R&D Center, Huawei Industrial Base,
Bantian, Longgang District,
Shenzhen 518129 P.R.China
Email: yi.lin@huawei.com
Yi Lin Expires May 3, 2020 [Page 12]