Internet DRAFT - draft-ietf-mpls-tp-1ton-protection
draft-ietf-mpls-tp-1ton-protection
Network Working Group E. Osborne
Internet-Draft Cisco
Intended status: Standards Track F. Zhang
Expires: February 7, 2014 ZTE
Y. Weingarten
August 6, 2013
MPLS-TP 1toN Protection
draft-ietf-mpls-tp-1ton-protection-02.txt
Abstract
There is a requirement for Multiprotocol Label Switching Transport
Profile(MPLS-TP) to support 1:n linear protection for transport
paths. This requirement is further elaborated in RFC6372
[SurvivFwk]. The basic protocol for linear protection, specified in
RFC6378 [LinProt], is limited to 1+1 and 1:1 protection. This
document extends that protocol to address the additional
functionality necessary to support scenarios where a single
protection path is preconfigured to provide protection of multiple
transport paths between two joint endpoints.
This document is a product of a joint Internet Engineering Task Force
(IETF) / International Telecommunications Union Telecommunications
Standardization Sector (ITU-T) effort to include an MPLS Transport
Profile within the IETF MPLS and PWE3 architectures to support the
capabilities and functionalities of a packet transport network as
defined by the ITU-T.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 7, 2014.
Copyright Notice
Osborne, et al. Expires February 7, 2014 [Page 1]
Internet-Draft MPLS-TP LP August 2013
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Osborne, et al. Expires February 7, 2014 [Page 2]
Internet-Draft MPLS-TP LP August 2013
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. 1:n Protection architecture . . . . . . . . . . . . . . . 4
1.2. Locking operation . . . . . . . . . . . . . . . . . . . . 6
1.3. Non-Locking . . . . . . . . . . . . . . . . . . . . . . . 7
1.4. Path priority . . . . . . . . . . . . . . . . . . . . . . 7
1.5. Preemption . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6. Contributing authors . . . . . . . . . . . . . . . . . . . 8
2. Conventions used in this document . . . . . . . . . . . . . . 8
2.1. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2. Definitions and Terminology . . . . . . . . . . . . . . . 9
3. Use cases and scenarios . . . . . . . . . . . . . . . . . . . 9
3.1. Non-locking use case: Per-node label space . . . . . . . . 9
3.2. Locking use-case: . . . . . . . . . . . . . . . . . . . . 10
3.3. PSC Scenarios . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1. Unidirectional failure cases . . . . . . . . . . . . . 13
3.3.2. Bidirectional fault scenarios . . . . . . . . . . . . 15
3.3.3. Preemption scenarios . . . . . . . . . . . . . . . . . 17
4. Changes to PSC . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1. PSC . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2. Changes to PSC Payload . . . . . . . . . . . . . . . . . . 23
4.2.1. Locking (L) flag . . . . . . . . . . . . . . . . . . . 24
4.2.2. Fault path (FPath) field . . . . . . . . . . . . . . . 24
4.2.3. Data path (Path) field . . . . . . . . . . . . . . . . 24
4.3. Changes to PSC Operation . . . . . . . . . . . . . . . . . 25
4.3.1. Basic operation . . . . . . . . . . . . . . . . . . . 25
4.3.2. Two-phased operation . . . . . . . . . . . . . . . . . 25
4.3.3. Acknowledge message . . . . . . . . . . . . . . . . . 26
4.3.4. Wait for Acknowledge (WFA) timer . . . . . . . . . . . 27
4.3.5. Additional PSC State . . . . . . . . . . . . . . . . . 27
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
6. Security Considerations . . . . . . . . . . . . . . . . . . . 31
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.1. Normative References . . . . . . . . . . . . . . . . . . . 31
8.2. Informative References . . . . . . . . . . . . . . . . . . 32
Appendix A. PSC state machine tables . . . . . . . . . . . . . . 32
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36
Osborne, et al. Expires February 7, 2014 [Page 3]
Internet-Draft MPLS-TP LP August 2013
1. Introduction
The MPLS Transport Profile (MPLS-TP) Requirements document [TPReq]
includes requirements for the necessary survivability tools required
for MPLS based transport networks. Network survivability is the
ability of a network to recover traffic delivery following failure,
or degradation of network resources. Requirement 67 lists various
types of 1:n protection architectures that are required for MPLS-TP.
The MPLS-TP Survivability Framework [SurvivFwk] is a framework for
survivability in MPLS-TP networks, and describes recovery elements,
types, methods, and topological considerations, focusing on
mechanisms for recovering MPLS-TP Label Switched Paths (LSPs).
Linear protection in mesh networks - networks with arbitrary
interconnectivity between nodes - is described in Section 4.7 of
[SurvivFwk]. Linear protection provides rapid and simple protection
switching. In a mesh network, linear protection provides a very
suitable protection mechanism because it can operate between any pair
of points within the network. It can protect against a defect in an
intermediate node, a span, a transport path segment, or an end-to-end
transport path.
[LinProt] defines a Protection State Coordination (PSC) protocol that
supports the different 1+1 and 1:1 architectures described in
[SurvivFwk]. The PSC protocol is a single-phased protocol that
allows the two endpoints of the protection domain to coordinate the
protection switching operation when a switching condition is detected
on the transport paths of the protection domain.
This document extends the PSC protocol to support a protection domain
that includes multiple working transport paths, between common end
points, protected by a single protection transport path. The
protection transport path is pre-allocated with resources to
transport the traffic normally carried by any one of the working
transport paths. This is the architecture described in [SurvivFwk]
as 1:n protection, and is the generalization of the 1:1 protection
architecture already supported by PSC.
1.1. 1:n Protection architecture
Linear protection switching is a fully allocated survivability
mechanism in the sense that the route and bandwidth of the protection
path is reserved for a set of working paths. For 1:n protection the
protection path is allocated to protect any one of n working paths
between the two endpoints of the protection domain.
Osborne, et al. Expires February 7, 2014 [Page 4]
Internet-Draft MPLS-TP LP August 2013
+-----+ +-----+
| |=============================| |
|LER-A| Working Path #1 |LER-Z|
| | | |
| |=============================| |
| | Working Path #2 | |
| | | |
| |=============================| |
| | Working Path #3 | |
| | | |
| | ooo | |
| | | |
| |=============================| |
| | Working Path #N | |
| | | |
| | Protection Path | |
| |*****************************| |
| | | |
+-----+ +-----+
|--------Protection Domain--------|
Figure 1: 1:n Protection domain
Figure 1 shows a protection domain with N working transport paths and
a single protection path. In 1:n protection, the protection path may
transport the traffic of only a single working path at any particular
time. The identity of the working path that is being protected must
be communicated between the two endpoints.
Unless otherwise specified, all examples will be based on the network
topology in Figure 1, with the working paths referenced as Wi (for
1<=i<=N) and the protection path referenced as P. The end-points of
the protection domain will be referred to as LER-A and LER-Z.
The different working paths may be disjoint at the intermediary
points on the path between LER-A and LER-Z and may also have
different resource requirements. In addition, each of the working
paths may be assigned a priority that could be used to decide which
working path would be protected in cases of conflict (see more on
this topic in Section 1.5). It is usually advised to arrange these
protection groups in a way that would minimize any potential conflict
situation.
1:n protection in MPLS supports two modes of operation - locking and
non-locking. The locking mode mirrors the behavior that is used by
many transport protection mechanisms, and is necessary in some cases
Osborne, et al. Expires February 7, 2014 [Page 5]
Internet-Draft MPLS-TP LP August 2013
but may incur increased latency (and thus packet loss), as a result
of prolonged switching time, in comparison to the non-locking case.
Non-locking 1:n can be used in many MPLS networks and affords a lower
rate of packet loss as compared to locking mode, but must be used
with care - since incorrect use of non-locking can lead to
misconnectivity.
1.2. Locking operation
The high-level functionality of the locking operation mode of 1:n
protection would follow the following basic steps:
o LER-A detects a unidirectional failure of W1 and stops sending
traffic on W1.
o LER-A transmits a PSC SF message to LER-Z indicating that W1 has
failed and its traffic should be redirected to P. No traffic is
sent on P at this point.
o LER-Z receives the PSC message from LER-A and begins transmitting
W1 traffic in P, and sends a PSC message to LER-A indicating that
W1 is now being protected by P. LER-A receives the normal data
traffic intended for W1 from P, LER-Z receives the W1 data traffic
from P and also bridges W1 data traffic into P.
o LER-A receives the PSC message from LER-Z and begins transporting
W1 traffic in P -- that is, LER-A bridges W1 into P.
It should be clear from this description that no traffic is sent over
P until LER-Z processes the PSC message from LER-A, and that traffic
is only sent unidirectionally (Z->A) until LER-A processes the
"reply" PSC message from LER-Z. As the message processing time is
expected to be dwarfed by the propagation delay between LER-A to
LER-Z, it can be said that there is complete traffic loss between the
endpoints for the duration of the one-way propagation delay from
LER-A to LER-Z, and full bidirectional traffic flow is not fully
restored until after 1xRTT of the protection path.
This operation mode is referred to as "locking" because the sequence
of processing the PSC messages includes periods where the protection
path is locked from carrying protected traffic, while the two end-
points verify that both are ready to process the W1 traffic that is
received on P. More detailed information on this mode of operation
will be supplied later in the document when considering different
scenarios.
Osborne, et al. Expires February 7, 2014 [Page 6]
Internet-Draft MPLS-TP LP August 2013
1.3. Non-Locking
In non-locking protection operation mode, LER-A switches data traffic
onto P immediately upon failure detection. This minimizes traffic
loss, but at the cost of temporary asymmetry of packet flow. At a
high level, it works like this:
o LER-A detects the failure of W1 and stops sending traffic on W1.
o LER-A immediately begins to transport W1's data traffic over the
protection path P.
o Simultaneously LER-A transmits a PSC message to LER-Z indicating
that W1 has failed and is currently being protected in P.
o LER-Z receives the PSC message from LER-A, switches all W1 data
traffic to P, and transmits a PSC message to LER-A indicating that
W1 is now protected in P.
o LER-A receives the PSC message from LER-Z and needs to take no
action, as the protection switch had already been completed.
In the non-locking case, the packet loss between the endpoints is
minimized. Packet loss may occur in the A->Z direction only for the
duration of the failure detection time , which is assumed, for this
document, to be negligible. Packet loss in the Z->A direction is
almost entirely the result of the one-way propagation delay of the
PSC message from LER-A to LER-Z. Assuming the transport path from
A->Z has the same delay as that from Z->A, it can be said that the
packet loss in the non-locking case is roughly half that of the
locking case.
1.4. Path priority
As the 1:n architecture requires the ability for one working path to
preempt the traffic of another in the event of multiple failures (see
Section 1.5), there must be an indication of priority between the
different working paths so that an implementation can decide whether
a new failure should be allowed to preempt a protection switch
already in place. The priority for a given Working path is
determined by the value used to represent that path in the FPath
field of the PSC packet. When comparing two Working paths to
determine priority, the numerically lower FPath value is the winner.
That is, Wi>Wj if i<j.
As described in Section 4.2.2, valid FPath values for Working paths
are in the range 1-128.
Osborne, et al. Expires February 7, 2014 [Page 7]
Internet-Draft MPLS-TP LP August 2013
1.5. Preemption
Preemption occurs, for example, when the protection path is being
used to transport traffic and is then required to transport traffic
for a working path with higher priority. At this point, the current
traffic that is being transported on the protection path needs to be
interrupted to allow the transport of the protected traffic.
There are two basic scenarios for preemption of traffic -
1. When the protection path is used to transport "extra traffic".
While this practice is discouraged by [TPReq], it is still not
precluded. When the protection domain triggers a protection
switch, the extra traffic should be preempted to allow the
transport of the protected traffic from the working path that
triggered the switching operation. The subsequent treatment of
the interrupted service is out of the scope of this document.
2. When the protection path is transporting traffic from a working
path and a second working path triggers a switching condition.
This second trigger may either be a trigger with a higher
priority (e.g. FS after a SF) or because the operator had
assigned a higher priority to the working path of the second
trigger. At this point, the traffic for the lower priority
working path will be interrupted, and the higher priority traffic
will be transmitted on the protection path. The preempted
traffic will only renew transmission, when either the working
path recovers, or the higher priority traffic relinquishes
control of the protection path.
1.6. Contributing authors
Nurit Sprecher (NSN)
2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Osborne, et al. Expires February 7, 2014 [Page 8]
Internet-Draft MPLS-TP LP August 2013
2.1. Acronyms
This draft uses the following acronyms:
Ack Acknowledge
DNR Do not revert
FS Forced Switch
LER Label Edge Router
LO Lockout of protection
MPLS-TP Transport Profile for MPLS
MS Manual Switch
NR No Request
P2P Point-to-point
P2MP Point-to-multipoint
PSC Protection State Coordination Protocol
SD Signal Degrade
SF Signal Fail
WFA Wait for Acknowledge
WTR Wait-to-Restore
2.2. Definitions and Terminology
The terminology used in this document is based on the terminology
defined in [RFC4427] and further adapted for MPLS-TP in [SurvivFwk].
In addition, we use the term LER to refer to a MPLS Network Element,
whether it is a LSR, LER, T-PE, or S-PE.
3. Use cases and scenarios
This section presents some use-cases and scenarios that should
elucidate the use of PSC for 1:n protection.
3.1. Non-locking use case: Per-node label space
Non-locking protection can be used when the payload that is received
from the protection path is unambiguous and can be properly forwarded
without the need to explicitly establish selector and bridge
configuration at the time of failure. One example where this applies
is when the endpoints of the protection domain are using per-platform
label space [RFC3031].
In per-node or per-platform label space, the LIB is established on a
node such that it can properly switch any labeled packet regardless
of input interface.
Consider, as an example, the protection topology as shown in Figure 1
with four working paths - W1, W2, W3, W4 and a single protection
Osborne, et al. Expires February 7, 2014 [Page 9]
Internet-Draft MPLS-TP LP August 2013
path, P, that connect between LER-A and LER-Z. Each packet
transported from LER-A to LER-Z is labelled by LER-A depending upon
the path used to transmit the packet. From there the packet will
traverse the relevant path and have its label manipulated by the
intermediate LSRs until it arrives at LER-Z, at which point, the LER
will pop the label for the path used within the protection domain and
process the next label down to determine how to forward the packet
payload. The following table gives the label assigned by LER-A and
the one expected by LER-Z for each of the transport paths:
+------+----------------+-----------------+
| Path | Label at LER-A | Label for LER-Z |
+------+----------------+-----------------+
| W1 | 100 | 105 |
| W2 | 200 | 205 |
| W3 | 300 | 305 |
| W4 | 400 | 405 |
| P | 500 | 505 |
+------+----------------+-----------------+
If there is a pseudowire (PW) that needs to be carried over one of
these transport paths between LER-A and LER-Z, whose label is
allocated from the per-platform label space on both LER-A and LER-Z
(e.g. label 888), then when a packet for this PW is transported over
W2, the label stack that will be sent from LER-A will be [200|888|..]
and it will arrive at LER-Z with a label stack [205|888|..]. If W2
were to report a failure that triggers a protection switch and LER-A
would redirect a packet for this PW to P, it would be transported
with a label stack of [500|888|..] and be received by LER-Z with a
label stack [505|888|..]. Since the PW label is drawn from per-node
label space, when LER-Z pops the path label it will be able to
process the PW label regardless of the transport path that was used
between LER-A & LER-Z.
Since the forwarding behavior is preestablished, there is no need to
ensure that LER-A and LER-Z coordinate the bridge/selector functions
as part of the protection protocol. This is true for any underlying
label assigned from per-node space. The label can be allocated by
LDP, MPLS VPNs, PWs, TE tunnels, or any other application. As long
as the label is preprogrammed in the receiving node's label space,
coordination of the bridge/selection functions is unnecessary.
3.2. Locking use-case:
Locking protection must be used when the payload that is received on
the protection path is ambiguous; that is, the switching behavior for
the payload of the protection path must be established at the time of
failure. One such example where this applies is when the endpoints
Osborne, et al. Expires February 7, 2014 [Page 10]
Internet-Draft MPLS-TP LP August 2013
of the protection domain are using per-interface label space, where
the Working and Protect LSPs are instantiated as interfaces.
In per-interface label space, a node may use the same label value to
represent different switching behaviors on different interfaces. For
example, the label value 100 when received on LSP W1 may be treated
differently than the label value 100 when received on LSP W2. Since
either W1 or W2 may be protected in P, LSP P must ensure that it has
the proper forwarding behavior defined for label 100. Using the
wrong forwarding behavior (e.g. programming P's label space with W1's
entry for label 100 when P is protecting W2) is likely to lead to
misconnectivity.
Consider, as an example, the protection topology as shown in Figure 1
and in Section 3.1. There are four working paths - W1, W2, W3, W4 -
and a single protection path, P, that connect between LER-A and
LER-Z. Section 3.1 shows a table with the receive labels [105, 205,
305, 405, 505] at LER-Z, and those do not change. What changes is
the payload of those labels. Section 3.1 gives the example of a PW
drawn from global label space which uses the label 888 - this label
is treated to the same forwarding behavior no matter which LSP is
used to carry it from LER-A to LER-Z.
In per-interface label space, each W-LSP has its own label space.
For this example, consider a PW switched over W1 with the outgoing
label 900. Thus, the label stack when leaving LER-A is [100|900] and
when arriving at LER-Z is [105|900]. There is also a PW defined over
W2 which also uses label 900, but with a different forwarding
behavior. The per-interface label switching tables on LER-Z look
like this:
+-----------------+-------+--------------------------------+
| Input Interface | Label | Switching behavior |
+-----------------+-------+--------------------------------+
| W1 | 900 | Switch to Access Circuit #1 |
| W2 | 900 | Switch to Access Circuit #2 |
| W3 | 900 | Switch to Access Circuit #3 |
| W4 | 900 | Switch to Access Circuit #4 |
| P | 900 | none defined (drop, log error) |
+-----------------+-------+--------------------------------+
The label space for P is established at the time of failure, using
PSC. When there is no failure, there is no switching behavior
defined for the P LSP's contents.
When the protection domain has determined that W2 has failed and
needs to be switched, it coordinates this protection, using PSC,
between LER-A and LER-Z. Part of the coordination is to establish
Osborne, et al. Expires February 7, 2014 [Page 11]
Internet-Draft MPLS-TP LP August 2013
the proper receive behavior on LER-Z, i.e. the Switching behavior on
the input interface for Label 900 to be "Switch to Access Circuit
#2". Whereas, if W1 fails and preempts W2, the switching behavior on
LER-Z is changed be "Switch to Access Circuit #1".
Clearly it is imperative that there be no misconnectivity. This
requirement means that there must be a "lock" on P established, such
that there are no packets transmitted on an LSP until both ends agree
on the switching behavior for that LSP. The details of the behavior
in the locking use cases is explored further in Section 3.3. of this
document.
3.3. PSC Scenarios
This section discusses the message exchange necessary to perform both
non-locking and locking PSC options for 1:n protection. There are
several examples presented here that attempt to cover all the
combinations of failure and preemption, unidirectional and
bidirectional protection for the two modes of operation. It should
be noted that this is a non-exhaustive set of scenarios, but were
chosen to highlight the main features of the proposal.
It is not the intent of this document to spell out all the
combinations of preemption, directionality and locking behavior which
can occur. That is not how one builds a robust protocol. This
document spells out a state machine which reacts appropriately in all
possible cases, and as part of that walks through some of the failure
cases as examples. PSC is, at its heart, a simple protocol. A node
is aware of both its local status and the status of the remote node,
and transitions to the appropriate state and takes appropriate action
based on the combination of these two states. Preemption, which as
noted is only relevant in 1:n, does not increase the complexity of
the protocol. The examples are detailed, but the behavior is quite
simple.
All of these examples assume a protection domain consisting of four
working paths [W1, W2, W3, W4] with priority in decreasing order,
i.e. W1 > W2 etc. There is a single protection path, P. These
examples use the notation "B = x" to indicate the protect LSP whose
contents are bridged into the protect LSP. For example, if W3 has
failed and is currently protected, B = 3. If no protection is in
place, B = n/a. All examples end with the REQ(FPath, Path) and B
values for each node in each example.
The non-locking cases assume that both LER-A and LER-Z have
preestablished per-node label spaces, as per the use case above.
All cases assume that the time required to perform on-box operations
Osborne, et al. Expires February 7, 2014 [Page 12]
Internet-Draft MPLS-TP LP August 2013
such as bridging or selecting is instantaneous. The one-way delay
between nodes is abbreviated OWD, and the round trip time is RTT
(i.e. RTT = 2 x OWD).
3.3.1. Unidirectional failure cases
The examples in this section provide the message flow between LER-A
and LER-Z for the scenario where a unidirectional fault is detected
by LER-A on working path W1. The message flow is described as a
sequence along a timeline.
3.3.1.1. Non-locking
Considering the scenario of a protection domain operating in non-
locking mode the following is the event timeline:
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) |
| | not carrying any traffic. Both LER-A and | B = n/a | B = n/a |
| | LER-Z transmitting PSC NR(0,0) message. | | |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,0) |
| | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = n/a |
| | (Waiting for Acknowledgement) state. LER-A| | |
| | still selects the traffic from W1. This is| | |
| | admittedly of not much use when LER-A sees| | |
| | SF, may be useful when LER-A encounters a | | |
| | partial failure such as SD. | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z receives SF(1,1). LER-Z enters | SF(1,1) | NR(0,1) |
| | PF:W:R state. LER-Z switches W1 onto P | B = 1 | B = 1 |
| | and sends SF(1,1). At this point traffic | | |
| | for W1 is protected in both directions | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-A receives SF(1,1), which it takes as | SF(1,1) | NR(0,1) |
| | an ACK from LER-Z. LER-A transits from | B = 1 | B = 1 |
| | WFA to PF:W:L state. Switch is complete. | | |
+----+-------------------------------------------+---------+---------+
Figure 2: Unidirectional non-locking
Note: Between t1 and t2, LER-A transports the data traffic on P while
LER-Z continues transporting it on W1, and there is temporary path
asymmetry. After t2, the data traffic is in P in both directions.
Osborne, et al. Expires February 7, 2014 [Page 13]
Internet-Draft MPLS-TP LP August 2013
In this case, LER-A loses traffic for the OWD time, as it does not
receive any traffic from LER-Z on P until LER-Z bridges W1 into P.
LER-Z does not lose any traffic due to the immediate bridging on
LER-A.
3.3.1.2. Locking
When examining the similar scenario for a protection domain that is
using the Locking mode of operation, we have the following time
sequence:
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) |
| | not carrying any traffic. Both LER-A and | B = n/a | B = n/a |
| | LER-Z transmitting PSC NR(0,0) message. | | |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1, LER-A enters into | SF(1,0) | NR(0,0) |
| | WFA state and sends SF(1,1). LER-A still| B = n/a | B = n/a |
| | transports and selects the traffic from | | |
| | W1. This allows traffic to get through if | | |
| | the failure is truly unidirectional. | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z receives SF(1,0). LER-Z enters | SF(1,0) | NR(0,1) |
| | PF:W:R state. LER-Z bridges W1 into P and| B = 1 | B = 1 |
| | sends NR(0,1) but continues to select | | |
| | traffic from W1 | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-A receives NR(0,1), which it takes as | SF(1,1) | NR(0,1) |
| | an ACK from LER-Z. LER-A completely | B = 1 | B = 1 |
| | switches W1 traffic onto P. LER-A transits| | |
| | from WFA to PF:W:L state. Switch complete| | |
+----+-------------------------------------------+---------+---------+
| t4 | LER-Z receives SF(1,1). LER-Z selects W1 | SF(1,1) | NR(0,1) |
| | traffic from P and sends NR(0,1) | B = 1 | B = 1 |
+----+-------------------------------------------+---------+---------+
Figure 3: Unidirectional locking
Note: At t1, LER-A stops sending traffic to LER-Z. At t3, it
resumes. Since the majority of the time delay at both t1 and t2 is
the one-way transmission delay between LER-A and LER-Z, there is a
total of 1xRTT traffic loss at both endpoints.
Osborne, et al. Expires February 7, 2014 [Page 14]
Internet-Draft MPLS-TP LP August 2013
3.3.2. Bidirectional fault scenarios
The examples above focused on unidirectional failures in order to
illustrate the basic principles of 1:n protection. However, most
failures in carrier networks are bidirectional in nature.
Bidirectionality includes not only the failure of both the tx and rx
physical path (e.g. a fiber cut) but also a unidirectional failure
made bidirectional by mechanisms outside of PSC such as CC-V or LDI.
Both ends of a protection domain may not see the bidirectional
failure at the same instant. In the case of a true bidirectional
fiber cut, the cut may be physically closer to one end of the domain
than the other, and thus the end which is farther away takes longer
to notice the failure. This is referred to as "asymmetric
notification delay" in this document. Similarly, a unidirectional
failure seen by one endpoint which triggers an LDI notification to
the far endpoint will not be recognized by this far end until after
ir has been noticed it at the near endpoint.
There are a number of scenarios that constitute bidirectional
failure, and the variety of triggers and notification delays mean
that it is impossible to document them all here. The scenario used
in this case is of a true bidirectional failure, on working path W1,
with asymmetric notification delay, as described above. Both the
case of Non-locking and Locking operation modes are presented.
It is perhaps important to understand that a node, when reacting to a
failure, simply reacts either to its local LSP status (e.g. SF on
the underlying fiber) or the status of the remote node (e.g. the
remote node sending SF(x,y)). A node neither knows nor cares whether
the failure is bidirectional; it simply reacts to inputs to its local
state machine. It can easily be observed that there are no special
states needed for unidirectional vs. bidirectional error handling.
3.3.2.1. Non-Locking
First we present the scenario when operating in non-locking mode:
Osborne, et al. Expires February 7, 2014 [Page 15]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) |
| | not carrying any traffic. Both LER-A and | B = n/a | B = n/a |
| | LER-Z transmitting PSC NR(0,0) message. | | |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,0) |
| | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = n/a |
| | state and continues to select the traffic | | |
| | from W1. | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,1) | SF(1,1) |
| | WFA state and bridges W1 into P and | B = 1 | B = 1 |
| | transmitting SF(1,1). At this point | | |
| | traffic for W1 is protected in both | | |
| | directions, however the endpoints are | | |
| | still not coordinated | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | SF(1,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state | | |
+----+-------------------------------------------+---------+---------+
| t4 | LER-A receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) |
| | an Ack from LER-Z and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state. Switch is complete. | | |
+----+-------------------------------------------+---------+---------+
Figure 4: Bidirectional non-locking
It is perhaps instructive to note that the only differences between
the unidirectional non-locking and bidirectional non-locking
scenarios are the trigger at t2 which causes Z to send SF(1,1) and
the state Z finally enters (PF:W:L rather than PF:W:R). All other
actions before and after this point are identical between the two
cases.
3.3.2.2. Locking
We now follow the scenario for the locking mode of operation:
Osborne, et al. Expires February 7, 2014 [Page 16]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) |
| | not carrying any traffic. Both LER-A and | B = n/a | B = n/a |
| | LER-Z transmitting PSC NR(0,0) message. | | |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,0) |
| | LER-A enters into WFA continues to bridge | B = n/a | B = n/a |
| | and select the traffic from W1. This | | |
| | allows traffic to get through if the | | |
| | failure is really unidirectional. | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,0) | SF(1,0) |
| | WFA state and continues to bridge and | B = n/a | B = n/a |
| | select traffic from W1 while transmitting | | |
| | SF(1,0). | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | SF(1,1) |
| | bridges traffic from W1 to P remaining in | B = n/a | B = 1 |
| | WFA state now transmitting a SF(1,1) | | |
+----+-------------------------------------------+---------+---------+
| t4 | LER-A receives the SF(1,0) from LER-Z and | SF(1,1) | SF(1,1) |
| | bridges traffic from W1 to P remaining in | B = 1 | B = 1 |
| | WFA state now transmitting a SF(1,1) | | |
+----+-------------------------------------------+---------+---------+
| t5 | LER-A receives the SF(1,1) from LER-Z and | SF(1,1) | SF(1,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state | | |
+----+-------------------------------------------+---------+---------+
| t6 | LER-Z receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) |
| | an Ack from LER-A and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state. Switch is complete. | | |
+----+-------------------------------------------+---------+---------+
Figure 5: Bidirectional locking
As with non-locking, the major difference between the unidirectional
and bidirectional scenarios of this failure are the alarm which
causes LER-Z to take action and the final state LER-Z enters as a
result.
3.3.3. Preemption scenarios
In addition to a bidirectional failure, it is also necessary to
consider preemption. When protecting n entities e.g [W1, W2, W3] it
Osborne, et al. Expires February 7, 2014 [Page 17]
Internet-Draft MPLS-TP LP August 2013
is possible for multiple working LSPs to simultaneously fail.
Consider the case where LSP W1 fails and starts to use the protection
LSP. After this failure, LSP W2 fails before W1 has been restored.
If W2 is of a lower relative priority than W1, there is no
preemption. However, if W2 has a higher priority than W1, when W2
fails it preempts W1 from the protection LSP. Preemption is not an
issue in 1:1 or 1+1, as with only a single working LSP there's
nothing to preempt.
There are multiple scenarios of preemption depending on where the
failures were detected. In addition to the combinations of failure
directionality and preemption, it is also necessary to consider how
these combinations behave in both the locking and non-locking modes
of operation.
First consider, the two flavors of preemption due to multiple
unidirectional failures.
The difference between Locking and Non-Locking modes is that a node
can continue to send traffic on the P-LSP during the preemption
process, when operating in Non-Locking mode. The P-LSP contents may
momentarily disagree (A may send W1 on P, Z may send W2 on P) but in
the non-locking case there is no risk of misconnectivity as explained
in the previous discussion. For this reason, the identity of the
path that the endpoints are selecting incoming traffic from are
irrelevant. In a sense there is no selector; each node is able to
properly process arbitrary data on the P-LSP.
However, WFA state is still necessary in order to ensure that the
endpoints converge on the identity of the working path whose traffic
is being transported on the P-LSP. Failure to converge is a problem
that should be flagged to the operator.
The scenarios start after the two endpoints have converged on
protecting a unidirectional SF condition that was detected on W2,
when a new SF condition is detected on W1 (with higher priority):
3.3.3.1. Unidirectional non-locking
First, consider the event sequence for unidirectional faults in a
domain in non-locking mode:
Osborne, et al. Expires February 7, 2014 [Page 18]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) |
| | and both endpoints are coordinated | B = 2 | B = 2 |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1 and sends SF(1,1). | SF(1,1) | NR(0,2) |
| | LER-A enters into WFA, blocks the W2 | B = 1 | B = 2 |
| | traffic and begins transporting W1 traffic| | |
| | on P. (Since W1 has higher priority) | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | NR(0,1) |
| | bridges traffic from W1 to P remaining in | B = 1 | B = 1 |
| | PF:W:R now transmitting a NR(0,1) | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-A receives the NR(0,1) from LER-Z and | SF(1,1) | NR(0,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state. Coordination complete | | |
+----+-------------------------------------------+---------+---------+
Figure 6: Preemption unidirectional non-locking
As mentioned, in steady state LER-A is sending SF(2,2) and LER-Z is
sending NR(0,2). If LER-A detects an SF on W1, W1 must preempt W2 in
its use of the protection LSP. What the network subsequently does
with W2 is outside the scope of PSC, but likely recovery actions may
include rerouting W2, alerting W2's clients as to the unprotected
failure status of W2, and so forth.
3.3.3.2. Unidirectional locking
In locking operation mode, when A detects an SF on W1, it needs to
alert the far-end, LER-Z, that the W2 traffic must be preempted.
LER-A does this by indicating an SF on the higher priority LSP and by
emptying the protection LSP. The following table presents the
sequence for this scenario (we include the indication of the working
path that is expected by each endpoint to be on the protection path,
shown as "S = n")
Osborne, et al. Expires February 7, 2014 [Page 19]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
| | |Selector | Selector|
+----+-------------------------------------------+---------+---------+
| t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) |
| | and both endpoints are coordinated | B = 2 | B = 2 |
| | | S = 2 | S = 2 |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,2) |
| | LER-A enters into WFA blocks all traffic | B = n/a | B = 2 |
| | on the protection path | S = n/a | S = 2 |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | NR(0,1) |
| | bridges traffic from W1 to P (higher | B = n/a | B = 1 |
| | priority), and begins transmitting NR(0,1)| S = n/a | S = 2 |
| | At this point W1 traffic is flowing Z->A | | |
| | but not A->Z | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-A receives NR(0,1) from LER-Z and | SF(1,1) | NR(0,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state and transmits SF(1,1) | S = 1 | S = 2 |
+----+-------------------------------------------+---------+---------+
| t4 | LER-Z receives SF(1,1), and begins | SF(1,1) | NR(0,1) |
| | selecting the protected traffic as W1 data| B = 1 | B = 1 |
| | Switch is complete. | S = 1 | S = 1 |
+----+-------------------------------------------+---------+---------+
Figure 7: Preemption unidirectional locking
Traffic loss is asymmetric. Loss A->Z starts at t1 and ends at t4,
roughly 1.5xRTT. Loss Z->A starts at t1 and ends at t3, roughly
0.5xRTT.
3.3.3.3. Bidirectional non-locking
Looking, similarly, at the implications of preemption on the basic
scenarios of bidirectional faults in multiple working paths. Both of
the operating modes, i.e. non-locking and locking, are presented.
The scenarios begin at the point where W2 traffic is being
transported on the protection path in a coordinated fashion, when a
SF is detected by both endpoints of the 1:n protection domain. W1
traffic has a higher priority than that of W2 traffic and, therefore,
will preempt the current protected traffic.
The following presents the scenario in non-locking operation:
Osborne, et al. Expires February 7, 2014 [Page 20]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
+----+-------------------------------------------+---------+---------+
| t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) |
| | and both endpoints are coordinated | B = 2 | B = 2 |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,2) |
| | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = 2 |
| | state and continues to select the | | |
| | protected traffic from P that is for W2. | | |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,1) | SF(1,1) |
| | WFA state and bridges W1 into P and | B = 1 | B = 1 |
| | transmitting SF(1,1). At this point | | |
| | traffic for W1 is protected in both | | |
| | directions, however the endpoints are | | |
| | still not coordinated | | |
+----+-------------------------------------------+---------+---------+
| t3 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | SF(1,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state | | |
+----+-------------------------------------------+---------+---------+
| t4 | LER-A receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) |
| | an Ack from LER-Z and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state. Switch is complete. | | |
+----+-------------------------------------------+---------+---------+
Figure 8: Preemption bidirectional non-locking
3.3.3.4. Bidirectional locking
When considering the locking mode of operation, we must consider that
the protection path, P, must be cleared of all traffic during the
transition of traffic caused by preemption. The bidirectional case
will be similar to the scenario for a unidirectional fault with the
major difference being the final state of the two endpoints. The
following would be the sequence of events:
Osborne, et al. Expires February 7, 2014 [Page 21]
Internet-Draft MPLS-TP LP August 2013
+--------------------------------------------------------------------+
|Time| Event Description |LER-A PSC|LER-Z PSC|
| | | Bridge | Bridge |
| | |Selector | Selector|
+----+-------------------------------------------+---------+---------+
| t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) |
| | and both endpoints are coordinated | B = 2 | B = 2 |
| | | S = 2 | S = 2 |
+----+-------------------------------------------+---------+---------+
| t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,2) |
| | LER-A enters into WFA blocks all traffic | B = n/a | B = 2 |
| | on the protection path | S = n/a | S = 2 |
+----+-------------------------------------------+---------+---------+
| t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,0) | SF(1,0) |
| | WFA state and blocks all traffic on the | B = n/a | B = n/a |
| | protection path while transmitting SF(1,0)| S = n/a | S = n/a |
+----+-------------------------------------------+---------+---------+
| t3 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | SF(1,1) |
| | bridges traffic from W1 to P (higher | B = n/a | B = 1 |
| | priority) At this point W1 traffic is | S = n/a | S = n/a |
| | flowing Z->A but not A->Z | | |
+----+-------------------------------------------+---------+---------+
| t4 | LER-A receives NR(0,1) from LER-Z and | SF(1,1) | SF(1,1) |
| | considers it an Ack and transits from WFA | B = 1 | B = 1 |
| | to PF:W:L state | S = 1 | S = n/a |
+----+-------------------------------------------+---------+---------+
| t5 | LER-Z receives SF(1,1), and begins | SF(1,1) | SF(1,1) |
| | selecting the protected traffic as W1 data| B = 1 | B = 1 |
| | Switch is complete. | S = 1 | S = 1 |
+----+-------------------------------------------+---------+---------+
Figure 9: Preemption bidirectional locking
4. Changes to PSC
The Protection State Coordination protocol (PSC) is defined in
[LinProt]. This includes both the format of the G-ACh based message
as well as a description of the operations and the state transition
logic of the protocol. The extension to cover 1:n protection
includes changes to both aspects of PSC.
The changes to the message structure, include both the addition of
new information and extension of the semantics of some of the
existing fields of the message. These changes will be described in
Section 4.2.
Osborne, et al. Expires February 7, 2014 [Page 22]
Internet-Draft MPLS-TP LP August 2013
The changes relative to the behavior of the base PSC protocol will be
described in Section 4.3.
4.1. PSC
Base PSC (as defined in [LinProt] is a single-phased protocol, i.e.
the endpoints perform protection switching without waiting for
acknowledgement from the far end LER. The protocol messages are
transmitted using the G-ACh and the format is described in Figure 10.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 1|Version| Reserved | PSC-CT = 0x0024 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver|Request|PT |R| Reserved1 | FPath | Path |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TLV Length | Reserved2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ Optional TLVs ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 10: Format of basic PSC packet with a G-ACh header
In regards to the G-ACh Header no changes are suggested in the
extensions for 1:n protection, i.e., the channel type field will
continue to use the PSC-CT value defined in [LinProt]. The PSC
payload fields affected by this document are the Ver field, Reserved1
field, and the Fpath and Path fields.
4.2. Changes to PSC Payload
In order to support 1:n protection there is a need to make one small
change to the format of the PSC payload (see Figure 11). In
particular, we have added a new flag (L), taken from the Reserved1
space, that is used to indicate whether the protection domain is
opearting in locking or non-locking mode. In addition, the semantics
of the FPath and Path field are adjusted to indicate an index of the
multiple working paths. The details of these changes are supplied in
the following subsections.
Due to the significance of these changes, the value of the Ver field
(in the PSC payload) for 1:n protection domain MUST be set to 2.
Osborne, et al. Expires February 7, 2014 [Page 23]
Internet-Draft MPLS-TP LP August 2013
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver|Request|PT |R|L| Reserved1 | FPath | Path |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TLV Length | Reserved2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ Optional TLVs ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 11: Format of 1:n PSC message payload
4.2.1. Locking (L) flag
The Locking flag is used to indicate that the end-point is configured
for Locking mode (see Section 1.2).
If the value is 1 then the protection-domain is operating in locking
mode
The Locking flag must be the same on both ends; if the two endpoints
of a protection domain have different L-flag settings, this MUST
raise an error to the network operator.
4.2.2. Fault path (FPath) field
The Fpath field indicates which path is identified to be in a fault
condition or affected by an administrative command. The following
are the possible values:
o 0: indicates that the anomaly condition is on the protection path
o 1-128: indicates that the anomaly condition is on a working path
whose index is indicated.
o 129-255: for future extensions or experimental use.
4.2.3. Data path (Path) field
The Path field indicates which data is being transmitted on the
protection path. Under normal conditions, the protection path does
not need to carry any user data traffic, but may carry extra traffic.
If there is a failure/degrade condition on one of the working paths,
then that working path's data traffic will be transmitted over the
protection path. The following are the possible values:
Osborne, et al. Expires February 7, 2014 [Page 24]
Internet-Draft MPLS-TP LP August 2013
o 0: indicates that the protection path is not transporting user
data traffic.
o 1-128: indicates that the protection path is transmitting user
traffic replacing the use of the working path indexed.
o 129-255: for future extensions or experimental use.
4.3. Changes to PSC Operation
In all of the following subsections, assume a protection domain
between LER-A and LER-Z, using working paths 1-N and the protection
path as shown in figure 1.
A basic premise of this protection architecture is that both
endpoints of the protection domain MUST be configured to associate
the indices of the working paths with the proper LSP identifiers. If
this condition is not met then the protection scheme will cause
inconsistencies in traffic transmission.
4.3.1. Basic operation
Protection of the N working paths is based on the operational
principles outlined in [LinProt] and will employ the same basic
Protection State Coordination Protocol (PSC) outlined in that
document. However, as can be expected, due to certain basic
differences in the architecture of the protection domain, a small set
of differences in operation are necessary. The following sub-
sections will highlight these differences and explain their effects
on the PSC state machine.
4.3.2. Two-phased operation
PSC, as presented in [LinProt] is a single-phased protocol. This
means that when an endpoint receives a trigger to perform a
protection switch, the LER switches traffic and then notifies the far
end of the switch, without waiting for acknowledgement. When
addressing the situation in a 1:n protection domain, the endpoint
that receives the trigger must first verify that the protection path
is available to transmit the protected traffic. This may involve
interrupting the traffic that is currently being transmitted on the
protection path by both endpoints.
In general, after the LER has detected a trigger for protection
switching, e.g. a FS operator command, or a SF indication for one of
the working paths, the LER SHALL transmit the appropriate PSC message
as described in [LinProt] with the following changes:
Osborne, et al. Expires February 7, 2014 [Page 25]
Internet-Draft MPLS-TP LP August 2013
o If the protection domain is currently in either Protecting
administrative or Protecting failure state, then the endpoint
SHALL verify that the new trigger has a higher priority than the
currently protected traffic. If the new trigger has a lower
priority then it MUST be ignored.
o The PSC message SHALL set the FPath value to the index of the
working path that generated the trigger. The Path value SHOULD be
set to 0, unless the protection path was previously transporting
traffic from another working path (as indicated by the value of
the Path field.)
o If the protection path is currently transporting protected traffic
and the protection domain is operating in locking mode, then the
endpoint SHALL block all traffic of the protected working path.
o The endpoint SHALL transit to WFA state (see below).
o Upon reception of the switching PSC message, the far end LER SHALL
verify that the received request is of higher priority than the
known current traffic on the protection path, and if so SHALL
interrupt the current traffic on the protection path, perform the
switch to the requested protected traffic, and send a PSC message
with the Path field set to the index of the current protected
working path.
o Upon reception of the PSC message, the initiating LER SHALL verify
that the Path field is set to the index of the working path of the
highest priority. If the Path field matches the highest priority
path the LER SHALL perform the protection switch and transmit the
appropriate PSC message, with the FPath field indicating the index
of the working path that triggered the protection switch and the
Path field set to the index of the working path whose traffic is
being transported on the protection path.
4.3.3. Acknowledge message
As stated above, before performing a protection switch the endpoint
that detected a switching trigger MUST wait for an Acknowledge
message prior to performing the switch. There are two types of
message that will be considered as an Acknowledge message:
1. A reply message with the Request field reflecting the state of
the far end, and the Path field set to the index of the working
path that triggered the switching condition. For example, if
there is a Forced Switch command detected by LER-Z on working
path W4, then LER-Z will have sent an FS(4,0) message to LER-A.
Then when LER-Z receives a message such as NR(0,4)Ack this should
Osborne, et al. Expires February 7, 2014 [Page 26]
Internet-Draft MPLS-TP LP August 2013
be considered acknowledgement of the switching and that the
protection path is available to switch the traffic from working
path W4.
2. A remote message with the same Request field and FPath field as
that transmitted by the LER in the WFA state. For example, if
there is a bidirectional Signal fault detected by LER-A on
working path W4, then LER-A will enter WFA state and transmit a
SF(4,0) message. When it receives the SF(4,0) message from
LER-Z, that has also detected the SF condition, it should be
considered an acknowledgement of the switching and that the
protection path is available to switch the traffic from working
path W2.
4.3.4. Wait for Acknowledge (WFA) timer
The protection system MUST include a timer called the Wait for
Acknowledge (WFA) timer that SHALL be started when the LER enters WFA
state and reset when the Acknowledge message is received. The length
of the WFA timer SHOULD be configured to allow protection switching
within the normal time constraints. The WFA timer will expire only
if no Acknowledge message was received by the LER in WFA state. The
WFA Expires local input should have a priority just below that of the
WTRExpires signal.
4.3.5. Additional PSC State
As described above and demonstrated in the scenarios in Section 3.3,
there is a need, in some scenarios, for the endpoint that is
reporting a trigger for protection-switching to delay the actual
switch-over until an acknowledge is received from the far end LER.
In order to facilitate this wait period it is necessary to define a
new PSC State - Wait for Acknowledge (WFA) state. WFA is used in
both the Locking and Non-Locking cases. It is more essential to the
Locking mode of operation, as agreement is the mechanism to establish
and release the lock on the protection LSP. However, it is necessary
for the Non-Locking mode as a persistent disagreement on the contents
of the protection LSP indicates an error in the network devices and
WFA is the method used to detect this error.
In the locking mode, WFA comes into play when a failed LSP preempts
another LSP. This is highlighted in the scenarios presented in
Figure 7 & Figure 9.
When a working path is preempted, the protection domain must
transition the contents of the protecting path from the preempted
working path to the preempting working path. In the locking case,
the protecting path must temporarily be blocked (that is, nothing is
Osborne, et al. Expires February 7, 2014 [Page 27]
Internet-Draft MPLS-TP LP August 2013
being protected) in order to ensure that there is no misconnectivity.
In the case where W1 preempts W2, the contents of the protection path
transitions from transporting the W2 to not carrying any traffic
before beginning to transport W1 traffic.
The following sub-section will describe the actions to be taken when
an LER is in the WFA state.
4.3.5.1. Wait for Acknowledge (WFA) State
An LER will enter the Wait for Acknowledge state before transitioning
into a protection state, i.e. either Protecting administrative or
Protecting failure state. The LER SHALL remain in this state until
either receiving an Acknowledge message, or until a WFA timer
expires. Normally, the Acknowledge message will be a remote PSC
input. The following describe how the LER, in WFA state, should
react to a new local input:
o A local Clear SHALL cause the LER to go into Normal state if the
LER is in WFA state due to either a FS or MS trigger and transmit
an NR(0,0) PSC message. If the LER is in WFA state due to a SF
trigger then the local Clear SHALL be ignored.
o A local LO SHALL cause the LER to go into Unavailable state and
begin to transmit LO(x, 0) [where x indicates the index of the
working path that triggered the WFA state].
o A local FS SHALL cause the LER to remain in WFA state and transmit
the FS(x, 0) message [where x indicates the index of the protected
working path]. If the LER is in WFA state due to a FS from a
different working path, then the working path with the higher
priority SHALL be the protected working path. If the LER is in
WFA state due to any other switching trigger, then the working
path that is identified in this FS will be the protected working
path.
o A local SF SHALL cause the LER to remain in WFA state. If the LER
is in WFA state due to an existing FS trigger, then ignore the
local SF and continue to transmit the FS(x, 0) PSC message. If
the LER is in WFA state due to an existing SF trigger then
transmit the SF(x, 0) PSC message [where x indicates the index of
protected working path, i.e. the highest priority working path
indicating an SF condition]. If the LER is in WFA state due to
any other trigger, then begin transmitting a SF(x, 0) PSC message
[where x indicates the index of the working path that is
generating the SF condition].
Osborne, et al. Expires February 7, 2014 [Page 28]
Internet-Draft MPLS-TP LP August 2013
o A local ClearSF indication where the working path is the same as
the path that triggered the LER into WFA state SHALL cause the LER
to go into WTR state (note: 1:N protection is always revertive)
and to transmit the WTR(0, 0) message. If the ClearSF indicates a
different index from the protected working path or indicates the
protection path then the indication SHALL be ignored.
o A local MS operator command SHALL cause the LER to remain in WFA
state. If the LER is in WFA state due an existing MS trigger,
then the node continues to transmit MS(x, 0) messages [where x
indicates the index of the protected working path, i.e. the
highest priority working path indicating the MS condition]. If
the LER is in WFA state due to any other trigger, ignore the MS
command and continue transmitting the current message.
o If the WFA timer expires, i.e. the LER did not receive the
Acknowledge message from the far end in a timely manner, then the
LER SHALL go to Unavailable state, i.e. it assumes that there is a
problem on the protection path (where all PSC traffic is
transmitted) and send an error notification to the management
system. The LER SHALL continue transmitting the current PSC
message with Path field set to 0.
o All other local indications SHALL be ignored.
The following details the reactions of the LER in WFA state to remote
messages:
o Any remote message with the Acknowledge flag set to 1 and the Path
field set to the index of the protected working path SHALL cause
the LER to change state. If the trigger was either FS or MS
command, the LER enters Protecting administrative state. The LER
transmits the appropriate message according to the trigger (i.e.
FS(x,x) for FS command and MS(x,x) for the MS command). If the
trigger was a SF condition, then the LER enters the Protecting
failure state and begins to transmit the appropriate SF(x, x)
message. A remote message with the Acknowledge flag set to 1 but
where the Path field does not match, according to the description
above, SHALL be ignored.
o A remote LO message SHALL cause the LER to go into Unavailable
state and transmit the appropriate message for the trigger that
caused the WFA state.
o A remote FS message indicating the same working path as the local
FS command that triggered the WFA state SHALL be considered an
Acknowledge message, even if the Acknowledge flag is not set. The
LER SHALL perform the protection switch, and begin transmitting
Osborne, et al. Expires February 7, 2014 [Page 29]
Internet-Draft MPLS-TP LP August 2013
the FS(x, x) message [where x indicates the index of the protected
working path]. If the remote FS message indicates a different
index than the one indicated in the local FS and if the remote FS
message indicates a lower priority working path than the working
path in the local FS trigger then the LER SHALL ignore the remote
FS message and remain in WFA state. If the remote FS message
indicates an index of higher priority or the LER is in WFA state
as a result of a SF or MS trigger, then the LER SHALL perform the
protection switch for the protected working path indicated by the
remote FS message, and SHALL go to Protecting administrative state
and transmit the appropriate message for the local trigger with
the Path field set to the index of the remote message and the
Acknowledge flag set to 1.
o A remote SF message indicating an error on the protection path
SHALL cause the LER to go into Unavailable state and transmit the
appropriate message for the trigger that caused to WFA state.
o A remote SF message indicating an error on the same working path
as the local SF condition that triggered the WFA state SHALL be
considered an Acknowledge message (even if the Acknowledge flag is
not set). The LER SHALL perform the protection switch, go to
Protecting failure state and transmit the SF(x, x) message [where
x is the index of the protected working path]. If the remote SF
message indicates a different index than the one indicated in the
local SF, then if the local command indicates a higher priority
working path the LER SHALL ignore the remote SF message and remain
in WFA state. If the remote SF message indicates an index of
higher priority or the LER is in WFA state as a result of a MS
trigger, then the LER SHALL perform the protection switch for the
protected working path indicated by the remote SF message, and
SHALL go to Protecting failure state and transmit the appropriate
message for the local trigger with the Path field set to the index
of the remote message and the Acknowledge flag set to 1. If the
LER is in WFA state due to a local FS command, then it SHALL
ignore the remote message and remain in WFA state.
o A remote MS message indicating an error on the same working path
as the local MS that triggered the WFA state SHALL be considered
an Acknowledge message (even if the Acknowledge flag is not set).
The LER SHALL perform the protection switch, go to Protecting
administrative state and transmit the MS(x, x) message [where x is
the index of the protected working path]. If the remote MS
message indicates a different index than the one indicated in the
local MS, then if the local command indicates a higher priority
working path or the LER is in WFA due to either a FS or SF
trigger, the LER SHALL ignore the remote MS message and remain in
WFA state. If the remote MS message indicates an index of higher
Osborne, et al. Expires February 7, 2014 [Page 30]
Internet-Draft MPLS-TP LP August 2013
priority, then the LER SHALL perform the protection switch for the
protected working path indicated by the remote MS message, and
SHALL go to Protecting administrative state and transmit an NR(0,
y) with the Path field set to the index of the remote message and
the Acknowledge flag set to 1.
o All other remote messages SHOULD be ignored.
5. IANA Considerations
This document does not include any required IANA considerations
6. Security Considerations
The generic security considerations for the data-plane of MPLS-TP are
described in the security framework document [SecureFwk] together
with the required mechanisms needed to address them. The security
considerations for the generic associated control channel are
described in [RFC5586]. The security considerations for protection
and recovery aspects of MPLS-TP are addressed in [SurvivFwk].
The extensions to the protocol described in this document are
extensions to the protocol defined in [LinProt] and does not
introduce any new security risks.
7. Acknowledgements
The authors would like to thank everyone involved in the definition
and specification of protection mechanisms for MPLS Transport Profile
(MPLS-TP).
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[TPReq] Niven-Jenkins, B., Brungard, D., Betts, M., Sprecher, N.,
and S. Ueno, "Requirements of an MPLS Transport Profile",
RFC 5654, September 2009.
[LinProt] Bryant, S., Sprecher, N., Osborne, E., Fulignoli, A., and
Y. Weingarten, "Multi-protocol Label Switching Transport
Osborne, et al. Expires February 7, 2014 [Page 31]
Internet-Draft MPLS-TP LP August 2013
Profile Linear Protection", RFC 6378, Apr 2011.
8.2. Informative References
[RFC5586] Vigoureux,, M., Bocci, M., Swallow, G., Aggarwal, R., and
D. Ward, "MPLS Generic Associated Channel", RFC 5586,
May 2009.
[RFC4427] Mannie, E. and D. Papadimitriou, "Recovery Terminology for
Generalized Multi-Protocol Label Switching", RFC 4427,
Mar 2006.
[RFC3031] Rosen, Eric., Viswanathan, A., and Ross. Callon,
"Multiprotocol Label Switching Architecture", RFC 3031,
Mar 2006.
[SurvivFwk]
Sprecher, N., Farrel, A., and H. Shah, "Multi-protocol
Label Switching Transport Profile Survivability
Framework", RFC 6372, Feb 2009.
[SecureFwk]
Fang, L., Niven-Jenkins, B., Mansfield, S., Zhang, R.,
Bitar, N., Daikoku, M., and L. Wang, "MPLS-TP Security
Framework",
ID draft-ietf-mpls-tp-security-framework-07.txt, Jan 2013.
Appendix A. PSC state machine tables
Note/Disclaimer: This state machine is not currently in sync with the
text of the document and will be updated in a future revision.
The full PSC state machine is described in [LinProt], both in textual
and tabular form. This appendix highlights the changes to the basic
PSC state machine. In the event of a mismatch between these tables
and the text either in [LinProt] or in this document, the text is
authoritative. Note that this appendix is intended to be a
functional description, not an implementation specification.
The tables here use the same format and state descriptions used in
the Linear Protection document with the addition of the WFA state,
WFA Expires, and the changes in the behavior that is noted.
Each state corresponds to the transmission of a particular set of
Request, FPath and Path bits. The table below lists the message that
is generally sent in each particular state. If the message to be
sent in a particular state deviates from the table below, it is noted
Osborne, et al. Expires February 7, 2014 [Page 32]
Internet-Draft MPLS-TP LP August 2013
in the footnotes to the state-machine table.
State REQ(FP,P)
------- ---------
N NR(0,0)
UA:LO:L LO(0,0)
UA:P:L SF(0,0)
UA:LO:R NR(0,0)
UA:P:R NR(0,0)
PF:W:L SF(1,1)
PF:W:R NR(0,1)
PA:F:L FS(1,1)
PA:M:L MS(1,1)
PA:F:R NR(0,1)
PA:M:R NR(0,1)
WTR WTR(0,1)
DNR DNR(0,1)
The top row in each table is the list of possible inputs. The local
inputs are:
NR No Request
OC Operator Clear
LO Lockout of protection
SF-P Signal Fail on protection path
SF-W Signal Fail on working path
FS Forced Switch
SFc Clear Signal Fail
MS Manual Switch
WTRExp WTR Expired
and the remote inputs are:
LO remote LO message
SF-P remote SF message indicating protection path
SF-W remote SF message indicating working path
FS remote FS message
MS remote MS message
WTR remote WTR message
DNR remote DNR message
NR remote NR message
Section 4.3.3 refers to some states as 'remote' and some as 'local'.
By definition, all states listed in the table of local sources are
local states, and all states listed in the table of remote sources
are remote states. For example, section 4.3.3.1 says "A local
Lockout of protection input SHALL cause the LER to go into local
Unavailable State". As the trigger for this state change is a local
Osborne, et al. Expires February 7, 2014 [Page 33]
Internet-Draft MPLS-TP LP August 2013
one, 'local Unavailable State' is by definition displayed in the
table of local sources. Similarly, "A remote Lockout of protection
message SHALL cause the LER to go into remote Unavailable state"
means that the state represented in the Unavailable rows in the table
of remote sources is by definition a remote Unavailable state.
Each cell in the table below contains either a state, a footnote, or
the letter 'i'. 'i' stands for Ignore, and is an indication to
continue with the current behavior. See section 4.3.3. The
footnotes are listed below the table.
Part 1: Local input state machine
| OC | LO | SF-P | FS | SF-W | SFc | MS | WTRExp
--------+-----+-------+------+------+------+------+------+-------
N | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i
UA:LO:L | N | i | i | i | i | i | i | i
UA:P:L | i |UA:LO:L| i | i | i | [5] | i | i
UA:LO:R | i |UA:LO:L| [1] | i | [2] | [6] | i | i
UA:P:R | i |UA:LO:L|UA:P:L| i | [3] | [6] | i | i
PF:W:L | i |UA:LO:L|UA:P:L|PA:F:L| i | [7] | i | i
PF:W:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i
PA:F:L | N |UA:LO:L|UA:P:L| i | i | i | i | i
PA:M:L | N |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i
PA:F:R | i |UA:LO:L|UA:P:L|PA:F:L| [4] | [8] | i | i
PA:M:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i
WTR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| [9]
DNR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i
Osborne, et al. Expires February 7, 2014 [Page 34]
Internet-Draft MPLS-TP LP August 2013
Part 2: Remote messages state machine
| LO | SF-P | FS | SF-W | MS | WTR | DNR | NR
--------+-------+------+------+------+------+------+------+------
N |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i
UA:LO:L | i | i | i | i | i | i | i | i
UA:P:L | [10] | i | i | i | i | i | i | i
UA:LO:R | i | i | i | i | i | i | i | [16]
UA:P:R |UA:LO:R| i | i | i | i | i | i | [16]
PF:W:L | [11] | [12] |PA:F:R| i | i | i | i | i
PF:W:R |UA:LO:R|UA:P:R|PA:F:R| i | i | [14] | [15] | N
PA:F:L |UA:LO:R|UA:P:R| i | i | i | i | i | i
PA:M:L |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | i
PA:F:R |UA:LO:R|UA:P:R| i | i | i | i | i | [17]
PA:M:R |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | N
WTR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | [18]
DNR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i
The following are the footnotes for the table:
[1] Remain in the current state (UA:LO:R) and transmit SF(0,0)
[2] Remain in the current state (UA:LO:R) and transmit SF(1,0)
[3] Remain in the current state (UA:P:R) and transmit SF(1,0)
[4] Remain in the current state (PA:F:R) and transmit SF(1,1)
[5] If the SF being cleared is SF-P, Transition to N. If it's SF-W,
ignore the clear.
[6] Remain in current state (UA:x:R), if the SFc corresponds to a
previous SF then begin transmitting NR(0,0).
[7] If domain configured for revertive behavior transition to WTR,
else transition to DNR
[8] Remain in PA:F:R and transmit NR(0,1)
[9] Remain in WTR, send NR(0,1)
[10] Transition to UA:LO:R continue sending SF(0,0)
[11] Transition to UA:LO:R and send SF(1,0)
[12] Transition to UA and send SF(1,0)
[13] Transition to PF:W:R and send NR(0,1)
Osborne, et al. Expires February 7, 2014 [Page 35]
Internet-Draft MPLS-TP LP August 2013
[14] Transition to WTR state and continue to send the current
message.
[15] Transition to DNR state and continue to send the current
message.
[16] If the local input is SF-P then transition to UA:P:L. If the
local input is SF-W then transition to PF:W:L. Else - transition to N
state and continue to send the current message.
[17] If the local input is SF-W then transition to PF:W:L. Else -
transition to N state and continue to send the current message.
[18] If the receiving LER's WTR timer is running, maintain current
state and message. If the WTR timer is stopped, transition to N.
Authors' Addresses
Eric Osborne
Cisco
United States
Email: eosborne@cisco.com
Fei Zhang
ZTE
China
Email: zhang.fei3@zte.com.cn
Yaacov Weingarten
34 Hagefen St
Karnei Shomron, 4485500
Israel
Email: wyaacov@gmail.com
Osborne, et al. Expires February 7, 2014 [Page 36]