Internet Engineering Task Force | M. Binderberger |
Internet-Draft | N. Akiya |
Intended status: Standards Track | Cisco Systems |
Expires: November 08, 2013 | May 07, 2013 |
Redundant BFD sessions
draft-mbind-bfd-redundancy-01
This document defines a second or "shadow" BFD session to an existing "primary" BFD session, providing resiliency against BFD failures that are not legitimate.
Scenarios will be discussed on how presence of a shadow BFD session will be beneficial in the context of high availability.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 08, 2013.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Bidirectional Forwarding Detection [RFC5880] is used to detect network failures. Link failures and peer system outages are some examples of failures which can be detected with BFD technology. Although undesirable, the BFD technology may falsely declare failure in some scenarios: BFD process crash, FPGA reset on hardware based BFD, or a card running the BFD functionality fails or gets removed accidentally. In all these cases, the forwarding being monitored by BFD may remain functional. Unnecessary rerouting of traffic, while not a problem per-se, can be a problem at a large scale of false BFD triggers, e.g. tens of thousands of traffic path. A serious outcome may be seen if a network outage occurs in a time window in which BFD is not detecting failures. For example, during software updates an extended timer value may be used, leaving the system and it's peer "blind" for any real liveliness problem until the BFD functionality is restored.
This draft proposes to run a second "shadow" BFD session, in parallel to the existing "primary" BFD session. This additional session will have it's own unique discriminator value(s). The method used to differentiate discriminator zero primary and shadow sessions is discussed in the following sections.
BFD technology requires continuous transmission of control packets in both directions. The rate at which both systems are required to transmit these packets will vary depending on operational requirements and configurations: BFD mode and interval. If a BFD module on one system is unable to transmit BFD control packets for amount of time greater than the negotiated failure detection time, then the BFD module on the other system will declare a session failure. Sometimes the cause of such a session failure is not related to the functionality of the path being monitored by BFD.
Some failure scenarios which can exhibit such behaviors are described in this section.
Failure scenarios are not limited to the ones described above. In all cases, the reliability of BFD sessions will increase significantly if a second fully active BFD instance existed. It is possible to address some, or potentially all, failure scenarios locally. However, multiple proprietary solutions are likely required to cover wide problematic areas. Result may not be desirable from operator perspective, as expected behavior will deviate from a failure to failure, and from a device to device. Therefore, this specification defines a simple and consistent redundancy mechanism which can be used with wide range of local failure scenarios.
For a single target monitored by BFD, a system needs to run two instances of the BFD sessions: a primary session and a shadow session. This requires BFD control packets to have an indication on which role they belong. In other words, every control packet needs to have an indication on whether it belongs to the primary or the shadow session.
When looking at the BFD version 1 packet in [RFC5880], there are no unused bits left to store a shadow flag to distinguish the primary from the shadow session. One could take away a bit from e.g. the Diag, the Multiplier or the Length field, even claiming the least significant bit from one of the interval fields. But none of these proposals would be safe against interoperability problems with BFD speakers not supporting this draft.
That leaves three possible options.
Option b redefines the BFD packet contents. Although it is a clean solution, this approach can have a significant impact to existing BFD implementations. Introduction of BFD redundancy capability at significant costs is thought to be undesirable, thus this option is not recommended. However, when there is a discussion on defining new version of BFD packet contents, addition of redundancy capability would be recommended. Option c will create dependencies with current and future BFD RFCs since each will need to define a way shadow session can be specified. Therefore, this option is also not recommended. That leaves option a as the recommended choice.
BFD version 2 packets follow exactly the definition given in [RFC5880] and other BFD-related RFCs, with one difference that the version field contains the value "2". The packet format is the same as described in section 4.1 of [RFC5880]. Implementations following this draft MUST be able to receive BFD packets with the version field values "1" and "2" and MUST drop BFD packets with any other version value.
BFD packets with a version value of "1" are named "primary" packets while BFD packets with a version value of "2" are named "shadow" packets within this document. The primary session MUST only transmit and receive primary packets. The shadow session MUST only transmit and receive shadow packets.
As primary sessions and shadow sessions are operating independently, they have different my discriminator values. My discriminator values assigned to BFD sessions are unique per system, across the combined set of primary and shadow sessions. In other words, a system will have one discriminator pool to be used for both primary and shadow sessions, not a pool per session type.
A shadow BFD session is associated to exactly one primary BFD session. The parameters used by shadow sessions SHOULD be the same as the parameters of associated primary session. Purpose for such is to ensure that two sessions operate using the same mode, interval and failure detection time. This allows for the two sessions to behave as similar as possible to reduce the chance of them concluding deviating state in valid failure scenarios.
When the BFD shadow capability is enabled to a target, two session instances to that target are created: primary and shadow. A logic SHOULD be applied to identify where in the system to host the two sessions. The logic should maximize the failure detection validity by minimizing the chances of both sessions being impacted by a single local failure. For example, if there are multiple CPU instances, there will be more benefits to run the two sessions on different CPU instances. Details of this logic, however, is outside the scope of this document.
Both the primary and the shadow session are to operate as per specified in other BFD RFCs. A differentiator comes into play between state changes of the two sessions and the action taken when reachability of the BFD enabled target changes. This differentiator will be referred as the state consolidation module from here onward. The purpose of the state consolidation module is to consolidate the state of the primary and the shadow session, and to produce a final state to be used by the system to take action on. The logic of the state consolidation module is as follows:
Final state is UP when the state of the primary session is UP or the state of the shadow session is UP.
Final state is DOWN when both the state primary session is DOWN and the state of the shadow sessions is DOWN.
This specification aims to introduce BFD redundancy concept to various flavors of BFD while minimizing disruption to existing implementations. There is, however, one additional change required in order to support LSP ping bootstrapped BFD sessions described by [RFC5884].
This specification defines a new optional TLV to be carried in LSP ping packet.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Discriminator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This TLV has a length of 4. The value contains the 4-byte local discriminator that the LSR, sending the LSP ping message, associates with the shadow BFD session. TBD: IANA to assign optional type.
Upon reception of this optional TLV, LSP egress is to create a shadow session for specified FEC, if local constraints allow, with your discriminator set to value specified in the TLV. This TLV MAY be included in the LSP ping which carries BFD discriminator TLV of corresponding primary session, or this TLV MAY be carried in a separate LSP ping packet which does not carry BFD discriminator TLV of corresponding primary session. In both cases, egress LSR MUST associate both primary and shadow sessions in the state consolidation module.
The BFD module becomes more resilient by enabling the shadow BFD capability. However, when the shadow BFD capability is enabled on a system, the total number of BFD sessions hosted on a system will be increased by the number of shadow BFD sessions. For the same number of BFD monitored targets, more system resources will be used. Solving a scale issue is outside the scope of this document. However, below lists some techniques which can be considered:
IANA to assign optional type for new LSP ping TLV.
This document does not introduce any additional security issues and the security mechanisms defined in [RFC5880] apply in this document.
Authors would like to thank Aswatnarayan Raghuram from AT&T for providing requirements and helpful comments.
Authors would like to thank Gregory Mirsky and Alexander Vainshtein for providing insightful comments.
Authors would like to thank Srihari Raghavan and Mallik Mudigonda from Cisco Systems for providing valuable comments regarding LSP ping bootstrapped sessions.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC5880] | Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, June 2010. |
[RFC5881] | Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 2010. |
[RFC5883] | Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD) for Multihop Paths", RFC 5883, June 2010. |
[RFC5884] | Aggarwal, R., Kompella, K., Nadeau, T. and G. Swallow, "Bidirectional Forwarding Detection (BFD) for MPLS Label Switched Paths (LSPs)", RFC 5884, June 2010. |
[RFC5885] | Nadeau, T. and C. Pignataro, "Bidirectional Forwarding Detection (BFD) for the Pseudowire Virtual Circuit Connectivity Verification (VCCV)", RFC 5885, June 2010. |
[RFC6428] | Allan, D., Swallow Ed. , G. and J. Drake Ed. , "Proactive Connectivity Verification, Continuity Check, and Remote Defect Indication for the MPLS Transport Profile", RFC 6428, November 2011. |