Internet DRAFT - draft-mbind-bfd-redundancy
draft-mbind-bfd-redundancy
Internet Engineering Task Force M. Binderberger
Internet-Draft N. Akiya
Intended status: Standards Track Cisco Systems
Expires: November 08, 2013 May 07, 2013
Redundant BFD sessions
draft-mbind-bfd-redundancy-01
Abstract
This document defines a second or "shadow" BFD session to an existing
"primary" BFD session, providing resiliency against BFD failures that
are not legitimate.
Scenarios will be discussed on how presence of a shadow BFD session
will be beneficial in the context of high availability.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 08, 2013.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
Binderberger & Akiya Expires November 08, 2013 [Page 1]
Internet-Draft Redundant BFD sessions May 2013
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Failure scenarios . . . . . . . . . . . . . . . . . . . . . . 3
3. Differentiating primary and shadow sessions . . . . . . . . . 5
4. BFD version 2 packets . . . . . . . . . . . . . . . . . . . . 6
5. BFD discriminators . . . . . . . . . . . . . . . . . . . . . 6
6. Using primary and shadow BFD sessions . . . . . . . . . . . . 6
7. LSP ping bootstrapped BFD sessions . . . . . . . . . . . . . 7
8. Scale aspect . . . . . . . . . . . . . . . . . . . . . . . . 8
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
10. Security Considerations . . . . . . . . . . . . . . . . . . . 8
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
12. Normative References . . . . . . . . . . . . . . . . . . . . 8
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
Bidirectional Forwarding Detection [RFC5880] is used to detect
network failures. Link failures and peer system outages are some
examples of failures which can be detected with BFD technology.
Although undesirable, the BFD technology may falsely declare failure
in some scenarios: BFD process crash, FPGA reset on hardware based
BFD, or a card running the BFD functionality fails or gets removed
accidentally. In all these cases, the forwarding being monitored by
BFD may remain functional. Unnecessary rerouting of traffic, while
not a problem per-se, can be a problem at a large scale of false BFD
triggers, e.g. tens of thousands of traffic path. A serious outcome
may be seen if a network outage occurs in a time window in which BFD
is not detecting failures. For example, during software updates an
extended timer value may be used, leaving the system and it's peer
"blind" for any real liveliness problem until the BFD functionality
is restored.
This draft proposes to run a second "shadow" BFD session, in parallel
to the existing "primary" BFD session. This additional session will
have it's own unique discriminator value(s). The method used to
differentiate discriminator zero primary and shadow sessions is
discussed in the following sections.
Binderberger & Akiya Expires November 08, 2013 [Page 2]
Internet-Draft Redundant BFD sessions May 2013
2. Failure scenarios
BFD technology requires continuous transmission of control packets in
both directions. The rate at which both systems are required to
transmit these packets will vary depending on operational
requirements and configurations: BFD mode and interval. If a BFD
module on one system is unable to transmit BFD control packets for
amount of time greater than the negotiated failure detection time,
then the BFD module on the other system will declare a session
failure. Sometimes the cause of such a session failure is not
related to the functionality of the path being monitored by BFD.
Some failure scenarios which can exhibit such behaviors are described
in this section.
1. Software based BFD: BFD process crash - Software entity handling
BFD packets may crash unexpectedly. Time it takes for same, or
possibly alternative software entity, to become functional is a
time window where BFD packets will not be handled. If this time
window is larger than negotiated failure detection time, sessions
will be declared as failure even though monitored paths may still
be valid. If there existed another software entity, running on
same CPU or different CPU, validating same paths, false failure
can be avoided as long as two software entities do not crash
around same time.
2. Software based BFD: CPU starvation - CPU starvation may cause BFD
packets from being handled in timely manner. During this period,
packets may not get transmitted or received packets may not get
processed. If length of time CPU starvation affecting BFD
software entity is larger than negotiated failure detection time,
sessions will be declared as failure even though monitored paths
may still be valid. If there existed another software entity,
running on different CPU, validating same paths, false failure
for this scenario can be avoided as long as two software entities
do not become CPU starved around same time.
3. Hardware based BFD: FPGA reset - In a scenario where hardware BFD
and actual forwarding are performed on separate chips, it may be
desirable to reset just FPGA which runs BFD. Planned such FPGA
reset can be handled locally. Sessions can be migrated to
another chip set, failure detection times can be extended during
absence of local BFD functionality, combination of both or some
other means. However, any solution will require additional
proprietary logics to be implemented. Users, operating multiple
products, may need to understand expected behavior of each. In
addition, extending failure detection times mean that system can
no longer detect true failure within desired failure detection
Binderberger & Akiya Expires November 08, 2013 [Page 3]
Internet-Draft Redundant BFD sessions May 2013
times. A consistent solution which does not compromise
configured failure detection time is desired.
4. System using centralized BFD architecture: Route processor card
fault - A product with redundant route processor card could
implement a standby BFD entity to run on the other route
processor card. Implementation may set BFD entity on standby
route processor to be partially active or dormant until it is
determined to be active. In both cases, data synchronization
between the two entities is essential to ensure standby "take
over" happens seamlessly. Additionally, "take over" detection
and "take over" procedures themselves becoms essential, as any
slowness in such may cause remote peers to take down sessions.
If there existed two fully active BFD entities, one on active
route processor and another on standby route processor,
validating same paths, potentially complex "take over" logics can
be avoided.
5. System using distributed BFD architecture: Linecard fault - BFD
may run on logical interfaces which are comprised of physical
interfaces spanning multiple linecards. BFD may run on paths
which are comprised of nexthops hosted on multiple linecards.
BFD may run on logical interfaces or paths which nexthops change
dynamically, jumping from one linecard to another. In all cases,
a linecard hosting a certain BFD session may not be hosting
actual outgoing interface corresponding to that BFD session at
any given time. In such cases, failure of a linecard may not
have any impact to the paths being monitored by some or all
hosted BFD sessions. One implementation may attempt to solve
this problem by trying to move BFD sessions to a linecard where
nexthops reside. Unfortunately this only solves subset of the
problem since it will not cover the scenario where there are
valid multiple nexthops hosted on multiple linecards (ex: LAG,
ECMP). Another implementation may attempt to solve this problem
by running a standby BFD entity on another linecard. However,
this solution has same issues as described in the centralized BFD
architecture section. Again, if there existed two fully active
BFD entities, running on different linecards, validating same
paths, potentially complex synchronization, "take over" or
"migration" logics can be avoided.
Failure scenarios are not limited to the ones described above. In
all cases, the reliability of BFD sessions will increase
significantly if a second fully active BFD instance existed. It is
possible to address some, or potentially all, failure scenarios
locally. However, multiple proprietary solutions are likely required
to cover wide problematic areas. Result may not be desirable from
operator perspective, as expected behavior will deviate from a
Binderberger & Akiya Expires November 08, 2013 [Page 4]
Internet-Draft Redundant BFD sessions May 2013
failure to failure, and from a device to device. Therefore, this
specification defines a simple and consistent redundancy mechanism
which can be used with wide range of local failure scenarios.
3. Differentiating primary and shadow sessions
For a single target monitored by BFD, a system needs to run two
instances of the BFD sessions: a primary session and a shadow
session. This requires BFD control packets to have an indication on
which role they belong. In other words, every control packet needs
to have an indication on whether it belongs to the primary or the
shadow session.
When looking at the BFD version 1 packet in [RFC5880], there are no
unused bits left to store a shadow flag to distinguish the primary
from the shadow session. One could take away a bit from e.g. the
Diag, the Multiplier or the Length field, even claiming the least
significant bit from one of the interval fields. But none of these
proposals would be safe against interoperability problems with BFD
speakers not supporting this draft.
That leaves three possible options.
a. Use of existing BFD version 1 control packet definition will
indicate a primary BFD session. Shadow BFD sessions will use
version 2 in the BFD packets. Besides usage of different version
number, all operation will conform to the behaviors described in
BFD RFCs. Shadow BFD sessions only handle version 2 BFD packets.
Primary BFD sessions only handle version 1 BFD packets as
specified in section 6.8.6 of [RFC5880].
b. Define a new BFD packet header for version 2. This new version
is to include bits to indicate the session type: primary session
or shadow session. Shadow BFD sessions only handle version 2 BFD
packets with shadow bits set. Primary BFD sessions handle
version 1 BFD packets or version 2 BFD packets with primary bits
set.
c. Use information outside the BFD packet. For IP/UDP encapsulated
BFD packets this could be a UDP destination port different from
the well-known ports defined in [RFC5881] and [RFC5883]. For BFD
over Pseudo Wires [RFC5885] or BFD for MPLS-TP OAM [RFC6428] new
type values could be used in the PW-ACH and G-ACH to
differentiate shadow BFD packets from the primary BFD session
packets.
Option b redefines the BFD packet contents. Although it is a clean
solution, this approach can have a significant impact to existing BFD
Binderberger & Akiya Expires November 08, 2013 [Page 5]
Internet-Draft Redundant BFD sessions May 2013
implementations. Introduction of BFD redundancy capability at
significant costs is thought to be undesirable, thus this option is
not recommended. However, when there is a discussion on defining new
version of BFD packet contents, addition of redundancy capability
would be recommended. Option c will create dependencies with current
and future BFD RFCs since each will need to define a way shadow
session can be specified. Therefore, this option is also not
recommended. That leaves option a as the recommended choice.
4. BFD version 2 packets
BFD version 2 packets follow exactly the definition given in
[RFC5880] and other BFD-related RFCs, with one difference that the
version field contains the value "2". The packet format is the same
as described in section 4.1 of [RFC5880]. Implementations following
this draft MUST be able to receive BFD packets with the version field
values "1" and "2" and MUST drop BFD packets with any other version
value.
BFD packets with a version value of "1" are named "primary" packets
while BFD packets with a version value of "2" are named "shadow"
packets within this document. The primary session MUST only transmit
and receive primary packets. The shadow session MUST only transmit
and receive shadow packets.
5. BFD discriminators
As primary sessions and shadow sessions are operating independently,
they have different my discriminator values. My discriminator values
assigned to BFD sessions are unique per system, across the combined
set of primary and shadow sessions. In other words, a system will
have one discriminator pool to be used for both primary and shadow
sessions, not a pool per session type.
6. Using primary and shadow BFD sessions
A shadow BFD session is associated to exactly one primary BFD
session. The parameters used by shadow sessions SHOULD be the same
as the parameters of associated primary session. Purpose for such is
to ensure that two sessions operate using the same mode, interval and
failure detection time. This allows for the two sessions to behave
as similar as possible to reduce the chance of them concluding
deviating state in valid failure scenarios.
When the BFD shadow capability is enabled to a target, two session
instances to that target are created: primary and shadow. A logic
SHOULD be applied to identify where in the system to host the two
sessions. The logic should maximize the failure detection validity
Binderberger & Akiya Expires November 08, 2013 [Page 6]
Internet-Draft Redundant BFD sessions May 2013
by minimizing the chances of both sessions being impacted by a single
local failure. For example, if there are multiple CPU instances,
there will be more benefits to run the two sessions on different CPU
instances. Details of this logic, however, is outside the scope of
this document.
Both the primary and the shadow session are to operate as per
specified in other BFD RFCs. A differentiator comes into play
between state changes of the two sessions and the action taken when
reachability of the BFD enabled target changes. This differentiator
will be referred as the state consolidation module from here onward.
The purpose of the state consolidation module is to consolidate the
state of the primary and the shadow session, and to produce a final
state to be used by the system to take action on. The logic of the
state consolidation module is as follows:
Final state is UP when the state of the primary session is UP or the
state of the shadow session is UP.
Final state is DOWN when both the state primary session is DOWN and
the state of the shadow sessions is DOWN.
7. LSP ping bootstrapped BFD sessions
This specification aims to introduce BFD redundancy concept to
various flavors of BFD while minimizing disruption to existing
implementations. There is, however, one additional change required
in order to support LSP ping bootstrapped BFD sessions described by
[RFC5884].
This specification defines a new optional TLV to be carried in LSP
ping packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Discriminator |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This TLV has a length of 4. The value contains the 4-byte local
discriminator that the LSR, sending the LSP ping message, associates
with the shadow BFD session. TBD: IANA to assign optional type.
Upon reception of this optional TLV, LSP egress is to create a shadow
session for specified FEC, if local constraints allow, with your
discriminator set to value specified in the TLV. This TLV MAY be
included in the LSP ping which carries BFD discriminator TLV of
Binderberger & Akiya Expires November 08, 2013 [Page 7]
Internet-Draft Redundant BFD sessions May 2013
corresponding primary session, or this TLV MAY be carried in a
separate LSP ping packet which does not carry BFD discriminator TLV
of corresponding primary session. In both cases, egress LSR MUST
associate both primary and shadow sessions in the state consolidation
module.
8. Scale aspect
The BFD module becomes more resilient by enabling the shadow BFD
capability. However, when the shadow BFD capability is enabled on a
system, the total number of BFD sessions hosted on a system will be
increased by the number of shadow BFD sessions. For the same number
of BFD monitored targets, more system resources will be used.
Solving a scale issue is outside the scope of this document.
However, below lists some techniques which can be considered:
1. Reduce the configured BFD intervals of some or all BFD sessions.
2. Allow an implementation to run shadow sessions at a slower rate.
9. IANA Considerations
IANA to assign optional type for new LSP ping TLV.
10. Security Considerations
This document does not introduce any additional security issues and
the security mechanisms defined in [RFC5880] apply in this document.
11. Acknowledgements
Authors would like to thank Aswatnarayan Raghuram from AT&T for
providing requirements and helpful comments.
Authors would like to thank Gregory Mirsky and Alexander Vainshtein
for providing insightful comments.
Authors would like to thank Srihari Raghavan and Mallik Mudigonda
from Cisco Systems for providing valuable comments regarding LSP ping
bootstrapped sessions.
12. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD)", RFC 5880, June 2010.
Binderberger & Akiya Expires November 08, 2013 [Page 8]
Internet-Draft Redundant BFD sessions May 2013
[RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June
2010.
[RFC5883] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for Multihop Paths", RFC 5883, June 2010.
[RFC5884] Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow,
"Bidirectional Forwarding Detection (BFD) for MPLS Label
Switched Paths (LSPs)", RFC 5884, June 2010.
[RFC5885] Nadeau, T. and C. Pignataro, "Bidirectional Forwarding
Detection (BFD) for the Pseudowire Virtual Circuit
Connectivity Verification (VCCV)", RFC 5885, June 2010.
[RFC6428] Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive
Connectivity Verification, Continuity Check, and Remote
Defect Indication for the MPLS Transport Profile", RFC
6428, November 2011.
Authors' Addresses
Marc Binderberger
Cisco Systems
Email: mbinderb@cisco.com
Nobo Akiya
Cisco Systems
Email: nobo@cisco.com
Binderberger & Akiya Expires November 08, 2013 [Page 9]