Internet Engineering Task Force | D. Katz |
Internet-Draft | Juniper Networks |
Intended status: Standards Track | D. Ward |
Expires: June 1, 2019 | Cisco Systems |
S. Pallagatti, Ed. | |
Rtbrick | |
G. Mirsky, Ed. | |
ZTE Corp. | |
November 28, 2018 |
BFD Multipoint Active Tails.
draft-ietf-bfd-multipoint-active-tail-10
This document describes active tail extensions to the Bidirectional Forwarding Detection (BFD) protocol for multipoint networks.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 1, 2019.
Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This application of BFD is an extension to Multipoint BFD [I-D.ietf-bfd-multipoint], which allows tails to notify the head of the lack of multipoint connectivity. As a further option, heads can request a notification from the tails by means of a polling mechanism. Notification to the head can be enabled for all tails, or for only a subset of the tails.
The goal of this application is for the head to reasonably rapidly have knowledge of tails that have lost connectivity from the head.
Since scaling is a primary concern (particularly state explosion toward the head), it is required that the head be in control of all timing aspects of the mechanism, and that BFD packets from the tails to the head not be synchronized.
Throughout this document, the term "multipoint" is defined as a mechanism by which one or more systems receive packets sent by a single sender. This specifically includes such things as IP multicast and point-to-multipoint MPLS.
Term "connectivity" in this document is not being used in the context of connectivity verification in transport network but as an alternative to "continuity", i.e. existence of a path between the sender and the receiver.
This document effectively modifies and adds to Sections 5.12 and 5.13 of the base BFD multipoint document [I-D.ietf-bfd-multipoint].
BFD Bidirectional Forwarding Detection
c-poll Composite Poll
m-poll Multipoint Poll
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
A head may wish to be alerted to the tails' connectivity (or lack thereof), and there are a number of options to achieve that. First, if all that is needed is a best-effort failure notification, as discussed in Section 5.2.1, the tails can send unsolicited unicast BFD Control packets to the head when the path fails, as described in Section 6.4.
If the head wishes to know of the active tails on the multipoint path, it may send a multipoint BFD Control packet with the Poll (P) bit set, which will induce the tails to return a unicast BFD Control packet with the Final (F) bit set (detailed description in Section 5.2.2). The head can then create BFD session state for each of the tails that have multipoint connectivity. If the head sends such a packet on occasion, it can keep track of which tails answer, thus providing a more deterministic mechanism for detecting which tails fail to respond (implying a loss of multipoint connectivity). In this document, this method referenced to as Multipoint Poll (m-poll).
If the head wishes the definite indication of the tails' connectivity, it may do all of the above, but if it detects that a tail did not answer the previous multipoint poll, it may initiate a Demand mode Poll Sequence as a unicast to that tail (detailed description in Section 5.2.3). This covers the case where either the multipoint poll or the single reply also is lost in transit. If desired, the head may Poll one or more tails proactively to track the tails' connectivity. In this document this method that combines the use of multipoint and unicast polling of tails by the head referenced to as Composite Poll (c-poll).
If the awareness of the state of some nodes is more important for the head, in the sense that the head needs to detect the lack of multipoint connectivity to a subset of tails at a different rate, the head may transmit unicast BFD Polls to that subset of tails. In this case, the timing may be independent on a tail-by-tail basis.
Individual tails may be configured so that they never send BFD control packets to the head. Such tails will never be known to the head, but will still be able to detect multipoint path failures from the head.
It is worth analyzing how this protocol reacts to various scenarios. There are three path components present, namely, the multipoint path, the forward unicast path (from head to a particular tail), and the reverse unicast path (from a tail to the head). There are also four options as to how the head is notified about failures from the tail. For the different modes described below the setting of new state variables are given even if these are only introduced later in the document (see Section 6.3).
In this scenario, only the multipoint path is used and none of the others matter. A failure in the multipoint path will result in the tail noticing the failure within a detection time, and the head will remain ignorant of the tail state. This mode emulates the behavior described in [I-D.ietf-bfd-multipoint]. In this mode, bfd.SessionType is MultipointTail and the variable bfd.SilentTail (see Section 6.3.1) MUST be set to 1. If bfd.SessionType is MultipointHead or MultipointClient bfd.ReportTailDown MUST be set to 0. The head MAY set bfd.RequiredMinRxInterval to zero and thus suppress tails sending any BFD control packets.
In these scenarios, the tail sends unsolicited or solicited BFD packets in response to the detection of a multipoint path failure. All these scenarios have common settings:
In this scenario, the tail sends unsolicited BFD packets in response to the detection of a multipoint path failure. It uses the reverse unicast path, but not the forward unicast path.
If the multipoint path fails but the reverse unicast path stays up, the tail will detect the failure within a detection time, and the head will know about it within one reverse packet time (since the notification is delayed).
If both the multipoint path and the reverse unicast paths fail, the tail will detect the failure but the head will remain unaware of it.
In this scenario, the head sends occasional multipoint Polls in addition to (or in lieu of) non-Poll multipoint BFD Control packets, expecting the tails to reply with Final. This also uses the reverse unicast path, but not the forward unicast path.
If the multipoint path fails but the reverse unicast path stays up, the tail will detect the failure within a detection time, and the head will know about it within one reverse packet time (the notification is delayed to avoid synchronization of the tails).
If both the multipoint path and the reverse unicast paths fail, the tail will detect the failure but the head will remain unaware of this fact.
If the reverse unicast path fails but the multipoint path stays up, the head will see the BFD session fail, but the state of the multipoint path will be unknown to the head. The tail will continue to receive multipoint data traffic.
If either the multipoint Poll or the unicast reply is lost in transit, the head will see the BFD session fail, but the state of the multipoint path will be unknown to the head. The tail will continue to receive multipoint data traffic.
In this scenario, the head sends occasional multipoint Polls in addition to (or in lieu of) non-Poll multipoint BFD control packets, expecting the tails to reply with Final. If a tail that had previously replied to a multipoint Poll fails to reply (or if the head simply wishes to verify tail connectivity), the head issues a unicast Poll Sequence to the tail. This scenario makes use of all three paths. In this mode for bfd.SessionType of MultipointTail, variable bfd.SilentTail (see Section 6.3.1) MUST be set to 0.
If the multipoint path fails but the two unicast paths stay up, the tail will detect the failure within a detection time, and the head will know about it within one reverse packet time (since the notification is delayed). Note that the reverse packet time may be smaller in this case if the head has previously issued a unicast Poll (since the tail will not delay transmission of the notification in this case).
If both the multipoint path and the reverse unicast paths fail (regardless of the state of the forward unicast path), the tail will detect the failure but the head will remain unaware of this fact. The head will detect a BFD session failure to the tail but cannot make a determination about the state of the tail's multipoint connectivity.
If the forward unicast path fails but the reverse unicast path stays up, the head will detect a BFD session failure to the tail if it happens to send a unicast Poll sequence, but cannot make a determination about the state of the tail's multipoint connectivity. If the multipoint path to the tail fails prior to any unicast Poll being sent, the tail will detect the failure within a detection time, and the head will know about it within one reverse packet time (since the notification is delayed).
If the multipoint path stays up but the reverse unicast path fails, the head will see the particular MultipointClient session fail if it happens to send a Poll Sequence, but the state of the multipoint path will be unknown to the head. The tail will continue to receive multipoint data traffic.
If the multipoint path and the reverse unicast path both stay up but the forward unicast path fails, neither side will notice this failure so long as a unicast Poll Sequence is never sent by the head. If the head sends a unicast Poll Sequence, the head will detect the failure in the forward unicast path. The state of the multipoint path will be determined by multipoint Poll. The tail will continue to receive multipoint data traffic.
This section describes the operation of BFD Multipoint active tail in detail. This section modifies the section 4 of [I-D.ietf-bfd-multipoint] as the following:
If the head is keeping track of some or all of the tails, it has a session of type MultipointClient per tail that it cares about. All of the MultipointClient sessions for tails on a particular multipoint path are associated with the MultipointHead session to which the clients are listening. A BFD Poll Sequence may be sent over a MultipointClient session to a tail if the head wishes to verify connectivity. These sessions receive any BFD Control packets sent by the tails, and MUST NOT transmit periodic BFD Control packets other than Poll Sequences (since periodic transmission is always done by the MultipointHead session). Note that the settings of all BFD variables in a MultipointClient session for a particular tail override the corresponding settings in the MultipointHead session.
If a MultipointClient session receives a BFD Control packet from the tail with state Down or AdminDown, the head reliably knows that the tail has lost multipoint connectivity. If the Detection Time expires on a MultipointClient session, it is ambiguous as to whether the multipoint connectivity failed or whether there was a unicast path problem in one direction or the other, so the head does not reliably know the tail's state.
BFD Multipoint active tail introduces new state variables and modifies the usage of a few existing ones defined in section 4.4 of [I-D.ietf-bfd-multipoint].
A few state variables are added in support of Multipoint BFD active tail.
A new state variable value being added to:
bfd.SessionType
Some state variables defined in section 6.8.1 of [RFC5880] need to be initialized or manipulated differently depending on the session type. The values of some of these variables relate to those of the same variables of a MultipointHead session (see section 4.4.2 of [I-D.ietf-bfd-multipoint]).
The state variables defined above are used to choose which operational options are active.
The most basic form of the operation of BFD in multipoint networks explained in [I-D.ietf-bfd-multipoint]. In this scenario, BFD Control packets flow only from the head and no tracking of tail state at the head is desired. That can be accomplished by setting bfd.ReportTailDown to 0 in the MultipointHead session (Section 5.1).
If the head wishes to know of active the tails, it sends multipoint Polls as needed. Previously known tails that don't respond to the Polls will be detected (as per Section 5.2.2).
If the head wishes to request a notification from the tails when they lose connectivity, it sets bfd.ReportTailDown to 1 in either the MultipointHead session (if such notification is desired from all tails) or in the MultipointClient session (if notification is desired from a particular tail). Note that the setting of this variable in a MultipointClient session for a particular tail overrides the setting in the MultipointHead session.
If the head wishes to verify the state of a tail on an ongoing basis, it sends a Poll Sequence from the MultipointClient session associated with that tail as needed. This has the effect of eliminating the initial delay, described in Section 6.13.3, that the tail would otherwise insert prior to transmission of the packet thus the head may have notification of the session failure more quickly when comparing with use of m-poll.
If a tail wishes to operate silently (sending no BFD Control packets to the head) it sets bfd.SilentTail to 1 in the MultipointTail session. This allows a tail to be silent independent of the settings on the head.
Though the state transitions for the state machine, as defined in section 5.5 of [I-D.ietf-bfd-multipoint], for a session type MultipointHead are only administratively driven, the state machine for a session of type MultipointClient is the same and the diagram is applicable.
If BFD Control packets are received at the head, they are demultiplexed to sessions of type MultipointClient, which represent the set of tails that the head is interested in tracking. These sessions will typically also be established dynamically based on the receipt of BFD Control packets. The head has broad latitude in choosing which tails to track, if any, without affecting the basic operation of the protocol. The head directly controls whether or not tails are allowed to send BFD Control packets back to the head by setting bfd.RequiredMinRxInterval to zero in a MultipointHead or a MultipointClient session.
When the tails send BFD Control packets to the head from the MultipointTail session, the contents of Your Discriminator (the discriminator received from the head) will not be sufficient for the head to demultiplex the packet, since the same value will be received from all tails on the multicast tree. In this case, the head MUST demultiplex packets based on the source address and the value of Your Discr, which together uniquely identify the tail and the multipoint path.
When the head sends unicast BFD Control packets to a tail from a MultipointClient session, the value of Your Discriminator will be valid, and the tail MUST demultiplex the packet based solely on Your Discr.
As the fan-in from the tails to the head may be very large, it is critical that the flow of BFD Control packets from the tails is controlled.
The head always operates in Demand mode. This means that no tail will send an asynchronous BFD Control packet as long as the session is Up.
The value of Required Min Rx Interval received by a tail in a unicast BFD Control packet, if any, always takes precedence over the value received in Multipoint BFD Control packets. This allows the packet rate from individual tails to be controlled separately as desired by sending a BFD Control packet from the corresponding MultipointClient session. This also eliminates the random delay, as discussed in Section 6.13.3, prior to transmission from the tail that would otherwise be inserted, reducing the latency of reporting a failure to the head.
If the head wishes to suppress traffic from the tails when they detect a session failure, it MAY set bfd.RequiredMinRxInterval to zero, which is a reserved value that indicates that the sender wishes to receive no periodic traffic. This can be set in the MultipointHead session (suppressing traffic from all tails) or it can be set in a MultipointClient session (suppressing traffic from only a single tail).
Any tail may be provisioned to never send *any* BFD Control packets to the head by setting bfd.SilentTail to 1. This provides a mechanism by which only a subset of tails reports their session status to the head.
If the head wishes to know of the active tails, the MultipointHead session can send a BFD Control packet as specified in Section 6.13.3, with the Poll (P) bit set to 1. This will cause all of the tails to reply with a unicast BFD Control Packet, randomized across one packet interval.
The decision as to when to send a multipoint Poll is outside the scope of this specification. However, it MUST NOT be sent more often than the regular multipoint BFD Control packet. Since the tail will treat a multipoint Poll like any other multipoint BFD Control packet, Polls may be sent in lieu of non-Poll packets.
Soliciting the tails also starts the Detection Timer for each of the associated MultipointClient sessions, which will cause those sessions to time out if the associated tails do not respond.
Note that for this mechanism to work properly, the Detection Time (which is equal to bfd.DesiredMinTxInterval) MUST be greater than the round trip time of BFD Control packets from the head to the tail (via the multipoint path) and back (via a unicast path). See Section 6.11 for more details.
If the head wishes to verify connectivity to a specific tail, the corresponding MultipointClient session can send a BFD Poll Sequence to said tail. This might be done in reaction to the expiration of the Detection Timer (the tail didn't respond to a multipoint Poll), or it might be done on a proactive basis.
The interval between transmitted packets in the Poll Sequence MUST be calculated as specified in the base BFD specification [RFC5880] (the greater of bfd.DesiredMinTxInterval and bfd.RemoteMinRxInterval).
The value transmitted in Required Min RX Interval will be used by the tail (rather than the value received in any multipoint packet) when it transmits BFD Control packets to the head notifying it of a session failure and the transmitted packets will not be delayed. This value can potentially be set much lower than in the multipoint case, in order to speed up a notification to the head, since the value will be used only by the single tail. This value (and the lack of delay) are "sticky", in that once the tail receives it, it will continue to use it indefinitely. Therefore, if the head no longer wishes to single out the tail, it SHOULD reset the timer to the default by sending a Poll Sequence with the same value of Required Min Rx Interval as is carried in the multipoint packets, or it MAY reset the tail session by sending a Poll Sequence with state AdminDown (after the completion of which the session will come back up).
Note that a failure of the head to receive a response to a Poll Sequence does not necessarily mean that the tail has lost multipoint connectivity, though a reply to a Poll Sequence does reliably indicate connectivity or lack thereof (by virtue of the tail's state not being Up in the BFD Control packet).
MultipointClient sessions at the head are always in the Demand mode, and as such only care about detection time in two cases. First, if a Poll Sequence is being sent on a MultipointClient session, the detection time on this session is calculated according to the base BFD specification [RFC5880], that is, the transmission interval multiplied by bfd.DetectMult. Second, when a multipoint Poll is sent to solicit tail replies, the detection time on all associated MultipointClient sessions that aren't currently sending Poll Sequences is set to a value greater than or equal to bfd.RequiredMinRxInterval (one packet time). This value can be made arbitrarily large in order to ensure that the detection time is greater than the round trip time of a BFD Control packet between the head and the tail with no ill effects, other than delaying the detection of unresponsive tails. Note that a detection time expiration on a MultipointClient session at the head, while indicating a BFD session failure, cannot be construed to mean that the tail is not hearing multipoint packets from the head.
If the MultipointHead session is in Down/AdminDown state (which only happens administratively), all associated MultipointClient sessions SHOULD be destroyed as they are superfluous.
If a MultipointClient session goes down due to the receipt of an unsolicited BFD Control packet from the tail with state Down or AdminDown (not in response to a Poll), and tail connectivity verification is not being done, the session MAY be destroyed. If verification is desired, the session SHOULD send a Poll Sequence and the session SHOULD be maintained.
If the tail replies to a Poll Sequence with state Down or AdminDown, it means that the tail session is definitely down. In this case, the session MAY be destroyed.
If the Detection Time expires on a MultipointClient session (meaning that the tail did not reply to a Poll Sequence) the session MAY be destroyed.
The following sections are meant to extend the corresponding sections in the base BFD for Multipoint Networks specification [I-D.ietf-bfd-multipoint].
The following procedure modifies parts of Section 5.13.1 of [I-D.ietf-bfd-multipoint].
When a BFD Control packet is received, the procedure defined in Section 5.13.1 of [I-D.ietf-bfd-multipoint] MUST be followed, in the order specified. If the packet is discarded according to these rules, processing of the packet MUST cease at that point. In addition to that, if tail tracking is desired by the head, the following procedure MUST be applied.
This section is part of the addition to Section 5.13.2 of [I-D.ietf-bfd-multipoint], separated for clarity.
A system MUST NOT periodically transmit BFD Control packets if bfd.SessionType is MultipointClient and a Poll Sequence is not being transmitted.
If bfd.SessionType value is MultipointTail and the periodic transmission of BFD Control packets is just starting (due to Demand mode not being active on the remote system), the first packet to be transmitted MUST be delayed by a random amount of time between zero and (0.9 * bfd.RemoteMinRxInterval).
If a BFD Control packet is received with the Poll (P) bit set to 1, the receiving system MUST transmit a BFD Control packet with the Poll (P) bit clear and the Final (F) bit, without respect to the transmission timer or any other transmission limitations, without respect to the session state, and without respect to whether Demand mode is active on either system. A system MAY limit the rate at which such packets are transmitted. If rate limiting is in effect, the advertised value of Desired Min TX Interval MUST be greater than or equal to the interval between transmitted packets imposed by the rate limiting function. If the Multipoint (M) bit is set in the received packet, the packet transmission MUST be delayed by a random amount of time between zero and (0.9 * bfd.RemoteMinRxInterval). Otherwise, the packet MUST be transmitted as soon as practicable.
A system MUST NOT set the Demand (D) bit if bfd.SessionType is MultipointClient unless bfd.DemandMode is 1, bfd.SessionState is Up, and bfd.RemoteSessionState is Up.
Content of the transmitted packet MUST be as explained in section 5.13.3 of [I-D.ietf-bfd-multipoint].
If head notification is to be used, it is assumed that a multipoint BFD packet encapsulation contains enough information so that a tail can address a unicast BFD packet to the head.
If head notification is to be used, it is assumed that is that there is bidirectional unicast communication available (at the same protocol layer within which BFD is being run) between the tail and head.
For the head to know reliably that a tail has lost multipoint connectivity, the unicast paths in both directions between that tail and the head must remain operational when the multipoint path fails. It is thus desirable that unicast paths not share fate with the multipoint path to the extent possible if the head wants more definite knowledge of the tail state.
Since the normal BFD three-way handshake is not used in this application, a tail transitioning from state Up to Down and back to Up again may not be reliably detected at the head.
Section 7 of [RFC5880] includes the requirements for implementation of a congestion control mechanism when BFD is used across multiple hops, and the mechanism to use congestion detection to reduce the amount of BFD packets the system generates. These requirements are also applicable to this specification. When this specification used in the mode with no head notifications by tails, as discussed in Section 5.1, the head MUST limit the packet transmission rate to not higher than one BFD packet per second (Section 6 [I-D.ietf-bfd-multipoint]). When the BFD uses one of notification by tails to head mechanisms described in Section 5.2, Min RX Interval can be used by the tail to control the packet transmission rate of the head. The exact mechanism of processing changes in the Min RX Interval value in the received from the tail response to multicast or unicast Poll BFD packet is outside the scope of this document.
As noted in Section 7 [RFC5880], "any mechanism that increases the transmit or receive intervals will increase the Detection Time for the session".
This document has no actions for IANA.
The same security considerations as those described in [RFC5880] and [I-D.ietf-bfd-multipoint] apply to this document.
Additionally, implementations that create MultpointClient sessions dynamically upon receipt of BFD Control packet from a tail MUST implement protective measures to prevent a number of MultipointClient sessions being created growing out of control. Below are listed some points to be considered in such implementations.
This specification does not raise any additional security issues beyond those of the specifications referred to in the list of normative references.
Rahul Aggarwal of Juniper Networks and George Swallow of Cisco Systems provided the initial idea for this specification and contributed to its development.
Authors would also like to thank Nobo Akiya, Vengada Prasad Govindan, Jeff Haas, Wim Henderickx and Mingui Zhang who have greatly contributed to this document.
[I-D.ietf-bfd-multipoint] | Katz, D., Ward, D., Networks, J. and G. Mirsky, "BFD for Multipoint Networks", Internet-Draft draft-ietf-bfd-multipoint-18, June 2018. |
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997. |
[RFC5880] | Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010. |
[RFC7880] | Pignataro, C., Ward, D., Akiya, N., Bhatia, M. and S. Pallagatti, "Seamless Bidirectional Forwarding Detection (S-BFD)", RFC 7880, DOI 10.17487/RFC7880, July 2016. |
[RFC8174] | Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017. |