Internet Engineering Task Force | J. Durand |
Internet-Draft | CISCO |
Intended status: Standards Track | February 22, 2015 |
Expires: August 26, 2015 |
Path validation toward BGP next-hop
draft-jdurand-idr-next-hop-liveliness-00.txt
This proposal introduces a new BGP attribute that can be used by BGP routers to advertise their capability to support any kind of host liveliness checking protocols (for example BFD). This attribute can be used to avoid black-holes scenarios seen when BGP next-hop is not the peer, in particular on Internet eXchange Points (IXPs) implementing BGP Route-Servers. IXP member routers can exchange their capability to implement a given host liveliness checking
A placeholder to list general observations about this document.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 26, 2015.
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Internet eXchange Points (IXPs) implement BGP Route-Servers (RS) [4] so that connected members do not need to configure BGP peerings with every other member to exchange routes. Through a single peering with the RS, a member will receive routes of all the other members peering with the RS. The RS redistributes routes and could simply be described as a Route-Reflector for eBGP peerings.
Usually, deployed RS do not modify BGP next-hop of exchanged routes so traffic exchanged between IXP members do not pass through the RS, which keeps only a control-plane role, exactly as for a BGP RR. The drawback is that it may happen that peering stays up between members and route-server while there is no connectivity between members. This results in black holes for members with no easy troubleshooting: usually upon such problem a member just shuts its connectivity to the IXP. This situation has happened several times on many different IXPs and many members do not want to peer with route-servers for this reason.
eBGP UP----> RS <-------eBGP UP | | | | | | ----------------IXP LAN--------------------- | | | | V | | V Member 1 <================> Member 2 BROKEN CONNECTIVITY
Figure 1
This proposal intends to solve this situation with a new BGP attribute that can be used by BGP routers to advertise their capability to support any kind of host liveliness checking protocols (for example BFD).
Solution involves 3 different mechanisms:
Host liveliness checking mechanisms have been existing for years (BFD...) and are not under the scope of this proposal. This document will focus on how BGP routers can advertise their mutual liveliness checking mechanisms and what actions to take depending on the actual liveliness.
Solution should be as simple as possible and avoid if possible the creation of a new protocol. As goal is to make sure BGP next-hops can check their mutual liveliness, it appears quickly that BGP can be easily adapted to announce the liveliness checking protocol capability.
Solution should be independent of liveliness checking protocol. It should be possible to integrate future protocols without changing main aspects of the solution.
BGP next-hops should not rely on any implementation on the IXP route-server to exchange their liveliness checking capability. In other words the RS should be transparent for next-hops when they advertise their liveliness checking capabilities.
Connected member will announce their capability to implement host liveliness mechanisms (for example BFD) through new proposed BGP attribute called NH_REACHABLE_CAPABILITY. This attribute needs to be transitive optional so it can be re-advertised by route-server which may not support it. Upon reception of routes with this attribute, a given member may know capability of the advertising next-hop and may decide to start probing its reachability.
In previous example, in case member 1 and 2 support BFD, they would send their routes to the RS with NH_REACHABLE_CAPABILITY attribute with TLV describing BFD capability. RS will redistribute the routes with the attribute untouched (as this is a transitive optional attribute). Upon reception of the routes, member 1 will know member 2 next-hop is BFD-capable and vice-versa. They may start probing each other and detect when there is broken connectivity between them. When that occurs they will be able to decide what to do with corresponding routes (withdraw, change local preference...)
The NH_REACHABLE_CAPABILITY attribute follows the following schema in full conformance with BGP specifications [2]:
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +---------------+---------------+---------------+---------------+ | Attr. Flags |Attr. Type Code| Attr. Length | | +---------------+---------------+---------------+ | | List of TLVs | . . . .
Figure 2
Fields of this BFD_CAPABILITY attribute are described in the following sub-sections.
Attribute flags are following:
Attribute type code is to be provided by IANA.
Represents the total attribute length (in bytes) and is dependent on used TLVs.
Each associated TLV indicates a host liveliness capability. TLV data structure is used to make it possible to use different protocols to check BGP next-hop liveliness. At the time being only BFD TLV is envisaged and therefore described in this document. TLVs have the following format:
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +---------------+---------------+---------------+----------- - - | Type | Length | Value +---------------+---------------+---------------+----------- - -
Figure 3
BFD TLV is used to indicate BFD capability of the BGP router. It is described with a Type set to numerical value 1. BFD TLV have Value field format:
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +---------------+---------------+---------------+--------------+ | BFD Flags | Next-Hop IP address | +---------------+ | | | . .
Figure 4
The high-order bit of the flag field is the BFD-Capable flag. It MUST be set to 1 in case the next-hop is BFD capable. It is set to 0 otherwise.
All the other bits are left unused.
Contains the IP address (IPv4 or IPv6) of the router that can be probed with BFD. It MUST be the IP address used to advertise the route (ie. the BGP next-hop).
This section details router operations with the aforementioned BGP attribute.
When a router wants to advertise that it supports a host liveliness protocol, it SHOULD attach the NH_REACHABLE_CAPABILITY with appropriate TLVs to prefixes it advertises.
A router MUST NOT attach the NH_REACHABLE_CAPABILITY if it is not announcing itself as the BGP next-hop. For example BGP route-servers and BGP route-reflectors MUST NOT attach NH_REACHABLE_CAPABILITY for routes they relay.
A BGP router will most likely attach the attribute to all prefixes it advertises. There is apparently no reason why some prefixes would be checked against router liveliness while other would not benefit of this mechanism. But attribute structure makes it possible to attach the attribute only to part of the prefixes so there is no protocol restriction for attaching the attribute to only a subset of advertised routes.
For sake of limiting the number of bytes sent for each BGP transaction, it is important that the routes are grouped in BGP communications to transmit the attribute once for all impacted prefixes as BGP protocol [2] allows.
As the attribute is optional transitive it will be received by downstream BGP routers. Any router implementing NH_REACHABLE_CAPABILITY MUST do the following actions in following order:
While the primary focus of the authors is to solve the issue met with BGP route-servers on IXPs described in section Section 1, the proposed solution may also apply to the following use cases:
To avoid attachment of the attribute to all prefixes and useless pollution of downstream, a "magic prefix" with this attribute could be sufficient to declare host liveliness checking capability of the peer.
At a first glance, the "magic prefix" that would appear most relevant would be the host address of the next-hop. A BGP router would announce its own next-hop address (/32 for IPv4 and /128 for IPv6) in addition to all other regular prefixes. Nevertheless this approach goes against filtering policies usually applied on IXPs [5] and cannot be selected here.
Another solution would be to reserve a new special use addresses and have a unique well-known "magic prefix" across the Internet. This raises other problems such as security, useless address use, BGP best path selection algorithm modification to interprete differently this well known magic prefix...
At the time of writing this document such an optimization needs to be further studied.
The authors would like to thank the following people for their comments and support: [TBD].
A new BGP Attribute Type Code is requested to IANA for this new NH_REACHABLE_CAPABILITY attribute.
As the proposed attribute is transitive optional, it will be passed onward by all routers. There is no way to keep the attribute local to the IXP.
The attribute may contain IP address of an advertising router (this is the case if BFD TLV is used for instance). It is then possible that any downstream BGP router knows that the route has transited through it and that the router is capable of supporting some host liveliness protocol. This may be used by an attacker aware of vulnerabilities on such protocol.
[1] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[2] | Rekhter, Y., Li, T. and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006. |
[3] | Katz, D. and D. Ward, Bidirectional Forwarding Detection (BFD)", RFC 5880, June 2010. |
[4] | Internet Exchange Route Server", . | , "
[5] | Durand, J., Pepelnjak, I. and G. Doering, "BGP Operations and Security", BCP 194, RFC 7454, February 2015. |