Internet Engineering Task Force J. Durand
Internet-Draft CISCO
Intended status: Standards Track February 22, 2015
Expires: August 26, 2015

Path validation toward BGP next-hop
draft-jdurand-idr-next-hop-liveliness-00.txt

Abstract

This proposal introduces a new BGP attribute that can be used by BGP routers to advertise their capability to support any kind of host liveliness checking protocols (for example BFD). This attribute can be used to avoid black-holes scenarios seen when BGP next-hop is not the peer, in particular on Internet eXchange Points (IXPs) implementing BGP Route-Servers. IXP member routers can exchange their capability to implement a given host liveliness checking

Foreword

A placeholder to list general observations about this document.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 26, 2015.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Internet eXchange Points (IXPs) implement BGP Route-Servers (RS) [4] so that connected members do not need to configure BGP peerings with every other member to exchange routes. Through a single peering with the RS, a member will receive routes of all the other members peering with the RS. The RS redistributes routes and could simply be described as a Route-Reflector for eBGP peerings.

Usually, deployed RS do not modify BGP next-hop of exchanged routes so traffic exchanged between IXP members do not pass through the RS, which keeps only a control-plane role, exactly as for a BGP RR. The drawback is that it may happen that peering stays up between members and route-server while there is no connectivity between members. This results in black holes for members with no easy troubleshooting: usually upon such problem a member just shuts its connectivity to the IXP. This situation has happened several times on many different IXPs and many members do not want to peer with route-servers for this reason.

               
               
                    eBGP UP---->  RS <-------eBGP UP
                  	|              |               |
                    |              |               |      
               ----------------IXP LAN---------------------
                    |   |                       |  | 
                    V   |                       |  V
                  Member 1 <================> Member 2
                                  BROKEN
                               CONNECTIVITY
               
        	

Figure 1

This proposal intends to solve this situation with a new BGP attribute that can be used by BGP routers to advertise their capability to support any kind of host liveliness checking protocols (for example BFD).

2. Definitions and Accronyms

3. Solution Requirements

Solution involves 3 different mechanisms:

Host liveliness checking mechanisms have been existing for years (BFD...) and are not under the scope of this proposal. This document will focus on how BGP routers can advertise their mutual liveliness checking mechanisms and what actions to take depending on the actual liveliness.

Solution should be as simple as possible and avoid if possible the creation of a new protocol. As goal is to make sure BGP next-hops can check their mutual liveliness, it appears quickly that BGP can be easily adapted to announce the liveliness checking protocol capability.

Solution should be independent of liveliness checking protocol. It should be possible to integrate future protocols without changing main aspects of the solution.

BGP next-hops should not rely on any implementation on the IXP route-server to exchange their liveliness checking capability. In other words the RS should be transparent for next-hops when they advertise their liveliness checking capabilities.

4. Solution Overview

Connected member will announce their capability to implement host liveliness mechanisms (for example BFD) through new proposed BGP attribute called NH_REACHABLE_CAPABILITY. This attribute needs to be transitive optional so it can be re-advertised by route-server which may not support it. Upon reception of routes with this attribute, a given member may know capability of the advertising next-hop and may decide to start probing its reachability.

In previous example, in case member 1 and 2 support BFD, they would send their routes to the RS with NH_REACHABLE_CAPABILITY attribute with TLV describing BFD capability. RS will redistribute the routes with the attribute untouched (as this is a transitive optional attribute). Upon reception of the routes, member 1 will know member 2 next-hop is BFD-capable and vice-versa. They may start probing each other and detect when there is broken connectivity between them. When that occurs they will be able to decide what to do with corresponding routes (withdraw, change local preference...)

5. NH_REACHABLE_CAPABILITY attribute description

The NH_REACHABLE_CAPABILITY attribute follows the following schema in full conformance with BGP specifications [2]:


     1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
    +---------------+---------------+---------------+---------------+
    | Attr. Flags   |Attr. Type Code| Attr. Length  |               |
    +---------------+---------------+---------------+               |
    |                         List of TLVs                          |
    .                                                               .  
    .                                                               .

            

Figure 2

Fields of this BFD_CAPABILITY attribute are described in the following sub-sections.

5.1. Attribute Flags

Attribute flags are following:

  • bit 0 - Optional bit: MUST be 1 as the NH_REACHABLE_CAPABILITY attribute is optional and may not be implemented on all BGP routers.
  • bit 1 - Transitive bit: MUST be 1 as it should pass at least the BGP RS which may not implement NH_REACHABLE_CAPABILITY attribute processing.
  • bit 2 - Partial bit. It MUST be 1 if the router attaching the NH_REACHABLE_CAPABILITY attribute is not the originator. It MUST be 0 if the router attaching the NH_REACHABLE_CAPABILITY attribute is the originator.
  • bit 3 - Extended Length bit: MUST be 0 as attribute length is 1 octet.
  • The lower-order four bits of the Attribute Flags octet are unused and MUST be 0.

5.2. Attribute Type Code

Attribute type code is to be provided by IANA.

5.3. Attribute Length

Represents the total attribute length (in bytes) and is dependent on used TLVs.

5.4. TLV Definition

Each associated TLV indicates a host liveliness capability. TLV data structure is used to make it possible to use different protocols to check BGP next-hop liveliness. At the time being only BFD TLV is envisaged and therefore described in this document. TLVs have the following format:


     1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
    +---------------+---------------+---------------+----------- - -
    |     Type      |     Length    |               Value           
    +---------------+---------------+---------------+----------- - -

            	

Figure 3

5.4.1. BFD TLV

BFD TLV is used to indicate BFD capability of the BGP router. It is described with a Type set to numerical value 1. BFD TLV have Value field format:


     1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
    +---------------+---------------+---------------+--------------+
    |   BFD Flags   |             Next-Hop IP address              |
    +---------------+                                              |
    |                                                              |
    .                                                              .

              		

Figure 4

5.4.1.1. BFD Flags

The high-order bit of the flag field is the BFD-Capable flag. It MUST be set to 1 in case the next-hop is BFD capable. It is set to 0 otherwise.

All the other bits are left unused.

5.4.1.2. Next-Hop IP address

Contains the IP address (IPv4 or IPv6) of the router that can be probed with BFD. It MUST be the IP address used to advertise the route (ie. the BGP next-hop).

6. Protocol description

This section details router operations with the aforementioned BGP attribute.

6.1. Generating and sending the attribute

When a router wants to advertise that it supports a host liveliness protocol, it SHOULD attach the NH_REACHABLE_CAPABILITY with appropriate TLVs to prefixes it advertises.

A router MUST NOT attach the NH_REACHABLE_CAPABILITY if it is not announcing itself as the BGP next-hop. For example BGP route-servers and BGP route-reflectors MUST NOT attach NH_REACHABLE_CAPABILITY for routes they relay.

A BGP router will most likely attach the attribute to all prefixes it advertises. There is apparently no reason why some prefixes would be checked against router liveliness while other would not benefit of this mechanism. But attribute structure makes it possible to attach the attribute only to part of the prefixes so there is no protocol restriction for attaching the attribute to only a subset of advertised routes.

For sake of limiting the number of bytes sent for each BGP transaction, it is important that the routes are grouped in BGP communications to transmit the attribute once for all impacted prefixes as BGP protocol [2] allows.

6.2. Upon reception of the NH_REACHABLE_CAPABILITY BGP attribute

As the attribute is optional transitive it will be received by downstream BGP routers. Any router implementing NH_REACHABLE_CAPABILITY MUST do the following actions in following order:

  • If the router is a BGP route-server or a BGP route-reflector, it MUST NOT process or change this attribute.
  • If the router is not a BGP RR or RS, and has no BGP route with BGP next-hop corresponding to address embedded in NH_REACHABLE_CAPABILITY then it MUST remove the attribute from subsequent advertisements to avoid useless downstream propagation of this attribute.
  • If the router is not a BGP RR or RS, and has at least one BGP route with BGP next-hop corresponding to address embedded in NH_REACHABLE_CAPABILITY then:
    • It MAY start the host liveliness checking mechanisms advertised. Choice of parameters for the mechanism... is out of the scope of this document. For example, in case a BFD TLV is received, the routers will negociate this parameters with BFD control packets as described in [3].
    • It SHOULD NOT have more than one host liveliness checking mechanism with a given next-hop. If multiple routes are received with the same NH_REACHABLE_CAPABILITY, having a single host liveliness checking "session" is sufficient to validate reachability of the BGP next-hop.
    • It MAY take any action for a received route based on host liveliness provided by that mechanism. It is important to understand that while ordinary BGP session is shut when remote peer is detected as dead, the action has to occur this time at the route level as there is no BGP peering with the probed router. For example the router MAY withdraw the route, change its local preference, add a NO_EXPORT community...
    • It MUST remove the attribute from subsequent advertisements to avoid useless propagation of this attribute.

7. Possible other use cases

While the primary focus of the authors is to solve the issue met with BGP route-servers on IXPs described in section Section 1, the proposed solution may also apply to the following use cases:

  • iBGP route-reflector: the scenario described for BGP route-server could also apply for iBGP route-reflector. The solution described in this draft could be used to validate received iBGP routes against real reachability of BGP next-hop (a router in same AS in case next-hop self is used, or the eBGP next-hop announcing the route.
  • Any eBGP peering: the proposed solution would enable host liveliness protocols auto-deployment on every eBGP peering. Peers would just exchange their BGP parameters and host liveliness protocol would automatically "harden" the peering without the need of any additional configuration.

8. Future Work

8.1. Possible Optimization

To avoid attachment of the attribute to all prefixes and useless pollution of downstream, a "magic prefix" with this attribute could be sufficient to declare host liveliness checking capability of the peer.

At a first glance, the "magic prefix" that would appear most relevant would be the host address of the next-hop. A BGP router would announce its own next-hop address (/32 for IPv4 and /128 for IPv6) in addition to all other regular prefixes. Nevertheless this approach goes against filtering policies usually applied on IXPs [5] and cannot be selected here.

Another solution would be to reserve a new special use addresses and have a unique well-known "magic prefix" across the Internet. This raises other problems such as security, useless address use, BGP best path selection algorithm modification to interprete differently this well known magic prefix...

At the time of writing this document such an optimization needs to be further studied.

9. Acknowledgements

The authors would like to thank the following people for their comments and support: [TBD].

10. IANA Considerations

A new BGP Attribute Type Code is requested to IANA for this new NH_REACHABLE_CAPABILITY attribute.

11. Security Considerations

As the proposed attribute is transitive optional, it will be passed onward by all routers. There is no way to keep the attribute local to the IXP.

The attribute may contain IP address of an advertising router (this is the case if BFD TLV is used for instance). It is then possible that any downstream BGP router knows that the route has transited through it and that the router is capable of supporting some host liveliness protocol. This may be used by an attacker aware of vulnerabilities on such protocol.

12. References

12.1. Normative References

, "
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[2] Rekhter, Y., Li, T. and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006.
[3] Katz, D. and D. Ward, Bidirectional Forwarding Detection (BFD)", RFC 5880, June 2010.
[4]Internet Exchange Route Server", .

12.2. Informative References

[5] Durand, J., Pepelnjak, I. and G. Doering, "BGP Operations and Security", BCP 194, RFC 7454, February 2015.

Author's Address

Jerome Durand CISCO Systems, Inc. 11 rue Camille Desmoulins Issy-les-Moulineaux, 92782 CEDEX FR EMail: jerduran@cisco.com