Internet DRAFT - draft-chen-bgp-path-reduction
draft-chen-bgp-path-reduction
Internet Engineering Task Force (IETF) E. Chen
Internet Draft P. Mohapatra
Intended Status: Informational Cisco Systems
Expiration Date: March 18, 2013 September 17, 2012
Reduction of BGP Alternate Paths from Inter-Exchange Points
draft-chen-bgp-path-reduction-00.txt
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on March 18, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
draft-chen-bgp-path-reduction-00.txt [Page 1]
Internet Draft draft-chen-bgp-path-reduction-00.txt Sept 17, 2012
Abstract
In this document we present a mechanism that enhances the "IGP-metric
based MED" approach so that load balancing is maintained while
limiting the number of BGP alternate paths carried in a network. The
mechanism involves the use of a "scale factor" to scale down the IGP
metrics for the purpose of setting MEDs, and is thus termed "scaled
IGP-metric based MED".
1. Introduction
The BGP sessions [RFC4271] between service providers are typically
established and maintained at multiple inter-exchange points for the
purpose of routing redundancy, load balancing, and traffic
localization. For a particular prefix (i.e., destination) there
would exist multiple routes with different nexthops (corresponding to
different peering points) in a network. Given the large number of
routes, and the number of inter-exchange points, the challenge is to
utilize these peering connections efficiently while maintaining
operational simplicity.
One common practice is to direct packets received on an ingress
router to the "closest" (in terms of the IGP metric to the nexthop)
inter-exchange points. Thus from the network point of view, traffic
destined to a particular prefix would be distributed among the inter-
exchange points from which the routes for the prefix are received.
The scheme is commonly known as "shortest-exit routing", or "hot-
potato routing".
As described in [RFC4271], the first few steps in BGP route selection
[RFC4271] involve the comparisons of the LOCAL_PREF value, the AS-
PATH length, the MED value, and the IGP metrics for the nexthop.
Given that the AS-PATH length is typically network-topology dependent
and agnostic to the peering locations, a common implementation of the
"shortest-exit routing" is to set the LOCAL_PREF value and the MED
value to a constant value, respectively, for the routes received from
all these peering points, thus using the IGP metrics as the tie-
breaker in the BGP route selection. This scheme offers fast routing
convergence, consumes minimal network bandwidth for a particular
network, and requires little coordination and cooperation between
providers.
However, the number of alternate paths carried in a network, in
particular on the route reflectors [RFC4456], grows linearly to the
number of peering locations. A large number of alternate paths from
the peering locations could become a scaling issue as described in
[NANOG46].
draft-chen-bgp-path-reduction-00.txt [Page 2]
Internet Draft draft-chen-bgp-path-reduction-00.txt Sept 17, 2012
Clearly one alternative is for the service providers to use the IGP
metric as the MED in route advertisement, and accept the MEDs from
each other. This approach (hereby referred as "IGP metric based
MED") is straightforward both technically and operationally. The
amount of coordination between the providers would also be minimal.
However, compared with the "shortest-exit routing", the "IGP-metric
based MED" approach has the drawbacks of slower routing re-
convergence as only the paths with the lowest MED are readily
available in the network. In addition, only one peering point may be
used for traffic to a given destination, which may potentially impact
load balancing across all peering locations.
In this document we present a mechanism that enhances the "IGP-metric
based MED" approach so that load balancing is maintained while
limiting the number of BGP alternate paths carried in a network. The
mechanism involves the use of a "scale factor" to scale down the IGP
metrics for the purpose of setting MEDs, and is thus termed "scaled
IGP-metric based MED".
2. Scaled IGP-Metric Based MED
The "Scaled IGP-Metric based MED" approach consists of the following
procedures:
o Conceptually divide the network and the inter-exchange points
with a particular provider into multiple topological regions.
There should usually be more than one inter-exchange points in
a region so that the traffic destined toward that region will
be load balanced across the inter-exchange points within that
region.
o For a route sourced (either internally or received from EBGP
peers) within a region, advertise it with an identical MED
across all the BGP sessions with that provider in the region;
and advertise it with less preferred MEDs across BGP sessions
with that provider in other regions.
Note that only the paths with more preferred MEDs are carried in the
network, and the external paths with less preferred MEDs would not be
further advertised by the peering routers to the internal peers.
The number of alternate paths (for a prefix) carried by the other
provider that accepts such MEDs will be controlled, roughly, by the
number of BGP sessions inside the region that sources the prefix.
Due to the presence of more than one path in the network, the fast
routing re-convergence will be maintained. In addition, multiple
peering locations will be used for traffic destined to the prefix.
draft-chen-bgp-path-reduction-00.txt [Page 3]
Internet Draft draft-chen-bgp-path-reduction-00.txt Sept 17, 2012
Operationally this approach can be easily implemented by setting the
MED based on a scaled IGP metric (i.e., divide the IGP metric by a
scale factor). The scale factor can be set as one plus the maximum
IGP metric (or diameter) between the peering routers and other
routers within the region. Clearly the "scale factor" implementation
would work better in a network with differentiated IGP metric values
for the "inter-regional" links vs "intra-regional" links.
This approach can also be implemented by attaching "location
communities" for routes sourced from different locations, and then
setting the MED based on the communities.
It also noted that both the "shortest-exit routing" and the "IGP
metric based MED" schemes can be considered as special cases of this
"scaled IGP-metric based MED" scheme with the scale factor being the
largest 32-bit unsigned integer, and 1, respectively.
3. Example
In the following figure, A1, A2, A3, and A4 are the peering routers
in SP1; P1, P2, P3, P4 are prefixes/routes sourced at A1, A2, A3, A4
respectively. They are advertised at all the peering points.
B1, B2, B3, and B4 and the peering routers in SP2; RR1, RR2, RR3, RR4
are route reflectors in different clusters in SP2.
The numerical number above a dotted line is the IGP metrics assigned
to the link.
(P1) (P2) (P3) (P4)
| | | |
| 10 | 30 | 20 |
A1 -------------- A2 ----------------- A3 ------------- A4
| | | |
| | | |
| | | |
B1 -------------- B2 ----------------- B3 ------------- B4
| | | |
| | | |
RR1 RR2 RR3 RR4
| | | |
To use the "scaled IGP-metric based MED" scheme, SP1 can conceptually
organize the networks into two regions, one with the inter-exchange
draft-chen-bgp-path-reduction-00.txt [Page 4]
Internet Draft draft-chen-bgp-path-reduction-00.txt Sept 17, 2012
peerings of A1/B1 and A2/B2, and another with A3/B3 and A4/B4. The
scale factor for the first region would be 11, and the scale factor
for the second region would be 21.
As a result, the number of alternate paths would be 2 on the route
reflectors in SP2 for each of the prefixes/routes P1 - P4.
4. IANA Considerations
This document requires no action from the IANA.
5. Security Considerations
This document does not introduce any new security issues.
6. Acknowledgments
We would like to thank Eric Rosen and Saikat Ray for their review and
suggestions.
7. References
7.1. Normative References
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Border Gateway Protocol 4 (BGP-4)", RFC 4271, January
2006.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, April 2006.
7.2. Informative References
[NANOG46] McPherson, D., S. Amante, and L. Zhang, "BGP Scalability
Considerations - The Intra-domain BGP Scaling Problem",
NANOG-46, June 2009.
draft-chen-bgp-path-reduction-00.txt [Page 5]
Internet Draft draft-chen-bgp-path-reduction-00.txt Sept 17, 2012
8. Authors' Addresses
Enke Chen
Cisco Systems, Inc.
Email: enkechen@cisco.com
Pradosh Mohapatra
Cisco Systems, Inc.
Email: pmohapat@cisco.com
draft-chen-bgp-path-reduction-00.txt [Page 6]