Internet DRAFT - draft-bgp-path-marking
draft-bgp-path-marking
Network Working Group C. Camilo Cardona
Internet-Draft P. Pierre Francois
Intended status: Standards Track IMDEA Networks
Expires: January 12, 2014 S. Ray
K. Patel
P. Paolo Lucente
Cisco Systems
P. Mohapatra
Cumulus Networks
July 11, 2013
BGP Path Marking
draft-bgp-path-marking-00
Abstract
The potential advertisement of non-best paths by a BGP speaker
supporting the add-path or the best-external extensions makes it
difficult for other BGP speakers to identify the paths that have been
selected as best by those who advertise them. This information is
required for proper operation of some applications. Towards that
end, this document proposes marking the paths using extended
communities that encode the path type.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 12, 2014.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
Camilo Cardona, et al. Expires January 12, 2014 [Page 1]
Internet-Draft Path-Marking July 2013
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. The BGP Path Type Community . . . . . . . . . . . . . . . . . 4
3. Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Operational Considerations . . . . . . . . . . . . . . . . . 6
5. Applications . . . . . . . . . . . . . . . . . . . . . . . . 7
5.1. Avoiding suboptimal routing in Inter-AS VPN . . . . . . . 7
5.2. Monitoring applications . . . . . . . . . . . . . . . . . 9
5.3. SDN applications . . . . . . . . . . . . . . . . . . . . 9
5.4. Selective Best-path . . . . . . . . . . . . . . . . . . . 10
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
7. Security Considerations . . . . . . . . . . . . . . . . . . . 10
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 10
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
10.1. Normative References . . . . . . . . . . . . . . . . . . 11
10.2. Informative References . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction
When there are multiple paths for a given address prefix, BGP chooses
one of the paths as the "best-path" according to the best-path
selection rules prescribed in [RFC4271] and installs the best-path in
its forwarding table. Classically, each BGP speaker advertises only
Camilo Cardona, et al. Expires January 12, 2014 [Page 2]
Internet-Draft Path-Marking July 2013
the best-path to its peers. So when a BGP speaker receives a path
from one of its peers, it is assured that the path is used by the
peer for forwarding and all other peers have received the same path
from this peer. This leads to consistent routing in a BGP network.
The classical advertisement rule of sending only the best-path does
not convey the full routing state of a destination present on a BGP
speaker to its peers.
o In order to improve link bandwidth utilization, most BGP
implementations choose additional paths, that satisfy certain
conditions, as "multi-path", and install them in the forwarding
table. Incoming packets for that destination are load-balanced
across the best-path and the multi-path(s). I.e., there may be
paths installed in the forwarding table that are not advertised to
the peers.
o When an Autonomous System (AS) deploys a route-reflector
([RFC4456]) instead of using full IBGP mesh, the BGP speakers
receive only the route reflector's best-path and therefore lose
information about the best-paths of other IBGP peers.
o If an IBGP path is chosen as the best-path by a non-route-
reflector BGP speaker, then the best-path is not sent to its IBGP
peers. Thus the IBGP peers learn nothing from this BGP speaker
even though it might have other EBGP paths for that destination.
o Even when a BGP speaker selects an EBGP path as the best-path and
advertises it to its peers, it may have additional EBGP paths for
the destination. Should those paths be advertised a priori, they
could be used by the peers in the event of loss of reachability of
the best-path resulting in faster convergence.
There are extensions to the classical BGP advertisement rule to
provide additional information about the routing state of a
destination. A BGP speaker supporting the best-external
[I-D.ietf-idr-best-external] extension sends its best external path
to its IBGP peers when the best-path is an IBGP path. A BGP speaker
supporting the add-path [I-D.ietf-idr-add-paths] extension advertises
multiple paths for a given address prefix.
With best-external or add-path extensions in use, when a BGP speaker
receives a path from a peer, that path may not be the best-path, or
it may not be installed in the peer's forwarding table. In some
scenarios, knowledge of the path type - i.e., whether the path is the
best-path, or whether the path is installed in the forwarding table -
is essential.
Camilo Cardona, et al. Expires January 12, 2014 [Page 3]
Internet-Draft Path-Marking July 2013
For instance, in a typical dual-homed VPN in primary-backup
configuration, the backup path is created by advertising the best-
external path from the backup PE with worse LOCAL_PREF. However,
when the customer adds a site in another AS, the LOCAL_PREF
information does not reach that site. As a result, data traffic
coming from that site may incorrectly be forwarded over the backup
link instead of the primary link.
Similarly when an add-path enabled peer receives multiple paths from
a peer, it does not know which one among those paths is the best-path
and which ones are installed in the forwarding table. An exogenous
monitoring system, e.g., would require that information to properly
tweak the policies on the router to effect desired forwarding
optimization.
This draft proposes marking the advertised paths by an extended
community, called Path Type community, that encodes the path type.
The path type provides the necessary information to the BGP speakers
about how the path is used by the sender when add-path or best-
external extensions are in use.
2. The BGP Path Type Community
The BGP Path Type Community is an IPv4 Address Extended Community
([RFC4360]) defined as follows:
Type Field:
The value of the high-order octet of the extended Type Field is
0x01, which indicates that it is transitive. The value of low-
order octet of the extended type field for this community is TBD.
Value Field:
The Value field contains two sub-fields, described below:
+---------------------+
| Router-ID (4 octet) |
+---------------------+
| Path type (2 octet) |
+---------------------+
The Router-ID field contains the BGP identifier of the BGP speaker
that adds the Path Type community to a path.
Camilo Cardona, et al. Expires January 12, 2014 [Page 4]
Internet-Draft Path-Marking July 2013
The Path type field contains a bitfield where each bit encodes a
specific role of the path. Multiple bits may be set when a path is
used in multiple roles.
+--------+--------------------+
| Value | Path type |
+--------+--------------------+
| 0x0000 | Unknown |
| 0x0001 | Best-path |
| 0x0002 | Best-external path |
| 0x0004 | Multi-path |
| 0x0008 | Backup path |
| 0x0010 | Uninstalled path |
| 0x0020 | Unreachable path |
+--------+--------------------+
Table 1: Path Type Values
The best-path is defined in [RFC4271] and the best-external path is
defined in [I-D.ietf-idr-best-external].
A multi-path is not the best-path but installed in the forwarding
table and used for forwarding packets. We use the convention that
the best-path is not considered a multi-path.
A backup path is installed in the forwarding table, but it is not
used for forwarding until all multipath(s) and the best-path become
unreachable. Backup paths are used for fast convergence in the event
of failures.
All other reachable paths are marked as 'Uninstalled'.
Lastly, all paths that are considered unreachable are marked as
'Unreachable'. Unreachable paths may be sent only in special cases
(such as to a monitoring application).
3. Rules
o A BGP speaker MAY add the Path Type community to an originated
path.
o When a BGP speaker receives a path from a peer and propagates it
without changing the NEXT_HOP to self:
* If the path contained a Path Type community, it MUST be
retained in the propagated path.
Camilo Cardona, et al. Expires January 12, 2014 [Page 5]
Internet-Draft Path-Marking July 2013
* If the path did not contain a Path Type community, the speaker
MAY add a Path Type community with 'Unknown' value.
o When a path received from a peer is propagated after changing the
NEXT_HOP to self:
* If the path did not contain a Path Type community, the Path
Type community indicating the path role MAY be added.
* If the path contained a Path Type community:
+ If data traffic entering the router for the given
destination may be forwarded over other paths (e.g., for
doing load balancing), then the existing Path Type community
MUST be removed. The BGP speaker MAY add its own Path Type
community.
+ If data traffic entering the router for the given
destination is forwarded only along the given path, then the
existing Path Type community MAY be retained.
In all cases, when a BGP speaker adds its own Path Type community, it
sets its own router-id in the community. Note that BGP router-id
need not be unique across ASes.
The above rule-set prevents a route reflector from modifying the Path
Type community set by its client (unless the route reflector is
changing the NEXT_HOP to self).
When a peer is capable of sending only one path for a given address
prefix and it sends the path without any Path Type community, the
path MAY be considered as the best-path of the peer. In all other
cases, a path without any Path Type community SHOULD be considered to
have an 'Unknown' Path type.
A local policy might modify the above rules. For instance, if a
monitoring application peers with a BGP speaker with add-path
capability for the sole purpose of learning its paths and their
types, then the speaker may always add its own Path Type community
when it advertises the paths to that peer even if it does not change
the NEXT_HOP to self. Such overriding policies should be used with
caution if the advertised paths may impact forwarding decisions in
the network.
4. Operational Considerations
If a speaker receives a path with a Path Type community with an
invalid combination of bits (e.g., both 'Multi-path' and 'Backup'
Camilo Cardona, et al. Expires January 12, 2014 [Page 6]
Internet-Draft Path-Marking July 2013
bits are set), the path MUST NOT be considered invalid. Such error
cases SHOULD be logged through other means.
An implementation SHOULD provide a configurable option for the user
to indicate whether a path should be readvertised when its type is
changed. If the user does not configure the option, the BGP speaker
MUST NOT readvertise a path just to update its Path Type community
(e.g., when a path type changes from 'Multi-path' to 'Uninstalled'
due to a change in IGP metric).
An implementation SHOULD provide a configurable option for removing
Path Type communities from paths that are advertised to untrusted
peers.
An implementation SHOULD mark all paths for a given address prefix
consistently. If one of the paths is marked, then all other paths
SHOULD be marked.
An implementation MAY modify its best-path selection algorithm to
take path type information into account. For instance, paths with
type 'Best-path' MAY be preferred over paths of other types.
Similarly, paths of type 'Best-external' MAY be considered ineligible
for being a multipath.
5. Applications
In this section, we illustrate some applications that benefit from
the Path Type community proposed in this draft.
5.1. Avoiding suboptimal routing in Inter-AS VPN
(RD1)A/B +---+ +---+
LP=200 |RR1| |RR2|
+---+ ,,-+---+-.. _.-+---+-._
,|PE1|' `. / \ (RD3)A/B
+---+,' +---+ +---+ +---+ +---+ -> PE1 (LP=100)
A/B |CE1|. | AS1 |AR1|---|AR2| AS2 |PE3| -> PE2 (LP=100)
+---+ \ +---+ +---+ +---+ +---+
>|PE2|._ _,' `. ,'
+---+ `-....,-' `--...--'
(RD2)A/B
LP=150
Figure 1: Inter-AS VPN scenario
Figure 1 depicts an L3VPN network that spans two ASes: AS1 and AS2.
The ASes may be connected using either Option-B or Option-C
Camilo Cardona, et al. Expires January 12, 2014 [Page 7]
Internet-Draft Path-Marking July 2013
techniques [RFC4364]. A customer site with equipment CE1 is dual-
homed in AS1, connected to PE1 and PE2. For prefix A/B, the customer
prefers to use the link between CE1 and PE1. This routing preference
is expressed by setting the LOCAL_PREF of the prefix advertised by
PE1 to a higher value than that of the prefix advertised by PE2.
This causes PE2 to use PE1's route as the best-path and its own EBGP
path becomes the best-external path. PE2 is configured to advertise
its best-external path. Therefore, both PEs continue to advertise
their own EBGP path. The provider uses unique route-distinguishers
for its VPNs. So PE1 and PE2 advertises different VPN prefixes:
(RD1)A/B and (RD2)A/B. Both these prefixes are advertised to PE3 in
AS2. PE3 imports both paths to its own VPN with route-distinguisher
RD3.
Existing behavior:
Since LOCAL_PREF is not sent across AS boundary, both paths on
PE3 have the default LOCAL_PREF of 100. As a result the best-
path selection on PE3 may boil down to tie breaking steps and
the path towards PE2, which is the best-external path, may be
chosen. Alternately, the path from PE2 may be chosen as the
multipath and may be used for load-balancing. Therefore, some
or all data traffic entering PE3 would reach CE1 via PE2, which
is not what the customer desired.
Behavior with Path Type Community:
When PE2 advertises its path, it adds the best-external Path
Type community. This community is preserved across AS
boundary. If option C is used, then RR1 or RR2 does not change
the NEXT_HOP and hence the community is preserved according to
the rule-set (Section 3). If option B is used, then the
community reaches AR1 since RR1 does not change the NEXT_HOP.
At AR1, (RD2)A/B has only one path and forwarding traffic
entering AR1 from AR2 for this destination (determined by the
outer label) would use this path. Therefore, AR1 retains the
Path Type community set by PE2. The same applies to AR2. So
at PE3, the path to PE2 has the best-external Path Type
community and therefore PE3 can choose to not use this path for
forwarding.
If the best-path algorithm takes the Path Type community values into
account, it eliminates the need for setting LOCAL_PREF to deprefer
the bext-external path even within a single AS. This simplifies the
network design and management.
Instead of using Path Type communities, it is possible to use
policies on the border routers (AR1 and AR2 for option B, or RR1 and
Camilo Cardona, et al. Expires January 12, 2014 [Page 8]
Internet-Draft Path-Marking July 2013
RR2 for option C) to recreate the LOCAL_PREF in AS2 (e.g., by
matching on the RD and the prefix). However, the recreated
LOCAL_PREF may interfere with the local policies set in AS2 (e.g., if
there are other paths in AS2 for A/B that the customer wants to use
as secondary paths). In addition, such policies are error-prone and
complex to manage, especially when the customer is allowed to change
the primary/backup relationships between PE1 and PE2 on its own. The
standardized mechanism of Path Type community is free from such
drawbacks.
5.2. Monitoring applications
A modern Service Provider (SP) network may contain thousands of BGP
routers. For planning, proper engineering and operation of a
backbone, it is a good practice to continuously monitor the routers'
states and perhaps keep a history. Many Network Management Systems
(NMS) establish IBGP sessions with BGP speakers to collect the paths
the speaker has. When the speaker supports add-path (or best-
external), the NMS receives non-best-paths. There are also
monitoring protocols such as BMP [I-D.ietf-grow-bmp] that similarly
receives all paths from a speaker.
When an NMS receives multiple paths for a destination, it is
important for its operation to know which path is the best-path,
which paths are installed in forwarding table, which path is used as
a backup, etc. The NMS system may run the best-path algorithm on
those paths on its own. However, its information, especially on IGP
metric, local policies, etc., may be incomplete and hence its own
calculations may not match that of the router's. It is also noted
that even if the NMS system collected additional information to run
the best-path algorithm from the point-of-view of the router, it
would have to do so for every router in the network, which would
impose a very high computational burden on the NMS.
When Path Type community is in use, the router provides the required
information directly, thus avoiding computational load on the NMS as
well as potential discrepancies between the point-of-view of the
router and that of the NMS.
5.3. SDN applications
Similar to the monitoring applications, a "Software Defined
Networking" application monitors the routing state and based on it,
may change the policies on the router, or inject additional paths, to
influence the forwarding. When a BGP speaker supports Path Type
communities and add-path, an SDN application can simply peer with the
router to receive its routing state in real-time even if the router
does not provide vendor-specific APIs for doing the same.
Camilo Cardona, et al. Expires January 12, 2014 [Page 9]
Internet-Draft Path-Marking July 2013
5.4. Selective Best-path
When the classical BGP advertisement rule is followed, all paths a
BGP speaker considers for best-path are already installed in the
forwarding table of the peer. However, when add-path, or best-
external extensions are used, that no longer holds. If the BGP
speakers support the Path Type communities, then the classical
behavior can be reinstated by considering only those paths in the
best-path algorithm that are marked as best-path or multi-path.
Detailed discussions on the rules and benefits of such an approach
are outside the scope of this draft.
6. IANA Considerations
Section 2 defines an IPv4 Address specific transitive extended
community called the Path Type extended community. IANA is requested
to assign a sub-type value for the Path Type extended community. The
last 2 bytes of the value field of the Path Type extended community
contains a bitfield that encodes the type of the advertised path.
IANA is expected to maintain a registry for these bits. Section 2
defines 6 of those bits. The rest of the bits are to be assigned by
IANA using the "IETF Consensus" policy defined in [RFC2434].
7. Security Considerations
This document introduces no new security concerns to BGP or other
specifications referenced in this document.
8. Contributors
Adam Simpson
Alcatel-Lucent
600 March Road
Ottawa, Ontario K2K 2E6
Canada
Email: adam.simpson@alcatel-lucent.com
Roberto Fragassi
Alcatel-Lucent
600 Mountain Avenue
Murray Hill, New Jersey
USA
Email: roberto.fragassi@alcatel-lucent.com
9. Acknowledgments
We would like to thank Bruno Decraene for his feedback on this work.
Camilo Cardona, et al. Expires January 12, 2014 [Page 10]
Internet-Draft Path-Marking July 2013
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", BCP 26, RFC 2434,
October 1998.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, February 2006.
10.2. Informative References
[I-D.ietf-grow-bmp]
Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring
Protocol", draft-ietf-grow-bmp-07 (work in progress),
October 2012.
[I-D.ietf-idr-add-paths]
Walton, D., Retana, A., Chen, E., and J. Scudder,
"Advertisement of Multiple Paths in BGP", draft-ietf-idr-
add-paths-08 (work in progress), December 2012.
[I-D.ietf-idr-best-external]
Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H.
Gredler, "Advertisement of the best external route in
BGP", draft-ietf-idr-best-external-05 (work in progress),
January 2012.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, February 2006.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, April 2006.
Authors' Addresses
Camilo Cardona, et al. Expires January 12, 2014 [Page 11]
Internet-Draft Path-Marking July 2013
Camilo Cardona
IMDEA Networks
Avenida del Mar Mediterraneo
Leganes 28919
Spain
Email: juancamilo.cardona@imdea.org
Pierre Francois
IMDEA Networks
Avenida del Mar Mediterraneo
Leganes 28919
Spain
Email: pierre.francois@imdea.org
Saikat Ray
Cisco Systems
170 W. Tasman Drive
San Jose, CA 95134
USA
Email: sairay@cisco.com
Keyur Patel
Cisco Systems
170 W. Tasman Drive
San Jose, CA 95134
USA
Email: keyupate@cisco.com
Paolo Lucente
Cisco Systems
170 W. Tasman Drive
San Jose, CA 95134
USA
Email: plucente@cisco.com
Camilo Cardona, et al. Expires January 12, 2014 [Page 12]
Internet-Draft Path-Marking July 2013
Pradosh Mohapatra
Cumulus Networks
140 C. Whisman Rd.
Mountain View, CA 94041
USA
Email: pmohapat@cumulusnetworks.com
Camilo Cardona, et al. Expires January 12, 2014 [Page 13]