Internet DRAFT - draft-alcaide-calabria-idr-bgp-prefix-priority
draft-alcaide-calabria-idr-bgp-prefix-priority
IDR working group Juan Alcaide
Internet Draft Cisco
Intended status: Standards Track Fernando Calabria
Expires: February 2012 Cisco
September 2, 2011
BGP prefix priority
draft-alcaide-calabria-idr-bgp-prefix-priority-00
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Calabria - Alcaide Expires February 2 , 2012 [Page 1]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
Abstract
This document defines a set of extended communities to carry
priority information. This information provides a mechanism for
assigning a processing preference to the routes that carries it. It
also provides a scheme for processing routes with strict priority
order during update reception, best-path computation, and update
transmission.
Table of Contents
1. Introduction......................................................3
2. Conventions Used in this Document.................................4
3. Definitions of Commonly Used Terms................................4
4. Scope.............................................................6
5. Solution Specification............................................7
5.1. Network Wide Prefix Priority ...................................7
5.2. Network Wide Prefix Priority in a "Trusted" Environment.........8
5.3. Network Wide Prefix Priority in a "on-Trusted" Environment......8
5.4. Prioritizing Reception of Routes................................8
5.5. Prioritizing Local and Outbound Processing of Routes...........10
5.6. Change of priority.............................................11
5.7. Interaction with Neighbors not Supporting Route Prioritization.12
6. Rationale behind network wide priorities.........................13
7. Security Considerations..........................................14
8. IANA Considerations..............................................14
9. References.......................................................15
10. Authors' Addresses..............................................15
Calabria - Alcaide et all Expires February 2 , 2012 [Page 2]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
1. Introduction
BGP scale has been growing in the last years, in terms of neighbors
and routes. This impacts convergence times after, for example, a BGP
re-initialization event. One solution is a continuous upgrade of the
hardware used by BGP speakers, by adding faster CPU and additional
memory. This approach, however, is expensive and cannot reduce
convergence times indefinitely. It is desirable having a software
based solution, in which a BGP speaker can prioritize some selected
routes. In other words, there is a need for a Qos-like mechanism in
the BGP control plane.
Processing of routes with a given priority SHOULD be performed
before any lower priority ones. This process SHOULD be performed in
a preemptive manner. Thus, the convergence times obtained for high
priority routes would be the same as if there were no lower priority
routes at all. Implementations are not expected to reach this
theoretical limit, but closely approach to it.
Priority information is signaled by adding to the route an extended
community hereby named PEC (Priority Extended Community). A PEC is
meant to have network wide significance and transparent to speakers
that do not understand it. It MAY be set at the origination of the
route and propagated across the network, thus greatly reducing
management burden, but it can also be set by a policy if required.
Route processing during reception of routes is based on the priority
assigned to the received path; while the remaining tasks are based
on the priority of the computed best-path. Provisions to prevent
that a change in the priorities associated to the path results in
miss ordered routes are also covered in the present document.
The design of how a given priority marking is honored is twofold: a
given speaker SHOULD process the reception of a path with the
priority that the received path has; and it should process any local
or transmission task with the priority associated to the best-path
of the net. Thus, the design supports different paths being
originated with different priority marking; and it deals with the
conflict by aggregating these markings during best-path computation
and propagating them downstream. Thus, aggregated marking is honored
as close to the source of this aggregation as possible.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 3]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
2. Conventions Used in this Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119
[RFC2119]. RFC 2119 defines the use of these key words to help make
the intent of standards track documents as clear as possible. While
this document uses these keywords, this document is not a standards
track document.
3. Definitions of Commonly Used Terms
The set of definitions below are used through this document. Some
terms are well-known, some terms are defined to avoid confusion and
some (those marked with a "*") are defined for the purpose of this
implementation (and thus referenced by other sections throughout the
entire document).
BGP process: internal implementation of a BGP speaker. The
router may implement the BGP process as one or more OS
processes or threads.
net: BGP prefix, including all the paths received from all
the neighbors.
path: BGP prefix received from a particular neighbor.
Multiple paths can be associated to a given net.
BGP table: database where all the BGP routes are kept. It's a
set of nets, each of them with their associated paths.
RIB (Routing Information Base): database where all the
forwarding information is kept. It's a set of nets with their
associated forwarding paths (more than one if it's a
multipath net). Nets can be learned from different routing
protocols, in particular they can have a correspondent entry
from the BGP table, and the forwarding path used will be the
BGP best-path for that net (plus additional ones if it's a
multipath net).
Calabria - Alcaide et all Expires February 2 , 2012 [Page 4]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
upstream/downstream directions: When routes flow in a given
direction, a BGP speaker receives routes from upstream and
advertises them downstream.
receiving peer/sending peer: When routes flow in a given
direction between two speakers, the BGP speaker that sends
the routes is the sending peer and the BGP speaker that
receives them is the receiving peer.
PEC* (Priority Extended Community): extended-community
associated to a BGP path that is an indication of the path-
priority for that path. PEC=priority denotes that a given PEC
indicates that priority. PEC=NULL indicates that no PEC is
actually send in an update message.
strict priority: method of servicing the process of several
tasks. Tasks with a given priority are processed before any
other task with lower priority. In the context of this
document, they SHOULD also preempt the processing of any
lower priority task.
route priority*: integer from 0 to 7 associated to a route.
It indicates the priority or urgency with which this route is
processed. Priority=0 indicates the lowest urgency, and
priority=7 indicates the highest urgency. It is a generic
term that can actually have a different value based on the
specific task a BGP process is performing:
in-message-priority*: priority associated to a received BGP
message as it is received from the TCP session. It's derived
by calculating the maximum of all path-priorities in a given
update message. It determines the priority for message
processing during reception.
path-priority*: priority associated to a BGP path. It's
calculated by looking at the PEC associated to the path. It
determines the priority of a path during reception, after it
Calabria - Alcaide et all Expires February 2 , 2012 [Page 5]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
has been parsed from a message. It is also used to calculate
the rest of priorities.
max-path-priority*: priority associated to a BGP net. It's
derived by calculating the maximum path-priority for all the
paths of a given net. It determines the processing priority
for best-path computation.
net-priority*: priority associated to a BGP net. It's derived
from by calculating the path-priority of the best-path. It
determines the processing priority for any further local
processing (after best-path computation) and advertisement of
routes.
4. Scope
As mentioned before, this document focuses on the following:
- A scheme that assigns and signals priority values on a prefix
basis.
- Proposing a solution for processing prioritized routes during
update reception.
- Proposing a solution for processing prioritized routes during
best-path computation, and update transmission.
- Proposing a solution for managing prefixes whose priority changed
by an administrative task.
- Guidelines to "interact" with speakers that do not (fully or
partially) support prefix prioritization.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 6]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
5. Solution Specification
5.1.Network Wide Prefix Priority
Priority for a prefix is set by the assignment of a BGP extended
community attribute, in order to indicate preference of processing.
This community is hereby named PEC (Priority Extended Community),
and MUST contain priority values from 0 to 7. PECs are defined as a
new transitive extended-community of experimental use as defined by
[RFC4360] and [RFC3692].
The extended community type is: 0x80FE whose value is encoded as a
sequence of 5 zero bytes and the priority value set by the 3 most
significant bits of the last byte, resulting in:
Highest priority (7) : 0x80FE:0000000000E0
Lowest priority (0) : 0x80FEA:000000000000
and all the pertinent values in between.
In a trusted environment, PEC is set by the speaker originating the
route and has neighbor significance. This approach greatly reduces
the management burden of mapping routes to priorities. If PECs are
not trusted, they MAY be changed by any other speaker downstream
based on its policy.
PECs are propagated on a per path basis. The correlation between
paths and nets for a given priority is as follows:
- Path-priority is associated to a BGP path upon receiving it,
typically based on PECs.
- Net-priority is assigned to the net, and corresponds to the
path-priority of the best-path for that prefix.
- Net-priority is signaled when the route is advertised,
typically by PECs.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 7]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
5.2.Network Wide Prefix Priority in a "Trusted" Environment
In a trusted environment, priority signaling is based on the
advertisement of one single PEC by the originator of the route. In
particular:
- Path-priority for a path is based on the PEC received to that
BGP path.
- If multiple PECs are received for the same prefix, the speaker
SHOULD use the PEC that indicates a higher priority.
- If no PEC is received (PEC=NULL), the speaker SHOULD explicitly
set path-priority=0.
- When advertising updates, all PECs are removed and one single
PEC is advertised, corresponding to the net-priority of the
advertised net. In particular, if net-priority=0 an explicit
PEC=0 SHOULD be sent.
5.3.Network Wide Prefix Priority in a "on-Trusted" Environment
In a non-trusted environment, it's possible to change the above
procedures by local configuration. In particular:
- Path-priority can be overwritten when receiving a route.
- PECs transmitted can be overwritten when advertising a route.
5.4.Prioritizing Reception of Routes
Processing routes during reception involves tasks like reading
update messages, parsing the prefixes inside those messages, and
installing them in the BGP table as a path belonging to the neighbor
associated to the session the message was received from.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 8]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
These tasks SHOULD be performed in strict priority order based on
the path-priority set by a speaker or by local configuration.
Using path-priority to select the priority for inbound processing
carries within some challenges, since path-priority is unknown till
inbound processing itself is performed. The following solutions to
this challenge are presented:
- After reading an update message from the TCP session, inspect
the message and calculate an in-message-priority, which
corresponds to the highest path-priority of all the prefixes
present in the message. Any further processing of the message,
like a detailed parsing, it's performed in strict priority order
based on in-message-priority.
- Calculating in-message-priority itself is not a task that can
be prioritized, and therefore it should be a light-weight task.
For the most common case, where path-priority is determined based
on PEC, this consideration does not apply. Assigning statically a
path-priority to a given session is a task that requires no
processing at all. On the other side of the spectrum, if path-
priority is determined by the prefix itself (i.e. prefixes in the
same update can have different path-priority), the task becomes
non-trivial. Furthermore, some prefixes may get a preferential
treatment (if their in-message-priority is higher than their
path-priority).
- After path-priority is computed for a route, any further inbound
processing of the route can be performed based on path-priority.
This may involve tasks like installing the route into the BGP
table.
A path MUST be discarded (and not installed in the BGP table) if it
has been received before a path for the same prefix and TCP session
that already exists in the BGP table. This non-FIFO scenario is
possible when receiving the same prefix with different priorities.
If the second prefix received has a higher in-message-priority or
path-priority, the first prefix could be a candidate to be installed
in the BGP table after the second has actually already been
installed. Note that with these modifications, the sequence of
routes installed in the BGP table could be different than it would
Calabria - Alcaide et all Expires February 2 , 2012 [Page 9]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
be without the use of priorities. This change of behavior is
acceptable under BGP protocol rules ([RFC4271]).
Any received BGP messages that are not update messages SHOULD be
processed in strict priority order, based on a higher priority than
the maximum in-message-priority.
5.5.Prioritizing Local and Outbound Processing of Routes
After a path has been installed in the BGP table, the processing
priority of all the tasks that correspond to the associated prefix
is not dependent anymore into the priority of the path itself (path-
priority), but on that of the net it belongs to, namely net-
priority. However, net-priority cannot be known till the best-path
is resolved, and to prioritize itself the task that resolves best-
path, max-path-priority is used. Max-path-priority is defined as the
maximum path-priority of all the paths associated to a given net,
including the path-priority of any new path that triggered the best-
path computation.
Calculating max-path-priority itself is a task that SHOULD be
processed in strict priority order, based on the path-priority of
the path that triggers best-path computation.
Best-path processing is a local task that SHOULD be processed in
strict priority order, based on max-path-priority.
Further local processing of routes includes tasks like installation
of the net in the routing table. Outbound processing includes tasks
like formatting nets into update messages and transmitting them
through the TCP session. All these tasks SHOULD be performed in
strict priority order based on net-priority.
Note that the rules above force that all the prefixes in a given
message to have associated the same net-priority (if the
transmission of update messages is to be prioritized based on the
common net-priority). This is already a constriction if PECs are
used to signal priorities to downstream peers.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 10]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
Any transmitted BGP messages that are not update messages SHOULD be
processed in strict priority order, based on a higher priority than
the maximum net-priority.
5.6.Change of priority
As previously described, the advertisement of routes is done with a
priority based on net-priority (assigned to a given prefix). There
are no conflicts as long as, over time, net-priority remains the
same for a given prefix. However, net-priority derives from path-
priority, and therefore it may change. Without any further
mechanisms, the order in which routes are advertised would be
incorrect, and inconsistencies across the BGP tables of the sending
and receiving peers would appear.
This non-FIFO scenario is possible when advertising the same prefix
with different priorities. If the second prefix that needs to be
advertised to a given neighbor has a higher net-priority than a
first one already scheduled for transmission, the second one could
be transmitted actually before the first one is.
When sent through the BGP session, advertisements for a given prefix
MUST keep, in all cases, the same order than they would have without
route prioritization (i.e., FIFO-like processing), or perform only
the last advertisement. In other words, a route computed as best-
path MUST NOT be transmitted over a BGP session before a route that
was computed previously as best-path. Note that the offending
scenarios are only possible when increasing net-priority. If net-
priority decreases, the problem does not happen. How an
implementation deals with this situation is outside the scope of
this document. However, these two general approaches are discussed:
- One obvious option is making sure that any previous low-
priority route is not actually advertised (and thus it's
discarded). This option has the drawback of complexity (updates
already scheduled for transmission may have to be reformatted).
Note also that the sequence of routes transmitted could be
different than it would be without the use of priorities. This
change of behavior is acceptable under BGP protocol rules
([RFC4271]).
Calabria - Alcaide et all Expires February 2 , 2012 [Page 11]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
- A second option is that, whenever net-priority needs to
increase, the BGP speaker simply waits for all the routes with
lower net-priority to be transmitted across all sessions.
After they are transmitted, net-priority can be safely
increased. While net-priority has not transitioned, any task
depending on net-priority for that route is processed as usual,
considering the old net-priority. Note that this may imply
sending two updates upon a transition, if attributes
transmitted (like PEC) depend on net-priority. The drawback of
this approach is that it introduces a delay in how priority
information is propagated across the network (indefinitely in a
worst case scenario, if a prefix is constantly flapping at a
high rate).
Same considerations apply for any other local processing tasks, if
the implementation of these tasks makes them susceptible of miss
ordering their execution.
5.7.Interaction with Neighbors not Supporting Route Prioritization
When all the BGP speakers involved in the propagation of a network
event do not support route prioritization, priority routes will not
be treated with the preference they would have otherwise. It is
possible, however, to minimize the effects of this scenario based on
the following considerations:
- Priority management is transparent across speakers and domains not
supporting route prioritization. This is because PEC is defined as a
transitive extended-community.
- If priority of received paths is not marked with a PEC, the same
effect can be achieved by local configuration.
- Reception of routes from a neighbor not supporting route priority
does not change. The routes are received with the preference that
in-message-priority indicates.
Calabria - Alcaide et all Expires February 2 , 2012 [Page 12]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
- Advertisement of nets towards a neighbor not supporting route
priority does not change. The routes are advertised with the
preference that net-priority indicates.
Note that if routes are advertised with the order determined by its
own net-priority to a downstream speaker not supporting route
prioritization, there is a high probability that that this speaker
will process those routes with the same (or approximate) order that
it received them, since most likely it will treat them in a FIFO or
quasi-FIFO fashion. Thus, introducing a single speaker supporting
route prioritization upstream in the network can significantly
increase the overall prioritization across the entire route
propagation path.
6. Rationale behind network wide priorities
This proposal develops a comprehensive use of a network wide
priority as a method to give preferential treatment to some routes.
Out of all the possible design alternatives, the choices were based
in flexibility, performance and stability. Amongst them, the
following ones can be pointed out:
- PECs can be used to signal path-priorities for unreachable NLRIs
(aka withdraws). In an implementation without priorities, any
attributes are meaningless when associated to unreachable NLRIs,
but there is nothing in the BGP protocol rules ([RFC4271]) to
prevent its use. Note that implementations could use other
attributes (besides PECs) associated to unreachable NLRIs.
- An implementation SHOULD send one and one only PEC, but it SHOULD
also accept multiple PECs or no PECs at all. With only "good
behaved" implementations and configurations, this precaution is
not necessary; but the proposal's designs provisions for it under
the philosophy "be liberal with what you receive, be conservative
with what you send".
- When a net with a net-priority=0 is sent, the options are to set
PEC explicitly (PEC=0) or implicitly (PEC=NULL). Both options are
equally valid and there is not a chance for confusion. Consider,
however, the case where the nets coming from two speakers, one
supporting route priority and one not supporting it. They traverse
a transparent speaker (i.e. one that just forwards nets with the
Calabria - Alcaide et all Expires February 2 , 2012 [Page 13]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
PECs it received). In this case, confusion is possible: a router
downstream using route prioritization won't be able to distinguish
the two set of routes (and it's possible that its requirements
dictate to differentiate both cases). The drawbacks of using an
explicit PEC=0 is that some extra bytes need to be added to the
update messages of the lowest net-priority routes, and that more
update messages might be transmitted (consider the case above,
where a transparent speaker sends routes with both PEC=0 and
PEC=NULL: these routes cannot be packed in the same message).
- It's desirable for a given prefix to have the same priority across
the network. Propagating the priority of the best-path maximizes
the chances of this happening. There is no absolute guarantee,
however, since not all the speakers have to select the same best-
path, according to BGP propagation and best-path selection rules
([RFC4271]).
- When path-priorities are different for a given net, a different
approach could have been chosen to determine net-priority (other
than using the path-priority of a best-path). An alternative
method, however, could potentially create a chicken-and-egg
situation. Consider, for instance, a proposal that chooses as net-
priority the higher path-priority of all the paths. Consider also
the case of two speakers back to back, mutually advertising routes
for a given prefix between them, none of them using the other's
route a best-path. The mutually advertised routes could have a
higher priority than the best-paths. This would be a self-
sustained state that would remain no matter what other PECs are
received from other peers.
7. Security Considerations
This document introduces no new security concerns to BGP or other
specifications referenced in this document.
8. IANA Considerations
N/A
Calabria - Alcaide et all Expires February 2 , 2012 [Page 14]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
9. References
[RFC4271] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-
4)", RFC 4271, January 2006.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4360] Sangli, et all "BGP Extended Communities Attribute, RFC 4360,
February 2006
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2234] Crocker, D. and Overell, P.(Editors), "Augmented BNF for
Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium and
Demon Internet Ltd., November 1997.
Copyright (c) 2011 IETF Trust and the persons identified as authors of
the code. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, is permitted pursuant to, and subject to the license
terms contained in, the Simplified BSD License set forth in Section
4.c of the IETF Trust's Legal Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info).
10. Authors' Addresses
Juan Alcaide
Cisco
7025 Kit Creek Rd RTP-NC 27709
jalcaide@cisco.com
USA
Fernando Calabria
Cisco
7025 Kit Creek Rd RTP-NC 27709
Calabria - Alcaide et all Expires February 2 , 2012 [Page 15]
Internet-Draft draft-idr-bgp-prefix-priorization-00.txt September 2011
fcalabri@cisco.com
USA
Calabria - Alcaide et all Expires February 2 , 2012 [Page 16]