Internet DRAFT - draft-mjsraman-pce-inter-as-p2mp-protect
draft-mjsraman-pce-inter-as-p2mp-protect
PCE Working Group Shankar Raman
Internet draft Balaji Venkat
Intended Status: Experimental RFC Gaurav Raina
Expires: July 2013 I.I.T, Madras
January 25, 2013
Constructing protection paths for inter-AS, inter-sub-AS P2MP TE-LSPs
draft-mjsraman-pce-inter-as-p2mp-protect-03
Abstract
Constructing primary and backup explicit path Point-to-Multipoint
Label Switched Paths is important from the point of view of providing
protection switching in case the primary fails.
It is absolutely essential that the backup P2MP LSPs constructed do
not share risk with any of the links and nodes of the primary path.
In the case of inter-AS P2MP TE-LSPs or in the case of inter-sub-AS
(in the case of BGP-Confederations being deployed) P2MP TE-LSPs where
BGP confederations are deployed within an AS, such protection
switching can be provided by calculating primary and backup multicast
distribution trees (read P2MP TE-LSPs) that dont intersect with each
other.
In this paper we propose a method by which inter-sub-AS P2MP TE-LSPs
(hence even inter-AS P2MP TE-LSPs) can be constructed by first
finding the AS level topology of the network (be it inter-AS or
inter-sub-AS within a single AS) in question and secondly to compute
the paths in such a way that they dont intersect or if necessary in
the worst case partially intersect each other. The proposed scheme is
explained with an example and subsequent discussion is done to
elucidate its benefits to multicast in particular.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
Shankar Raman.et.al Expires July 2013 [Page 1]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Methodology of the proposal . . . . . . . . . . . . . . . . . . 3
3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 Discussion of Algorithms to be used . . . . . . . . . . . . 9
4 Security Considerations . . . . . . . . . . . . . . . . . . . . 11
5 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11
6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1 Normative References . . . . . . . . . . . . . . . . . . . 11
6.2 Informative References . . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13
Shankar Raman.et.al Expires July 2013 [Page 2]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
1 Introduction
In the context of P2MP TE-LSPs being constructed for carriage of
multicast traffic across multiple areas within a given AS employing
BGP confederations where such an autonomous system may comprise of
more than one BGP confederation instance, and if such multicast
traffic is mission critical traffic then it would be most useful to
have backup protection trees (where such trees represent entire P2MP
TE-LSPs rooted at the same BGP confederation instance as the primary)
or sub-trees which cover certain areas of the primary P2MP TE-LSP.
The intent is to provide a solution whereby a backup protection P2MP
TE-LSP can be laid out in such a way that the set of BGP
confederation instances traversed by the primary (except say for the
egress BGP confederation instances) inter-sub-AS P2MP TE-LSP is not
traversed at all by the backup P2MP TE-LSP. It is also possible that
if such a totally distinct backup TE-LSP is not possible, then an
effort be made where possible to construct a partially disjoint
backup P2MP TE-LSP which tries to avoid to the extent possible those
BGP confederation instances which fall in the primary.
While this type of a solution is available in the unicast world
applying to inter-AS cases such a solution doesn't seem to have
been proposed for the multicast world for inter-AS P2MP TE-LSPs or
for cases within an autonomous system in which BGP conferderations
are deployed.
1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2 Methodology of the proposal
Assume that a centralized path calculation engine is available as a
facility in the head end BGP confederation instance of the P2MP TE-
LSP. And this centralized PCE has a backup PCE that does pretty much
the same work as the primary PCE in the head end BGP confederation
instance except that it is not contacted for path calculation
requests while the active is still up. At the active PCE the
following happens. The intention at the active PCE is to gather all
routes advertised by BGP peers internal and external/internal (a.k.a
EIBGP used for confederations) and collect from each of these routes
the AS path information attribute (which is a mandatory well-known)
and from each of these AS path info attributes which we will call
strands from now on, an entire BGP-confederation-instance-Level
Shankar Raman.et.al Expires July 2013 [Page 3]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
topology of the AS involved (in the case of inter-AS the internet) is
captured to the extent possible. These strands give us information
about sections of a BGP-confederation-instance level graph, where
each BGP-confederation-instance is considered to be a node and the
edge is connectivity between two or more such confederation-
instances. Where a set of BGP-confederation-instances are consecutive
in a strand they are assumed to be connected to each other.
It is to be noted that the information about such strands may come
from iBGP, eiBGP, MBGP (in the case of inter-AS level) (sometimes
known as Multi-protocol BGP used for multicast) if it carries AS path
information. After the BGP-confederation-instance level topology of
the immediate neighborhood of the ingress BGP-confederation instance
and the egress BGP-confederation-instances to which the multicast
subscribers connect to, has been constructed, the P2MP TE-LSP
calculation algorithm is run, that takes into account a path in the
topology that is most optimal to help and connect the source to the
destinations / receivers. It is possible that source and receivers
who may be distributed across multiple BGP-confederation-instances in
the neighborhood are multihomed in that they have more than one BGP-
confederation instance to which they are connected to. This is
particularly of interest to the algorithm presented in this invention
(in summary) because the backup P2MP TE-LSP needs to be spread out
across links that are disjoint from the primary P2MP TE-LSP to the
extent possible and multi-homing in this case helps a great deal.
Once the primary has been built and the secondary is sought to be
built, the invention proposed would take into account multi-homing
characteristics of the egress BGP-confederation instances to their
respective receivers (which may be enterprise networks / sites) and
build a totally disjoint P2MP TE-LSP or if that is not possible a
partially disjoint P2MP TE-LSPs. It is also possible that manual
policies may be stated in a suitable format to avoid certain BGP-
confederation-instances in the calculation of either a totally
disjoint or partially disjoint backup P2MP TE-LSP.
It is assumed that RSVP-TE would be used to build such primary and
secondary P2MP TE-LSPs be they disjoint or partially disjoint.
Building a BGP-confederation-instance topology of the immediate
neighborhood of the ingress and egress BGP-confederation-instance for
a multicast tree to carry one or more streams of mcast traffic
through such a tree.
Shankar Raman.et.al Expires July 2013 [Page 4]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
Here the term AS refers to a BGP-confederation-instance.
Prefix-A: AS-Path Info: AS1 AS2 AS4
Prefix-B: AS-Path Info: AS2 AS3 AS4
Prefix-C: AS-Path Info: AS2 AS5 AS6 and so on.
Figure 1.0 Strand obtained from a subgraph of the AS having
confederations
Given the above three strands of information we could build out the
following graph.
AS5---------> AS6
|
AS1 -----> AS2 --------> AS4
| /
--------> AS3
Figure 2.0 Strands put together to form a graph of the AS with
confederations
If we extrapolate this to a number of such prefixes and their
respective AS paths we could end up discovering the BGP-
confederation-instance level topology for a greater radius than is
otherwise possible with the ingress BGP-confederation-instance to
which the source is connected as the center.
Multi-homing links appear as duplicate edges between receivers /
source to their respective egress / ingress BGP-confederation-
instances. Thus building out disjoint and partially disjoint P2MP TE-
LSPs using RSVP-TE is a helpful technique in protecting mission
critical multicast traffic on the primary by obviating the need for
fate-sharing of the path that such traffic takes to the extent
possible.
Assume that the following BGP-confederation-instance topology in the
vicinity of the sender / senders is computed.
Shankar Raman.et.al Expires July 2013 [Page 5]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
+----------------+
/ V
/ +----> (AS2) ------> (AS3)<--(RcvrB)
/ / \ |
/ / \ +------+
/ / \V
(source/s)--->(AS1)------> (AS5) ------> (AS4)<-(RcvrA)
\ /\ /
\ ----------+ \ /
\ / \ /
(AS6)----> (AS7) ------> (AS8)<+
Figure 3.0 Strands put together to form a graph of the AS with
confederations
In the above diagram you can see that the source/sources are
connected using a multi homed connections to the same ISP through
BGP-confederation instance AS1 and AS2. Similarly there are two
Receiver sites RcvrA and RcvrB that are multihomed to TWO BGP-
confederation-instances for RcvrB to AS3 and AS4 and for RcvrA to AS4
and AS8 respectively.
Given that the path calculation engine in AS1 BGP-confederation-
instance is given this picture as it absorbs the AS paths and builds
out the BGP-confederation-instances level topology, there are at
least 2 possible completely (to the maximum extent possible) disjoint
P2MP TE-LSPs possible. Note here of course that the egress BGP-
confederation-instance sometimes prevents a totally disjoint P2MP TE-
LSP since the two up-until-the-egress-BGP-confederation-instance
totally P2MP TE-LSPs may be need to merge to enter the receiver
domains / sites.
They are illustrated below.
Primary P2MP TE-LSP advised could be as illustrated by the dotted
lines...
Shankar Raman.et.al Expires July 2013 [Page 6]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
+----------------+
/ V
/ +----> (AS2) ------> (AS3)<--(RcvrB)
/ / \ |
/ / \ +------+
/ / \V
(source/s)...>(AS1)------> (AS5) ------> (AS4)<-(RcvrA)
. .\ /
. ........... \ /
. . \ /
(AS6).....>(AS7) .......>(AS8)<+
Figure 4.0 Strands put together to form a graph of the AS with
confederations
Secondary P2MP TE-LSP advicsed could be as illustrated by the dotted
lines...
..................
. V
. +----> (AS2).......> (AS3)<--(RcvrB)
. / . |
. / . +------+
. / .V
(source/s)--->(AS1)------> (AS5) ------> (AS4)<-(RcvrA)
\ /\ /
\ ----------+ \ /
\ / \ /
(AS6)----> (AS7) ------> (AS8)<+
It is important to note that multicast P2MP TE-LSPs constructed are
considered to be stitched sections of unicast paths and may include
multicast topologies reserved for multicast use alone that may have
been advertised by Multi-protocol BGP just for multicast purposes (in
the case of inter-AS). The normal mechanism of discovering receivers
could be employed.
It is to be noted that while constructing the BGP-confederation-
instance-Level topology the routes which have been advertised in BGP
that carry multicast NLRIs can be preferred and such inter-BGP-
confederation-instance links can be given a higher priority for
constructing the P2MP TE-LSPs. Please read the term AS in the
following paras to represent a BGP-confederation instance.
Shankar Raman.et.al Expires July 2013 [Page 7]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
It is to be noted that this applies to centralized schemes of path
calculations. What is being calculated is a tree of ASes that form a
P2MP tree where each node can conceptualized as an AS and each edge
the border link connecting one AS to another to carry multicast
traffic from several sources located at the head end ingress AS to
several receiver nodes connected to egress ASes. We will call this
calculated tree as a P2MP AS tree. Once the tree is calculated the
PCE in the head end / ingress AS through which sources connect
calculates the intra-AS P2MP path (the literal P2MP TE-LSP within the
AS) within that ingress AS and hands off the elements in the P2MP AS
tree in suitable form to the branch ASes which branch off from the
ingress AS. This is followed as the information about the elements of
the P2MP AS tree are handed off at branches from the PCE of one AS to
another until the final inter-AS P2MP literal TE-LSP is complete. The
backup P2MP AS tree is constructed as well and the same procedure is
followed. This completes the construction of the primary and the
backup P2MP AS Trees.
The failover mechanism could be derived from any one of the available
mechanisms in existence or a novel method for failover could also be
deduced. But that is outside the scope of this document.
This method could be applied for multicast traffic to be transported
through L3VPNs and L2VPNs as well through such inter-AS tree
construction of a P2MP AS tree. Also this method of constructing
inter-AS disjoint P2MP AS trees is independent of how the egress ASes
and the ingress AS are discovered. The method of such discovery is
left to existing mechanisms. The primary input to the invention
proposed is a list of ingress AS and their respective egress ASes.
The other input to the construction of P2MP AS tree part of the
invention of course is the BGP-confederation-AS-instance level
topology (in case of inter-AS the AS level topology of the
neighborhood).
It is also possible to apply policies in such a way that the
construction of the P2MP AS tree is done in a partially disjoint
fashion. That is if there exist certain policies that dictate that a
given AS in the path of the P2MP AS tree is to be present in both the
primary and backup P2MP AS trees, then it would be applied at the PCE
and the resultant tree would contain that AS as policy dictates. Thus
the mechanism could churn out disjoint (totally) P2MP AS trees or
partially disjoint P2MP AS trees depending on policy dictats.
3 Discussion
A centralized picture of a AS level topology of the immediate
vicinity of the egress and ingress ASes of a multicast tree to be
Shankar Raman.et.al Expires July 2013 [Page 8]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
built using a P2MP TE LSP, helps build a backup P2MP TE-LSP without
traversing wherever possible the ASes in the primary for the backup
Backup / protection switching is a lot more reliable since the fate-
sharing is precluded to the maximum extent possible.
Also the multi-homing characteristics of the sites / receivers as
well as the multi-homing characteristics of the source to its ingress
PEs which may be located in different ISP ASes helps in building out
a disjoint path that may well help in avoid in fate-sharing if an
edge between two ASes or an AS itself goes down.
Whether it be OPTIONS A,B or C as listed in the inter-AS RFCs the
method we propose can be used to calculate the P2MP AS tree and the
actual construction left to those methods applying to those options
respectively.
Multicast NLRI information could be used to give priority to routes
learnt in that fashion and AS path information used to construct the
P2MP AS tree may be favourably passed through ASes listed within it.
Thus if there exist in-congruent multicast and unicast topologies
then this method could use the appropriate multicast topology hints
apart from using the unicast topology information for constructing
P2MP AS trees that are more suited to multicast topology hints
provided.
AS sequences which are not ordered ASes representing AS paths toward
a destination may be dropped from consideration while constructing
the topology of ASes in the immediate neighborhood or in toto.
This proposal will also be applicable to ASes which are independent
administrative domains in a literal sense (as against BGP-
confederation instances) as outlined.
3.1 Discussion of Algorithms to be used
The basic first cut algorithm that can be used for implementing this
scheme is to account or collate the list of ASes the primary P2MP TE-
LSP has passed through. These set of ASes can be enclosed in a set
EXCLUDE_ASes and the algorithm for the secondary P2MP TE-LSP run with
the constraint that any of the elements / ASes in this set are to be
avoided. The set EXCLUDE_ASes is ordered with the branch points in
order of traversal from the egress AS to the source ASes where the
receivers are located. So the first check is applied for the elements
/ ASes in the beginning of the EXCLUDE_ASes set working backwards
from the receivers to the source ASes. Only in cases where disjoint
paths are not possible will the overlap occur.
Shankar Raman.et.al Expires July 2013 [Page 9]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
Another way is to weight the edges in the primary P2MP TE-LSP with
higher weights and simply run the algorithm on the topology again. If
the CSPF algorithm finds edges which are high it can be made to avoid
such nodes and edges thus excluding the ASes in the primary path. The
weights that can be assigned for the primary P2MP tree edges is 1/n
where n was the lower weight for these edges when the primary P2MP
LSP was constructed. All the edges in the graph are assigned 1/n as
well. This way the lowest weights of the previous run become the
higher weights of the secondary run. Once this is done one could use
graph clipping and rejoining as explained below. This algorithm too
can run backwards from the egress ASes to the source AS.
Another way is to use graph clipping and rejoining without coupling
with the previous algorithm. The primary path ASes are simply clipped
along with the edges from the graph at the PCE and the algorithm run
again to build a P2MP TE-LSP covering all receivers. Notably if the
receivers are multi-homed the primary P2MP TE-LSP ASes can be removed
first for such cases. If they are single-homed then it would be
worthwhile retaining such egress ASes for the second run. The re-run
can then try to build the tree, and if incrementally such an attempt
fails those ASes in the Primary P2MP TE-LSP which are necessarily
needed in the secondary P2MP tree can be added to the secondary tree
till the completion of the secondary P2MP TE-LSP occurs. So the CSPF
would operate on the graph clipping first and then rejoining those
ASes without which a secondary cannot be built. This again could be
run backwards from the egress AS to the source AS.
An intermediary meeting point akin to the Rendezvous point can be
chosen such that the primary RP AS is disjoint from the secondary RP
AS. Working backwards this will in most cases produce the optimal
result. Further proof of this algorithm will be furnished in future
versions of this document.
Shankar Raman.et.al Expires July 2013 [Page 10]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
4 Security Considerations
When receiving BGP updates from BGP-confederation-instances the PCE
or the head end which is expected to do the calculation of the
totally disjoint or partially disjoint P2MP TE-LSP paths MUST ensure
the authenticity of the updates received from other BGP-
confederation-instances through existing mechanisms that assure that
the updates are not spoofed thus misleading the receiver of such
updates. No additional security hole is seen under these
circumstances apart from the above.
5 IANA Considerations
No IANA requirements exist as of the time of writing of this draft.
6 References
6.1 Normative References
[KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC1776] Crocker, S., "The Address is the Message", RFC 1776, April
1 1995.
[TRUTHS] Callon, R., "The Twelve Networking Truths", RFC 1925,
April 1 1996.
[RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S.
Yasukawa, Ed., "Extensions to Resource Reservation
Protocol - Traffic Engineering (RSVP-TE) for Point-to-
Multipoint TE Label Switched Paths (LSPs)", RFC 4875, May
2007.
[RFC5151] Farrel, A., Ed., Ayyangar, A., and JP. Vasseur, "Inter-
Domain MPLS and GMPLS Traffic Engineering -- Resource
Reservation Protocol-Traffic Engineering (RSVP-TE)
Extensions", RFC 5151, February 2008.
[RFC4920] Farrel, A., Ed., Satyanarayana, A., Iwata, A., Fujita, N.,
and G. Ash, "Crankback Signaling Extensions for MPLS and
GMPLS RSVP-TE", RFC 4920, July 2007.
[RFC5920] Fang, L., Ed., "Security Framework for MPLS and GMPLS
Networks", RFC 5920, July 2010.
Shankar Raman.et.al Expires July 2013 [Page 11]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
Tunnels", RFC 3209, December 2001.
6.2 Informative References
[EVILBIT] Bellovin, S., "The Security Flag in the IPv4 Header",
RFC 3514, April 1 2003.
[RFC5513] Farrel, A., "IANA Considerations for Three Letter
Acronyms", RFC 5513, April 1 2009.
[RFC5514] Vyncke, E., "IPv6 over Social Networks", RFC 5514, April 1
2009.
[RFC4726] Farrel, A., Vasseur, J.-P., and A. Ayyangar, "A Framework
for Inter-Domain Multiprotocol Label Switching Traffic
Engineering", RFC 4726, November 2006.
[RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP)
Hierarchy with Generalized Multi-Protocol Label Switching
(GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005.
[RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel,
"Label Switched Path Stitching with Generalized
Multiprotocol Label Switching Traffic Engineering (GMPLS
TE)", RFC 5150, February 2008.
[RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream
Label Assignment and Context-Specific Label Space",
RFC 5331, August 2008.
Shankar Raman.et.al Expires July 2013 [Page 12]
INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs January 2013
Authors' Addresses
Shankar Raman
Department of Computer Science and Engineering
I.I.T Madras,
Chennai - 600036
TamilNadu,
India.
EMail: mjsraman@cse.iitm.ac.in
Balaji Venkat Venkataswami
Department of Electrical Engineering,
I.I.T Madras,
Chennai - 600036,
TamilNadu,
India.
EMail: balajivenkat299@gmail.com
Prof.Gaurav Raina
Department of Electrical Engineering,
I.I.T Madras,
Chennai - 600036,
TamilNadu,
India.
EMail: gaurav@ee.iitm.ac.in
Shankar Raman.et.al Expires July 2013 [Page 13]