Internet DRAFT - draft-kurapati-dynamicrp-bgpmvpn
draft-kurapati-dynamicrp-bgpmvpn
L3VPN P. Kurapati
Internet-Draft M. Rodrigues
Intended status: Standards Track K. Windisch
Expires: November 25, 2013 Juniper Networks
S. Asif
AT&T LABS
May 24, 2013
Dynamic RP encodings in BGP based MVPNs
draft-kurapati-dynamicrp-bgpmvpn-00.txt
Abstract
PIM Group-to-RP mappings are distributed dynamically using protocols
such as BSR or Auto-RP. The BGP-MVPN specification provides for this
information to be encapsulated in an I-PMSI or S-PMSI provider tunnel
between the PEs in an MVPN environment. Since this is control
information, it is desirable to signal this information in BGP
between PEs, similar to carrying other customer control state such as
C-Multicast routes. This document specifies the mechanisms and
procedures to carry bootstrap information via BGP to provide true
control and data plane separation.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 25, 2013.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
Kurapati, et al. Expires November 25, 2013 [Page 1]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1. Unified BGP based control plane . . . . . . . . . . . . . 4
3.2. Manageability at a common provider location . . . . . . . 4
3.3. Avoiding unnecessary Provider Tunnels . . . . . . . . . . 4
4. MCAST-VPN-BSR NLRI . . . . . . . . . . . . . . . . . . . . . 5
4.1. BSR Parameters NLRI . . . . . . . . . . . . . . . . . . . 5
4.2. BSM Group Parameters NLRI . . . . . . . . . . . . . . . . 6
4.3. BSM RP Parameters NLRI . . . . . . . . . . . . . . . . . 7
4.4. BSR-BGP Path Attribute . . . . . . . . . . . . . . . . . 8
5. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 9
5.1. Handling Bootstrap messages . . . . . . . . . . . . . . . 9
5.1.1. Data for BSR Parameters Route . . . . . . . . . . . . 9
5.1.2. Fragmented BSMs . . . . . . . . . . . . . . . . . . . 10
5.1.3. Mappings arriving in different BGP UPDATES . . . . . 10
5.2. Mapping individual groups to RP . . . . . . . . . . . . . 11
5.3. Triggered BSMs by egress PE . . . . . . . . . . . . . . . 11
5.4. Reverse Path Forwarding for Dynamic RP advertisements . . 11
5.5. Interoperation with tunneled BSR . . . . . . . . . . . . 12
5.6. Route Targets for Group-to-RP mapping routes . . . . . . 13
5.7. BSR multihomed . . . . . . . . . . . . . . . . . . . . . 13
6. Protocol Details . . . . . . . . . . . . . . . . . . . . . . 13
6.1. Originating Group-to-RP Mapping route for bootstrap
messages received on a VRF . . . . . . . . . . . . . . . 13
6.2. Handling changes in BSR messages . . . . . . . . . . . . 14
Kurapati, et al. Expires November 25, 2013 [Page 2]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
6.2.1. Missing RP or Group mapping entry . . . . . . . . . . 14
6.2.2. Missing BSM . . . . . . . . . . . . . . . . . . . . . 15
6.2.3. Change of Elected BSR . . . . . . . . . . . . . . . . 16
6.3. Receiving Group-to-RP Mapping routes for BSR . . . . . . 17
7. Security Considerations . . . . . . . . . . . . . . . . . . . 17
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18
10.1. Normative Reference . . . . . . . . . . . . . . . . . . 18
10.2. Informative Reference . . . . . . . . . . . . . . . . . 19
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19
1. Introduction
In PIM-SM [RFC4601], the Group-to-RP mapping information is
distributed dynamically through methods such as BSR [RFC5059] or
Auto-RP. When multicast is deployed in an VPN environment, PEs in
the provider space need to carry this information transparently
across its core so that CEs in all the sites can access this RP
information. MVPN specification [RFC6513] defined a mechanism in
section 5.3.4 where the BSR messages can be transmitted in the
provider space over PMSI tunnels. However, carrying control messages
like BSR in the data tunnels is not always desirable. BGP encodings
in BGP-MVPN specification [RFC6514] already define mechanisms to
carry C-Multicast route information in BGP. This document specifies
carrying BSR Group-to-RP mapping information through BGP. Auto-RP
mechanism is out of scope for this specification.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
This document uses the following terms:
"MVPN"
"Multicast in MPLS/BGP IP VPNs" [RFC6513] includes two different
methods, BGP and PIM, for exchanging customer's multicast control
information. This document only deals with BGP for exchanging the
customer multicast control information. MVPN in the following
sections refers to BGP-MVPN.
"C-Multicast routes"
MVPN customer's multicast routing information that is carried in BGP
is referred in this document as C-Multicast routes.
Kurapati, et al. Expires November 25, 2013 [Page 3]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
"RP-Set"
RP-Set is the Group-to-RP mapping information distributed by BSR.
3. Motivation
3.1. Unified BGP based control plane
In BGP based MVPN, PE Auto Discovery and the exchange of PIM Join/
Prune state, are a part of the customer multicast control plane and
are accomplished via advertisements of BGP MVPN NLRI. BSR/Auto-RP
protocols also carry a type of customer multicast control
information. Carrying them in BGP is not only logical but also gives
a unified BGP based control plane to carry all the Customer space
control messages.
3.2. Manageability at a common provider location
A provider of MVPN services would be interested to know the
customer's RP topology and the detailed mappings. This information
will be critical for operations to troubleshoot and/or manage
customer's multicast deployments. Like the C-Multicast routes,
obtaining this information at a centralized location like a BGP
Route-Reflector is desired. In the current MVPN specifications, if a
provider wants such information, it needs to be obtained from the
customer VRFs on all PE's importing these customer VRF's. More so,
obtaining this information on each PE through intervals is prone to
information lost between intervals, especially with soft-state
control protocols like PIM. Since the RP-Set information is not a
control message passed through BGP, obtaining this information is not
possible through BGP Route Reflector.
3.3. Avoiding unnecessary Provider Tunnels
With the current specification, the RP-Set is carried via PMSI
tunnels. There may be deployments which uses only S-PMSI. Carrying
BSR information through S-PMSI implies creating tunnels just to carry
control information. Also, if the BSR is located at the receiver-
only site, an S-PMSI tunnel needs to be created with the PE at
receiver site as root. This PE needs to create an S-PMSI tunnel with
bursty source just for distributing control message which was
otherwise not required to be setup as it is a receiver-only PE.
Kurapati, et al. Expires November 25, 2013 [Page 4]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
4. MCAST-VPN-BSR NLRI
This document defines a new NLRI called the MCAST-VPN-BSR NLRI. The
format of the MCAST-VPN-BSR NLRI is identical to what is defined in
BGP-MVPN [RFC6514]
+-------------------------------------+
| Route Type (1 octet) |
+-------------------------------------+
| Length (1 octet) |
+-------------------------------------+
| Route Type specific (variable) |
+-------------------------------------+
The following 3 Route Types are defined for MCAST-VPN-BSR NLRI.
1 - BSR Parameters
2 - BSM Group parameters
3 - BSM RP Parameters
The MCAST-VPN-BSR NLRI is carried in BGP [RFC4271] with an AFI of 1
or 2 and a SAFI of MCAST-VPN-BSR. In order for two BGP speakers to
exchange MCAST-VPN-BSR NLRIs, they must use a BGP Capabilities
Advertisement to ensure that they both are capable of properly
processing such an NLRI. This is done as specified in [RFC4760], by
using capability code 1 (multiprotocol BGP) with an AFI of 1 or 2 and
a SAFI of MCAST-VPN-BSR.
4.1. BSR Parameters NLRI
A BSR Parameters Route Type specific MCAST-VPN-BSR NLRI consists of
the following:
+-------------------------------------+
| RD (8 octets) |
+-------------------------------------+
| len of BSR Address (1 octet) |
+-------------------------------------+
| BSR Address (Variable length) |
+-------------------------------------+
| Hash Mask Length (1 octet) |
+-------------------------------------+
| BSR Priority (1 octet) |
+-------------------------------------+
Kurapati, et al. Expires November 25, 2013 [Page 5]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
| Originating PE's IP Address |
+-------------------------------------+
The RD is encoded as described in [RFC4364].
BSR Address defines the Elected/Candidate BSRs IP address extracted
from the Bootstrap message. The field "len of BSR Address"
determines whether it is IPv4 or IPv6. If the value is 32, the
corresponding IP address is an IPv4 address. If the value is 128 it
is an IPv6 address.
BSR priority is a 1 octet value as defined in BSR specification
[RFC5059].
Hash Mask Length is a 1 octet value which is used for RP selection by
the routers running BSR. This value MUST be taken from the BSM and
copied in to the BSR Parameters route by the originating router.
Originating PE's IP Address field is set to the IP address that the
PE places in the Global Administrator field of the VRF Route Import
Extended Community of the VPN-IP routes advertised by the PE. For a
given MVPN, a single such IP address MUST be used, and that same IP
address MUST be used as the originating PE's IP address in all route
types of the MCAST-VPN-BSR NLRI that the PE transmits.
The usage and details of this NLRI is discussed in "Protocol Details"
section.
4.2. BSM Group Parameters NLRI
A BSM Group Parameters Route Type specific MCAST-VPN-BSR NLRI
consists of the following:
Kurapati, et al. Expires November 25, 2013 [Page 6]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
+-------------------------------------+
| RD (8 octets) |
+-------------------------------------+
| len of BSR Address (1 octet) |
+-------------------------------------+
| BSR Address (Variable length) |
+-------------------------------------+
| len of Group Address (1 Octet) |
+-------------------------------------+
| Group Prefix (Encoded Group format) |
+-------------------------------------+
| Originating PE's IP Address |
+-------------------------------------+
Group prefix can contain IPv6 or IPv4 address. Similar to RP
Address, this can be determined by "len of Group Address" field. A
value of 32 indicates IPv4 and 128 indicates IPv6 Group prefix. Note
that the length only indicates the length of the Group address amd
not the encoded group prefix.
Encoded Group prefix format is defined in PIM-SM specification
[RFC4601] and the same is extracted from the BSM message and is
placed in this NLRI. Along with the group prefix, the encoded group
format also contains important information such as 'BiDir' and Admin
Scope zone which needs to be carried across to the egress PEs.
The usage and details of this NLRI is discussed in "Protocol Details"
section.
4.3. BSM RP Parameters NLRI
A BSR RP Parameters Route Type specific MCAST-VPN-BSR NLRI consists
of the following:
+-------------------------------------+
| RD (8 octets) |
+-------------------------------------+
| len of BSR Address (1 octet) |
+-------------------------------------+
| BSR Address (Variable length) |
+-------------------------------------+
| len of RP Address (1 octet) |
+-------------------------------------+
| RP Address (Encoded unicast addr) |
+-------------------------------------+
| len of Group Address (1 octet) |
+-------------------------------------+
Kurapati, et al. Expires November 25, 2013 [Page 7]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
| Group Prefix(Encoded Group format)|
+-------------------------------------+
| RP Hold Time (2 Octets) |
+-------------------------------------+
| RP Priority (1 Octet) |
+-------------------------------------+
| Originating PE's IP Address |
+-------------------------------------+
RP Address is in Encoded unicast format as defined in PIM-SM
specification [RFC4601] and the same is extracted from the BSM
message and is placed in this NLRI.
Hold Time is a 2 octet value in seconds, as defined in BSR
specification [RFC5059]. It is a per RP value which is taken from
the BSM and filled in the NLRI by the originating router.
The usage and details of this NLRI is discussed in "Protocol Details"
section.
4.4. BSR-BGP Path Attribute
This document defines and uses a new BGP attribute called the "BSR-
BGP attribute". This is an optional transitive BGP attribute. The
format of this attribute is defined as follows:
+-------------------------------------+
| Fragment Tag (2 Octets) |
+-------------------------------------+
| Group Count (1 Octet) |
+-------------------------------------+
| RP Count (1 Octet) |
+-------------------------------------+
All MCAST-VPN-BSR NLRI route types carry the BSR-BGP Path attribute.
However, based on the route-type the corresponding fields are set.
Fragment Tag MUST be set by all 3 route types. The usage of Fragment
Tag is discussed in the details section.
BSR Parameters NLRI also sets the Group Count field indicating the
number of Groups this BSM is carrying. RP Count is not used by the
originator of BSR Parameters NLRI and SHOULD be ignored by the
receiver of BSR Parameters NLRI.
BSM Multicast Group Parameters NLRI sets the RP Count indicating the
number of RPs that this Group carries. The Group Count for this NLRI
Kurapati, et al. Expires November 25, 2013 [Page 8]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
is not set and SHOULD be ignored by the receiver of BSM MUlticast
Group Parameters NLRI
BSM RP parameters NLRI sets only the Fragment Tag. Group Count and
RP Count fields are not set by the originator and SHOULD be ignored
by the receiver of BSM RP Parameters NLRI.
5. Protocol Overview
In this section, we will discuss the overview of the mechanism to
carry BSR through BGP. The full specification will be discussed in
the next section. The actual processing of BSR message is not
changed by this specification for BSR messages sent/received on PE-CE
links. i.e. the way BSR is elected based on BSM or RP selection
procedure will continue to follow the BSR specification. This
document only describes how the information carried in BSR is
redistributed into BGP at the ingress PEs, and how the remote PEs
redistribute this information from BGP back into BSR. The same
applies when the PE itself is configured as C-RP or BSR for the given
VRF. When the PE is C-RP, the RP advertisements are unicasted to the
elected BSR based on the BSR specification. When the PE is
configured as BSR for a specific MVPN VRF, it MUST originate the BSMs
as it would normally do and send them towards CE facing interfaces.
PE also originates MVPN-MCAST-BSR NLRIs from the RP-Set it created by
virtue of being a BSR and advertises them to other remote PEs.
5.1. Handling Bootstrap messages
BSR specification can be broadly classified into 3 stages:
(a) BSR election
(b) Candidate RP advertisements
(c) Elected BSR (E-BSR) advertising the RP-Set information
Candidate RP advertisements are unicasted and hence it is not
necessary to carry them in PMSI tunnels or convert them into BGP
routes. On the other hand, both the empty BSMs and the BSMs sent by
the Elected BSRs (E-BSR) needs to be converted into BGP routes.
5.1.1. Data for BSR Parameters Route
The Bootstrap Router (BSR) election is based on the Bootstrap Message
(BSM) transmitted with BSR priority, by the candidate BSRs. Whenever
a PE receives a BSM on a PE-CE link, it needs to originate a BSR
Parameters Route with the BSR Address and Priority extracted from the
BSM. The BSR election on the PEs is done based on BSR Parameters
Kurapati, et al. Expires November 25, 2013 [Page 9]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
Route received. BSR Parameters Route also MUST contain the BSR-BGP
Path Attribute with Fragment-Tag and "Group Count" fields set.
"Group Count" is the number of Group prefixes this BSM is carrying.
The "Group Count" field is not present in the original BSR
specification and needs to be populated by the originating PE. The
Fragment Tag field can be any locally generated unique value at the
originating PE. The usage of this field is discussed in the next
section.
5.1.2. Fragmented BSMs
BSR specification also provides a way to deal with fragmentation such
that, if number of group-to-rp mappings exceed the packet size,
semantic fragmentation is performed. 'Fragment tag' in the BSM
distinguishes the fragments of the same BSM. If the fragmentation
boundary happens to be within a group prefix, the difference of "RP
Count" and "Frag RP Cnt" in the BSM determines how many more RPs are
to come. When the fragmentation boundary falls at the group prefix
(i.e a group range is fit entirely into a BSM fragment), then there
is no way to determine if more such fragments are coming in other
BSMs. An originating PE SHOULD wait for all the fragmented BSMs to
arrive before propagating the same in BGP, if it is known that more
fragments are coming based on "Frag RP Cnt" value. Otherwise, the PE
SHOULD start converting the mappings into BGP NLRIs and advertise the
routes as soon as the first BSM is received. The receiving PE will
start assembling the RP-Set based on the received Group-to-RP mapping
routes and send the resulting BSM to the CE. In the event of
originating PE receiving subsequent fragments of the same BSM, it
MUST advertise "BSR Parameters" NLRI with change in BSR-BGP path
attribute reflecting the modified "Group Count". However, the
existing "BSM Group Parameters" and "BSM RP Parameters" routes are
still considered valid and the new routes are to be treated as
incremental routes by the egress PE. This is possible because the
Fragment tag value is same for the new mappings in the subsequent
fragment. The details of this are discussed in the next section.
5.1.3. Mappings arriving in different BGP UPDATES
Even without fragmentation, all the Group-to-RP mappings may not fit
a single BGP UPDATE message at the originating PE. The NLRIs can be
split and sent into multiple BGP UPDATE messages. In such a
situation, egress PEs can know whether all the mappings are recieved
by using the "Group Count" and the "RP Count" in the "BSR-BGP" Path
attributes present in the BSR Parameters and BSM Group Parameters
NLRIs respectively. BGP Graceful Restart (GR) specification
[RFC4724] also proposes End-Of-RIB marker that can be used for non GR
purpose. PEs MAY use End-Of-RIB marker to indicate the completion of
all the route updates to the peer. The peer to the ingress PE can be
Kurapati, et al. Expires November 25, 2013 [Page 10]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
a BGP route reflector. If negotiated for the MCAST-MVPN-BSR family,
a BGP peer SHOULD wait for the End-of-RIB from the peer before
advertising it to the other clients. Egress PE SHOULD wait for the
End-Of-RIB marker before considering the routes for BSM Group-to-RP
mapping calculation and informing its corresponding CEs.
5.2. Mapping individual groups to RP
For a router or management system to determine the RP mapping for an
individual multicast stream, the "PIM Group-to-Rendezvous-Point
Mapping" specification [RFC6226] requires that the following be
evaluated in addition to any dynamic Group-to-RP mappings: source-
specific multicast (SSM) group ranges, dense mode (DM) group-ranges,
embedded-RP encoded in the IPv6 group address, and statically
configured Group-to-RP mappings. This specific Group-to-RP mapping
given by the algorithm in RFC6226 determines the RP that a router
would use for joining PIM shared trees or sending PIM Register
messages for individual streams.
In addition to the dynamic RP group-range-to-RP mappings obtainable
from BGP, a router or management system that needs to make these
specific group-to-RP mapping decisions for individual streams is
assumed to have knowledge of the same information required by RFC6226
as all the routers in the multicast domain. Specifically, it must
know the source-specific multicast (SSM) group ranges, dense mode
(DM) group-ranges, embedded-RP encoded in the IPv6 group address, and
statically configured group-range-to-RP mappings. How it learns
these is outside the scope of this document.
5.3. Triggered BSMs by egress PE
Unlike the regular BSR implementations where the BSM is flooded,
egress PE in this implementation is acting as a proxy and generating
BSMs. An egress PE MUST generate BSMs periodically from the mapping
information taken from the MCAST-VPN-BSR NLRIs. In addition to the
periodic BSMs, whenever there is a change in BSM mapping, a triggered
BSM MUST be generated which will then refresh the information on the
CE. Ingress PE and egress PE may not be in sync with each other in
terms of timing. Assume that periodic BSM is sent at t=0 and it
received a change in BGP route with one RP withdrawn. In this case,
an egress PE generates a new BSM without waiting for BS_Period to
expire. There MUST however be a minimum of BS_Min_Interval time
between each time a BSM is sent as noted in [RFC5059]. This will
cause an extra BSM to be generated towards the CE whenever there is a
change in Group-to-RP mapping in the egress PE.
5.4. Reverse Path Forwarding for Dynamic RP advertisements
Kurapati, et al. Expires November 25, 2013 [Page 11]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
BSR relies on variations of Reverse Path Forwarding (RPF) to ensure
that advertisement messages do not loop through the network. RPF
delivery semantics must also be maintained across the BGP-MVPN
service provider core for dynamic RP advertisements encoded in BGP.
The BGP NLRI to be defined for dynamic RP advertisements includes a
Originating PE's IP address field which can be used by PEs receiving
advertisements via BGP to conduct RPF checks when handling these
advertisements. When a PE receives the MCAST-VPN-BSR NLRI for a
particular MVPN from some other PE, the PE accepts the message only
if the 'Originating PE's IP address' field is the selected upstream
PE for the IP address of the Bootstrap router. Otherwise, the PE
simply discards the update.
5.5. Interoperation with tunneled BSR
In order for this specification to be incrementally deployable in a
network, PEs that implement this specification must be able to
interoperate with PEs that do not. Such PEs that are not capable of
advertising dynamic RP information in BGP will send tunneled BSR
messages.
For example, it is possible that a BSR router could be multihomed to
multiple PEs, some of which advertise dynamic RP mappings in BGP and
some of which encapsulate the native packets. In such a topology,
it's possible that each of the PEs connecting the site of the BSR
sender will forward the redundant advertisements for the same sender
to the other PEs across the core via the different protocol
mechanisms. Further, it is possible that different senders are
connected to PEs with differing capabilities and unique
advertisements will arrive from the core at PEs via different
protocol mechanisms.
In these scenarios in which there are non-capable PEs in the network,
PEs sending dynamic RP advertisements via BGP may also choose to
encapsulate the same advertisements as native BSR packets tunnel via
the PMSIs of the BGP-MVPN for delivery to receiving PEs that are not
capable of handling the dynamic RP advertisements from BGP. However,
when it is known that ALL PEs are capable of dynamic RP
advertisements in BGP, PEs should filter multicasted BSR messages
such that they are not encapsulated in PMSI tunnels.
PEs receiving dynamic RP advertisements from the service provider
core must apply RPF rules to the received advertisements regardless
of the mechanism of delivery. Propagation of native BSR encapsulated
advertisements by receiving PEs enabled for dynamic RP advertisements
in BGP should occur as if these advertisements were received from
BGP, and as specified in this document.
Kurapati, et al. Expires November 25, 2013 [Page 12]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
5.6. Route Targets for Group-to-RP mapping routes
By default the Group-to-RP mapping routes SHOULD have the same Route
Targets as the VPN-IP unicast routes towards BSR/Mapping Agent/C-RP
carried in these routes. An implementation SHOULD allow to modify
the default via configuration. With the use of Route Target
Constraint [RFC4684], the distribution of these routes can be
controlled to only those PEs who have the RT configured.
5.7. BSR multihomed
When a BSR is multihomed, say to two PEs, both the PEs will originate
the MCAST-VPN-BSR NLRIs. In such a case, egress PEs SHOULD take the
NLRIs from the PE based on single forwarder selection procedure
described in section 9.1.2 of [RFC6513].
6. Protocol Details
6.1. Originating Group-to-RP Mapping route for bootstrap messages
received on a VRF
When a PE router receives a BSM message on its CE facing interface
that is the RPF towards the BSR or configures an RP locally and is
the elected BSR, the router will add the mappings to the local copy
of Group-to-RP set. The PE router will then form BGP NLRIs as
mentioned in the previous section based on the received BSR message.
The PE router will determine if there were any previous
advertisements from the same BSR and if there is any change in the
BSM content. If the routes are already advertised and is not changed
as a result of the BSR message, then the same is not re-advertised in
BGP. Refer to section 3.4 of BSR specification [RFC5059] for
forwarding the received BSR messages. Unless the BSR implementation
requires a particular BSM to be blocked, BSR messages needs to be
forwarded via BGP to the egress PEs.
In the case where a new BSR has come up, it generates a BSM with
empty content. In such a case, only a "BSR Parameters" route is
generated by the PE with BSR address and priority fields filled.
Even if the BSR which sent empty BSM is not a preferred BSR (the
current EBSR is better), "BSR Parameters" NLRI MUST be generated with
this BSR address and sent to the remote PEs in order to maintain
consistency with the BSR implementation. The local BSR
implementation will take care of chosing the right BSR.
Kurapati, et al. Expires November 25, 2013 [Page 13]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
If a PE is rebooted or newly added, it may receive a BSM with "No-
Forward" bit set or a unicasted BSM from the CE to which it formed
PIM neighborship. In either case, PE MUST originate the required
NLRIs from the BSM and forward the same to the remote PEs. There is
no need to carry "No-Forward" bit in BGP for this scenario.
As discussed in Section 5.1.2, if the BSM is fragmented and if the
fragmentation boundary is at at group prefix, there is no way to tell
whether more fragments will arrive. Hence, the BSR routes are
advertised as soon as the BSM is received. If another BSM is
received with same "Fragment Tag" field at a later time, the BSR
implementation treats this as part of the same BSM that was received
earlier. Hence, these BSMs are converted to the respective BSR NLRIs
and advertised to the BGP peers.
6.2. Handling changes in BSR messages
A PE router may need to withdraw a Group-to-RP mapping for which it
has originated an advertisement based on several conditions. If a
BSM is received from a CE with a holdtime of zero for the mapping, or
if a local PE is BSR and an RP is unconfigured, then the
advertisement MUST be withdrawn immediately. In addition, scenarios
such as a BSM missing an RP mapping entry or missing BSMs entirely
may necessitate withdrawal of advertised mappings. A change may also
happen to a group, where a new group may get added or existing group
may be removed. These needs to be propagated accordingly through
BGP.
6.2.1. Missing RP or Group mapping entry
Assume a scenario where a given Group prefix had 100 RPs in the
received BSM from a BSR. In the next periodic update after BS_Period
interval, only 99 RPs are present for that group. This can happen
when an RP did not go down gracefully (i.e, it did not advertise with
Hold Timer = 0). The BSR implementation on the PE will continue to
keep the RP mapping until the "RP Hold Time" expires. However, this
needs to be communicated via BGP.
In this scenario, originating PE will continue to keep the "BSR
Parameters" route unchanged. The "BSM Group Parameters" route for
the respective group is re-advertised with change in the "RP Count"
value in the BSR-BGP path attribute. "Fragment Tag" field in the
path attribute MUST NOT be changed. Along with that, the respective
"BSM RP Parameters" route (Type-3) MUST be withdrawn.
Egress PE receiving a changed Type-2 route (BSM Group Parameters)
MUST wait until the RP count matches before generating the BSM
towards the CE with this change. In case the "End-Of-RIB" is
Kurapati, et al. Expires November 25, 2013 [Page 14]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
negotiated, this check SHOULD be performed only after the "End-Of-
RIB" is received. The RP will not be removed from the local group-
to-rp mapping table until the RP hold time expires. BSM that is
advertised towards the CE is changed to reflect the missing RP. This
scenario also handles any out of sequence messages arising. For
example, if Type-3 (BSM RP Parameters) withdrawal comes before the
changed Type-2 (BSM Group Parameters), the RP count check would fail
making the egress PE wait until the update is complete.
Another scenario to consider is when a RP for a group is removed and
a new RP is added. In this case, the "RP Count" remains same, but
the BSM RP Parameters route (Type-3) corresponding to old RP is
withdrawn, and Type-3 for new RP is advertised. Even if these two
arrive in two different BGP updates, the corresponding checks at
egress PE will ensure that a BSM is triggered only when the Type-3s
for the group is matched with the RP count in the corresponding BSM
Group Parameters route (Type-2).
Consider another scenario where initially the Group Count was 100.
The new BSM received has the same Group Count, but one group removed
and a new group added. In this situation, an ingress PE need not
generate a new "BSR Parameters" (Type-1) route since the group count
did not change. For the group which was removed from the BSM, the
corresponding Type-2 route and its Type-3 routes MUST be withdrawn
and a new BSM Group parameters route (Type-2) route with its RP
parameters route (Type-3) MUST be advertised. At the egress PE, as
soon as the Type-2 withdrawal comes, all the corresponding RP entries
(Type-3) are placed as inactive and MUST not be considered,
irrespective of whether a withdrawal for those routes are received or
not. However, PEs SHOULD keep those routes until the actual
withdrawals arrive. The "Group Count" and its "RP Count" per group
are to be matched before BSM is generated towards the CE.
Lastly, consider a scenario where a new Group is added over the
existing entries. In this case, the "BSR Parameters" route is re-
advertised with modified "Group Count" value, keeping the "Fragment
Tag" same. Along with that, "BSM Group Parameters" route and "BSM RP
Parameters" route are generated for the new Group and RPs. Again,
the egress PE MUST wait for the values to match before generating the
BSM towards egress.
6.2.2. Missing BSM
Like missing a particular mapping, missing an entire BSM can also
happen due to several reasons. It could be that the BSR went down
ungracefully or BSM is missed due to congestion. BSR specification
[RFC5059] defines a timer BS_Timeout (defaults to 2*BS_Period + 10
seconds) before declaring a BSR as dead and electing a new BSR.
Kurapati, et al. Expires November 25, 2013 [Page 15]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
While most of the scenarios are taken care by the local BSR
implementation on the PEs, we need to handle communicating the
missing BSM between the PEs through BGP. In the scenario of missing
BSM, the corresponding "BSR Parameters" (Type-1) route is withdrawn,
however the corresponding "BSM Group Parameters" route (Type-2) and
"BSM RP Parameters" route (Type-3) entries MUST NOT be withdrawn by
the ingress PE.
Egress PE receiving a withdrawn "BSR Parameters" route (Type-1) MUST
still keep the corresponding Type-2 and Type-3 entries. However, it
MUST NOT advertise the BSM to the CE without the Type-1 route
present. As soon as the Type-1 is withdrawn, BS_Timeout period has
to be started at the egress and upon its expiry, all the Type-2 and
Type-3 entries MUST be deleted.
Say the egress has generated BSM at t=0. At t=1 BS_Period expired at
ingress PE and ingress PE did not get the periodic BSM. So, it
withdraws type-1 (BSR Parameters). Egress PE has already generated
BSM just before the type-1 withdrawal was received. The egress PE
skips the next periodic BSM towards the CE. But CE is "off" by
BS_Period interval by now. Once the BS_Timeout expires, egress PE
removes all the type-2 and type-3 entries. CEs connected to egress
PE will remove the same, a whole BS_Period later. Hence, to avoid
this issue, once the BS_Timeout expires,an egress PE MUST generate a
new BSM towards CE with RP hold time set to "0" for all the type-2
and type-3 entries. This will make the CEs in sync with the the PEs.
After generating the BSM, PE removes all the Type-2 and Type-3
entries as stated above.
After the BSR is timed out (after BS_Timeout), when a new BSM comes
from the same BSR, a new "Fragment Tag" MUST be generated by the
ingress PE.
6.2.3. Change of Elected BSR
As per the BSR specification [RFC5059], when a preferred BSM is
received, the current 'Elected BSR' will transfer its state to
'Candidate BSR' and forward the received BSM. All the routers will
also change the elected BSR based on the preferred BSM. When an
originating PE's local bootstrap module elects a new BSR, all the old
Group-to-RP mapping entries advertised by the previous BSR MUST be
withdrawn.
Kurapati, et al. Expires November 25, 2013 [Page 16]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
6.3. Receiving Group-to-RP Mapping routes for BSR
The PEs receiving the BGP Group-to-RP Mapping route NLRIs will act as
a proxy. First step is to check if the received routes are valid.
If the "Fragment Tag" present in the "BSR Parameters" route does not
match with the "BSM Group Parameters" (Type-2) and "BSM RP
Parameters" (Type-3) routes, then those entries are considered
invalid. Similarly, if the len of BSR/RP/Group Address field
contains any value other than "0","32" or "128" it MUST be considered
as a malformed message and MUST be discarded. The PE MUST also run a
RPF check for the BSR IP address and see if the originating PE
address is the router through which the BSR is reachable. If the
preferred route to the BSR is through the core, RPF check is done as
per the MVPN upstream multicast hop (UMH) selection described in MVPN
specification [RFC6513]. If the advertising PE is not the PE
matching UMH selection, or if the preferred route to the BSR is
through one of the CE interfaces, the RPF check fails and the routes
MUST be ignored.
The receiving PE collects the Group-to-RP mapping routes per BSR IP
address and makes an entry in its BSR Group-to-RP mapping table.
From the Group-to-RP mapping per BSR, the egress PE forms a BSM
message. In order to generate the BSM message, the PE need to
construct certain fields such as Checksum which is not available in
the advertised NLRIs.
Unlike CE PIM routers, the PE receiving Group-to-RP mapping routes
via BGP will not receive periodic soft-state refreshes of the
mappings every BS_Period. The receiving PE MUST generate periodic
BSMs every BS_Period as specified in the BSR RFC [RFC5059]. When
there is a change in the corresponding Group-to-RP mapping routes, a
fresh BSM MUST be triggered after the calculation of "Group Count"
and "RP Count" matches. Egress PE MUST also ensure that there is a
minimum period of BS_Min_Interval between each time a BSM is sent
towards the CE as noted in BSR specification [RFC5059].
When a "BSR Parameters" (Type-1) route is received with "Group Count"
as 0, the PE MUST treat it as an empty BSM. An empty BSM MUST be
formed and sent to the CEs with other relevant fields populated.
Other aspects of electing a BSR based on the BSR priority MUST be
same as what is specified in the BSR specification [RFC5059].
7. Security Considerations
Since a BSR message allows semantic fragmentation, a message can be
very big with lot of mappings there by leading to PE generating
several Group-to-RP mapping route NLRIs. An implementation SHOULD be
able to restrict the number of Groups and RP mappings allowed on a
Kurapati, et al. Expires November 25, 2013 [Page 17]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
VRF or interface level so that the number of BGP routes generated for
the mapping are controlled.
8. IANA Considerations
This document defines a new NLRI, called MCAST-VPN-BSR, to be carried
in BGP using multiprotocol extensions. It requires assignment of a
new SAFI.
This document defines a new BGP optional transitive attribute, called
BSR-BGP.
9. Acknowledgments
The authors would like to thank Huajin Jeng (AT&T), Jeffrey Haas
(Juniper), Yakov Rekhter (Juniper) and Eric Rosen (Cisco) for their
valuable review and feedback.
10. References
10.1. Normative Reference
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
"Protocol Independent Multicast - Sparse Mode (PIM-SM):
Protocol Specification (Revised)", RFC 4601, August 2006.
[RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk,
R., Patel, K., and J. Guichard, "Constrained Route
Distribution for Border Gateway Protocol/MultiProtocol
Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual
Private Networks (VPNs)", RFC 4684, November 2006.
[RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y.
Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724,
January 2007.
[RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano,
"Bidirectional Protocol Independent Multicast (BIDIR-
PIM)", RFC 5015, October 2007.
[RFC5059] Bhaskar, N., Gall, A., Lingard, J., and S. Venaas,
"Bootstrap Router (BSR) Mechanism for Protocol Independent
Multicast (PIM)", RFC 5059, January 2008.
Kurapati, et al. Expires November 25, 2013 [Page 18]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
[RFC6513] Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP
VPNs", RFC 6513, February 2012.
[RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
Encodings and Procedures for Multicast in MPLS/BGP IP
VPNs", RFC 6514, February 2012.
[RFC6515] Aggarwal, R. and E. Rosen, "IPv4 and IPv6 Infrastructure
Addresses in BGP Updates for Multicast VPN", RFC 6515,
February 2012.
[RFC6226] Joshi, B., Kessler, A., and D. McWalter, "PIM Group-to-
Rendezvous-Point Mapping", RFC 6226, May 2011.
10.2. Informative Reference
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, February 2006.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Authors' Addresses
Pavan Kurapati
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
USA
Email: kurapati@juniper.net
URI: http://www.juniper.net/
Marco Rodrigues
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
USA
Email: mprodrigues@juniper.net
URI: http://www.juniper.net/
Kurapati, et al. Expires November 25, 2013 [Page 19]
Internet-Draft Dynamic RP encodings in BGPMVPN May 2013
Kurt Windisch
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
USA
Email: kurtw@juniper.net
URI: http://www.juniper.net/
Saud Asif
AT&T LABS
200 S Laurel Ave.
Middletown, NJ 07748
USA
Email: sasif@att.com
URI: http://www.att.com/
Kurapati, et al. Expires November 25, 2013 [Page 20]