Internet DRAFT - draft-xu-idr-bgp-route-broker
draft-xu-idr-bgp-route-broker
Network Working Group X. Xu
Internet-Draft China Mobile
Intended status: Standards Track S. Hegde
Expires: 17 February 2024 S. Sangli
Juniper
S. Zhang
J. Dong
Huawei
16 August 2023
BGP Route Broker for Hyperscale SDN
draft-xu-idr-bgp-route-broker-03
Abstract
This document describes an optimized BGP route reflector mechanism,
referred to as a BGP route broker, so as to use BGP-based IP VPN as
an overlay routing protocol in a scalable way for hyperscale data
center network virtualization environments, also known as Software-
Defined Network (SDN) environments.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 17 February 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
Xu, et al. Expires 17 February 2024 [Page 1]
Internet-Draft BGP Route Broker August 2023
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3
2. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 3
3. Route Target Membership Advertisement Process . . . . . . . . 4
4. Route Distribution Process . . . . . . . . . . . . . . . . . 4
5. Deployment Considerations . . . . . . . . . . . . . . . . . . 5
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5
7. Security Considerations . . . . . . . . . . . . . . . . . . . 5
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 5
9.1. Normative References . . . . . . . . . . . . . . . . . . 5
9.2. Informative References . . . . . . . . . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7
1. Problem Statement
BGP/MPLS IP VPN has been successfully deployed in world-wide service
provider networks for two decades and therefore it has been proved to
be scalable enough in large-scale networks. Here, the BGP/MPLS IP
VPN means both BGP/MPLS IPv4 VPN [RFC4364] and BGP/MPLS IPv6 VPN
[RFC4659] . In addition, BGP/MPLS IP VPN-based data center network
virtualization approaches described in [RFC7814], especially in the
virtual PE model described in [I-D.ietf-bess-virtual-pe] have been
widely deployed in small to medium-sized data centers for network
virtualization purpose, also known as Software Defined Network (SDN).
Examples include but not limited to OpenContrail.
When it comes to hyperscale cloud data centers typically housing tens
of thousands of servers which in turn are virtualized as Virtual
Machines (VMs) or containers, it usually means there would be at
least tens of thousands of virtual PEs, millions of VPNs and tens of
millions of VPN routes from the network virtualization perspective
provided the virtual PE model as mentioned above (a.k.a., a host-
based network virtualization model) is used. That means a
significant challenge on both the BGP session capacity and the VPN
routing table capacity of any given BGP router.
It’s no doubt that the route reflection mechanism should be
considered in order to address the BGP scaling issues as mentioned
above. Assume a typical one-level route reflector architecture is
used, it's straightforward to partition all the VPN routes supported
Xu, et al. Expires 17 February 2024 [Page 2]
Internet-Draft BGP Route Broker August 2023
by a data center among multiple route reflectors with each route
reflector being preconfigured with a block of route targets
associated with partial VPNs. In other words, there is no need for a
single route reflector to maintain all the VPN routes supported by
the data center. For redundancy purpose, more than one route
reflector SHOULD be preconfigured with the same block of route
targets so as to form a RR cluster.
Provided each virtual PE had been attached with at least one VPN
corresponding to a given route reflector, that particular route
reflector would have to establish BGP sessions with all virtual PEs,
it would become a huge BGP session pressure on route reflectors.Now
assume that another level (bottom-level) of route reflectors is
introduced between the existing level (top-level) of router
reflectors and the virtual PEs. Each top-level route reflectors
would establish BGP sessions with all bottom-level route reflectors
rather than all virtual PE routers. In addition, bottom-level route
reflectors just need to establish BGP sessions with a subset of all
virtual PEs respectively. As a result, the scaling issue of the BGP
session capacity is solved through the above partition mechanism.
In the above two-level RR hierarchy within hyperscale data centers,
deploying the Route Target Constrain (RTC) mechanism as defined in
[RFC4684] would bring at least the following two drawbacks: 1) it's
hard to partition all the VPN routes supported by the data center
among multiple top-level RRs; 2) virtual PEs would have to receive RT
membership NLRIs corresponding to all of route targets supported by
the data center, which unnecessarily waste the CPU and RAM resources
on virtual PEs.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Solution Overview
By learning from the widely-adopted high-performance message queuing
mechanisms (e.g., RabbitMQ), the bottom-level route reflectors,
referred to as route brokers in the following text, work as follows:
they just need to maintain the route target membership information of
their IBGP peers and reflect VPN routes among them on demands. In a
word, route brokers act as the message brokers/exchanges of the
message queuing system, while top-level route reflectors, referred to
as route servers, and virtual PEs, referred to as route broker
Xu, et al. Expires 17 February 2024 [Page 3]
Internet-Draft BGP Route Broker August 2023
clients, act as both message publishers/producers and subscribers/
consumers of the message queuing system.
3. Route Target Membership Advertisement Process
Route collection servers advertise route target membership
information according to the preconfigured block of route targets on
each of them. As such, route brokers know the VPNs partitioned to
each of them.
Route brokers advertise a default route target membership information
to their own route broker clients so as to collect VPN routes
originated from their own route broker clients and then reflect them
to the corresponding route collection servers.
Route broker clients advertise route target membership information
according to the block of route targets which are dynamically
configured. Upon receiving the above advertisement, route brokers
would dispatch the received route target memembership information
towards the corresponding route collection servers whose
preconfigured block of route target cover the advertised route
targets.
The advertisement of route target membership information is built on
the Route Target Outbound Route Filtering (ORF) as defined in
[I-D.xu-idr-route-target-orf] .
4. Route Distribution Process
Upon receiving a route update message from a route collection server
which contains VPN routes for a given VPN, if those VPN routes
contained in the route update message are selected as best routes,
route brokers would store those VPN routes in their local RIBs and
then reflect them to their route broker clients which are associated
with that VPN. Meanwhile, the cluster ID of route brokers SHOULD be
prepended when reflecting the above VPN routes.
Upon receiving a route update message from a route broker client
which contains VPN routes for a given VPN, if those VPN routes are
selected as best routes, route brokers would store those routes in
their local RIBs and then reflect them to the other iBGP peers
(including route collection servers and other route broker clients)
which are associated with that VPN. Meanwhile, the cluster ID of
route brokers SHOULD be prepended when reflecting the above VPN
routes.
Xu, et al. Expires 17 February 2024 [Page 4]
Internet-Draft BGP Route Broker August 2023
Upon receiving an implicit route request for all the VPN routes for
one or more VPNs (via the route target membership information
advertisement) from a route broker client, route brokers SHOULD
respond with the corresponding VPN routes stored in its local RIBs to
that route broker.
Upon receiving an implicit route request for all the VPN routes for
one or more VPNs (via the route target membership information
advertisement) from a route collection server, route brokers SHOULD
respond with the corresponding VPN routes stored in its local RIBs
which are learnt from their own route broker clients to that route
collection server.
5. Deployment Considerations
To simplify the VPN route distribution control, each VPN SHOULD be
assigned with a globally unique export route target value.
Since the advertisement of multiple paths for a given VPN prefix is
needed in the data center SDN environments, virtual PEs SHOULD be
assigned with different RDs.
To avoid the VPN routes learnt from a given route collection server
to another route collection server, route collection servers SHOULD
be configured with the same cluster ID.
Virtual PEs SHOULD NOT establish BGP session with more than one
cluster of route brokers which are configured with the same cluster
ID.
6. IANA Considerations
TBD
7. Security Considerations
TBD
8. Acknowledgements
The authors would like to thank Robert Raszuk for their valuable
comments and suggestions on this document.
9. References
9.1. Normative References
Xu, et al. Expires 17 February 2024 [Page 5]
Internet-Draft BGP Route Broker August 2023
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February
2006, <https://www.rfc-editor.org/info/rfc4364>.
[RFC4659] De Clercq, J., Ooms, D., Carugi, M., and F. Le Faucheur,
"BGP-MPLS IP Virtual Private Network (VPN) Extension for
IPv6 VPN", RFC 4659, DOI 10.17487/RFC4659, September 2006,
<https://www.rfc-editor.org/info/rfc4659>.
[RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk,
R., Patel, K., and J. Guichard, "Constrained Route
Distribution for Border Gateway Protocol/MultiProtocol
Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual
Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684,
November 2006, <https://www.rfc-editor.org/info/rfc4684>.
[RFC5291] Chen, E. and Y. Rekhter, "Outbound Route Filtering
Capability for BGP-4", RFC 5291, DOI 10.17487/RFC5291,
August 2008, <https://www.rfc-editor.org/info/rfc5291>.
[RFC7814] Xu, X., Jacquenet, C., Raszuk, R., Boyes, T., and B. Fee,
"Virtual Subnet: A BGP/MPLS IP VPN-Based Subnet Extension
Solution", RFC 7814, DOI 10.17487/RFC7814, March 2016,
<https://www.rfc-editor.org/info/rfc7814>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
9.2. Informative References
[I-D.ietf-bess-virtual-pe]
Fang, L., Fernando, R., Napierala, M., Bitar, N. N., and
B. Rijsman, "BGP/MPLS VPN Virtual PE", Work in Progress,
Internet-Draft, draft-ietf-bess-virtual-pe-00, 12 November
2014, <https://datatracker.ietf.org/doc/html/draft-ietf-
bess-virtual-pe-00>.
Xu, et al. Expires 17 February 2024 [Page 6]
Internet-Draft BGP Route Broker August 2023
[I-D.xu-idr-route-target-orf]
Xu, X., Hegde, S., Sangli, S., Shunwan, and Jie, "Route
Target ORF", Work in Progress, Internet-Draft, draft-xu-
idr-route-target-orf-00, 16 August 2023,
<https://datatracker.ietf.org/api/v1/doc/document/draft-
xu-idr-route-target-orf/>.
Authors' Addresses
Xiaohu Xu
China Mobile
Email: xuxiaohu_ietf@hotmail.com
Shraddha Hegde
Juniper
Email: shraddha@juniper.net
Srihari Sangli
Juniper
Email: ssangli@juniper.net
Shunwan
Huawei
Email: zhuangshunwan@huawei.com
Jie
Huawei
Email: jie.dong@huawei.com
Xu, et al. Expires 17 February 2024 [Page 7]