Internet DRAFT - draft-li-idr-congestion-status-extended-community
draft-li-idr-congestion-status-extended-community
IDR Z. Li
Internet-Draft China Mobile
Updates: 4271, 4360, 7153 (if approved) J. Dong
Intended status: Standards Track Huawei Technologies
Expires: September 4, 2018 March 3, 2018
Carry congestion status in BGP community
draft-li-idr-congestion-status-extended-community-07
Abstract
To aid BGP receiver to steer the AS-outgoing traffic among the exit
links, this document introduces a new BGP community, congestion
status community, to carry the link bandwidth and utilization
information, especially for the exit links of one AS. If accepted,
this document will update RFC4271, RFC4360 and RFC7153.
The introducd congestion status community is not used to impact the
decision process of BGP specified in section 9.1 of RFC4271, but can
be used by route policy to impact the data forwarding behavior.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 4, 2018.
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Li & Dong Expires September 4, 2018 [Page 1]
Internet-Draft congestion status community March 2018
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 4
3. Previous Work . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Solution Alternative 1: Extended Community . . . . . . . . . 4
5. Solution Alternative 2: Large Community . . . . . . . . . . . 6
6. Solution Alternative 3: Community Container . . . . . . . . . 6
7. Deployment Considerations . . . . . . . . . . . . . . . . . . 8
8. Security Considerations . . . . . . . . . . . . . . . . . . . 9
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
11.1. Normative References . . . . . . . . . . . . . . . . . . 10
11.2. Informative References . . . . . . . . . . . . . . . . . 10
Appendix A. Bandwidth Values . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12
1. Introduction
Knowing the congestion status (bandwidth and utilization) of the AS
exit links is useful for traffic steering, especially for steering
the AS outgoing traffic among the exit links. Section 7 of
[I-D.gredler-idr-bgplu-epe] explicitly specifies this kind of
requirement, which is also needed in our field network.
The following figure is used to illustrate the benefits of knowing
the congestion status of the AS exit links. AS A has multiple exit
links connected to AS B. Both AS A and B has exit link to AS C, and
AS B provides transit service for AS A. Due to cost or some other
reasons, AS A prefers using AS B to transmit its' traffic to AS C,
not the directly connected link between AS A and C. If the exit
routers, Router 7 and 8, in AS A tell their iBGP peers the congestion
status of the exit links, the peers in turn can steer some outgoing
traffic toward the less loaded exit link. If AS A knows the link
between AS B and AS C is congested, it can steer some traffic towards
AS C from AS B to the directly connected link by applying some route
policies.
Li & Dong Expires September 4, 2018 [Page 2]
Internet-Draft congestion status community March 2018
+-------------------------------------------+
| AS C |
| +----------+ +----------+ |
+--| Router 1 |---------------| Router 2 |--+
+----------+ +----------+
| |
| |
| +----------+
| +--------| Router 3 |----------+
| | +----------+ |
| | AS B |
| | +----------+ +----------+ |
| +-| Router 4 |----| Router 5 |-+
| +----------+ +----------+
| | |
| | |
+----------+ +----------+ +----------+
+--| Router 6 |--------| Router 7 |----| Router 8 |-+
| +----------+ +----------+ +----------+ |
| AS A |
+---------------------------------------------------+
This document introduces new BGP extensions to deliver the congestion
status of the exit link to other BGP speakers. The BGP receiver can
then use this community to deploy route policy, thus steer AS
outgoing traffic according to the congestion status of the exit
links. This mechanisum can be used by both iBGP and eBGP.
In this verion, we provide three solution alternatives according to
the discussion in the face to face meetings and mail list. After
adoption, one solution will be selected as the final solution based
on the working group consensus.
In a network deployed SDN (Software Defined Network) controller,
congestion status extended community can be used by the controller to
steer the AS outgoing traffic among all the exit links from the
perspective of the whole network.
For the network with Route Reflectors (RRs) [RFC4456], RRs by default
only advertise the best route for a specific prefix to their clients.
Thus RR clients has no opportunity to compare the congestion status
among all the exit links. In this situation, to allow RR clients
learning all the routes for a specific prefix from all the exit
links, RRs are RECOMMENDED to enable add-path functionality
[RFC7911].
Li & Dong Expires September 4, 2018 [Page 3]
Internet-Draft congestion status community March 2018
To emphasize, the introduced new BGP extensions have no impact on the
decision process of BGP specified in section 9.1 of [RFC4271], but
can be used by route policy to impact the data forwarding behavior.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
3. Previous Work
In [constrained-multiple-path], authors from France Telecom also
specified the requirement to know the congestion status of a link.
To aid a router to perform unequal cost load balancing, experts from
Cisco introduced Link Bandwidth Extended Community in
[link-bandwidth-community] to carry the cost to reach the external
BGP neighbor. The cost can be either configured per neighbor or
derived from the bandwidth of the link that connects the router to a
directly connected external neighbor. This document was accepted by
the IDR working group, but expired in 2013.
Link Bandwidth Extended Community only carries the link bandwidth of
the exit link. The method provided in our document can carry the
link bandwidth together with the link utilization information. What
the BGP receiver needs to impact its traffic steering policy is the
up-to-date unused link bandwith, which can be derived from the link
bandwith and link utilization. Since Link Bandwidth Extended
Community is expired, the BGP speaker who receives update message
with both Link Bandwidth Extended Community and Congestion Status
Community SHOULD ignore the Link Bandwidth Extended Community and use
the Congestion Status Community.
4. Solution Alternative 1: Extended Community
As described in [RFC4360], the extended community attribute is an
8-octet value with the first one or two octets to indicate the type
of this attribute. Since congestion status community needs to be
delivered from on AS to other ASes, and used by the BGP speakers both
in other ASes and within the same AS as the sender, it MUST be a
transitive extended community, i.e. the T bit in the first octet MUST
be zero.
We only define the congestion status community for four-octet AS
number [RFC6793], since all the BGP speakers can handle four-octet AS
number now and the two-octet AS numbers can be mapped to four-octet
Li & Dong Expires September 4, 2018 [Page 4]
Internet-Draft congestion status community March 2018
AS numbers by setting the two high-order octets of the four-octet
field to zero, as per [RFC6793].
Congestion status community is a sub-type allocated from Transitive
Four-Octet AS-Specific Extended Community Sub-Types defined in
section 5.2.4 of [RFC7153]. Its format is as Figure 1.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type =0x02 | Sub-Type | Sender AS Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender AS Number (cont.) | Bandwidth | Utilization |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: Congestion status extended community
Type: 1 octet. This field MUST be 0x02 to indicate this is a
Transitive Four-Octet AS-Specific Extended Community.
Sub-Type: 1 octet. It is used to indicate this is a Congestion
Status Extended Community. Its value is to be assigned by IANA.
Sender AS Number: 4 octets. Its value is the AS number of the BGP
speaker who generates this congestion status extended community.
If the generator has 2-octct AS number, it MUST encode its AS
number in the last (low order) two bytes and set the first (high
order) two bytes to zero, as per [RFC6793].
Bandwidth: 1 octet. Its value is the bandwidth of the exit link
in unit of 10 gbps (gigabits per second). The link with bandwidth
less than 10 gbps is not suitable to use this feature. To reflect
the practice that sometimes the traffic is rate limited to a
capacity smaller than the physical link, the value of the
bandwidth can be the configured capacity of the link. The
available configured capacity can be calculated from this field
together with Utilization field. Zero means the bandwidth is
unknown or is not advertised to other peers.
Utilization: 1 octet. Its value is the utilization of the exit
link in unit of percent. A value bigger than 100 means the
incoming traffic is higher than the link capacity. We can use the
"Utilization" field together with the "Bandwidth" field to
calculate the traffic load that we can further steer to this exit
link.
Li & Dong Expires September 4, 2018 [Page 5]
Internet-Draft congestion status community March 2018
5. Solution Alternative 2: Large Community
As described in [RFC8092], the BGP large community attribute is an
optional transitive path attribute of variable length, consisting of
12-octet values. The BGP large community attribute is mainly used to
extend the size of BGP Community [RFC1997] and Extened Community
[RFC4360], thus to accommodate at least two four-octet ASNs
[RFC6793]. As shown in the following figure, the format of the
12-octet BGP Large Community value is not suitable to be used to
define new type for congestion status community.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Global Administrator |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Local Data Part 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Local Data Part 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2
Global Administrator: A four-octet namespace identifier.
Local Data Part 1: A four-octet operator-defined value.
Local Data Part 2: A four-octet operator-defined value.
6. Solution Alternative 3: Community Container
As described in [I-D.ietf-idr-wide-bgp-communities], the BGP
Community Container has flexible encoding format, which we can use to
define the congestion status community.
A new type of the BGP Community Container is defined for the
congestion status community, which has the same common header as the
BGP Community Container with the following encoding format.
Li & Dong Expires September 4, 2018 [Page 6]
Internet-Draft congestion status community March 2018
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Flags |C|T| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Sender AS Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender AS Number (cont.) | Bandwidth |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bandwidth (cont.) | Utilization | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3
Type: 2 octets. Its value is to be assigned by IANA from the
registry "BGP Community Container Types" to indicate this is the
Congestion Status Community.
Flags: 1 octet. C and T bits MUST be set to indicate the
Congestion Status Community is transitive across confederation and
AS boundaries. The other bits in Flags field MUST be set to zero
when originated and SHOULD be ignored upon receipt.
Reserved: Reserved fields are reserved for future definition,
which MUST be set to zero when originated and SHOULD be ignored
upon receipt.
Length: 2 octets. This field represents the total length of a
given container's contents in octets.
Sender AS Number: 4 octets. Its value is the AS number of the BGP
speaker who generates this congestion status community. If the
generator has 2-octct AS number, it MUST encode its AS number in
the last (low order) two bytes and set the first (high order) two
bytes to zero, as per [RFC6793].
Bandwidth: 4 octets. Its value is the bandwidth of the exit link
in IEEE floating point format (see [IEEE.754.1985]), expressed in
bytes per second. Zero means the bandwidth is unknown or is not
advertised to other peers. Appendix A lists some typical
bandwidth values, most of which are extracted from Section 3.1.2
of [RFC3471].
To reflect the practice that sometimes the traffic is rate limited
to a capacity smaller than the physical link, the value of the
bandwidth can be the configured capacity of the link. The
available configured capacity can be calculated from this field
together with Utilization field.
Li & Dong Expires September 4, 2018 [Page 7]
Internet-Draft congestion status community March 2018
Utilization: 1 octet. Its value is the utilization of the exit
link in unit of percent. A value bigger than 100 means the
incoming traffic is higher than the link capacity. We can use the
"Utilization" field together with the "Bandwidth" field to
calculate the traffic load that we can further steer to this exit
link.
7. Deployment Considerations
o To avoid route oscillation
The exit router SHOULD set a threshold. When the utilization
change reaches the threshold, the exit router SHOULD generate a
BGP update message with congestion status community.
Implementations SHOULD further reduce the BGP update messages
trigered by link utilization change using the method similar to
BGP Route Flap Damping [RFC2439]. When link utilization change
by small amounts that fall under thresholds that would cause
the announcement of BGP update message, implementations SHOULD
suppress the announcement and set the penalty value
accordingly.
To reduce the update churn introduced, when one BGP router
needs to re-advertise a BGP path due to attribute changes, it
SHOULD update its Congestion Status Community at the same time.
Supposing there are N ASes on the way from the far end egress
BGP speaker to the final ingress BGP speaker, this allows
reducing the update churn as the final ingress BGP speaker will
receive a single UPDATE refreshing the N communities, rather
than N UPDATEs, each refreshing one community.
o To avoid traffic oscillation
Traffic oscillation means more traffic than expected is
attracted to the low utilized link, and some traffic has to be
steered back to other links.
Route policy is RECOMMENDED to be set at the exit router.
Congestion status community is only conveyed for some specific
routes or only for some specific BGP peers.
Congestion status community can also be used in a SDN network.
The SDN controller uses the exit link utilization information
to steer the Internet access traffic among all the exit links
from the perspective of the whole network.
o Other Conserns
Li & Dong Expires September 4, 2018 [Page 8]
Internet-Draft congestion status community March 2018
To avoid forwarding loops incremental deployment issues,
complications in error handling, the reception of such
community over IBGP session SHOULD NOT influence routing
decision unless tunneling is used to reach the BGP Next-Hop.
8. Security Considerations
This document defines a new BGP community to carry the congestion
status of the exit link. It is up to the BGP receiver to trust the
congestion status communities or not. Following deployment models
can be considered.
The BGP receiver may choose to only trust the congestion status
communities generated by some specific ASes or containing
bandwidth greater than a specific value.
You can filter the congestion status communities at the border of
your trust/administrative domain. Hence all the ones you receive
are trusted.
You can record the communities received over time, monitor the
congestion e.g. via probing, detect inconsistency and choose to
not trust anymore the ASes which advertise fake news.
9. IANA Considerations
For solution alternative 1, one sub-type is solicited to be assigned
from Transitive Four-Octet AS-Specific Extended Community Sub-Types
registry to indicate the Congestion Status Community defined in this
document.
For solution alternative 3, one community value is solicited to be
assigned from the registry "Registered Type 1 BGP Wide Community
Community Types" to indicate the Congestion Status Community defined
in this document.
10. Acknowledgments
We appreciate the constructive suggestions received from Bruno
Decraene. Many thanks to Rudiger Volk, Susan Hares, John Scudder,
Randy Bush for their review and comments to improve this document.
11. References
Li & Dong Expires September 4, 2018 [Page 9]
Internet-Draft congestion status community March 2018
11.1. Normative References
[I-D.ietf-idr-wide-bgp-communities]
Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S.,
and P. Jakma, "BGP Community Container Attribute", draft-
ietf-idr-wide-bgp-communities-04 (work in progress), March
2017.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Border Gateway Protocol 4 (BGP-4)", RFC 4271,
DOI 10.17487/RFC4271, January 2006,
<https://www.rfc-editor.org/info/rfc4271>.
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,
February 2006, <https://www.rfc-editor.org/info/rfc4360>.
[RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP
Extended Communities", RFC 7153, DOI 10.17487/RFC7153,
March 2014, <https://www.rfc-editor.org/info/rfc7153>.
[RFC8092] Heitz, J., Ed., Snijders, J., Ed., Patel, K., Bagdonas,
I., and N. Hilliard, "BGP Large Communities Attribute",
RFC 8092, DOI 10.17487/RFC8092, February 2017,
<https://www.rfc-editor.org/info/rfc8092>.
11.2. Informative References
[constrained-multiple-path]
Boucadair, M. and C. Jacquenet, "Constrained Multiple BGP
Paths", October 2010, <https://www.ietf.org/archive/id/
draft-boucadair-idr-constrained-multiple-path-00.txt>.
[I-D.gredler-idr-bgplu-epe]
Gredler, H., Vairavakkalai, K., R, C., Rajagopalan, B.,
Aries, E., and L. Fang, "Egress Peer Engineering using
BGP-LU", draft-gredler-idr-bgplu-epe-11 (work in
progress), October 2017.
Li & Dong Expires September 4, 2018 [Page 10]
Internet-Draft congestion status community March 2018
[link-bandwidth-community]
Mohapatra, P. and R. Fernando, "BGP Link Bandwidth
Extended Community", January 2013,
<https://www.ietf.org/archive/id/
draft-ietf-idr-link-bandwidth-06.txt>.
[RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities
Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996,
<https://www.rfc-editor.org/info/rfc1997>.
[RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route
Flap Damping", RFC 2439, DOI 10.17487/RFC2439, November
1998, <https://www.rfc-editor.org/info/rfc2439>.
[RFC3471] Berger, L., Ed., "Generalized Multi-Protocol Label
Switching (GMPLS) Signaling Functional Description",
RFC 3471, DOI 10.17487/RFC3471, January 2003,
<https://www.rfc-editor.org/info/rfc3471>.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006,
<https://www.rfc-editor.org/info/rfc4456>.
[RFC6793] Vohra, Q. and E. Chen, "BGP Support for Four-Octet
Autonomous System (AS) Number Space", RFC 6793,
DOI 10.17487/RFC6793, December 2012,
<https://www.rfc-editor.org/info/rfc6793>.
[RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder,
"Advertisement of Multiple Paths in BGP", RFC 7911,
DOI 10.17487/RFC7911, July 2016,
<https://www.rfc-editor.org/info/rfc7911>.
Appendix A. Bandwidth Values
Some typical bandwidth values encoded in 32-bit IEEE floating point
format are enumerated below.
Li & Dong Expires September 4, 2018 [Page 11]
Internet-Draft congestion status community March 2018
Link Type Bit-rate Bandwidth Value (Bytes/Sec)
(Mbps) (32-bit IEEE Floating point)
--------------- --------------- ---------------------------------
E1 2.048 0x487A0000
Ethernet 10.00 0x49989680
Fast Ethernet 100.00 0x4B3EBC20
OC-3/STM-1 155.52 0x4B9450C0
OC-12/STM-4 622.08 0x4C9450C0
GigE 1000.00 0x4CEE6B28
OC-48/STM-16 2488.32 0x4D9450C0
OC-192/STM-64 9953.28 0x4E9450C0
10GigE 10000.00 0x4E9502F9
OC-768/STM-256 39813.12 0x4F9450C0
100GigE 100000.00 0x503A43B7
Authors' Addresses
Zhenqiang Li
China Mobile
No.32 Xuanwumenxi Ave., Xicheng District
Beijing 100032
P.R. China
Email: li_zhenqiang@hotmail.com
Jie Dong
Huawei Technologies
Huawei Campus, No.156 Beiqing Rd.
Beijing 100095
P.R. China
Email: jie.dong@huawei.com
Li & Dong Expires September 4, 2018 [Page 12]