Internet DRAFT - draft-nalawade-softwire-nhop
draft-nalawade-softwire-nhop
Network Working Group Gargi Nalawade
Internet Draft Pradosh Mohapatra
June 2006 Francois Le Faucheur
Ruchi Kapoor
Pranav Mehta
David Ward
Simon Barber
Cisco Systems
J. Wu
Y. Cui
X. Li
Tsinghua University
BGP Softwire Nexthop Attribute
draft-nalawade-softwire-nhop-00.txt
1. Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
2. Copyright Notice
draft-nalawade-softwire-nhop-00.txt [Page 1]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
Copyright (C) The Internet Society (2006). All Rights Reserved.
3. Abstract
The current [MP-BGP] extensions carry routing information by
associating the same network layer protocol with both the NLRI and
the next hop. However, in certain scenarios, it is desirable or
required to advertise next hop information associated with a
different network layer protocol to the one associated with the NLRI.
Similarly, when data traffic for a network layer protocol's NLRIs
needs to be forwarded over an underlying tunnel, it is required to
indicate the tunnel endpoint to be used for forwarding with the BGP
update for a given NLRI.
This document specifies a new BGP attribute, called the SW_NEXT_HOP
attribute, which can optionally be used in a BGP Update message to
advertise next hop information associated with a different network
layer protocol than that of the NLRIs or convey information about the
tunnel endpoint.
4. Introduction
[MP-BGP] defines extensions to BGP-4 to enable it to carry routing
information for multiple network layer protocols (e.g. IPv4-VPN,
IPv6, IPv6-VPN). This is achieved by encoding the next hop and the
NLRI as defined by the network layer protocol in an MP_REACH_NLRI
attribute and including the network layer protocol identifiers. Since
the same network layer protocol is associated with both the next hop
information and the NLRI, [MP-BGP] extensions do not allow
advertisement of next hop information from a different network layer
protocol to the one of the NLRI.
However, there are situations where the next hop information to be
advertised is indeed from a different network layer protocol to the
one of the NLRI.
In a number of such situations, the [MP-BGP] limitation has been
circumvented by mapping the actual nexthop to an encoded value so as
to match the network layer protocol format of the NLRIs. [MPLS-VPN]
is an example of this since it calls for advertisement of IPv4 next
hop information with IPv4-VPN NLRI. This is achieved by prepending a
Null Route Distinguisher to the IPv4 Next Hop address. [BGP-V6-TUNN]
is another example that requires advertisement of IPv4 next hop
information along with IPv6 NLRI. The next hop is encoded as an
IPv4-mapped IPv6 address. [IPv6-VPN] is yet another example that
requires advertisement of IPv4 or IPv6 next hop information along
with IPv6-VPN NLRI, which is achieved by prepending a Null Route
draft-nalawade-softwire-nhop-00.txt [Page 2]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
Distinguisher to the next hop address and, when the meaningful next
hop is IPv4, by encoding it as an IPv4-mapped IPv6 address.
These workarounds do not suffice if the actual next hop address can
not be embedded in the next hop information field as defined by the
network layer protocol of the NLRIs. One such example is the need to
advertise IPv6 next hop information with IPv4 NLRIs to be able to
carry IPv4 islands routing information over a native IPv6 core.
Also, in cases where the network layer protocols of the next hop and
NLRI are different, the transport protocol is different from the
payload. This calls for the payload to be tunneled through the ISP
core. The establishment of the tunnels as well as the selection of
the tunnel type(s) to be used from an ingress router to a given
egress router can be statically controlled by configuration.
Alternatively the tunneling capabilities and preferences as well as
the individual tunnel attributes [BGP-TUN] can be dynamically
established via various mechanisms such as the BGP IPv4/IPv6 Tunnel
SAFI [BGP-TUN-SAFI] or IGP based discovery of TE tunnels [IGP-TE].
In some cases, the same tunnel can be used for all NLRIs advertised
by the egress router. The tunnel can then be selected by the ingress
router based on its local configuration as well as the information
that may have been advertised by the egress router about tunneling
capabilities and preferences for example via [BGP-TUN-SAFI].
In other cases, different NLRIs may need to be carried over different
tunnels. For example, some NLRIs may require transport over IPsec
tunnels while the other NLRIs may be more efficiently transported
without IPsec protection over MPLS LSPs. In these cases there is a
requirement for the egress router to advertise which tunnel ought to
be used for a particular set of NLRIs. The ingress router needs an
indication in the BGP update for these NLRIs, as to which tunnel to
use to reach the egress router.
This document describes a new BGP attribute, called SW_NEXT_HOP
attribute that can be optionally carried in BGP Update messages to
signal the actual next hop that is independent of the network layer
protocol of NLRIs and to also signal which tunnel to use for a given
set of NLRIs.
5. Softwire Nexthop Attribute
An Optional transitive attribute is being defined. This attribute is
meant to carry the nexthop address and the Tunnel information needed
to reach this nexthop address of the remote peer.
draft-nalawade-softwire-nhop-00.txt [Page 3]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
The attribute is encoded as shown below:
+---------------------------------------------------------+
| Address Family Identifier (2 octets) |
+---------------------------------------------------------+
| Subsequent Address Family Identifier (1 octet) |
+---------------------------------------------------------+
| Length of Next Hop Network Address (1 octet) |
+---------------------------------------------------------+
| Network Address of Next Hop (variable) |
+---------------------------------------------------------+
| TLVs (variable length) |
+---------------------------------------------------------+
The use and meaning of these fields are as follows :
Address Family Identifier (AFI):
This field in combination with the Subsequent Address Family
Identifier field identifies the Network Layer protocol
associated with the Next Hop address. Presently defined values
for the Address Family Identifier field are specified in
RFC1700 (see the Address Family Numbers section).
Subsequent Address Family Identifier (SAFI):
This field in combination with the Address Family Identifier
field identifies the Network Layer protocol associated with the
Next Hop address.
Length of Next Hop Network Address:
A 1 octet field whose value expresses the length of the
"Network Address of Next Hop" field as octets.
Network Address of Next Hop:
A variable length field that contains the Network Address of
the Next Hop.
TLV:
The variable length TLV field of the Softwire Nexthop attribute
draft-nalawade-softwire-nhop-00.txt [Page 4]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
contains one or more tuples of the form :
+----------------+------------------+
| Type (1 octet) | Length (1 octet) |
+----------------+------------------+
| Value (as specified by Type) |
+-----------------------------------+
where,
Type: This field specifies the 'Type' of the data contained in the value field.
Length: specifies the length of the 'Value' field.
Value: The contents and format of the value field are defined by
the Type field.
Following 'Types' are being defined :
Type 1 : indicates that the value field in the TLV contains a 2-octet Tunnel
Identifier which uniquely identifies a Tunnel on the
egress BGP router
Type 2 : indicates that the value field in the TLV contains a 2-octet
Multicast Tree Identifier which uniquely identifies a
Multicast Tree on the advertising BGP router
6. Operation
A BGP SPeaker may want to advertise itself as the router that should be
used as the next hop to the destinations advertised in the NLRI field,
or in the MP_NLRI field of the MP_REACH_NLRI attribute, and wants to
advertise one of its Network Layer addresses for a Network Layer protocol
which is different to the Network Layer protocol of the NLRI destinations.
Alternately, a BGP Speaker may also want to explicitely advertise which
tunnel to itself ought to be used for particular NLRI destinations.
In both the above cases, a BGP Speaker supporting the SW_NEXT_HOP attribute,
SHOULD include the SW_NEXT_HOP attribute to convey this Network Layer address.
A BGP Speaker supporting the SW_NEXT_HOP attribute which receives a
BGP advertisement containing a SW_NEXT_HOP attribute and which does
not modify the next hop information, SHOULD propagate the received
draft-nalawade-softwire-nhop-00.txt [Page 5]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
SW_NEXT_HOP attribute unchanged.
A BGP Speaker supporting the SW_NEXT_HOP attribute which receives a
BGP advertisement containing a SW_NEXT_HOP attribute and which
modifies next hop information MAY include an SW_NEXT_HOP
attribute in the generated advertisement. When it does so, the
Network Layer address contained inside the SW_NEXT_HOP attribute
MUST be one of its own addresses. In other words, in the case where
the BGP speaker modifies next hop information, it MUST NOT simply
propagate the received SW_NEXT_HOP unchanged.
When a BGP speaker supporting the SW_NEXT_HOP attribute receives a
BGP advertisement with next hop information encoded both in the
MP_REACH_NLRI and in the SW_NEXT_HOP, the BGP speaker SHOULD use the
next hop information encoded in the SW_NEXT_HOP, unless configured
to do otherwise.
When a BGP Speaker sets itself as the nexthop and is advertising
Optional Tunnel TLVs using the SW_NEXT_HOP attribute, it
means that the BGP Speaker is terminating the Tunnels and is advertising
itself as a Tunnel endpoint.
Let us consider the case when an ingress router receives a BGP update
for NLRIs which will receive data traffic (Eg. IPv4/6 unicast/multicast,
VPNv4/6 etc). If this update contains SW_NEXT_HOP attribute carrying a
Type, Tunnel-ID and Tunnel endpoint address, the ingress router will
use this information in the following manner (Tunnel endpoint address is
the address contained in the 'Network Address of Next Hop' field in the
SW_NEXT_HOP attribute):
The Tunnel/Tree-ID and the Tunnel endpoint address will be used to
lookup the appropriate tunnel in the Tunnel database to establish data
forwarding through this Tunnel. Data traffic for the NLRIs carried in
this BGP update will now be forwarded through this Tunnel.
Note that the Tunnels themselves are established by the respective Tunnel
protocols (Eg. mGRE, IPSec, L2TP etc).
As an example, if the BGP Tunnel SAFI is the mechanism used to discover
the Tunnels, then the Tunnel-ID:Tunnel endpoint address will be the NLRI
carried by the BGP Tunnel SAFI [BGP-TUN-SAFI] updates. The Tunnel
encapsulations will be carried in the BGP Tunnel attribute [BGP-TUN]
accompanying the BGP Tunnel SAFI update.
On the other hand, if IGP-based discovery of TE tunnels [IGP-TE],
the mechanism used to discover TE tunnels is used, then the Tunnel-ID and
Tunnel endpoint address will identify the TE tunnel discovered
through this mechanism.
draft-nalawade-softwire-nhop-00.txt [Page 6]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
Similarly this applies to other out of band Tunnel discovery mechanisms
as well which includes static configuration.
7. Capability advertisement
A new capability [BGP-CAP] code (TBD) is defined for the BGP SW_NEXT_HOP
attribute. The Capability Length is set to zero. The SW_NEXT_HOP attribute
can only be sent to peers that have advertised this capability.
8. Applicability Statement
8.1. VPNv4 unicast traffic over a Tunnel
If VPNv4 unicast traffic has to be tunneled through an ISP core instead of
being MPLS switched as per RFC 4364, then the ingress PE needs to know what
Tunnel to connect to. The egress PE may use the SW_NEXT_HOP attribute to
signal this information. The Tunnel encapsulation itself could be statically
configured or discovered through various mechanisms such as IGP based
discovery of TE tunnels [IGP-TE] or a BGP Tunnel SAFI [BGP-TUN-SAFI].
If an ingress PE receives a BGP update for the VPNv4 prefix with a
SW_NEXT_HOP attribute, it would be able to connect to the appropriate
Tunnel. Using the Tunnel-ID and Tunnel endpoint address, the SW_NEXT_HOP
attribute will indicate which Tunnel is to be used to reach the VPNv4
destination.
For an IPv4 core, the contents of the SW_NEXT_HOP attribute can be
expressed as follows:
Address Family Identifier: 1 (IP version 4)
Subsequent Address Family Identifier: 1 (Unicast)
Length of Next Hop Network Address: 4
Network Address of Next Hop: IPv4 address of the egress PE
TLVs:
Type: 1
Length: 2
Value: Tunnel Identifier value as created by the egress PE
8.2. MVPN traffic over a default MDT Tunnel
A Multicast tunnel is setup between the PEs in one or more VPN-Providers
networks. Over the Multicast tunnel we create PIM neighbors. The IP
address of the PIM neighbor that is seen over the Multicast tunnel
depends on the configured address of the Tunnel endpoint. This can
either be an unnumbered address from a different interface or a configured
address on the Tunnel itself. The PE router that does an RPF check on a
VPN source can find which Tunnel the source is on, but may not know what
PIM neighbor to target on that tunnel. Therefore we need a way to connect
draft-nalawade-softwire-nhop-00.txt [Page 7]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
the BGP VPNv4 prefix to the PIM neighbor on the tunnel to allow the RPF
check to succeed.
Suppose PIM wants to join to a source that is behind another VPN site. We
do an RPF lookup on the source address in the VPNv4 unicast table on
this PE. The RPF lookup will return a connected next-hop and interface
to use to reach the source. The returned next-hop may not be the neighbor
on the Multicast tunnel. This can be due to the next-hop being rewritten
by BGP Route Reflectors (RR) or crossing AS's. Therefore we don't know
which PIM neighbor to target as an upstream neighbor in the PIM join.
This can be achieved by using the SW_NEXT_HOP attribute to carry that
information. The SW_NEXT_HOP attribute when carried with Type 2, will
indicate what default MDT tunnel endpoint's IP address is.
8.3. Multicast VPN traffic over Label-switched or other Multicast Tunnels
If a BGP Multicast Overlay SAFI [BGP-MOS] is used for signalling Multicast
Join/Prune Binding information, the downstream PE needs to know what
Multicast tree built by MLDP or what Tunnel to bind to. The Tunnel
encapsulation information itself could be provided by MLDP when
Multipoint LSPs are used in the core. Or the Tunnel encapsulation could
be provided by TE, or through the BGP Tunnel SAFI [BGP-TUN-SAFI]. Either
ways, the downstream PE needs to know which Tunnel to connect to in
order to receive a Multicast stream corresponding to a given PIM Join.
This can be achieved by the Upstream PE sending the Tunnel/P-MP LSP
binding information through the SW_NEXT_HOP attribute.
8.4. IPv4 Forwarding over IPv6 Networks
With the rapid deployment of IPv6 networks, there is a requirement for
IPv6 backbones to provide packet transport service to existing IPv4
access networks. One part of the control plane mechanism involves
carrying IPv4 NLRIs with the IPv6 network layer address as the next hop.
This can be achieved by the egress PE sending either an MP_REACH_NLRI
or BGP-4 Update message with the IPv4 NLRIs that carries an SW_NEXT_HOP
attribute containing the IPv6 next hop.
Address Family Identifier: 2 (IP version 6)
Subsequent Address Family Identifier: 1 (Unicast)
Length of Next Hop Network Address: 16
Network Address of Next Hop: IPv6 address of the egress PE
TLVs:
Type: 1
Length: 2
Value: Tunnel Identifier value as created by the egress PE
draft-nalawade-softwire-nhop-00.txt [Page 8]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
9. Route Reflector Considerations
If Route Reflectors (RR) reflect routes from the BGP speakers supporting
SW_NEXT_HOP attribute, they MUST support this new capability to be
able to validate the nexthop.
If the Route Reflectors are not in the forwarding path, they don't
need to perform a nexthop resolution and so validating just network
address portion of the SW_NEXT_HOP attribute is sufficient. So, Route
Reflectors not in the forwarding path may choose not to validate TLV
fields carried inside the SW_NEXT_HOP attribute that provides
additional information to resolve the nexthop.
When data traffic for a network layer protocol's NLRIs needs to be
forwarded over an underlying tunnel, there are two possible ways to
carry nexthop information inside SW_NEXT_HOP attribute. The nexthop
can be carried as a IPv4/IPv6 network address with additional tunnel
end-point information carried inside a TLV field. Alternatively, the
nexthop can be carried directly in the form of [BGP-TUN-SAFI] where
the tunnel end-point information and the nexthop address are embedded
together in the network address portion of the nexthop. Alternately
nexthop can be carried as a IPv4/IPv6 network address with additional
tunnel end-point information carried inside a TLV field. In case of
Route Reflector partitioning, it is possible that tunnel end-point
information [BGP-TUN-SAFI] is exchanged via different Route Reflector
from the one carrying the NLRIs forwarded over those tunnels. To
facilitate nexthop validation on such Route Reflectors, it is
recommended to carry Nexthop information in the alternate TLV format.
10. IANA Considerations
A BGP attribute code [BGP-4] and a Capability code [BGP-CAP] will be
needed to be obtained from IANA.
11. Security Considerations
This extension to BGP does not change the underlying security issues.
12. Acknowledgements
This specification combines and extends prior work on "BGP-4 NEXTHOP-v2
Attribute" by Francois Le Faucheur, Dan Tappan, and Gargi Nalawade, with
prior work on "BGP Connector Attribute" by Gargi Nalawade, Ruchi Kapoor,
and David Ward. The current authors wish to thank all these authors for
their contribution.
The authors would like to thank Dan Tappan, Chris Metz, Eric Rosen,
Christian Cassar and Scott Wainner for their feedback, review and comments.
draft-nalawade-softwire-nhop-00.txt [Page 9]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
13. Normative References
[BGP-4] Rekhter, Y. and T. Li (editors), "A Border Gateway Protocol
4 (BGP-4)", Internet Draft draft-ietf-idr-bgp4-26.txt, April 2005.
[BGP-CAP] Chandra, R., Scudder, J., "Capabilities Advertisement with
BGP-4", draft-ietf-idr-rfc2842bis-02.txt, April 2002.
[BGP-V6-TUNN] Ooms et al., Connecting IPv6 Islands across IPv4 Clouds
with BGP, draft-ooms-v6ops-bgp-tunnel-00.txt, work in progress.
[BGP-TUN] Kapoor R., Nalawade G., "BGPv4 Tunnel Encapsulation Attribute",
June 2008, <draft-kapoor-nalawade-bgp-ssa-04.txt>, Work in Progress.
[BGP-TUN-SAFI] Nalawade G., Kapoor R., Tappan T., Wainner S. "BGPv4
Tunnel SAFI", June 2006,
<draft-nalawade-kapoor-bgp-tunnel-safi-04.txt>, Work in Progress.
[SW-MESH-FMWK] Metz, C. et al, "A Framework for Softwire Mesh
Signaling, Routing and Encapsulation across IPv4 and IPv6 Backbone
Networks", draft-wu-softwire-mesh-framework-00, June 2006.
[BGP-MOS] Nalawade G., Bhaskar N., Mehta P. "Multicast PE-PE Signaling
using BGP", June 2006,
draft-nalawade-bgp-mcast-signaling-001.txt, Work in Progress.
[MULTI-BGP] Bates et al, Multiprotocol Extensions for BGP-4, draft-
ietf-idr-rfc2858bis-02.txt, work in progress.
[IGP-TE] Vasseur J., Psenak P., Yasukawa S., "OSPF MPLS Traffic
Engineering Capabilities", Feb 2004, Work in Progress.
[BGP-VPN] Rosen E., Rekhter Y., "BGP/MPLS IP Virtual Private Networks
(VPNs)", RFC 4364
14. Author's Addresses
Gargi Nalawade
170 Tasman Drive
San Jose, CA, 95134
E-mail: gargi@cisco.com
Pradosh Mohapatra
170 Tasman Drive
San Jose, CA, 95134
E-mail: pmohapat@cisco.com
Francois Le Faucheur
draft-nalawade-softwire-nhop-00.txt [Page 10]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
Cisco Systems, Inc.
Village d'Entreprise Green Side - Batiment T3
400, Avenue de Roumanille
06410 Biot-Sophia Antipolis
France
E-mail: flefauch@cisco.com
Ruchi Kapoor
170 Tasman Drive
San Jose, CA, 95134
E-mail: ruchi@cisco.com
Pranav Mehta
170 Tasman Drive
San Jose, CA, 95134
E-mail: ruchi@cisco.com
David Ward
408 St Peter Street, Hamm Bldg
St Paul, MN, 55102
E-mail: wardd@cisco.com
Simon Barber
Cisco Systems, Inc
Email: sbarber@cisco.com
Jianping Wu
Tsinghua University
Department of Computer Science, Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6278-5983
Email: jianping@cernet.edu.cn
Yong Cui
Tsinghua University
Department of Computer Science, Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6278-5822
Email: cuiyong@tsinghua.edu.cn
Xing Li
Tsinghua University
Department of Electronic Engineering, Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6278-5983
draft-nalawade-softwire-nhop-00.txt [Page 11]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
Email: xing@cernet.edu.cn
15. Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed
to pertain to the implementation or use of the technology
described in this document or the extent to which any license
under such rights might or might not be available; nor does it
represent that it has made any independent effort to identify any
such rights. Information on the procedures with respect to rights
in RFC documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use
of such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository
at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention
any copyrights, patents or patent applications, or other
proprietary rights that may cover technology that may be required
to implement this standard. Please address the information to the
IETF at ietf-ipr@ietf.org.
16. Full Copyright Statement
"Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights."
Additional copyright notices are not permitted in IETF Documents
except in the case where such document is the product of a joint
development effort between the IETF and another standards development
organization or the document is a republication of the work of
another standards organization. Such exceptions must be approved on
an individual basis by the IAB.
"This document and the information contained herein are provided
on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
draft-nalawade-softwire-nhop-00.txt [Page 12]
Internet Draft draft-nalawade-softwire-nhop-00.txt June 2006
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE."
17. Expiration Date
This memo is filed as <draft-nalawade-softwire-nhop-00.txt>, and
expires December, 2006.
draft-nalawade-softwire-nhop-00.txt [Page 13]