Network Working Group | P. Francis |
Internet-Draft | MPI-SWS |
Intended status: Informational | X. Xu |
Expires: September 03, 2011 | Huawei |
H. Ballani | |
Cornell U. | |
R. Raszuk | |
Cisco | |
L. Zhang | |
UCLA | |
March 02, 2011 |
Simple Virtual Aggregation (S-VA)
draft-ietf-grow-simple-va-02.txt
The continued growth in the Default Free Routing Table (DFRT) stresses the global routing system in a number of ways. One of the most costly stresses is FIB size: ISPs often must upgrade router hardware simply because the FIB has run out of space, and router vendors must design routers that have adequate FIB. FIB suppression is an approach to relieving stress on the FIB by NOT loading selected RIB entries into the FIB. Simple Virtual Aggregation (S-VA) is a simple form of Virtual Aggregation (VA) that allows any and all edge routers to shrink their FIB requirements substantially and therefore increase their useful lifetime. S-VA does not change FIB requirements for core routers. S-VA is extremely easy to configure---considerably more so than the various tricks done today to extend the life of edge routers. S-VA can be deployed autonomously by an ISP (cooperation between ISPs is not required), and can co-exist with legacy routers in the ISP. There are no changes from the 01 version to this version.
This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 03, 2011.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
ISPs today manage constant DFRT growth in a number of ways. One way, of course, is for ISPs to upgrade their router hardware before DFRT growth outstrips the size of the FIB. This is too expensive for many ISPs. They would prefer to extend the lifetime of routers whose FIBs can no longer hold the full DFRT.
A common approach taken by lower-tier ISPs is to default route to their providers. Routes to customers and peer ISPs are maintained, but everything else defaults to the provider. This approach has several disadvantages. First, packets to Internet destinations may take longer-than-necessary AS paths. This problem can be mitigated through careful configuration of partial defaults, but this can require substantial configuration overhead. A second problem with defaulting to providers is that the ISP is no longer able to provide the full DFRT to its customers. Finally, provider defaults prevents the ISP from being able to detect martian packets. As a result, the ISP transmits packets that could otherwise have been dropped over its expensive provider links. Simple Virtual Aggregation (S-VA) solves these problems because the full DFRT is used by core routers.
An alternative is for the ISP to maintain full routes in its core routers, but to filter routes from edge routers that do not require a full DFRT. These edge routers can then default route to the core routers. This is often possible with edge routers that interface to customer networks. The problem with this approach is that it cannot be used for all edge routers. For instance, it cannot be used for routers that connect to transits. It should also not be used for routers that connect to customers which wish to receive the full DFRT.
This draft describes a very simple technique, called Simple Virtual Aggregation (S-VA), that allows any and all edge routers to have substantially reduced FIB requirements even while still advertising and receiving the full DFRT over BGP. The basic idea is as follows. Core routers in the ISP maintain the full DFRT in the FIB and RIB. Edge routers maintain the full DFRT in the RIB, but suppress certain routes from the FIB. Edge routers install a default route to core routers. Label Switched Paths (LSP) are used to transmit packets from a core router, through the edge router, to the Next Hop remote Autonomous System Border Router (ASBR). ASBRs strip the tunnel header (MPLS or IP) before forwarding tunneled packets to the remote ASBR (in much the same way MPLS Penultimate Hop Popping (PHP) strips the LSP header before forwarding packets to the tunnel target).
S-VA requires no changes to BGP and no changes to MPLS forwarding mechanisms in routers. Configuration is extremely simple: S-VA must be enabled, and routers must told whether they are FIB-suppressing routers or not. Everything else is automatic. ISPs can deploy FIB suppression autonomously and with no coordination with neighbor ASes.
The scope of this document is limited to Intra-domain S-VA operation. In other words, the case where a single ISP autonomously operates S-VA internally without any coordination with neighboring ISPs.
Note that this document assumes that the S-VA "domain" (i.e. the unit of autonomy) is the AS (that is, different ASes run S-VA independently and without coordination). For the remainder of this document, the terms ISP, AS, and domain are used interchangeably.
This document applies equally to IPv4 and IPv6.
S-VA may operate with a mix of upgraded routers and legacy routers. There are no topological restrictions placed on the mix of routers. In order to avoid loops between upgraded and legacy routers, however, legacy routers must be able to terminate tunnels.
Note that S-VA is a greatly simplified variant of "full VA" [I-D.ietf-grow-va]. With full VA, all routers (core or otherwise) can have reduced FIBs. However, full VA requires substantial new configuration and operational complexity compared to S-VA. Note that S-VA was formerly specified in [I-D.ietf-grow-va]. It has been moved to this separate draft to simplify its understanding.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
There are three types of routers in S-VA, FIB-Installing routers (FIR), FIB-Suppressing routers (FSR), and optionally legacy routers. While any router can be an FIR or an FSR (there are no topology constraints), the simplist form of deployment is for border routers to be configured as edge routers, and for non-border routers (for instance the routers used as route reflectors) to be configured as core routers. S-VA, however, does not mandate this deployment per se.
FIRs must originate a BGP route to NLRI 0/0 [RFC4271]. The ORIGIN is set to INCOMPLETE (value 2), the AS number of the FIR's AS is used in the AS_PATH, and the BGP NEXT_HOP is set to the router's own address. The ATOMIC_AGGREGATE and AGGREGATOR attributes are not included. The FIR MUST attach a NO_EXPORT Communities Attribute [RFC1997] to the route.
FIRs must not FIB-suppress any routes.
FSRs must FIB-install a route to 0/0. When transmitting a packet to a FIR (i.e. based on a 0/0 FIB lookup), the packet must be tunneled. This is to prevent loops that would otherwise occur when a packet transits multiple FSRs on the way to the core, some of which have FIB-installed the route for the destination, and others of which have not. FSRs may FIB-install any other routes. They should install any routes for which their eBGP neighbor is the NEXT_HOP. There are a couple reasons for this, which can be illustrated in the figure below. This figure shows an autonomous system with a FIR FIR1 and an FSR FSR1. FSR1 is an ASBR and is connected to two remote ASBRs, EP1 and EP2.
+------------------------------------------+ | Autonomous System | +----+ | | |EP1 | | /---+---| | | To ----\ +----+ +----+ / | +----+ | Other \|FIR1|----------|FSR1|/ | |Routers /| | | |\ | | ----/ +----+ +----+ \ | +----+ | \---+---|EP2 | | | | | | | +----+ +------------------------------------------+
Suppose that FSR1 does not FIB-install routes for which EP1 and EP2 are next hops. In this case, when EP2 sends a packet to FSR1 for which the next hop is EP1, FSR1 will first tunnel the packet to FIR1, which will tunnel it right back to FSR1. This trombone routing is avoided if local ASBRs FIB-install routes where their neighbor remote ASBRs are the BGP NEXT_HOP.
In addition, FSR1 cannot filter source addresses using strict unicast Reverse Path Forwarding (uRPF) unless it FIB-installs the routes learned from the remote ASBR. Note, however, that FSRs cannot do loose uRPF. Rather, this must be done by FIRs.
The above observations lead to the following rules: FSRs that are ASBRs should FIB-install all routes for which the neighbor is the BGP NEXT_HOP. FSRs that are ASBRs must FIB-install any routes that are used for uRPF.
S-VA works with both MPLS and IP-in-IP tunnels. There are potentially up to two tunnels required for a packet to traverse an AS with S-VA. The first tunnel is that from an FSR to a FIR (for the 0/0 default). This is called the default tunnel. The second tunnel targets the remote ASBR which is the BGP NEXT_HOP, although the tunnel header is stripped by the local ASBR before transmitting to the remote ASBR. This is the exit tunnel. The start of the exit tunnel is an ingress local ASBR in the case where the ingress local ASBR has FIB-installed the associated route. Otherwise, the start of the exit tunnel is a FIR.
The target address of the default tunnel is always the FIR. If MPLS is used, the FIRs must initiate LSPs to themselves using either the Label Distribution Protocol (LDP) [RFC5036]. RSVP-TE [RFC3209] may also be used.
If IP-in-IP tunnels are used, then the BGP Encapsulation Extended Community (BGPencap-Attribute) ([RFC5512]) is used to convey the ability to accept tunnels at the target address (the BGP NEXT_HOP).
For the exit tunnels, again either MPLS or IP-in-IP can be used. In the case of IP-in-IP, the inner label defined in [RFC4023] and signaled in BGP with [RFC3107] is used by the local ASBR to identify the remote ASBR which is the BGP NEXT_HOP for the packet. Specifically, when a local ASBR, which can be either an FSR or a FIR, advertises an eBGP-received route into iBGP, it sets the BGP NEXT_HOP as itself. It assigns a label to the route. This label is used as the inner label in packets tunneled to the local ASBR, and is used to identify the remote ASBR from which the route was received. When receiving a packet with this label, the local ASBR strips off the label, and forwards the native packet to the remote ASBR indicated by the label.
In the case of MPLS, the inner label may or may not be used. If it is used, then an LSP is established to the IP address of the local ASBR as described above for FIRs. The BGP NEXT_HOP is set to be itself (the same address that serves as the FEC in the LSP). The inner label is established as described in the previous paragraph for IP-in-IP tunnels, but with the encapsulation defined in [RFC3032].
If the inner label is not used, then the local ASBR must initiate a Downstream Unsolicited LSP for each remote ASBR. The FEC for the LSP is the remote ASBR address that is used in the BGP NEXT_HOP field. When a packet is received on one of these LSPs, the local ASBR strips the MPLS header, and forwards the packet to the remote ASBR indicated by the label.
S-VA may be operated with a mix of legacy and S-VA-upgraded routers. The legacy routers, however, must be able to forward tunneled packets. In the case of MPLS tunnels, this means that they must fully participate in MPLS signaling. If a legacy router is an ASBR, then it must also initiate tunnels to itself and be able to detunnel packets (without the inner label).
There are no IANA considerations.
The authors are not aware of any new security considerations due to S-VA.
The concept for S-VA comes from Robert Raszuk.