Internet Engineering Task Force | D.M. McPherson |
Internet-Draft | Verisign, Inc. |
Intended status: Informational | D.O. Oran |
Expires: January 16, 2014 | Cisco Systems |
D.T. Thaler | |
Microsoft Corporation | |
E.O. Osterweil | |
Verisign, Inc. | |
July 15, 2013 |
Architectural Considerations of IP Anycast
draft-iab-anycast-arch-implications-10
This memo discusses architectural implications of IP anycast, and provides some historical analysis of anycast use by various IETF protocols.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 16, 2014.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
IP anycast is a technique with a long legacy, with interesting engineering challenges associated with it. However, at its core it is a relatively simple concept. As described in BCP 126 [RFC4786], the general form of IP anycast is the announcement of a stable set of IP addresses from multiple topological locations in the network.
IP anycast is used for at least one critical Internet service, that of the Domain Name System [RFC1035] root servers. As of late 2007, at least 10 of the 13 root name servers were using IP anycast [RSSAC29]. Use of IP anycast is growing for other applications as well. It has been deployed for over a decade for DNS resolution services and is currently used by several DNS Top Level Domain (TLD) operators. IP anycast is also used for other services in operational environments, including Network Time Protocol (NTP) [RFC5905] services.
Anycast addresses are syntactically indistinguishable from unicast addresses. Allocation of anycast addresses typically follows a model similar to that of unicast allocation policies. Anycast addressing is largely equivalent to that of unicast in multiple locations, and leverages unicast's destination-based routing to deliver a packet to either zero or one interface among the set of interfaces asserting the reachability for the address. The expectation of delivery is to the "closest" instance as determined by unicast routing topology metric(s), and there is also a possibility that various load-balancing techniques (e.g., per-packet, per-microflow) may be used among multiple equal cost routes to distribute load for an anycasted prefix.
Unlike IP unicast, it is not considered an error to assert the same anycast address on multiple interfaces within the same or multiple systems.
Some consider anycast a "deceptively simple idea". That is, when IP anycast is employed, many pitfalls and subtleties exist with applications and transports, as well as for routing configuration and operation. In this document, we aim to capture many of the architectural implications of IP anycast.
BCP 126 [RFC4786] discusses several different deployment models with IP anycast. Two additional distinctions beyond that document involve "off-link anycast" and "on-link anycast". "Off-link anycast" takes advantage of routing protocol preferences and IP's hop-by-hop destination-based forwarding paradigm in order to direct packets to the "closest" destination. This is the traditional method of anycast largely considered in BCP 126 [RFC4786] and can be used for IPv4 and IPv6. "On-link anycast" is the formal support of anycast in the address resolution (duplicate address detection) protocol and is only standardized for IPv6, with the introduction of designated anycast addresses on the anycasted hosts, and the Override flag in Neighbor Discovery (ND) Neighbor Advertisements (NAs) [RFC4861]. There is no standardized mechanism for this in IPv4.
As of this writing, the term "anycast" appears in 176 RFCs and 144 active Internet-Drafts. The following sections capture some of the key appearances and discussion of anycasting within the IETF over the years.
The first formal specification of anycast was provided in "Host Anycasting Service" [RFC1546]. The authors of this document did a good job of capturing most of the issues that exist with IP anycast today.
One of the first documented uses of anycast was in 1994 for a "Video Registry" experiment [IMR9401]. In the experiment a UDP query was transmitted to an anycasted address to locate the topologically closest "supposedly equivalent network resource":
There is also discussion that ISPs began using anycast for DNS resolution services around the same time, although no public references to support this are available.
In 1997 the IAB clarified that IPv4 anycast addresses were pure "locators", and could never serve as an "identifier" (of a host, an interface, or anything else) [RFC2101].
In 1998 the IAB conducted a routing workshop [RFC2902]. Of the conclusions and output action items from the report, an Anycast section is contained in Section 2.10.3. Specifically called out is the need to describe the advantages and disadvantages of anycast, and the belief that local-scoped well-known anycast addresses will be useful to some applications. In the subsequent section, an action item was outlined that suggested a BOF should be held to plan work on anycast, and if a working group forms, a paper on the advantages and the disadvantages of anycast should be included as part of the charter.
As a result of the recommendation in [RFC2902], in November of 1999 an Anycast BOF [ANYCASTBOF] was held at IETF 46. A number of uses for anycast were discussed. No firm conclusion was reached regarding use of TCP with anycasted services, but it was observed that anycasting was useful for DNS, although it did introduce some new complexities. The use of global anycast was not expected to scale (see Section 4.1 below for more discussion), and hence was expected to be limited to a small number of key uses.
In 2001, the Multicast and Anycast Group Membership [MAGMA] WG was chartered to address host-to-router signaling, including initial authentication and access control issues for multicast and anycast group membership, but other aspects of anycast, including architecture and routing, were outside the group's scope.
SNTPv4 [RFC2030] defined how to use SNTP anycast for server discovery. This was extended in [RFC4330] as an NTP-specific "manycast" service, in which anycast was used for the discovery part.
IPv6 defined some reserved subnet anycast addresses [RFC2526] and assigned one to "Mobile IPv6 Home-Agents" [RFC3775] (obsoleted by [RFC6275]).
The original IPv6 transition mechanism [RFC2893] made use of IPv4 anycast addresses as tunnel endpoints for IPv6 encapsulated in IPv4, but this was later removed [RFC4213]. The 6to4 tunneling protocol [RFC3056] was augmented by a 6to4 relay anycast prefix [RFC3068] aiming to simplify the configuration of 6to4 routers. Incidentally, 6to4 deployment has shown a fair number of operational and security issues [RFC3964] that result from using anycast as a discovery mechanism. Specifically, one inference is that operational consideration is needed to ensure that anycast addresses get advertised and/or filtered in a way that produces the intended scope (e.g., only advertise a route for your 6to4 relay to ASes that conform to your own acceptable usage policy), an attribute that can easily become quite operationally expensive.
In 2002, DNS' use of anycast was first specified in "Distributing Authoritative Name Servers via Shared Unicast Addresses" [RFC3258]. It is notable that it used the term "shared unicast address" rather than "anycast address" for the service. This distinction was made due to the IPv6 distinction in the on-link model. "Shared unicast" addresses are unicast (not multicast) in the IPv6 model and, therefore, support the off-link anycast model (described earlier), but not the on-link anycast model. At the same time, site-local-scoped well-known addresses began being used for recursive resolvers [I-D.ietf-ipv6-dns-discovery], but this use was never standardized (see below in Section 3.4 for more discussion).
Anycast was used for routing to rendezvous points (RPs) for PIM [RFC4610].
"Operation of Anycast Services" BCP 126 [RFC4786] deals with how the routing system interacts with anycast services, and the operation of anycast services.
"Requirements for a Mechanism Identifying a Name Server Instance" [RFC4892] cites the use of anycast with DNS as a motivation to identify individual name server instances, and the Name Server ID (NSID) option was defined for this purpose [RFC5001]. One could view the addition of NSID as an incarnation of locator and identifier separation (where the anycast address is a locator and the NSID is an identifier).
The IAB's "Reflections on Internet Transparency" [RFC4924] briefly mentions how violating transparency can also damage global services that use anycast.
Originally, the IPv6 addressing architecture [RFC1884] [RFC2373] [RFC3513] severely restricted the use of anycast addresses. In particular, they provided that anycast addresses must not be used as a source address, and must not be assigned to an IPv6 host (i.e., only routers). These restrictions were later lifted in 2006 [RFC4291].
In fact, the recent "IPv6 Transition/Co-existence Security Considerations" [RFC4942] overview now recommends:
As discussed in the Overview, "on-link anycast" is employed expressly in IPv6 via ND NAs; see Section 7.2.7 of [RFC4861] for additional information.
"Distributed Authoritative Name Servers via Shared Unicast Addresses" [RFC3258] described how to reach authoritative name servers using multiple unicast addresses, each one configured on a different set of servers. It stated in Section 2.3:
In anycast more generally, most anycast benefits cannot be realized without route withdrawals since traffic will continue to be directed to the link with the failed server. When multiple unicast addresses are used with different sets of servers, a client can still fail over to using a different server address and hence a different set of servers. There can still be reliability problems, however, when each set contains a failed server.
Other assertions included:
"Unique Per-Node Origin ASNs for Globally Anycasted Services" [RFC6382] makes recommendations regarding the use of per-node unique origin ASNs for globally anycasted critical infrastructure services in order to provide routing system discriminators for a given anycasted prefix. The object was to allow network management and monitoring techniques, or other operational mechanisms to employ this new origin AS as a discriminator in whatever manner fits their operating environment, either for detection or policy associated with a given anycasted node.
"Operation of Anycast Services" BCP 126 [RFC4786] was a product of the IETF's GROW working group. The primary design constraint considered was that routing "be stable" for significantly longer than a "transaction time", where "transaction time" is loosely defined as "a single interaction between a single client and a single server". It takes no position on what applications are suitable candidates for anycast usage.
Furthermore, it views anycast service disruptions as an operational problem: "Operators should be aware that, especially for long running flows, there are potential failure modes using anycast that are more complex than a simple 'destination unreachable' failure using unicast."
The document primary deals with global Internet-wide services provided by anycast. Where internal topology issues are discussed they're mostly regarding routing implications, rather than application design implications. BCP 126 also views networks employing per-packet load balancing on equal cost paths as "pathological". This was also discussed in [RFC2991].
Preserving the integrity of a modular layered design for IP protocols on the Internet is critical to its continued success and flexibility. One such consideration is that of whether an application should have to adapt to changes in the routing system.
Higher layer protocols should make minimal assumptions about lower layer protocols. E.g., applications should make minimal assumptions about routing stability, just as they should make minimal assumptions about congestion and packet loss. When designing applications, it would perhaps be safe to assume that the routing system may deliver each packet to a different service instance, in any pattern, with temporal re-ordering being a not-so-rare phenomenon.
Stateful transport protocols (e.g., TCP), without modification, do not understand the properties of anycast and hence will fail probabilistically, but possibly catastrophically, when using anycast addresses in the presence of "normal" routing dynamics. Specifically, if datagrams associated with a given active transaction are routed to a new anycasted end system and that end system lacks state data associated with the active transaction, the session will be reset and hence need to be reinitiated.
Anycast addresses are "safe" to use as destination addresses for an application if the following design points are all met:
It is noteworthy, though, that even under the above circumstances ICMP messages against packets with anycast source addresses may be routed to servers other than those expected. In addition, the Path Maximum Transmission Unit daemon (PMTUD) will likely encounter complications when employed against anycast addresses, since iterations in the PMTU discovery process may have packets routed to different anycast service instances.
Anycast addresses are "safe" to use as source addresses for an application if all of the following design points are met:
Applications able to tolerate an extra round trip time (RTT) to learn a unicast destination address for multi-packet exchanges might safely use anycast destination addresses for service instance discovery. For example, "instance discovery" messages are sent to an anycast destination address, and a reply is subsequently sent from the unique unicast source address of the interface that received the discovery message, or a reply is sent from the anycast source address of the interface that received the message, containing the unicast address to be used to invoke the service. Only the latter of these will avoid potential NAT binding and stateful firewall issues.
One noteworthy historical item is that after the final round of revisions to [I-D.ietf-ipv6-dns-discovery] were made, [RFC4339] was published with a very similar focus, and overlapping content.
Section 3.3 of [RFC4339] proposes a "Well-known Anycast Address" for recursive DNS service configuration in clients to ease configuration and allow those systems to ship with these well-known addresses configured "from the beginning, as, say, factory default". This RFC discussed several options to address the need to configure DNS servers. As a result, the discussions of each (including the use of anycast) were framed with advantages and disadvantages. Accordingly, Section 3.3.1 and 3.3.2, of [RFC4339], each discuss the advantages and disadvantages (respectively) of using anycast in this scenario. During publication the IESG requested that the following "IESG Note" be contained in the document:
Widespread use of anycast for global Internet-wide services or inter- domain services has some scaling challenges. Similar in ways to multicast, each service generates at least one unique route in the global BGP routing system. As a result, additional anycast instances result in additional paths for a given prefix, which scales super- linearly as a function of denseness of inter-domain interconnection within the routing system (i.e., more paths result in more resources, more network interconnections result in more paths).
This is why the Anycast BOF concluded that "the use of global anycast addresses was not expected to scale and hence was expected to be limited to a small number of key uses."
However, one interesting note is that multiple anycast services can (unlike multicast) share a routing system entry if they are all located in a single announced prefix, and if all the servers of all the services are always collocated.
UDP is the "lingua franca" for anycast today. Stateful transports could be enhanced to be more anycast friendly. This was anticipated in Host Anycasting Services [RFC1546], specifically:
The reason for such considerations can be illustrated through an example: one operationally observed shortcoming of using the Transmission Control Protocol (TCP) [RFC0793] and anycast nodes in DNS is that even during the TCP connection establishment, IP control packets from a DNS client may initially be routed to one anycast instance, but subsequent IP packets may be delivered to a different anycast instance if (for example) a route has changed. In such a case, the TCP connection will likely elicit a connection reset, but will certainly result in the disruption of the connection.
Multi-address transports (e.g., SCTP) might be more amenable to such extensions than TCP.
The features needed for address discovery when doing multi-homing in the transport layer are similar to those needed to support anycast.
Middleboxes (e.g., NATs) and stateful firewalls cause problems when used in conjunction with some ways to use anycast. In particular, a server-side transition from an anycast source IP address to a unique unicast address may require new or additional session state, and this may not exist in the middlebox, as discussed previously in Section 3.4.
Anycast is often deployed to mitigate or at least localize the effects of distributed denial of service (DDOS) attacks. For example, with the Netgear NTP fiasco [RFC4085] anycast was used in a distributed sinkhole model [RFC3882] to mitigate the effects of embedded globally-routed Internet addresses in network elements.
"Internet Denial-of-Service Considerations" [RFC4732] notes: that "A number of the root nameservers have since been replicated using anycast to further improve their resistance to DoS".
"Operation of Anycast Services" BCP 126 [RFC4786] cites DoS mitigation, constraining DoS to localized regions, and identifying attack sources using spoofed addresses as some motivations to deploy services using anycast. Multiple anycast service instances such as those used by the root name servers also add resiliency when network partitioning occurs (e.g., as the result of transoceanic fiber cuts or natural disasters).
When using anycast, care must be taken not to simply withdraw an anycast route in the presence of a sustained DoS attack, since the result would simply move the attack to another service instance, potentially causing a cascaded failure. Anycast adds resiliency when such an attack is instead constrained to a single service instance.
It should be noted that there is a significant man in the middle (MITM) exposure in either variant of anycast discovery (see Section 3.4) that in many applications may necessitate the need for end to end security models [RFC4302] [RFC4303] [RFC4305], or even [RFC4033] that enable end systems to authenticate one another, or the data itself.
However, when considering the above suggestion of enabling end systems to authenticate each other, a potential complication can arise. if the service nodes of an anycast deployment are administered by separate authorities (as in the case of the DNS root servers), any server-side authentication credentials that are used must (necessarily) be shared across the administrative boundaries in the anycast deployment. This would likely also be the case with Secure Neighbor Discovery, described in [RFC5909].
Furthermore, as discussed earlier in this document, operational consideration needs to be given to ensure that anycast addresses get advertised and/or filtered in a way that produces intended scope (for example, only advertise a route to your 6to4 relay to ASes that conform to your own Acceptable Use Policy, AUP). This seems to be operationally expensive, and is often vulnerable to errors outside of the local routing domain, in particular when anycasted services are deployed with the intent to scope associated announcements within some local or regional boundary.
As previously discussed, [RFC6382] makes recommendations regarding the use of per-node unique origin ASNs for globally anycasted critical infrastructure services in order to provide routing system discriminators for a given anycasted prefix. Network management and monitoring techniques, or other operational mechanisms may then employ this new discriminator in whatever manner fits their operating environment, either for detection or policy associated with a given anycasted node.
Moreover, the use of per-node unique origin ASNs has the additional benefit of overcoming complications that might arise with the potential deployment of the Resource Public Key Infrastructure (RPKI) [RFC6480]. Without per-node unique origin ASNs, the cryptographic certificates needed to attest to the ROAs of a multi-administrative deployment of anycast would need to be shared. However, if each service instance has a separate ASN, then those ASNs can be managed separately in the RPKI.
Unlike multicast (but like unicast), anycast allows traffic stealing. That is, with multicast, joining a multicast group doesn't prevent anyone else who was receiving the traffic from continuing to receive the traffic. With anycast, adding an anycasted node to the routing system can prevent a previous recipient from continuing to receive traffic because it may now be delivered to the new node instead. As such, if one allows unauthorized anycast nodes onto the network, traffic can be diverted thereby triggering DoS or other attacks. Section 6.3 of BCP 126 [RFC4786] provides expanded discussion on "Service Hijacking" and "traffic stealing".
Unlike unicast (but like multicast), the desire is to allow applications to cause route injection (either directly or as a side effect of doing something else). This combination is unique to anycast and presents new security concerns which are why MAGMA [MAGMA] only got so far. The security concerns include:
These are two of the core issues that were part of the discussion during [RFC1884], the [ANYCASTBOF], and the MAGMA [MAGMA] chartering.
Additional security considerations are scattered throughout the list of references provided herein.
BCP 126 [RFC4786] provides some very solid guidance related to operations of anycasted services, and in particular DNS.
This document covers issues associated with the architectural implications of anycast. This document does not treat in any depth the fact that there are deployed services with TCP transport using anycast today. Evidence exists to suggest that such practice is not "safe" in the traditional and architectural sense (as described in Section 4.2). These sorts of issues are indeed relative, and we recognize sometimes unpredictability in the routing system beyond the local administrative domain can be manageable. That is, despite the inherent architectural problems in the use of anycast with stateful transport and connection-oriented protocols, there is expanding deployment (e.g., for content distribution networks) and situations exist where it may make sense (e.g., such as with service discovery, short-lived transactions, or in cases where dynamically directing traffic to topologically optimal service instances is required). In general, operators should consider the content and references provided herein, and evaluate the benefits and implications of anycast in their specific environments and applications.
In addition, (as noted in Section 2.3) the issue of whether to withdraw anycast routes when there is a service failure is only briefly broached in [RFC3258]. The advice given is that routes should not be withdrawn, in order to reduce operational complexity. However, the issue of route advertisements and service outages deserves greater attention.
There is an inherent tradeoff that exists between the operational complexity of matching service outages with anycast route withdrawals, and allowing anycast routes to persist for services that are no longer available. [RFC3258] maintains that DNS' inherent failure recovery mechanism is sufficient to overcome failed nodes, but even this advice enshrines the notion that these decisions are both application-specific and subject to the operational needs of each deployment. For example, the routing system plays a larger role in DNS when services are anycast. Therefore, operational consideration must be given to the fact that relying on anycast for DNS deployment optimizations means that there are operational tradeoffs related to keeping route advertisements (and withdrawals) symmetric with service availability. For example, in order to ensure that the DNS resolvers in a failed anycast instance's catchment [RFC4786] are able to fail over and reach a non-failed catchment, a route withdrawal is almost certainly required. On the other hand, instability of a DNS process that triggers frequent route advertisement and withdrawal might result in suppression of legitimate paths to available nodes, e.g., as a result of route flap damping [RFC2439].
Rather than prescribing advice that attempts to befit all situations, it should simply be recognized that when using anycast with network services that provide redundancy or resilience capabilities at other layers of the protocol stack, operators should carefully consider the optimal layer(s) at which to provide said functions.
No IANA actions are required.
In summary, operators and application vendors alike should consider the benefits and implications of anycast in their specific environments and applications, and also give forward consideration to how new network protocols and application functions may take advantage of anycast, or how they may be negatively impacted if anycasting is employed.
Many thanks to Kurtis Lindqvist for his early review and feedback on this document. Thanks to Brian Carpenter, Alfred Hoenes, and Joe Abley for their usual careful review and feedback as well as Mark Smith for his detailed review.
Bernard Aboba Jari Arkko Marc Blanchet Ross Callon Alissa Cooper Joel Halpern Russ Housley Eliot Lear Xing Li Andrew Sullivan Dave Thaler Hannes Tschofenig