Internet DRAFT - draft-chiappa-lisp-architecture
draft-chiappa-lisp-architecture
LISP Working Group J. N. Chiappa
Internet-Draft Yorktown Museum of Asian Art
Intended status: Informational July 16, 2012
Expires: January 17, 2013
An Architectural Perspective on the LISP
Location-Identity Separation System
draft-chiappa-lisp-architecture-01
Abstract
LISP upgrades the architecture of the IPvN internetworking system by
separating location and identity, current intermingled in IPvN
addresses. This is a change which has been identified by the IRTF as
a critically necessary evolutionary architectural step for the
Internet. In LISP, nodes have both a 'locator' (a name which says
_where_ in the network's connectivity structure the node is) and an
'identifier' (a name which serves only to provide a persistent handle
for the node). A node may have more than one locator, or its locator
may change over time (e.g. if the node is mobile), but it keeps the
same identifier.
This document gives additional architectural insight into LISP, and
considers a number of aspects of LISP from a high-level standpoint.
[NOTE: This is still a somewhat rough draft version; a few sections
at the end are just rough frameworks, but almost all the key
sections, and all the front part of the document, are here, and in
something like reasonably complete form.]
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be modified,
and derivative works of it may not be created, except to format it
for publication as an RFC or to translate it into languages other
than English.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 17, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction
2. Goals of LISP
2.1. Reduce DFZ Routing Table Size
2.2. Deployment of New Namespaces
2.3. Future Development of LISP
3. Architectual Perspectives
3.1. Another Packet-Switching Layer
3.2. 'Double-Ended' Approach
4. Architectual Aspects
4.1. Critical State
4.2. Need for a Mapping System
4.3. Piggybacking of Control on User Data
5. Namespaces
5.1. LISP EIDs
5.1.1. Residual Location Functionality in EIDs
5.2. RLOCs
5.3. Overlapping Uses of Existing Namespaces
5.4. LCAFs
6. Scalability
6.1. Demand Loading of Mappings
6.2. Caching of Mappings
6.3. Amount of State
6.4. Scalability of The Indexing Subsystem
7. Security
7.1. Basic Philosophy
7.2. Design Guidance
7.2.1. Security Mechanism Complexity
7.3. Security Overview
7.3.1. Securing Lookups
7.3.2. Securing The Indexing Subsystem
7.3.3. Securing Mappings
7.4. Securing the xTRs
8. Robustness
9. Fault Discovery/Handling
10. Optimization
11. Open Issues
11.1. Local Open Issues
11.1.1. Missing Mapping Packet Queueing
11.1.2. Mapping Cache Management Algorithm
11.2. Systemic Open Issues
11.2.1. Mapping Database Provider Lock-in
11.2.2. Automated ETR Synchronization
11.2.3. EID Reachability
11.2.4. Detect and Avoid Broken ETRs
12. Acknowledgments
13. IANA Considerations
14. Security Considerations
15. References
15.1. Normative References
15.2. Informative References
Appendix A. Glossary/Definition of Terms
Appendix B. Other Appendices
1. Introduction
This document begins by introducing some high-level architectural
perspectives which have proven useful for thinking about the LISP
location-identity separation system. It then discusses some
architectural aspects of LISP (e.g. its namespaces). The balance
(and bulk) of the document contains architectural analysis of the
LISP system; that is, it reviews from a high-level standpoint various
aspects of that system; e.g. its scalability, security, robustness,
etc.
NOTE: This document assumes a fair degree of familiarity with LISP;
in particular, the reader should have a good 'high-level'
understanding of the overall LISP system architecture, such as is
provided by [Introduction], "An Introduction to the LISP System".
By "system architecture" above, the restricted meaning used there is:
'How the system is broken up into subsystems, and how those
subsystems interact; when does information flows from one to another,
and what that information is.' There is obviously somewhat more to
architecture (e.g. the namespaces of a system, in particular their
syntax and semantics), and that remaining architectural content is
covered here.
2. Goals of LISP
As previously stated in the abstract, broadly, the goal of LISP is to
be a practically deployable architectural upgrade to IPvN which
performs separation of location and identity. But what is the value
of that? What will it allow us to do?
The answer to that obviously starts with the things mentioned in the
"Initial Applications" section of [Introduction], but there are
other, longer-range (and broader) goals as well.
2.1. Reduce DFZ Routing Table Size
One of the main design drivers for LISP, as well as other location-
identity separation proposals, is to decrease the overhead of running
global routing system. In fact, it was this aspect that led the IRTF
Routing RG to conclude that separation of location and identity was a
key architectural underpinning needed to control the growth of the
global routing system. [RFC6115]
As noted in [Introduction], many of the practical needs of Internet
users are today met with techniques that increase the load on the
global routing system (Provider Independent addresses for the
provision of provider independence, multihoming, etc; more-specific
routes for TE; etc.) Provision of these capabilities by a mechanism
which does not involve extra load on the global routing system is
therefore very desirable.
A number of factors, including the use of these techniques, has led
to a great increase in the fragmentation of the address space, at
least in terms of routing table entries. In particular, the growth
in demand for multi-homing has been forseen as driving a large
increase in the size of the global routing tables.
In addition, as the IPv4 address space becomes fuller and fuller,
there will be an inevitable tendency to find use in smaller and
smaller 'chunks' of that space. [RFC6127] This too would tend to
increase the size of the global routing table.
LISP, if successful and widely deployed, offers an opportunity to use
separation of location and identity to control the growth of the size
of the global routing table. (A full examination of this topic is
beyond the scope of this document - see {{find reference}}.)
2.2. Deployment of New Namespaces
Once the mapping system is widely deployed and available, it should
make deployment of new namespaces (in the sense of new syntax, if not
new semantics) easier. E.g. if someone wishes in the future to
devise a system which uses native MPLS [RFC3031] for a data carriage
system joining together a large number of xTRs, it would easy enough
to arrange to have the mappings for destinations attached to those
xTRs abe some sort of MPLS-specific name.
More broadly, the existence of a binding layer, with support for
multiple namespace built into the interface on both sides (see
Section 5) is a tremendously powerful evolutionary tool; one can
introduce a new namespace (on one side) more easily, if it is mapped
to something which is already deployed (on the other). Then, having
taken that step, one can invert the process, and deploy yet another
new namespace, but this time on the other.
2.3. Future Development of LISP
Speculation about long-term future developments which are enabled by
the deployment of LISP is not really proper for this document.
However, interested readers may wish to consult [Future] for one
person's thoughts on this topic.
3. Architectual Perspectives
This section contains some high-level architectural perspectives
which have proven useful in a number of ways for thinking about LISP.
For one, when trying to think of LISP as a complete system, they
provide a conceptual structure which can aid analysis of LISP. For
another, they can allow the application of past analysis of, and
experience with, similar designs.
3.1. Another Packet-Switching Layer
When considering the overall structure of the LISP system at a high
level, it has proven most useful to think of it as another packet-
switching layer, run on top of the original internet layer - much as
the Internet first ran on top of the ARPANET.
All the functions that a normal packet switch has to undertake - such
as ensuring that it can reach its neighbours, and they they are still
up - the devices that make up the LISP overlay also have to do, along
the 'tunnels' which connect them to other LISP devices.
There is, however, one big difference: the fanout of a typical LISP
ITR will be much larger than most classic physical packet switches.
(ITRs only need to be considered, as the LISP tunnels are all
effectively unidirectional, from ITR to ETR - an ETR needs to keep no
per-tunnel state, etc.)
LISP is, fundamentally, a 'tunnel' based system. Tunnel system
designs do have their issues (e.g. the high inter-'switch' fan-out),
but it's important to realize that they also can have advantages,
some of which are listed below.
3.2. 'Double-Ended' Approach
LISP may be thought of as a 'double-ended' approach to enhancing the
architecture, in that it uses pairs of devices, one at each end of a
communication stream. In particular, to interact with the population
of 'legacy' hosts (which will be, inevitably, the vast majority, in
the early stages of deployment) it requires a LISP device at both
ends of the 'tunnel'.
This is in distinction to, say, NAT systems ([RFC1631]), which only
need a device deployed at one end: the host at the other end doesn't
need a matching device at its end to massage the packets, but can
simply consume them on its own, as any packets it receives are fully
normal packets. This allows any site which deploys such a 'single-
ended' device to get the full benefit, whilst acting entirely on its
own. [Wasserman]
The issue is not that LISP uses tunnels. Designs like HIP
([RFC4423]) and ILNP ([ILNP]), which do not involve tunnels, inhabit
a similar space to tunnel-based designs like LISP, in that unless
both ends are upgraded - or there is a proxy at the un-upgraded end -
one doen't get any benefits. So it's really not the tunnel which is
the key aspect, it's the 'all at one end' part which is key. Whether
the system is tunnel, versus non-tunnel, is not that important.
However, the double-ended approach of LISP does have advantages, as
well as costs. To put it simply, the 'feature' of the alternative
approach, that there's only a box at one end, has a 'bug': there's
only a box at one end. There are things which such a design cannot
accomplish, because of that.
To put it another way, does the fact that the packet thus necessarily
has only a single 'name' in it for the entities at each end (i.e. the
IPvN source and destination addresses), because it is a 'normal'
packet, present a limitation? Put that way, it would seem natural
that it should cause certain limits.
To compile a complete list of the things that can be done, when two
separate 'names' are in the packet, is beyond the scope of this
document. However, one example of the kind of thing that can be done
is mobility with open connections, without needing to 'triangle
route' the packets through some sort of 'base station' at the
original location. Another is that is possible to automatically
tunnel IPv6 traffic over IPv4 infrastructure, or vice versa,
invisibly to the hosts on both ends.
In the longer term, having having tunnel boxes will allow (and is
allowing) us to explore other kinds of wrappings. For example, we
can transport 'raw' local-network packets (such as Ethernet MAC
frames) across an IPvN infrastructure.
One could also wrap packets in non-IPvN formats: perhaps to take
direct advantage of the capabilities of underlying switching fabrics
(e.g. MPLS [RFC3031]); perhaps to deploy new carriage protocols,
etc, where non-standard packet formats will allow extended semantics.
4. Architectual Aspects
LISP does take some novel architectural approaches in a number of
ways: e.g. its use of a separate mapping system, etc, etc. This
section contains some commentary on some of the high-level
architectural aspects of LISP.
4.1. Critical State
LISP does have 'critical state' in the network (i.e. state which, if
if lost, causes the communication to fail). However, because LISP is
designed as an overall system, 'designing it in' allows for a
'systems' approach to its state issues. In LISP, this state has been
designed to be maintained in an 'architected' way, so it does not
produce systemic brittleness in the way that the state in NATs does.
For instance, throughout the system, provisions have been made to
have redundant copies of state, in multiple devices, so that the loss
of any one device does not necessarily cause a failure of an ongoing
connection.
4.2. Need for a Mapping System
LISP does need to have a mapping system, which brings design,
implementation, configuration and operational costs. Surely all
these costs are a bad thing? However, having a mapping system have
advantages, especially when there is a mapping layer which has global
visibility (i.e. other entities know that it is there, and have an
interface designed to be able to interact with it). This is unlike,
say, the mappings in NAT, which are 'invisible' to the rest of the
network.
In fact, one could argue that the mapping layer is LISP's greatest
strength. Wheeler's Axiom* ('Any problem in computer science can be
solved with another level of indirection') indicates that the binding
layer available with the LISP mapping system will be of great value.
Again, it is not the job of this document to list them all - and in
any event, there is no way to forsee them all.
The author of this document has often opined that the hallmark of
great architecture is not how well it does the things it was designed
to do, but how well it does things it was never expected to have to
handle. Providing such a powerful and generic binding layer is one
sure way to achieve the sort of lasting flexibility and power that
leads to that outcome.
[Footnote *: This Axiom is often mis-attributed to Butler Lampson,
but Lampson himself indicated that it came from David Wheeler.]
4.3. Piggybacking of Control on User Data
LISP piggybacks control transactions on top of user data packets.
This is a technique that has a long history in data networking, going
back to the early ARPANET. [McQuillan] It is now apparently regarded
as a somewhat dubious technique, the feeling seemingly being that
control and user data should be strictly segregated.
It should be noted that _none_ of the piggybacking of control
functionality in LISP is _architecturally fundamental_ to LISP. All
of the functions in LISP which are performed with piggybacking could
be performed almost equally well with separate control packets.
The "almost" is solely because it would cause more overhead (i.e.
control packets); neither the response time, robustness, etc would
necessarily be affected - although for some functions, to match the
response time observed using piggybacking on user data would need as
much control traffic as user data traffic.
This technique is particularly important, however, because of the
issue identified at the start of this section - the very large fanout
of the typical LISP switch. Unlike a typical router, which will have
control interactions with only a few neighbours, a LISP switch could
eventually have control interactions with hundreds, or perhaps even
thousands (for a large site) of neighbours.
Explicit control traffic, especially if good response times are
desired, could amount to a very great deal of overhead in such a
case.
5. Namespaces
One of the key elements in any architecture, or architectural
analysis, are the namespaces involved: what are their semantics and
syntax, what are the kinds of things they name, etc.
LISP has two key namespace, EIDs and RLOCs, but it must be emphasized
that on an architectural level, neither the syntax, or, to a lesser
degree, the semantics, of either are absolutely fixed. There are
certain core semantics which are generaly unchanging (such as the
notion that EIDs provide only identity, whereas RLOCs provide
location), but as we will see, there is a certain amount of
flexibility available for the long-term.
In particular, all of LISP's key interfaces always include an Address
Family Identifier (AFI) [AFI] for all names, so that new forms can be
introduced at any time the need is felt. Of course, in practise such
an introduction would not be a trivial exercise - but neither is is
impossibly painful, as is the case with IPv4's 32-bit addresses,
which are effectively impossible to upgrade.
5.1. LISP EIDs
A 'classic' EID is defined as a subset of the possible namespaces for
endpoints. [Chiappa] Like most 'proper' endpoint names, as proposed
there, they contain contain no information about the location of the
endpoint. EIDs are the subset of possible endpoint names which are:
fixed length, 'reasonably' short', binary (i.e. not intended for
direct human use), globally unique (in theory), and allocated in a
top-down fashion (to achieve the former).
LISP EIDs are, in line with the general LISP deployment philosophy, a
reuse of something already existing - i.e. IPvN addresses. For
those used as in LISP as EIDs, LISP removes much (or, in some cases,
all) of the location-naming function of IPvN addresses.
In addition, the goal is to have EIDs name hosts (or, more properly,
their end-end communication stacks), whereas the other LISP namespace
group (RLOCs) names interfaces. The idea is not just to have two
namespaces (with different semantics), but also to use them to name
_different classes of things_ - classes which currently do not have
clearly differentiated names. This should produce even more
functionality.
5.1.1. Residual Location Functionality in EIDs
LISP retains, especially in the early stages of the deployment, in
many cases some residual location-naming functionality in EIDs, This
is to allow the packet to be correctly routed/forwarded to the
destination node, once it has been unwrapped by the ETR - and this is
a direct result of LISP's deployment philosophy (see [Introduction],
Section "Deployment").
Clearly, if there are one or more unmodified routers between the ETR
and the desination node, those routers will have to perform a routing
step on the packet, for which it will need _some_ information as to
the location of the destination.
One can thus view such LISP EIDs, which retain 'stub' location
information, as 'addresses' (in the definition of the generic sense
of this term, as used here), but with the location information
restricted to a limited, local scope.
This retention of some location functionality in LISP EIDs, in some
cases, has led some people to argue that use of the name 'EID' is
improper. In response, it was suggested that LISP use the term
'LEID', to distinguish LISP's 'bastardized' EIDs from 'true' EIDs,
but this usage has never caught on.
It has also been suggested that one usage mode for LISP EIDs, in
existing software loads, is to assign them as the address on an
internal virtual interface; all the real interfaces would have RLOCs
only. [Templin] This would make such LISP EIDs functionally
equivalent to 'real' EIDs - they are names which are purely identity,
have no location information of any kind in them, and cannot be used
to make any routing decisions anywhere outside the host.
It is true that even in such cases, the EID is still not a 'pure'
EID, as it names an interface, not the end-end stack directly.
However, to do a perfect job here (or on separation of location and
identity) is impossible without modifying existing hosts (which are,
inevitably, almost always one end of an end-end communication) - and
that has been ruled out, for reasons of viable deployment.
The need for interoperation with existing unmodified hosts limits the
semantic changes one can impose, much as one might like to provide a
cleaner separation. (Future evolution can bring us toward that
state, however: see [Future].)
5.2. RLOCs
RLOCs are basically pure 'locators' [RFC1992], although their syntax
and semantics is restricted at the moment, because in practise the
only forms of RLOCs supported are IPv4 and IPv6.
5.3. Overlapping Uses of Existing Namespaces
It is in theory possible to have a block of IPvN namespace used as
both EIDs and RLOCs. In other words, EIDs from that block might map
to some other RLOCs, and that block might also appear in the DFZ as
the locators of some other ETRs.
This is obviously potentially confusing - when a 'bare' IPvN address
from one of these blocks, is it the RLOC, or the EID? Sometimes it
it obvious from the context, but in general one could not simply have
a (hypothetical) table which assigns all of the address space to
either 'EID' or 'RLOC'.
In addition, such usage will not allow interoperation of the sites
named by those EIDs with legacy sites, using the PITR mechanism
([Introduction], Section "Proxy Devices"), since that mechanisms
depends on advertizing the EIDs into the DFZ, although the LISP-NAT
mechanism should still work ([Introduction], Section "LISP-NAT").
Nevertheless, as the IPv4 namespace becomes increasingly used up,
this may be an increasingly attractive way of getting the 'absolute
last drop' out of that space.
5.4. LCAFs
{{To be written.}}
--- Key-ID
--- Instance-IDs
6. Scalability
As with robustness, any global communication system must be scalable,
and scalable up to almost any size. As previously mentioned (xref
target="Perspectives-Packet"/), the large fanouts to be seen with
LISP, due to its 'overlay' nature, present a special challenge.
One likely saving grace is that as the Internet grows, most sites
will likely only interact with a limited subset of the Internet; if
nothing else, the separation of the world into language blocks means
that content in, say, Chinese, will not be of interest to most of the
rest of the world. This tendency will help with a lot of things
which could be problematic if constant, full, N^2 connectivity were
likely on all nodes; for example the caching of mappings.
6.1. Demand Loading of Mappings
One question that many will have about LISP's design is 'why demand-
load mappings - why not just load them all'? It is certainly true
that with the growth of memory sizes, the size of the complete
database is such that one could reasonably propose keeping the entire
thing in each LISP device. (In fact, one proposed mapping system for
LISP, named NERD, did just that. [NERD])
A 'pull'-based system was chosen over 'push' for several reasons; the
main one being that the issue is not just the pure _size_ of the
mapping database, but its _dynamicity_. Depending on how often
mappings change, the update rate of a complete database could be
relatively large.
It is especially important to realize that, depending on what
(probably unforseeable) uses eventually evolve for the
identity->location mapping capability LISP provides, the update rate
could be very high indeed. E.g. if LISP is used for mobility, that
will greatly increase the update rate. Such a powerful and flexible
tool is likely be used in unforseen ways (Section 4.2), so it's
unwise to make a choice that would preclude any which raise the
update rate significantly.
Push as a mechanism is also fundamentally less desirable than pull,
since the control plane overhead consumed to load and maintain
information about unused destinations is entirely wasted. The only
potential downside to the pull option is the delay required for the
demand-loading of information.
(It's also probably worth noting that many issues that some people
have with the mapping approach of LISP, such as the total mapping
database size, etc are the same - if not worse - for push as they are
for pull.)
Finally, for IPv4, as the address space becomes more highly used, it
will become more fragmented - i.e. there will tend to be more,
smaller, entries. For a routing table, which every router has to
hold, this is problematic. For a demand-loaded mapping table, it is
not bad. Indeed, this was the original motivation for LISP
([RFC4984]) - although many other useful and desirable uses for it
have since been enumerated (see [Introduction], Section
"Applications").
For all of these reasons, as long as there is locality of reference
(i.e. most ITRs will use only a subset of the entire set), it makes
much more sense to use the a pull model, than the classic push one
heretofore seen widely at the internetwork layer (with a pull
approach thus being somewhat novel - and thus unsettling to many - to
people who work at that layer).
It may well be that some sites (e.g. large content providers) may
need non-standard mechanisms - perhaps something more of a 'push'
model. This remains to be determined, but it is certainly feasible.
6.2. Caching of Mappings
It should be noted that the caching spoken of here is likely not
classic caching, where there is a fixed/limited size cache, and
entries have to be discarded to make room for newly needed entries.
The economics of memory being what they are, there is no reason to
discard mappings once they have been loaded (although of course
implementations are free to chose to do so, if they wish to).
This leads to another point about the caching of mappings: the
algorithms for management of the cache are purely a local issue. The
algorithm in any particular ITR can be changed at will, with no need
for any coordination. A change might be for purposes of
experimentation, or for upgrade, or even because of environmental
variations - different environments might call for different cache
management strategies.
The local, unsynchronized replacability of the cache management
scheme is the architectural aspect of the design; the exact
algorithm, which is engineering, is not.
6.3. Amount of State
{{To be written.}} [Iannone]
-- Mapping cache size
--- Mention studies
-- Delegation cache size (in MRs)
--- Mention studies
-- Any others?
6.4. Scalability of The Indexing Subsystem
LISP initially used an indexing subsystem called ALT. [ALT] ALT was
relatively easy to construct from existing tools (GRE, BGP, etc), but
it had a number of issues that made it unsuitable for large-scale
use. ALT is now being superseded by DDT. [DDT]
The basic structure and operation of DDT is identical to that of
TREE, so the extensive simulation work done for TREE applies equally
to DDT, as do the conclusions drawn about TREE's superiority to ALT.
[Jakab]
From an architectural point of view, the main advantage of DDT is
that it enables client side caching of information about intermediate
nodes in the resolution hierarchy, and also enables direct
communication with them. As a result, DDT has much better scaling
properties than ALT.
The most important result of this change is that it avoids a
concentration of resolution request traffic at the root of the
indexing tree, a problem which by itself made ALT unsuitable for a
global-scale system. The problem of root concentration (and thus
overload) is almost unavoidable in ALT (even if masses of 'bypass'
links are created).
ALT's scalability also depends on enforcing an intelligent
organization that aincreases aggregation. Unfortunately, the current
backbone routing BGP system shows that there is a risk of an organic
growth of ALT, one which does not achieve aggregation. DDT does not
display this weakness, since its organization is inherently
hierarchical (and thus inherently aggregable).
The hierarchical organization of DDT also reduces the possibility for
a configuration error which interferes with the operation of the
network (unlike the situation with the current BGP DFZ). DDT
security mechanisms can also help produce a high degree of
robustness, both against misconfiguration, and deliberate attack.
The direct communication with intermediate nodes in DDT also helps to
quickly locate problems when they occur, resulting in better
operational characteristics.
Next, since in ALT mapping requests must be transmitted through an
overlay network, a significant share of requests can see
substantially increased latencies. Simulation results in the TREE
work clearly showed, and quantified, this effect.
The simulations also showed that the nodes composing the ALT and DDT
networks for a mapping database of full Internet size could have
thousands of neighbours. This is not an issue for DDT, but would
almost certainly have been problematic for ALT nodes, since handling
that number of simultaneous BGP sessions would likely to be
difficult.
7. Security
LISP does not yet have an overarching security architecture. Many
parts of the system have been hardened, but more on a case-by case
basis, rather than from an overall perspective. (This is in part due
to the 'just enough' approach to security initially taken in LISP;
see [Introduction], Section "Just Enough Security".)
This section represents an attempt to produce a more broadly-based
view of security in LISP; it mostly resulted from an attempt to add
security to the DDT indexing system ([DDT]), but the analysis is is
general enough to apply to LISP broadly.
The _good_ thing about the Internet is that it brings the world to
your doorstep - masses of information from all around the world are
instantly available on your computing device. The _bad_ thing about
the Internet is that it brings the world to your doorstep - including
legions of crackers, thieves, and general scum and villainy. Thus,
any node may be the target of fairly sophisticated attack - often
automated (thereby reducing the effort required of the attacker to
spread their attack as broadly as possible).
Security in LISP faces many of the same challenges as security for
other parts of the Internet: good security usually means work for the
users, but without good security, things are vulnerable.
The Internet has seen many very secure systems devised, only to see
them fail to reach wide adoption; the reasons for that are complex,
and vary, but being too much work to use is a common thread. It is
for this reason that LISP attempts to provide 'just enough' security
(see [Introduction], Section "Just Enough Security").
7.1. Basic Philosophy
To square this circle, of needing to have very good security, but of
it being too difficult to use very good security, the general concept
is for LISP to have a series of 'graded' security measures available,
with the 'ultimate' security mechanisms being very high-grade indeed.
The concept is to devise a plan in which LISP can simultaneously
attempt to have not just 'ultimate' security, but also one or more
'easier' modes, ones which will be easier to configure and use. This
'easier' mode can be both an interim system (with the full powered
system available for when it it needed), as well as the system used
in sections of the network where security is less critical (following
the general rule that the level of any security should generally be
matched to what is being protected).
The challenge is to do this in a way that does not make the design
more complex, since it has to include both the 'full strength'
mechanism(s), and the 'easier to configure' mechanism(s). This is
one of the fundamental tradeoffs to struggle with: it is easy to
provide 'easier to configure' options, but that may make the overall
design more complex.
As far as making it hard to implement to begin with (also something
of a concern initially, although obviously not for the long term): we
can make it 'easy' to deploy initially by simply not implementing/
configuring the heavy-duty security early on. (Provided, of course,
that the packet formats, etc, needed to support such security are all
included in the design to begin with.)
7.2. Design Guidance
In designing the security, there are a small number of key points
that will guide the design:
- Design lifetime
- Threat level
How long is the design intended to last? If LISP is successful, a
minimum of a 50-year lifetime is quite possible. (For comparison,
IPv4 is now 34 at the time of writing this, and will be around for at
least several decades yet, if not longer; DNS is 28, and will
probably last indefinitely.)
How serious are the threats it needs to meet? As mentioned above,
the Internet can bring the worst crackers from anywhere to any
location, in a flash. Their sophistication level is rising all the
time: as the easier holes are plugged, they go after others. This
will inevitably eventually require the most powerful security
mechanisms available to counteract their attacks.
Which is not to say that LISP needs to be that secure _right away_.
The threat will develop and grow over a long time period. However,
the basic design has to be capable of being _securable_ to the
expanded degree that will eventually be necessary. However,
_eventually_ it will need to be as securable as, say, DNS - i.e. it
_can_ be secured to the same level, although people may chose not to
secure their LISP infrastructure as well as DNSSEC potentially does.
[RFC4033]
In particular, it should be noted that historically many systems have
been broken into, not through a weakness in the algorithms, etc, but
because of poor operational mechanics. (The well-known 'Ultra'
breakins of the Allies were mostly due to failures in operational
procedure. [Welchman]) So operational capabilities intended to
reduce the chance of human operational failure are just as important
as strong algorithms; making things operationally robust is a key
part of 'real' security.
7.2.1. Security Mechanism Complexity
Complexity is bad for several reasons, and should always be reduced
to a minimum. There are three kinds of complexity cost: protocol
complexity, implementation complexity, and configuration complexity.
We can further subdivide protocol complexity into packet format
complexity, and algorithm complexity. (There is some overlap of
algorithm complexity, and implementation complexity.)
We can, within some limits, trade off one kind of complexity for
others: e.g. we can provide configuration _options_ which are simpler
for the users to operate, at the cost of making the protocol and
implementation complexity greater. And we can make initial (less
capable) implementations simpler if we make the protocols slightly
more complex (so that early implementations don't have to implement
all the features of the full-blown protocol).
It's more of a question of some operational convenience/etc issues -
e.g. 'How easy will it be to recover from a cryptosystem
compromise'. If we have two ways to recover from a security
compromise, one which is mostly manual and a lot of work, and another
which is more automated but makes the protocol more complicated, if
compromises really are very rare, maybe the smart call _is_ to go
with the manual thing - as long as we have looked carefully at both
options, and understood in some detail the costs and benefits of
each.
7.3. Security Overview
First, there are two different classes of attack to be considered:
denial of service (DoS, i.e. the ability of an intruder to simply
cause traffic not to successfully flow) versus exploitation (i.e. the
ability to cause traffic to be 'highjacked', i.e. traffic to be sent
to the wrong location).
Second, one needs to look at all the places that may be attacked.
Again, LISP is a relatively simple system, so there are not that many
parts to examine. The following are the things we need to secure:
- Lookups
- Indexing
- Mappings
7.3.1. Securing Lookups
{{To be written.}} Nonces, [SecurityReq]
7.3.2. Securing The Indexing Subsystem
It is envisioned that DDT will be highly securable, with all the
delegations cryptographiclly secured via public-private signatures,
very similar to the way DNS is ([RFC4033]).
The detailed mechanisms will be based on DNS's; this has the obvious
benefit that all the lessons of DNS's years of practical experience
with deployment, operations, etc, as well as the improvements to the
basic design of DNS Security to provide a secure but usable system
can be taken into account. However, DDT's security will also apply
the thinking above, about making a 'versio' which is easier to use
available.
{{To be written.}}
7.3.3. Securing Mappings
There are two approaches to securing the provision of mappings. The
first, which is of course not completely satisfactory, is to only
secure the channel between the ITR and the entities involved in
providing mappings for it. (See above, Section 7.3.1)
The second is to secure the mappings themselves, by signing them 'at
birth' (much the same way in which DNS Security operates).
[RFC4033]. There was an attempt early on to suggest such a system
for LISP ([SecurityAuth]), but it was not adopted (although the
particular proposal was rather complex).
In the long run, the latter approach would obviously be superior,
since it would be almost immune to any compromises of the mapping
distribution system. {{Tie-in to space allocation security}}
7.4. Securing the xTRs
--- Cache management
--- Unsoliticed Map-Replies are _very bad_ - must go through
mapping system to verify that the sender is authoritative for
that range of EIDs
8. Robustness
-- Depends on deployment as well as design
-- Architected, visible replication of state/data
-- Overlapping mechanisms (ref redundancy as key for robustness)
9. Fault Discovery/Handling
Any global communication system must be robust, and to be robust, it
must be able to discover and handle problems. LISP's general
philosophy of robustness is usually to have overlapping, simple
mechanisms to discover and repair problems.
10. Optimization
-- Philosophy
-- Piggybacking
-- 'Wiretapping' return mappings
--- Security is an issue on that
11. Open Issues
Although much work has been done on LISP, and it operates
satisfactorily in a reasonably large initial deployment, there are a
few potentially problematic issues which remain. It is not clear if
they will be issues which need to be dealt, since they have not
proven to be obstacles so far, but it is worth listing them.
We can divide them in _local_ issues, i.e. ones which can be solved
on a node-by-node basis, without requiring co-ordinated change, and
systemic issues, which are obviously more problematic, since they
could require co-ordinated changes to the protocols.
11.1. Local Open Issues
11.1.1. Missing Mapping Packet Queueing
Currently, some (all?) ITRs discard packets when they need a
mapping, but have not loaded one yet, thereby causing the applicaton
to have to retransmit their opening packet. True, many ARP
implementations use the same strategy, but the average APR cache will
only ever contain a few mappings, so it will not be so noticeable as
with the mapping cache in an ITR, which will likely contain
thousands.
Obviously, they could queue the packets while waiting to load the
mapping, but this presents a number of subtle implementation issues:
the ITR must make sure that it does not queue too many packets, etc.
In particular, if such packets are queued, this presents a potential
DoS attack vector, unless the code is carefully written with that
possibility in mind.
11.1.2. Mapping Cache Management Algorithm
Relatively little work has been done on sophisticated mapping cache
management algorithms; in particular, the issue of which mapping(s)
to drop if the cache reaches some maximum allowed size.
This particular issue has also been identified as another potential
DoS attack vector.
11.2. Systemic Open Issues
11.2.1. Mapping Database Provider Lock-in
This refers to the fact that if one does not like the entity which is
providing the indexing for the part of the address space which one's
EIDs are allocated out of, there isn't probably isn't any way to
switch to an alternative provider.
It is not clear that this is a real probem, though - the fact that
all DNS top-level zones only have a single registry has not been a
problem, nor has the fact that if one doesn't like the service the
registry offers, one can't take one's DNS name to another registry.
Doing anything about it would also be difficult. Although it is
_technically_ possible to duplicate any node in the delegation tree,
and in theory such duplicates could be provided by different
providers, it is not clear that such an arrangement would make
_business_ sense.
For instance, if the holder of 10.1.1/24 decides they do not like the
entity providing indexing for 10.1/16 (call them E1), and ask another
entity (E2) to provide alternative service for 10.1/16, two problems
arise. First, E1 is _still_ going to have to maintain the correct
data for 10.1.1/24, and response to queries asking about them.
Second, E2 will similarly have to maintain data for, and reply to
queries about, all the other space-holders in 10.1/16 - even though
they will likely not have any business relationship with them.
11.2.2. Automated ETR Synchronization
LISP requires that all the ETRs which are authoritative for the
mappings for a particular address block return the same mapping data.
In particular, their idea of the 'liveness' of all the ETRs should be
identical, and correct.
At the moment, this is mostly a manual process, although liveness
information can be currently be gathered from some IGPs.
11.2.3. EID Reachability
At the moment, LISP assumes that if an ETR is reachable from a given
ITR, all destination EIDs behind that ETR are reachable from that
ETR. There is no way to detect if any are not, nor to switch to an
alternate ETR.
It is not clear that this is a problem that needs attention. The
same has been true for all border routers for many years now, and
there does not seem to be any general mechanism to deal with it
(Although some BGP implementations may advertize changes in
reachability status if what they are seeing from their IGP changes.)
11.2.4. Detect and Avoid Broken ETRs
{{To be written}}
12. Acknowledgments
The author would like thank all the members of the core LISP group
for their willingness to allow him to add himself to their effort,
and for their enthusiasm for whatever assistance he has been able to
provide. He would also like to thank (in alphabetical order) Vina
Ermagan, Vince Fuller, and Joel Halpern for their careful review of,
and helpful suggestions for, this document. Grateful thanks also to
Vince Fuller for help with XML.
A final thanks is due to John Wrocklawski for the author's
organizational affiliation. This memo was created using the xml2rfc
tool
13. IANA Considerations
This document makes no request of the IANA.
14. Security Considerations
This memo does not define any protocol and therefore creates no new
security issues.
15. References
15.1. Normative References
[DDT] V. Fuller, D. Lewis, and D. Farinacci, "LISP
Delegated Database Tree", draft-fuller-lisp-ddt-01
(work in progress), March 2012.
[Future] J. N. Chiappa, "Potential Long-Term Developments With
the LISP System", draft-chiappa-lisp-evolution-00
(work in progress), July 2012.
[Introduction] J. N. Chiappa, "An Introduction to the LISP Location-
Identity Separation System",
draft-chiappa-lisp-introduction-00 (work in
progress), July 2012.
[SecurityAuth] R. Gagliano, "A Profile for Endpoint Identifier
Origin Authorizations (IOA)",
draft-rgaglian-lisp-iao-00 (work in progress),
March 2009.
[SecurityReq] F. Maino, V. Ermagan, A. Cabellos, D. Saucez, and
O. Bonaventure, "LISP-Security (LISP-SEC)",
draft-ietf-lisp-sec-02 (work in progress),
March 2012.
[AFI] IANA, "Address Family Indicators (AFIs)", Address
Family Numbers, January 2011, <http://www.iana.org/
assignments/address-family-numbers>.
15.2. Informative References
[RFC1631] K. Egevang and P. Francis, "The IP Network Address
Translator (NAT)", RFC 1631, May 1994.
[RFC1992] I. Castineyra, J. N. Chiappa, and M. Steenstrup, "The
Nimrod Routing Architecture", RFC 1992, August 1996.
[RFC3031] E. Rosen, A. Viswanathan, and R. Callon,
"Multiprotocol Label Switching Architecture",
RFC 3031, January 2001.
[RFC4033] R. Arends, R. Austein, M. Larson, D. Massey, and
S. Rose, "DNS Security: Introduction and
Requirements", RFC 4033, March 2005.
[RFC4423] R. Moskowitz and P. Nikander, "Host Identity Protocol
(HIP) Architecture", RFC 4423, May 2006.
[RFC4984] D. Meyer, L. Zhang, and K. Fall, "Report from the IAB
Workshop on Routing and Addressing", RFC 4984,
September 2007.
[RFC6115] T. Li, Ed., "Recommendation for a Routing
Architecture", RFC 6115, February 2011.
Perhaps the most ill-named RFC of all time; it
contains nothing that could truly be called a
'routing architecture'.
[RFC6127] J. Arkko and M. Townsley, "IPv4 Run-Out and IPv4-IPv6
Co-Existence Scenarios", RFC 6127, May 2011.
[ALT] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis,
"LISP Alternative Topology (LISP-ALT)",
draft-ietf-lisp-alt-10 (work in progress),
December 2011.
[NERD] E. Lear, "NERD: A Not-so-novel EID to RLOC Database",
draft-lear-lisp-nerd-09 (work in progress),
April 2012.
[ILNP] R.J. Atkinson and S.N. Bhatti, "ILNP Architectural
Description", draft-irtf-rrg-ilnp-arch-05 (work in
progress), May 2012.
[Chiappa] J. N. Chiappa, "Endpoints and Endpoint Names: A
Proposed Enhancement to the Internet Architecture",
Personal draft (work in progress), 1999,
<http://www.chiappa.net/~jnc/tech/endpoints.txt>.
[Jakab] L. Jakab, A. Cabellos-Aparicio, F. Coras, D. Saucez,
and O. Bonaventure, "LISP-TREE: A DNS Hierarchy to
Support the LISP Mapping System", in 'IEEE Journal on
Selected Areas in Communications', Vol. 28, No. 8,
pp. 1332-1343, October 2010.
[Iannone] L. Iannone and O. Bonaventure, "On the Cost of
Caching Locator/ID Mappings", in 'Proceedings of the
3rd International Conference on emerging Networking
EXperiments and Technologies (CoNEXT'07)', ACM, pp.
1-12, December 2007.
[McQuillan] J. M. McQuillan, W. R. Crowther, B. P. Cosell,
D. C. Walden, and F. E. Heart, "Improvements in the
Design and Performance of the ARPA Network",
Proceedings AFIPS 1972 FJCC, Vol. 40, pp. 741-754.
[Templin] F. Templin, "LISP WG", LISP WG list
message, Message-ID: 39C363776A4E8C4A94691D2BD9D1C9A1
05B0AC71@XCH-NW-7V2.nw.nos.boeing.com, 13
March 2009,, <http://www.ietf.org/mail-archive/web/
lisp/current/msg00269.html>.
[Wasserman] M. Wasserman, "IPv6 networking: Bad news for small
biz", IETF list message, Message-Id:
D11C4A34-7362-423E-A60E-476FC5D61D37@lilacglade.org,
5 April 2012, <https://www.ietf.org/ibin/
c5i?mid=6&rid=49&gid=0&k1=933&k2=62733&
tid=1340933524>.
[Welchman] G. Welchman, "The Hut Six Story", Allen Lane,
London, pg. 3, 1982.
A truly monumental book; the ground it covers ranges
from his work helping break German codes in World War
II to his experience with securing data packet
networks!
Appendix A. Glossary/Definition of Terms
- Address
- Locator
- EID
- RLOC
- ITR
- ETR
- xTR
- PITR
- PETR
- MR
- MS
- DFZ
Appendix B. Other Appendices
-- Location/Identity Separation Brief History
-- LISP History
-- Old models (LISP 1, LISP 1.5, etc)
-- Different mapping distribution models (e.g. LISP-NERD)
-- Different mapping indexing models (LISP-ALT
forwarding/overlay model),
LISP-TREE DNS-based, LISP-CONS)
Author's Address
J. Noel Chiappa
Yorktown Museum of Asian Art
Yorktown, Virginia
USA
EMail: jnc@mit.edu