Network Working Group | S. Woolf |
Internet-Draft | March 5, 2018 |
Intended status: Informational | |
Expires: September 6, 2018 |
Some Considerations on the Use of Domain Names Outside of the Global Public Domain Name System
draft-stw-whatsinaname-02
From time to time, networking protocols need to be able to name things used within the protocol, and resolve the names created or referenced. It's common for protocol designers to attempt to use domain names as the starting point for their systems of names, and the DNS protocol as the starting point for name resolution. Such re-use of DNS naming and resolution conventions can cause issues if not carefully defined and handled, as applications and infrastructure in the modern Internet tend to assume that a "domain name" is an identifier that follows certain composition and allocation rules and is to be resolved by DNS protocol in the global default scope.
This document acknowledges this class of extensions to the shared domain namespace and considers a framework for the properties a naming and resolution convention should have in the internet protocol environment, including the avoidance of collision with other uses of the namespace. Depending to the answers to the suggested questions, the answer may be that domain names will not meet the constraints at hand.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 6, 2018.
Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
From time to time, networking protocols need to be able to name things used within the protocol, and resolve the names created or referenced. Such identifiers may also need to be persistent in time, across administrative and operational realms, or other transformations. Necessary operations tend to include creating, modifying, and deleting names, and accessing values and relationships that correspond to them.
It's common for protocol designers in this predicament to attempt to use domain names as the starting point for their systems of names, and the DNS as the starting point for name resolution. This is completely understandable-- domain names, and DNS resolution, are well-established in both the expectations of network users and developers, and well-supported by fielded software.
However, there are some risks when the protocol designer attempts to re-use domain names and DNS, even (or especially) with modifications, to support a specific use case or protocol design or deployment constraint. These have been touched upon in several RFCs, and in the long history of struggles to keep evolving DNS itself and the use of domain names as new needs and constraints appear. See in particular RFC 6055 ("IAB Thoughts on Encodings for Internationalized Domain Names"), RFC 6950 ("Architectural Considerations on Application Features in the DNS"), and RFC 6943 ("Issues in Identifier Comparison for Security Purposes").
Most recently, some of these questions have become prominent in the course of requests for new entries in the special use names registry as established by RFC 6761 ("Special Use Domain Names"). The topic raises contention in a number of areas, including risks of collision between different authorities and different uses of names within the abstract domain namespace, which have been considered in the DNSOP WG over the last few years and are cataloged in RFC 8244 ("Special-Use Domain Names Problem Statement") at greater length than this document will do.
This document deals principally with the questions a protocol designer or software developer should ask themselves about what behavior they want from the names they use in the context of a new protocol or scope for names. It also provides a basic framework for describing desired interactions for a naming and resolution context as part of a protocol design.
This approach is admittedly somewhat "DNS-centric," in that it's attempting to address the default assumption that domain names and DNS-like semantics are desirable or even necessarily acceptable for new naming needs. Depending to the answers to these questions, the designer may find that domain names will not meet the constraints at hand. For a different, and sometimes more comprehensive, view on some of the accumulated stresses on the DNS design, see also RFC 8324 ("DNS Privacy, Authorization, Special Uses, Encoding, Characters, Matching, and Root Structure: Time for Another Look?")
This discussion also owes a great deal to the RFCs already mentioned, especially RFC 6950, which "provides guidance to application designers and application protocol designers looking to use the DNS to support features in their applications." The consideration there of how to structure domain names and associated data is invaluable. This document takes a step in a different direction, however, in attempting to separate domain names from DNS protocol in the analysis of new protocol needs. RFC 6950 primarily assumes that the namespace, the database of instantiated names, and the protocol for lookup and retrieval are all of a piece, while it's become the case recently that people are attempting to separate the namespace from specific resolution protocol or even a specific instance of a database of names (namely, the global public Internet DNS), with varying degrees of drama and varying degrees of success.
Recommended reading includes draft-lewis-domain-names.txt, [RFC1034], [RFC2826], [RFC2860], [RFC6950], [RFC6055], [RFC6943], [RFC6761], [RFC8244] and [RFC8324]
The Domain Name System is a critical part of the global internet infrastructure. From the protocol standards perspective, it's comprised of a number of standards-track documents and BCPs, but roughly speaking, it includes a description of naming syntax and semantics, some operational rules for constructing a globally shared database of such names, and a specification of a wire protocol for maintaining, querying, and generating responses from that database. It has always been the case that all three need to be maintained in a coordinated fashion for the DNS to function properly and the DNS database to remain useful.
In an even larger sense, however, domain names and the DNS protocol provide one answer to some fundamental questions for any computer system: naming and the manipulation of names are fundamental topics in computer science. Thus, DNS names and the DNS protocol exist as a common and highly useful solution to the basic need for naming "stuff" in certain applications and activities on the internet. We do occasionally have to notice, however, that they're not the final and complete solutions-- they have weaknesses-- even as they've proven so useful they tend to be re-used where possible.
Domain names considerably predate the Domain Name System. The set of domain names is, however, a superset of the DNS namespace, and the characteristics of the DNS namespace are inherited from it. In particular, part of the abstraction that describes domain names is a tree with an identified root and identified semantics for labeling nodes.
The basic structure of the domain namespace is a tree, with a domain name as a list of nodes in the tree. Such a tree must have a single root in order to maintain the uniqueness of each node. In 2002, the IAB wrote [RFC2826] to clarify that the existence of this root is inherent in the design of the DNS and requires coordination of changes to the root of the global namespace. This remains true as far as the mathematics goes, but is not as simple as it sounds. This abstract root domain isn't limited to names instantiated in the DNS namespace, accessible via the DNS protocol, but of both mathematical and operational necessity, includes them.
For application and protocol designers, then, domain names come with desirable properties such as relatively straightforward structure and widespread conventions for interpretation (such as IDNA to internationalize a name in cases where human-friendliness is important).
This apparent ease of use has been increased in recent years by the publication of RFC 6761, which specifies a registry of domain names for special uses. In a case where a protocol uses domain names and a DNS-like protocol such as mDNS (see [RFC6762]), the registry marks a portion of the abstract domain name space as associated with that use. This allows a protocol- or application-specific node or subtree to be associated with a location in the global domain namespace, offering a degee of assurance that such names are globally unique-- also often a valuable property.
However, there are also risks in this approach. For all the useful properties that come with domain names, they can be tricky to use, and interoperation can be subtle. There's no historically accepted definition of "domain name," and in some cases people use more restricted subsets of domain names such as host names with idiosyncratic limitations of their own. There are security and interoperability risks in comparison of such identifiers (see RFC 6943). They allow people to think of domain name labels as "words" and other natural language analogs, but don't behave as people expect in such contexts.
The situation is complicated by the fact that many applications have their own resolution engines, and parse any input that looks like a sequence of ASCII-string labels separated by dots as a "domain name," with resolution to be requested of DNS by default.
Both the usefulness and the limitations of domain names are tied to the characteristics that are consistent and well-known about them, but also to the characteristics that are not well-defined or explicit and may not be consistent. The underlying complication is simply this: domain names are themselves identifiers, but they have no explicit scope or context. Because of the way the modern Internet has evolved and the role of the DNS, the assumed context of a domain name is the global DNS; if such a name occurs in most applications or inputs in the Internet, the default assumption is that it should be resolved in the default context of the DNS, which is usually global in scope and bootstraps the namespace, per the DNS protocol, by configuration of the default available DNS resolver with a public set of "root servers".
Having a default resolution context-- a namespace rooted at a public, widely available set of servers, which can be discovered or configured as part of the DNS protocol-- is very convenient in the case that the intention is to use that public namespace according to the rules of the DNS protocol. And indeed the public DNS is enormously useful in many, many ways.
However, domain names are so useful that people also try to use them in other contexts as well:
Thus any choice to re-use DNS namespace, even without the DNS protocol for resolving names, requires some decisions to be made about namespace management and potential collisions or overlaps between DNS namespace and others.
(Note: very much under construction; should be consistent with the cited RFCs to the degree possible.)
The primary references for this section will be RFC 7719, RFC 8244, draft-lewis-domain-names, and RFC 1034; the primary elements probably include:
This section will offer some questions that should be considered in analysis of a candidate naming scheme for a new or revised protocol.
For the protocol designer who thinks they want to use domain names, RFC 6761 lists a set of questions to be answered for a special use name, discussing how users, DNS name registries, and DNS operators should treat such a namein order to maintain compatibility with the public DNS. However, those questions largely leave undefined how to tell if a special use domain name is really what's required, or how to choose an appropriate string if it is, and don't touch at all on the underlying fundamentals of choosing a naming scheme in the first place.
In general, it's important to discuss separately:
Some questions follow, not yet in any particular order, about how the protocol will use names; they start with the assumption that domain names may be suitable, but may lead to the conclusion that domain names won't solve the problem at hand:
There are also some questions that arise, once a protocol has taken shape, in making a choice of what names are suitable. If the choice is domain names, some analysis still needs to be done. Of the extremely large set of possible domain names, the list of acceptable ones may be quite long, or quite short, depending on the constraints imposed by the protocol and the preferences of the protocol designers.
Such questions include:
Consideration of how to use names within a protocol leads quickly to a set of interrelated structures that have to be defined
Again people tend to use DNS as a starting point, even though DNS as a protocol has not always been rigorously specified. This has in many ways been an advantage, as it's allowed for the DNS to be extended in useful ways without being too restricted by structure. However, this also causes confusion when considering naming in a new protocol or environment: designers and developers who are used to working with DNS as implemented in the Internet may think of the system as a "blob," but it's probably useful to be able to separate:
Any or all of these attributes can be incompatible with the DNS protocol, if the designer is attempting to modify it in part (mDNS and .local names) or simply to re-use strings that look like domain names in an entirely different way (Tor and .onion names). But doing this carelessly can result in ambiguity of resolution, leakage of names between resolution contexts, and other forms of pathlogical behavior.
Decades of experience with naming in computer programming and network protocols, and with the DNS and domain names in the internet, suggest a few observations that may be relevant for those looking for a suitable naming system and name resolution protocol for network applications and protocols.
As a starting point, most of them pertain to the challenges of using domain names and DNS conventions in internet protocols.
It seems to be increasingly common for protocol designers to denote a specific name resolution context for a domain in the domain namespace by using a special string, intended to be interpreted as a domain name and then used as a switch into another name resolution context. This is usually done by designating a string to be used as a "special use name" in the rightmost label in a domain name (presentation format) or the node closest to the root in a canonical FQDN. This solution may or may not involve a delegation for the name in the global DNS, or an expectation that the string will not be delegated. (See questions above regarding the assumptions made in a new protocol about potential collision between domain names in its context and domain names in the public DNS or elsewhere.)
As described above, this practice has some benefits. It allows the protocol to take advantage of a number of existing features of the internet environment, including widespread availability of libraries for parsing domain names and a reasonable degree of comfort that names in a subtree of the domain namespace are globally unique.
Potential users of reserved names tend to assume they need a human-readable, single-label domain name as the root of their namespace, and the process of designating such a reserved name is commonly referred to as obtaining or reserving "a TLD". This assumption carries overhead, however, and this apparently simple solution hides some risks. Problems with this approach include:
However, if the "human-readable, single label" constraints are slightly relaxed, the situation becomes a bit more tractable.
The policy problems associated with a single-label or root-level domain name are largely avoidable if a name elsewhere in the domain name hierarchy can be used instead. In particular, for a domain name reserved in an IETF standard, the IETF can direct IANA to reserve it under the .arpa TLD; the approval of the IAB is required as the administrator of record for .arpa under RFC 3172. This course of action is probably the simplest for a special use name that isn't required to be a TLD. If the name needs to be delegated in the DNS (as in the case of home.arpa, which is to be delegated so clients can rely on certain behavior with locally-administered DNSSEC) the IAB can direct IANA accordingly. The IAB can also commit that a domain name intended for resolution outside of the DNS under .arpa will not collide with a DNS name there.
It's also been proposed that a special use name be set aside specifically as the root domain label for "domain names not to be used in the DNS" so that protocol designers and implementers can be reasonably sure that names used in that domain will not collide with names in the global DNS namespace. However, this works only for names that are not required to be single labels. (Reference alt-tld draft.)
Another possible way to support ad hoc use of domain names while limiting the risk of name collision, in the DNS name space or the larger domain name space, would be to allow programmatic formation of random strings, such as the uses standardized for leading underscores or the prefix "xn--" for certain names in the DNS and other IETF protocols. This might be poorly suited to situations where humans were expected to see and assign meaning to the names, but might simplify the use of domain names in a machine-to-machine protocol.
An IETF standard cannot force a name to be resolved in a given context, or not. That authority belongs to the operators of name resolvers, for the DNS protocol and otherwise. In the case of DNS, DNS operators determine what names can and can't be resolved with the DNS protocol by users sending queries to their resolvers. In other words, having the IETF document in an RFC that a particular name is to be used for a particular purpose or protocol does not prevent network operators from using the same string as a name for other purposes or in other protocols. An RFC is accepted as guidance by many DNS operators and implementers, however.
RFC 6761 establishes a registry of names that the IETF has designated as "special use domain names." An entry in this registry does not prevent local operators from configuring their environments as they see fit, including allowing such names to leak into the global DNS even if they're not supposed to (often considered a privacy risk). An entry in this registry discourages others from attempting to re-use the same domain names for other purposes or protocols, particularly within the set of IETF protocols.
Concerns are frequently expressed that spurious queries into the DNS are to be avoided in order to avoid leakage of potentially sensitive information into the global internet, challenges in debugging provided by giving up control of where such queries go, and extra load on the DNS root name servers. The first two concerns are well within the scope of operational concern. However, root name servers are configured for abnormal environmental conditions, not normal loads, and are probably not a big concern here. It's been the case for decades that most of the load on the root name servers is already spurious, in much the same way that load on email services is a concern only after one has considered that the vast bulk of email is spam.
Human-readable names may pose problems that random strings do not, such as internationalization and intellectual property concerns. "Human readable" is not a constraint to be added casually to the choice of domain names for a protocol or application.
Global uniqueness is also a constraint that comes at a higher price than may be obvious. The contents of the DNS root zone are evolving on a relatively short time scale, and the number of protocols and applications that assume their choices of strings will meet with universal respect from potentially colliding other uses seemsto be growing.
This document has no action for IANA. It might, in fact, help make some possible future IANA actions unnecessary.
This document poses no specific security considerations. However, a poorly specified naming scheme at the base of a protocol poses significant security risks and should be avoided.
This draft is the outcome of many conversations over many months, including discussions in the DNSOP WG, the IAB, and the ICANN SSAC. Particular thanks to Ed Lewis, Wendy Seltzer, Ralph Droms, Lyman Chapin, Dave Thaler, Brian Trammell, Ted Lemon, David Conrad, Andrew Sullivan, Ted Hardie, John Klensin, and everyone who's expressed exasperation to the author with respect to the issues discussed here.