TOC |
|
The syntax for allowed Top-Level Domain (TLD) labels in the Domain Name System (DNS) is not clearly applicable to the encoding of Internationalised Domain Names (IDNs) as TLDs.
This document provides a concise specification of TLD label syntax based on existing syntax documentation, extended minimally to accommodate IDNs.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
This Internet-Draft will expire on October 24, 2010.
Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1.
Introduction
2.
Background
3.
TLD Label Syntax Specification
4.
Policy Considerations
5.
IANA Considerations
6.
Security Considerations
7.
References
7.1.
Normative References
7.2.
Informative References
Appendix A.
Change History
A.1.
draft-liman-tld-names-02
A.2.
draft-liman-tld-names-01
A.3.
draft-liman-tld-names-00
§
Authors' Addresses
TOC |
The syntax of Top-Level Domain (TLD) labels in the Domain Name System (DNS) was specified somewhat imprecisely in [RFC1123] (Braden, R., “Requirements for Internet Hosts - Application and Support,” October 1989.) which required that TLD names be "alphabetic". This is commonly interpreted as excluding the hyphen character and numeric digits from TLD labels. This restriction does not accommodate the US-ASCII encoding of Internationalised Domain Names (IDNs), as specified in [I‑D.ietf‑idnabis‑defs] (Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” January 2010.). A more detailed discussion of the existing specifications can be found in Section 2 (Background).
This document updates the definition of allowable top-level domain names to support IDNs but places some restrictions on the choice of IDN labels, consistent with the existing specification for US-ASCII TLD labels. See Section 3 (TLD Label Syntax Specification) for the updated specification.
This document focuses narrowly on the issue of allowable labels in TLDs and does not (and is not intended to) make any other changes or clarifications to existing domain name syntax rules.
It is carefully noted that the specification in this document is not the only factor in choosing TLD labels, and that many considerations external to the IETF are included in that wider policy. See Section 4 (Policy Considerations) for more discussion of policy considerations.
TOC |
[RFC0952] (Harrenstien, K., Stahl, M., and E. Feinler, “DoD Internet host table specification,” October 1985.) defines a host name as follows:
'A "name" ... is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.). Note that periods are only allowed when they serve to delimit components of "domain style names". (See RFC-921, "Domain Name System Implementation Schedule", for background). No blank or space characters are permitted as part of a name. No distinction is made between upper and lower case. The first character must be an alpha character. The last character must not be a minus sign or period.' [Unnumbered section titled "ASSUMPTIONS", first paragraph]
[RFC1123] (Braden, R., “Requirements for Internet Hosts - Application and Support,” October 1989.) reaffirms this definition, making two additional changes to the syntax:
Neither [RFC0952] (Harrenstien, K., Stahl, M., and E. Feinler, “DoD Internet host table specification,” October 1985.) nor [RFC1123] (Braden, R., “Requirements for Internet Hosts - Application and Support,” October 1989.) explicitly states the reasons for these restrictions. It might be supposed that human factors considerations were involved; [RFC1123] (Braden, R., “Requirements for Internet Hosts - Application and Support,” October 1989.) appears to suggest that one of the reasons was to prevent confusion between dotted-decimal IPv4 addresses and host domain names. In any case, it is reasonable to believe that the restrictions are often assumed in deployed software, and that changing the rules should be undertaken with caution.
The Internationalised Domain Names in Applications (IDNA) 2008 specification provides a protocol for encoding unicode strings in DNS labels. The unicode string used by applications is known as a U-Label; its corresponding encoding in the DNS is known as an A-Label. The terms A-Label and U-Label are used in this document as defined in [I‑D.ietf‑idnabis‑defs] (Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” January 2010.). Valid A-Labels always contain non-alphabetic characters.
In order to accommodate the wish to express TLD names in scripts other than Latin (or rather, the US-ASCII subset of Latin), it is necessary to allow non-alphabetic characters in TLD DNS labels. To make the change as small as possible, we restrict the U-label form of the label in ways functionally compatible with the restrictions (from [RFC0952] (Harrenstien, K., Stahl, M., and E. Feinler, “DoD Internet host table specification,” October 1985.) and [RFC1123] (Braden, R., “Requirements for Internet Hosts - Application and Support,” October 1989.)) on US-ASCII labels for TLDs. This restriction will not enable new A-label TLDs to function with existing software that checks DNS top-level labels for conformance with the alphabetic restriction. It merely makes the same traditional rule work in a similar way for IDNA top-level labels.
TOC |
This document relaxes the existing specification to allow TLD labels to be well-formed A-Labels, but places restrictions on the corresponding U-Labels.
The ABNF expression that matches a valid TLD label is as follows:
tldlabel = traditional-tld-label / idn-label traditional-tld-label = 1*63(ALPHA) idn-label = Restricted-A-Label ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
Restricted-A-label is an A-Label as defined in [I‑D.ietf‑idnabis‑defs] (Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” January 2010.) converted from (and convertible to) a U-Label that is consistent with the definition in [I‑D.ietf‑idnabis‑defs] (Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” January 2010.) and that is further restricted to contain only Unicode characters with the derived property value PVALID and of General Category "L" or "Mn". Note that "L" contains several sub-categories, and that many characters of the "L" category do not have the derived property value PVALID. Most emphatically, this specification does not allow use of any character in a Restricted-A-Label that has derived property value of CONTEXT.
More specifically, a Restricted-A-Label consists of one or more Unicode characters such that all of the following statements are true:
This new specification reflects current practice in registration of TLD names by the IANA, extended to accommodate IDNs.
TOC |
This document provides a technical specification for TLD label syntax; it does not encapsulate a complete policy under which TLD names may be chosen.
At the time of writing, the policy under which TLD names are chosen is developed and maintained by ICANN in consultation with a wide base of stakeholders. As the Internet continues to grow to serve new user communities, applications and services, it is to be expected that the corresponding policy will be changed accordingly.
TOC |
This document makes no requests of the IANA.
This section needs IANA review. :-)
TOC |
This document is believed to have limited security implications.
The creation of new TLDs has the potential to conflict with software which (for example) does not accommodate TLD labels which did not exist at the time the software was written. Such software problems might in turn lead to security vulnerabilities; e.g., in the event that a DNS name specified by a user is truncated or otherwise misinterpreted, causing an application to interact with a different remote host from that which the user intended. It should be noted that this is not a new phenomenon, and has been observed following the creation of new TLDs prior to the publication of this document.
The issue of characters that can be confused with each other is a risk, but it is discussed at length in the Security Considerations section of [I‑D.ietf‑idnabis‑defs] (Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” January 2010.).
TOC |
TOC |
[RFC1123] | Braden, R., “Requirements for Internet Hosts - Application and Support,” STD 3, RFC 1123, October 1989 (TXT). |
[I-D.ietf-idnabis-defs] | Klensin, J., “Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework,” draft-ietf-idnabis-defs-13 (work in progress), January 2010 (TXT). |
TOC |
[RFC0952] | Harrenstien, K., Stahl, M., and E. Feinler, “DoD Internet host table specification,” RFC 952, October 1985 (TXT). |
TOC |
This section (and sub-sections) should be removed before publication/
TOC |
Wordsmithing and rearrangement of text following discussions with Joe Abley, Tina Dam, Thomas Narten and Andrew Sullivan. Incorporated revised ABNF and associated specification from Patrik Faltstrom.
TOC |
Substantial comments and improvements supplied by Thomas Narten and John Klensin. Decided to go for a minimal change approach. Also noted that U-labels have to be letters due to jumping digit problem. Rewritten major parts.
TOC |
First cut. Prompted by Olafur Gudmundsson and Tina Dam.
TOC |
Lars-Johan Liman | |
Autonomica AB | |
Franzengatan 5 | |
SE-112 51 Stockholm | |
Sweden | |
Email: | liman@autonomica.se |
URI: | http://www.autonomica.se/ |
Joe Abley | |
ICANN | |
4676 Admiralty Way | |
Suite 330 | |
Marina del Rey 90292 | |
USA | |
Email: | joe.abley@icann.org |
URI: | http://www.icann.org/ |