Internet DRAFT - draft-muks-dnsop-dns-message-fragments
draft-muks-dnsop-dns-message-fragments
Internet Engineering Task Force M. Sivaraman
Internet-Draft Internet Systems Consortium
Intended status: Experimental S. Kerr
Expires: July 22, 2016 L. Song
Beijing Internet Institute
January 19, 2016
DNS message fragments
draft-muks-dnsop-dns-message-fragments-00
Abstract
This document describes a method to transmit DNS messages over
multiple UDP datagrams by fragmenting them at the application layer.
The objective is to allow authoriative servers to successfully reply
to DNS queries via UDP using multiple smaller datagrams, where larger
datagrams may not pass through the network successfully.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 22, 2016.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
Sivaraman, et al. Expires July 22, 2016 [Page 1]
Internet-Draft DNS message fragments January 2016
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Background . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Motivation . . . . . . . . . . . . . . . . . . . . . . . 4
2. DNS Message Fragmentation Method . . . . . . . . . . . . . . 4
2.1. Client Behavior . . . . . . . . . . . . . . . . . . . . . 4
2.2. Server Behavior . . . . . . . . . . . . . . . . . . . . . 5
2.3. Other Notes . . . . . . . . . . . . . . . . . . . . . . . 6
3. The ALLOW-FRAGMENTS EDNS(0) Option . . . . . . . . . . . . . 7
3.1. Wire Format . . . . . . . . . . . . . . . . . . . . . . . 7
3.2. Option Fields . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1. Maximum Fragment Size . . . . . . . . . . . . . . . . 7
3.3. Presentation Format . . . . . . . . . . . . . . . . . . . 7
4. The FRAGMENT EDNS(0) Option . . . . . . . . . . . . . . . . . 7
4.1. Wire Format . . . . . . . . . . . . . . . . . . . . . . . 7
4.2. Option Fields . . . . . . . . . . . . . . . . . . . . . . 7
4.2.1. Fragment Identifier . . . . . . . . . . . . . . . . . 7
4.2.2. Fragment Count . . . . . . . . . . . . . . . . . . . 7
4.3. Presentation Format . . . . . . . . . . . . . . . . . . . 8
5. Network Considerations . . . . . . . . . . . . . . . . . . . 8
5.1. Background . . . . . . . . . . . . . . . . . . . . . . . 8
5.2. Implementation Requirements . . . . . . . . . . . . . . . 8
6. Open Issues and Discussion . . . . . . . . . . . . . . . . . 8
7. Security Considerations . . . . . . . . . . . . . . . . . . . 9
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
10.1. Normative references . . . . . . . . . . . . . . . . . . 10
10.2. Informative references . . . . . . . . . . . . . . . . . 10
Appendix A. Change History (to be removed before publication) . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction
1.1. Background
[RFC1035] describes how DNS messages are to be transmitted over UDP.
A DNS query message is transmitted using one UDP datagram from client
to server, and a corresponding DNS reply message is transmitted using
one UDP datagram from server to client.
The upper limit on the size of a DNS message that can be transmitted
thus depends on the maximum size of the UDP datagram that can be
transmitted successfully from the sender to the receiver. Typically
Sivaraman, et al. Expires July 22, 2016 [Page 2]
Internet-Draft DNS message fragments January 2016
any size limit only matters for DNS replies, as DNS queries are
usually small.
As a UDP datagram is transmitted in a single IP PDU, in theory the
size of a UDP datagram (including various lower internet layer
headers) can be as large as 64 KiB. But practically, if the datagram
size exceeds the path MTU, then the datagram will either be
fragmented at the IP layer, or worse dropped, by a forwarder. In the
case of IPv6, DNS packets are fragmented by the sender only. If a
packet's size exceeds the path MTU, a Packet Too Big (PTB) ICMP
message will be received by sender without any clue to the sender to
reply again with a smaller sized message, due to the stateless
feature of DNS. In addition, IP-level fragmentation caused by large
DNS response packet will introduce risk of cache poisoning
[Fragment-Poisonous], in which the attacker can circumvent some
defense mechanisms (like port, IP, and query randomization
[RFC5452]).
As a result, a practical DNS payload size limitation is necessary.
[RFC1035] limited DNS message UDP datagram lengths to a maximum of
512 bytes. Although EDNS(0) [RFC6891] allows an initiator to
advertise the capability of receiving lager packets (up to 4096
bytes), it leads to fragmentation because practically most packets
are limited to 1500 byte size due to host Ethernet interfaces, or
1280 byte size due to minimum IPv6 MTU in the IPv6 stack [RFC3542].
According to DNS specifications [RFC1035], if the DNS response
message can not fit within the packet's size limit, the response is
truncated and the initiator will have to use TCP as a fallback to re-
query to receive large response. However, not to mention the high
setup cost introduced by TCP due to additional roundtrips, some
firewalls and middle boxes even block TCP/53 which cause no responses
to be received as well. It becomes a significant issue when the DNS
response size inevitably increases with DNSSEC deployment.
In this memo, DNS message fragmentation attempts to work around
middle box misbehavior by splitting a single DNS message across
multiple UDP datagrams. Note that to avoid DNS amplification and
reflection attacks, DNS cookies [I-D.ietf-dnsop-cookies] is a
mandatory requirement when using DNS message fragments.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Sivaraman, et al. Expires July 22, 2016 [Page 3]
Internet-Draft DNS message fragments January 2016
1.2. Motivation
It is not a new topic regarding large DNS packets(>512B) issue
[I-D.ietf-dnsop-respsize], starting from introduction of IPv6,
EDNS(0) [SAC016], and DNSSEC deployment [SAC035]. In current
production networks, using DNSSEC with longer DNSKEYs (ZSK>1024B and
KSK>2048B) will result in response packets no smaller than 1500B
[T-DNS]. Especially during the KSK rollover process, responses to
the query of DNSKEY RRset will be enlarged as they contain both the
new and old KSK.
When possible, we should avoid dropped packets as this means the
client must wait for a timeout, which incurs a high cost. For
example, a validator behind a firewall suffers waiting till the
timeout with no response, if the firewall drops large EDNS(0) packets
and IP fragments. It may even cause disaster when the validator can
not recieve response for new trust anchor KSK due to the extreme case
of bad middle boxes which also drop TCP/53.
Since UDP requires fewer packets on the wire and less state on
servers than TCP, in this memo we propose continuing to use UDP for
transmission but fragment the larger DNS packets into smaller DNS
packets at the application layer. We would like the fragments to
easily go through middle boxes and avoid falling back to TCP.
2. DNS Message Fragmentation Method
2.1. Client Behavior
Clients supporting DNS message fragmentation add an EDNS option to
their queries, which declares their support for this feature.
If a DNS reply is received that has been fragmented, it will consist
of multiple DNS message fragments (each transmitted in a respective
UDP packet), and every fragment contain an EDNS option which says how
many total fragments there are, and the identifier of the fragment
that the current packet represents. The client collects all of the
fragments and uses them to reconstruct the full DNS message. Clients
MUST maintain a timeout when waiting for the fragments to arrive.
Clients that support DNS message fragments MUST be able to reassemble
fragments into a DNS message of any size, up to the maximum of 64KiB.
The client MAY save information about what sizes of packets have been
received from a given server. If saved, this information MUST have a
limited duration.
Sivaraman, et al. Expires July 22, 2016 [Page 4]
Internet-Draft DNS message fragments January 2016
If TSIG [RFC2845] is used, then the signature must be checked on each
fragment separately.
Any DNSSEC validation is performed on the reassembled DNS message.
2.2. Server Behavior
Servers supporting DNS message fragmentation will look for the EDNS
option which declares client support for the feature. If not
present, the server MUST NOT use DNS message fragmentation. The
server MUST check that DNS cookies are supported. [**FIXME**]
Implementation of the first request case, where no existing
established cookie is available needs discussion; we want to avoid
additional round-trips here. Shane: don't cookies already handle
this case?
The server prepares the response DNS message normally. If the
message exceeds the maximum UDP payload size specified by the client,
then it should fragment the message into multiple UDP datagrams.
Each fragment contains an identical DNS header with TC=1, possibly
varying only in the section counts. Setting the TC flag in this way
insures that clients which do not support DNS fragments can fallback
to TCP transparently.
As many RR are included in each fragment as are possible without
going over the desired size of the fragment. An EDNS option is added
to every fragment, that includes both the fragment identifier and the
total number of fragments. Names are compressed in each fragment
separately.
An RRSET may be split across multiple fragments. RRSIG may be sent
in a separate fragment from the RRSET that it refers to.
The server needs to know how many total fragments there are to insert
into each fragment. A simple approach would be to generate all
fragments, and then count the total number at the end, and update the
previously-generated fragments with the total number of fragments.
Other techniques may be possible.
The server MUST limit the number of fragments that it uses in a
reply. Too many packets sent simultaneously can cause network
congestion or packet loss, either of which will degrade the overall
network performance. (See "Open Issues and Discussion" for remaining
work.)
The server should try to minimize the total number of packets sent,
however since this is similar to the bin packing problem, which is
Sivaraman, et al. Expires July 22, 2016 [Page 5]
Internet-Draft DNS message fragments January 2016
complexity NP-hard, the server does not have to garantee this. In
all cases, the server MUST NOT exceed the maximum fragment size
requested by a client.
If TSIG [RFC2845] is used, then the signature must be added to each
fragment separately.
If a message cannot be fragmented successfully or the server does not
wish to fragment a message, then the FRAGMENT EDNS(0) Option is not
included in the reply, and it is truncated as a normal, non-
fragmented message. (There are many reasons a server may not wish to
fragment a reply, for example if it would result in too many
fragments. Also, since fragmentation occurs on an RR boundary, any
RR that would cause a fragment to exceed the maximum message size
cannot be fragmented. A large TXT record can cause this behavior,
for example.)
2.3. Other Notes
o The FRAGMENT option MUST NOT be present in DNS query messages,
i.e., when QR=0. If a DNS implementation notices the FRAGMENT
option in a DNS query message, it MUST ignore it.
o In DNS reply messages, the FRAGMENT option MUST NOT be present in
datagrams when truncation is not done, i.e., when TC=0. If a DNS
implementation notices the FRAGMENT option in a DNS reply message
fragment datagram that is not truncated, i.e, when TC=0, it MUST
drop all DNS reply message fragment datagrams received so far
(awaiting assembly) for that message's corresponding question
tuple (server IP, port, message ID) without using any data from
them. [**FIXME**] Dropping fragments to be received yet will be
problematic for implementations, but dropping fragments received
so far ought to be sufficient.
o More than one FRAGMENT option MUST NOT be present in a DNS reply
message fragment datagram. If a DNS implementation notices
multiple FRAGMENT options in a DNS reply message fragment
datagram, it MUST drop all reply datagrams received for that
message's corresponding question tuple (server IP, port, message
ID) without using any data from them. [**FIXME**] Dropping
fragments to be received yet will be problematic for
implementations, but dropping fragments received so far ought to
be sufficient.
Sivaraman, et al. Expires July 22, 2016 [Page 6]
Internet-Draft DNS message fragments January 2016
3. The ALLOW-FRAGMENTS EDNS(0) Option
ALLOW-FRAGMENTS is an EDNS(0) [RFC6891] option that a client uses to
inform a server that it supports fragmented responses.
3.1. Wire Format
TBD.
3.2. Option Fields
3.2.1. Maximum Fragment Size
The Maximum Fragment Size field is represented as an unsigned 16-bit
integer. This is the maximum size used by any given fragment the
server returns.
3.3. Presentation Format
As with other EDNS(0) options, the ALLOW-FRAGMENTS option does not
have a presentation format.
4. The FRAGMENT EDNS(0) Option
FRAGMENT is an EDNS(0) [RFC6891] option that assists a client in
gathering the various fragments of a DNS message from multiple UDP
datagrams. It is described in a previous section. Here, its syntax
is provided.
4.1. Wire Format
TBD.
4.2. Option Fields
4.2.1. Fragment Identifier
The Fragment Identifier field is represented as an unsigned 8-bit
integer. The first fragment is identified as 1. Values in the range
[1,255] can be used to identify the various fragments. Value 0 is
used for signalling purposes.
4.2.2. Fragment Count
The Fragment Count field is represented as an unsigned 8-bit integer.
It contains the number of fragments in the range [1,255] that make up
the DNS message. Value 0 is used for signalling purposes.
Sivaraman, et al. Expires July 22, 2016 [Page 7]
Internet-Draft DNS message fragments January 2016
4.3. Presentation Format
As with other EDNS(0) options, the FRAGMENT option does not have a
presentation format.
5. Network Considerations
5.1. Background
TCP-based application protocols co-exist well with competing traffic
flows in the internet due to congestion control methods such as in
[RFC5681] that are present in TCP implementations.
UDP-based application protocols have no restrictions in lower layers
to stop them from flooding datagrams into a network and causing
congestion. So applications that use UDP have to check themselves
from causing congestion so that their traffic is not disruptive.
In the case of [RFC1035], only one reply UDP datagram was sent per
request UDP datagram, and so the lock-step flow control automatically
ensured that UDP DNS traffic didn't lead to congestion. When DNS
clients didn't hear back from the server, and had to retransmit the
question, they typically paced themselves by using methods such as a
retransmission timer based on a smoothed round-trip time between
client and server.
Due to the message fragmentation described in this document, when a
DNS query causes multiple DNS reply datagrams to be sent back to the
client, there is a risk that without effective control of flow, DNS
traffic could cause problems to competing flows along the network
path.
Because UDP does not guarantee delivery of datagrams, there is a
possibility that one or more fragments of a DNS message will be lost
during transfer. This is especially a problem on some wireless
networks where a rate of datagrams can continually be lost due to
interference and other environmental factors. With larger numbers of
message fragments, the probability of fragment loss increases.
5.2. Implementation Requirements
TBD.
6. Open Issues and Discussion
1. Resolver behavior
Sivaraman, et al. Expires July 22, 2016 [Page 8]
Internet-Draft DNS message fragments January 2016
We need some more discussion of resolver behavior in general, at
least to the point of making things clear to an implementor.
2. What is the size of fragments?
Generally speaking the number of fragment increases if fragment
size is small (512 bytes, or other empirical value), which makes
the mechanism less efficient. If the size can changed
dynamically according to negotiation or some detection, it will
introduce more cost and round trip time.
3. We might set specific upper limits for number of fragments.
4. OPT-RR
Some OPT-RR seem to be oriented at the entire message, others
make more sense per packet. This needs to be sorted out. It
looks like only cookies need to be included per fragment.
7. Security Considerations
To avoid DNS amplification or reflection attacks, DNS cookies
[I-D.ietf-dnsop-cookies] must be used. The DNS cookie EDNS option is
identical in all fragments that make up a DNS message. The
duplication of the same cookie values in all fragments that make up
the message is not expected to introduce a security weakness in the
case of off-path attacks.
8. IANA Considerations
The ALLOW-FRAGMENTS and FRAGMENT EDNS(0) options require option codes
to be assigned for them.
9. Acknowledgements
Thanks to Stephen Morris, JINMEI Tatuya, Paul Vixie, Mark Andrews,
and David Dragon for reviewing a pre-draft proposal and providing
support, comments and suggestions.
Sivaraman, et al. Expires July 22, 2016 [Page 9]
Internet-Draft DNS message fragments January 2016
10. References
10.1. Normative references
[I-D.ietf-dnsop-cookies]
Eastlake, D. and M. Andrews, "Domain Name System (DNS)
Cookies", draft-ietf-dnsop-cookies-09 (work in progress),
January 2016.
[RFC1035] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <http://www.rfc-editor.org/info/rfc1035>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
[RFC2845] Vixie, P., Gudmundsson, O., Eastlake 3rd, D., and B.
Wellington, "Secret Key Transaction Authentication for DNS
(TSIG)", RFC 2845, DOI 10.17487/RFC2845, May 2000,
<http://www.rfc-editor.org/info/rfc2845>.
[RFC3542] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei,
"Advanced Sockets Application Program Interface (API) for
IPv6", RFC 3542, DOI 10.17487/RFC3542, May 2003,
<http://www.rfc-editor.org/info/rfc3542>.
[RFC5452] Hubert, A. and R. van Mook, "Measures for Making DNS More
Resilient against Forged Answers", RFC 5452,
DOI 10.17487/RFC5452, January 2009,
<http://www.rfc-editor.org/info/rfc5452>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<http://www.rfc-editor.org/info/rfc5681>.
[RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms
for DNS (EDNS(0))", STD 75, RFC 6891,
DOI 10.17487/RFC6891, April 2013,
<http://www.rfc-editor.org/info/rfc6891>.
10.2. Informative references
[Fragment-Poisonous]
Herzberg, A. and H. Shulman, "Fragmentation Considered
Poisonous", 2012.
Sivaraman, et al. Expires July 22, 2016 [Page 10]
Internet-Draft DNS message fragments January 2016
[I-D.ietf-dnsop-respsize]
Vixie, P., Kato, A., and J. Abley, "DNS Referral Response
Size Issues", draft-ietf-dnsop-respsize-15 (work in
progress), February 2014.
[SAC016] ICANN Security and Stability Advisory Committee, "Testing
Firewalls for IPv6 and EDNS0 Support", 2007.
[SAC035] ICANN Security and Stability Advisory Committee, "DNSSEC
Impact on Broadband Routers and Firewalls", 2008.
[T-DNS] Zhu, L., Hu, Z., and J. Heidemann, "T-DNS: Connection-
Oriented DNS to Improve Privacy and Security (extended)",
2007, <http://www.isi.edu/~johnh/PAPERS/Zhu14b.pdf>.
Appendix A. Change History (to be removed before publication)
o draft-muks-dns-message-fragments-00
Initial draft.
o draft-muks-dnsop-dns-message-fragments-00
* Renamed to include dnsop in the name.
* Removed the increasing packet size approach. All packets are
now limited to the same size in a given fragmented message.
* Added a brief note about network congestion and packet loss if
sending too many fragments at once.
* TSIG behavior documented.
* Documented that RRSET may be split, and RRSIG sent separately
from RRSET.
* Documented behavior if a packet cannot be split, and gave some
text about how this can happen.
* Clarified name compression is per fragment.
Authors' Addresses
Sivaraman, et al. Expires July 22, 2016 [Page 11]
Internet-Draft DNS message fragments January 2016
Mukund Sivaraman
Internet Systems Consortium
950 Charter Street
Redwood City, CA 94063
US
Email: muks@mukund.org
URI: http://www.isc.org/
Shane Kerr
Beijing Internet Institute
2/F, Building 5, No.58 Jinghai Road, BDA
Beijing 100176
CN
Email: shane@biigroup.cn
URI: http://www.biigroup.com/
Linjian Song
Beijing Internet Institute
2/F, Building 5, No.58 Jinghai Road, BDA
Beijing 100176
CN
Email: songlinjian@gmail.com
URI: http://www.biigroup.com/
Sivaraman, et al. Expires July 22, 2016 [Page 12]