Internet DRAFT - draft-arkko-abcd-distributed-resolver-selection
draft-arkko-abcd-distributed-resolver-selection
Network Working Group J. Arkko
Internet-Draft Ericsson
Intended status: Informational M. Thomson
Expires: September 11, 2020 Mozilla
T. Hardie
Google
March 10, 2020
Selecting Resolvers from a Set of Distributed DNS Resolvers
draft-arkko-abcd-distributed-resolver-selection-01
Abstract
This memo discusses the use of a set of different DNS resolvers to
reduce privacy problems related to resolvers learning the Internet
usage patterns of their clients.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 11, 2020.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
Arkko, et al. Expires September 11, 2020 [Page 1]
Internet-Draft Distributed Resolver Selection March 2020
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Operational Context . . . . . . . . . . . . . . . . . . . . . 4
3. Goals and Constraints . . . . . . . . . . . . . . . . . . . . 4
4. Query distribution strategies . . . . . . . . . . . . . . . . 6
4.1. Client-based . . . . . . . . . . . . . . . . . . . . . . 6
4.1.1. Analysis of client-based selection . . . . . . . . . 6
4.1.2. Enhancements to client-based selection . . . . . . . 7
4.2. Name-based . . . . . . . . . . . . . . . . . . . . . . . 7
4.2.1. Name reduction . . . . . . . . . . . . . . . . . . . 8
5. Early conclusions . . . . . . . . . . . . . . . . . . . . . . 9
5.1. Analysis conclusions . . . . . . . . . . . . . . . . . . 9
5.2. Recommendations . . . . . . . . . . . . . . . . . . . . . 9
5.3. Poor distribution strategies . . . . . . . . . . . . . . 9
6. Effects of query distribution . . . . . . . . . . . . . . . . 10
6.1. Caching considerations . . . . . . . . . . . . . . . . . 10
6.2. Consistency considerations . . . . . . . . . . . . . . . 10
6.3. Resolver load distribution and failover . . . . . . . . . 11
6.4. Query performance . . . . . . . . . . . . . . . . . . . . 11
6.5. Debugging . . . . . . . . . . . . . . . . . . . . . . . . 11
7. Further work . . . . . . . . . . . . . . . . . . . . . . . . 11
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.1. Normative References . . . . . . . . . . . . . . . . . . 12
8.2. Informative References . . . . . . . . . . . . . . . . . 12
8.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction
The DNS [DNS] is a complex system with many different security
issues, challenges, deployment models and usage patterns. This
document focuses on one narrow aspect within DNS and its security.
Traditionally, systems are configured with a single DNS recursive
resolver, or a set of primary and alternate recursive resolvers.
Recursive resolver services are offered by organisations such as
enterprises, ISPs, and global providers. Even when clients use
alternate recursive resolvers, they are typically all provided by the
same organisation.
The resolvers will learn the Internet usage patterns of their
clients. A client might decide to trust a particular recursive
resolver with information about DNS queries. However, it is
Arkko, et al. Expires September 11, 2020 [Page 2]
Internet-Draft Distributed Resolver Selection March 2020
difficult or impossible to provide any guarantees about data handling
practices in the general case. And even if a service can be trusted
to respect privacy with respect to handling of query data, legal and
commercial pressures or surveillance activity could result in misuse
of data. Similarly, outside attacks may occur towards any DNS
services. For a service with many clients, these risks are
particularly undesirable.
This memo discusses whether DNS clients can improve their privacy
through the potential use of a set of multiple recursive resolver
services. The goal is indeed an improvement only. There is no
expectation that it would be possible to have no part of the DNS
infrastructure aware of what queries are being made, but perhaps
there are mitigations that would make possible information collection
from the DNS infrastructure harder.
It should be understood that this is a narrow aspect within a bigger
set of topics even within privacy issues around DNS, let alone other
security issues, deployment models, or the many protocol questions
within DNS. Some of these other topics include detecting the
tampering DNS query responses [DNSSEC], encrypting DNS queries [DOT]
[DOH], application-specific DNS resolution mechanisms, or centralised
deployment models. Those other topics are not covered in this memo
and need to be dealt with elsewhere.
Specifically, the scope of this memo is not limited to DNS-over-TLS
(DOT) or DNS-over-HTTPS (DOH) deployments nor does it take a stand on
operating system vs. application or local vs. centralized DNS
deployment models. This memo is intended to provide useful
information for those that wish to consider trustworthiness of their
recursive resolvers as a part of their privacy analysis.
Naturally, there are some interactions between different topics. For
instance, privacy is affected both by what happens to data in transit
and at the endpoints, so where privacy is a concern, one would expect
to consider both aspects, and lack of consideration on one probably
leads to issues that dwarf the problems that this memo can address.
Both issues are also important aspects when considering defense again
pervasive monitoring efforts [PMON].
The rest of this memo is organized as follows. Section 2 discusses
the operational context that we imagine the multiple recursive
resolver arrangement might be applied in. Section 3 specifies
security goals for a system that employs multiple recursive
resolvers.
One key aspect of a system using multiple resolvers would be how to
select which particular recursive resolver to use in a particular
Arkko, et al. Expires September 11, 2020 [Page 3]
Internet-Draft Distributed Resolver Selection March 2020
situation. This is discussed in Section 4. This section covers a
number of possible strategies and further considerations for this
selection, along with an analysis of the implications of choosing a
particular strategy. There are technical issues in the use of
multiple recursive resolvers, and there are both technical and non-
technical questions in deciding on what recursive resolvers should
even be in the set.
Some early recommendations are provided in Section 5. Section 6
discusses operational and other implications of the distributed
approach. Finally, Section 7 discusses potential further work in
this area.
2. Operational Context
Our perspective is that of a client, choosing to either distribute or
not distribute its queries to a set of different resolvers. And if
the client decides to use distribution, it can choose exactly how it
does that.
There are obviously additional operational aspect of this - such as
central configuration mechanisms, resolver selection application
choices, and so on. But these are not covered in this memo.
It should also be observed that the practices suggested in this memo
are currently not widely used. Operational and other issues may be
discovered, such as those outlined in Section 6.
Many of these issues need further work, but this memo aims to discuss
the concept and analyse its impacts before dwelling into the
technical arrangements for configuring and using this particular
approach.
3. Goals and Constraints
This document aims to reduce the concentration of information about
client activity by distributing DNS queries across different resolver
services, for all DNS queries in the aggregate and for DNS queries
made by individual clients. By distributing queries in this way, the
goal is to reduce the amount of information that any given DNS
resolver service can acquire about client activity. As such, creates
a benefit for the client, but also makes these resolvers less
valuable targets for attacks relating to that client's activity.
Any method for distributing queries from a single client needs to
consider these benefits with regards to the following constraints:
Arkko, et al. Expires September 11, 2020 [Page 4]
Internet-Draft Distributed Resolver Selection March 2020
o A careful selection of the set of trusted resolvers must be the
first priority. It does not make sense to add less trustworthy
resolvers merely for the sake of distribution. For instance,
there is no reason to mix resolvers with good reliability or high
degree of privacy regulation with other resolvers: just include
the best resolvers in the set.
o As the goal is to reduce the amount of information given to any
given resolver, a strategy that tells the same information to all
resolvers is a poor one. A design that results in replicating the
same query toward multiple services would thus be a net privacy
loss. This happens quite easily over a long period of time,
unless the distribution method is carefully designed.
More subtle leaks arise as a result of distributing queries for
sub-domains and even domains that are superficially unrelated,
because these could share a commonality that might be exploited to
link them. For instance, some web sites use names that are appear
unrelated to their primary name for hosting some kinds of content,
like static images or videos. If queries for these unrelated
names were sent to different services, that effectively allows
multiple resolvers to learn that the client accessed the web site.
A distribution scheme also needs to consider stability of query
routing over time. A resolver can observe the absence of queries and
infer things about the state of a client cache, which can reveal that
queries were made to other resolvers. In general, different queries
for the same resolution context, such as sub-resources for a web page
load, which have some probability of being related or linked, should
be sent to the same resolver. Failure to do so reveals information
to more than one resolver.
In effect, there are two goals in tension:
o split queries between as many different resolvers as possible; and
o reduce the spread of linkable queries across multiple resolvers.
The need to limit replication of private information about queries
eliminates naive distribution schemes, such as those discussed in
Section 5.3. The designs described in Section 4 all attempt to
balance these different goals using different properties from the
context of a query (Section 4.1) or the query name itself
(Section 4.2).
Arkko, et al. Expires September 11, 2020 [Page 5]
Internet-Draft Distributed Resolver Selection March 2020
4. Query distribution strategies
This section introduces and analyzes several potential strategies for
distributing queries to different resolvers. Each strategy is
formulated as an algorithm for choosing a resolver Ri from a set of n
resolvers {R1, R2, ..., Rn}.
The designs presented in Section 4 assume that the stub resolver
performing distribution of queries has varying degrees of contextual
information. In general, more contextual information allows for
finer-grained distribution of information between resolvers.
4.1. Client-based
The simplest strategy is to distribute each different client to a
different resolver. This reduces the number of users any particular
service will know about. However, this does little to protect an
individual user from the aggregation of information about queries at
the selected resolver.
In this design clients select and consistently use the same resolver.
This might be achieved by randomly selecting and remembering a
resolver. Alternatively, a resolver might be selected using
consistent hashing that takes some conception of client identity as
input:
i = h(client identity) % n
For the purposes of this determination, a client might be an entire
device, with the selection being made at the operating system level,
or it could be a selection made by individual applications. In the
extreme, an individual application might be able to partition its
activities in a way that allows it to direct queries to multiple
resolvers.
4.1.1. Analysis of client-based selection
This is a simple and effective strategy, but while it provides
distribution of DNS queries in the aggregate, it does little to
divide information about a particular client between resolvers. It
effectively only reduces the number of clients that each resolver can
acquire information about. This provides systemic benefit, but does
not provide individual clients with any significant advantage as
there is still some resolver service that has a complete view of the
user's DNS usage patterns.
In addition, there are specific issues where this selection method is
used in particular deployment modes. Where different applications
Arkko, et al. Expires September 11, 2020 [Page 6]
Internet-Draft Distributed Resolver Selection March 2020
make independent resolver selections, activities that involve
multiple applications can result in information about those
activities being exposed to multiple resolvers. For instance, an
application could open another application for the purposes of
handling a specific file type or to load a URL. This could expose
queries related to the activity as a whole to multiple resolvers.
Making different selections at the level of a device resolves this
issue. But of course it is still possible that an individual who
uses multiple devices might perform similar activities on those
devices, but have DNS queries distributed to different resolvers,
resulting in replicating that individual's information to multiple
resolvers. The individual may or may not be identifiable through
fingerprinting of the specific set of queries being made from the
devices.
4.1.2. Enhancements to client-based selection
Clients can break continuity of records by occasionally resetting
state so that a different resolver is selected. A client might
choose to do this when it moves to a new network location, and may
otherwise appear as a new client its current resolver. But it is
unclear if there's a sufficient advantage to breaking continuity, as
the potential benefits are offset by the client's information being
disclosed to several resolvers as part of performing a series of
resets. And it is possible that a particular individual's usage
patterns can be identified across network locations and periods of
using other resolvers.
Breaking continuity is less effective if any state, in particular
cached results, is retained across the change. If activities that
depend on DNS querying are continued across the change then it might
be possible for the old resolver to make inferences about the
activity on the new resolver, or the new resolver to make similar
guesses about past activity. As many modern applications provide
session continuity features across shutdowns and crashes, this can
mean that finding an appropriate point in time to perform a switch
can be difficult.
4.2. Name-based
Clients might also additionally attempt to distribute queries based
on the name being queried. This results in different names going to
different resolvers.
A naive algorithm for name distribution uses the target name as input
to a fixed hash:
Arkko, et al. Expires September 11, 2020 [Page 7]
Internet-Draft Distributed Resolver Selection March 2020
i = h(queried name) % n
However, this simplistic approach fails to prevent related queries
from being distributed to different resolvers in several ways. For
instance, queries that are executed after receiving a CNAME record in
a response will leak the same information as the original query that
resulted in the CNAME record. Services that use related domain names
- such as where "example.com" uses "static.example.com" or
"cdn.example.net" - might reveal the use of the combined service to a
resolver that receives a query for any associated name. In both
cases, sensitive information is effectively replicated across
multiple resolvers.
4.2.1. Name reduction
In order to reduce the effect of distributing similar names to
different servers, a grouping mechanism might be used. Leading
labels in names might be erased before being input to the hashing
algorithm. This requires that the part of the suffix that is shared
between different services can be identified. For the purposes of
ensuring that queries are consistently routed to the same resolver, a
weak signal is likely sufficient.
Several options for grouping domain names into equivalence sets might
be used:
o The public suffix list [1] provides a manually curated list of
shared domain suffixes. Names can be reduced to include one label
more than the list allows, referred to as effective top-level
domain plus one (eTLD+1). This reduces the number of cases where
queries for domains under the same administrative control are sent
to different resolvers.
o Services often relies on multiple domain names across different
eTLD+1 domains. Developing equivalence sets might be needed to
avoid broadcasting queries to servers. Mozilla maintains a
manually curated equivalence list [2] for web sites that aims to
maps the complete set of unrelated names used by services to a
single service name.
o Other technologies, such as the proposed first party sets [3] or
the abandoned DBOUND [DBOUND] provide domain owners a means to
declare some form of equivalence for different names.
Each of these techniques are imperfect in different ways. They may
also skew the distribution of queries in ways that might concentrate
information on particular resolvers. Moreover, resolver choice based
solely on the name to be resolved rather than per-client information
Arkko, et al. Expires September 11, 2020 [Page 8]
Internet-Draft Distributed Resolver Selection March 2020
reduces the anonymity set of queries sent to each resolver. In
contrast to a client-based strategy, attackers can predict the target
resolver for a given name using a name-based strategy. This may have
implications for on-path attacker attempts to identify otherwise
encrypted queries. Of course, even a name-based mechanism might use
some non-public information as well for its choice, which might
reduce these issues.
5. Early conclusions
5.1. Analysis conclusions
Both the client-based and more advanced name-based strategies may
provide benefits. The former may provide primarily a systemic
benefit, while the latter may provide also some privacy benefits to
each individual client. However, neither strategy is perfect, and
can leak the same information to multiple resolvers in some cases.
5.2. Recommendations
Both strategies are, however, likely generally beneficial in the
common cases, and can improve the overall privacy situation. And
they are certainly a considerable privacy improvement over a
situation where a large number of clients use a single resolver.
Their use may also reduce any pressures against specific resolvers,
as information available in these specific resolvers does not
constitute all information about all clients. As such, the use of
one of these distribution strategies is tentatively recommended,
subject to further testing, discussion, and resolving any remaining
operational issues.
The naive name-based strategy is, however, not recommended, and
neither are other, even simpler strategies listed in Section 5.3. It
should be noted that no technique presented in this memo can defend
against a situation where an actor such as a surveillance agency has
access to information from all resolvers.
5.3. Poor distribution strategies
Random allocation to a resolver might be implemented:
i = rand() % n
Similar drawbacks can be seen where clients iterate over available
resolvers:
i = counter++ % n
Arkko, et al. Expires September 11, 2020 [Page 9]
Internet-Draft Distributed Resolver Selection March 2020
Whether this choice is made on a per-query basis, these two methods
eventually provide information about all queries to all resolvers
over time. Domain names are often queried many times over long
periods, so queries for the same domain name will eventually be
distributed to all resolvers. Only one-off queries will avoid being
distributed.
Implementing either method at a much slower cadence might be
effective, subject to the constraints in Section 4.1.2. This only
slows the distribution of information about repeated queries to all
resolvers.
6. Effects of query distribution
Choosing to use more than one DNS resolver has broader implications
than just the effect on privacy. Using multiple resolvers is a
significant change from the assumed model where stub resolvers send
all queries to a single resolver.
6.1. Caching considerations
Using a common cache for multiple resolvers introduces the
possibility that a resolver could learn about queries that were
originally directed to another resolvers by observing the absence of
queries. Though this can reduce caching performance, clients can
address this by having a per-resolver cache and only using the cache
for the selected resolver.
6.2. Consistency considerations
Making the same query to multiple resolvers can result in different
answers. For instance, DNS-based load balancing can lead to
different answers being produced over time or for different query
origins. Or, different resolvers might have different policies with
respect to blocking or filtering of queries that lead to clients
receiving inconsistent answers.
In the extreme, an application might encounter errors as a result of
receiving incompatible answers, particularly if a server operator
(incorrectly) assumes that different DNS queries for the same client
always originate from the same source address. This is most likely
to occur if name-based selection is used, as queries could be related
based on information that the client does not consider.
Arkko, et al. Expires September 11, 2020 [Page 10]
Internet-Draft Distributed Resolver Selection March 2020
6.3. Resolver load distribution and failover
Any selection of resolvers that is based on random inputs will need
to account for available capacity on resolvers. Otherwise, resolvers
with less available query-processing capacity will receive too high a
proportion of all queries. Clients only need to be informed of
relative available capacity in order to make an appropriate
selection. How relative capacities of resolvers are determined is
not in scope for this document.
The choice of different resolvers would also need to work well with
whatever mechanisms exist for failover to alternate resolvers when
one is not responsive. The same is true of IPv4/IPv6 connectivity,
the availability of communications to specific ports, etc. And the
dynamic situation should obviously not lead to extensive leakage to
different resolvers, either.
6.4. Query performance
Distribution of queries between resolvers also means that clients are
exposed to greater variations in performance [HBSHF19]. In contrast,
using a single resolver, as would result from the client-based method
in Section 4.1, promotes use of a persistent connection.
6.5. Debugging
The use of multiple resolvers may complicate debugging.
7. Further work
Should there be interest in the deployment of ideas laid out in this
memo, further work is needed. There would have to be ways to
configure systems to use multiple resolvers, including for instance:
o Central configuration mechanisms to enable the use of multiple
resolvers, perhaps through usual network configuration mechanisms
or choices made by applications using resolver services directly.
It may also be necessary to employ discovery mechanisms, such as,
e.g., [I-D.schinazi-httpbis-doh-preference-hints] or
[I-D.pauly-dprive-adaptive-dns-privacy] (but see Section 3)
o Mechanisms to allow both failover to working resolvers when a
resolver is unreachable,
o Additional testing for potential operational issues discussed in
Section 2 would be beneficial.
Arkko, et al. Expires September 11, 2020 [Page 11]
Internet-Draft Distributed Resolver Selection March 2020
Finally, more work is needed to determine factors other than privacy
that could motivate having queries routed to the same resolver. The
choice between different approaches is often a combination of several
factors, and privacy is only one of those factors.
8. References
8.1. Normative References
[DNS] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[DNSSEC] Arends, R., Austein, R., Larson, M., Massey, D., and S.
Rose, "DNS Security Introduction and Requirements",
RFC 4033, DOI 10.17487/RFC4033, March 2005,
<https://www.rfc-editor.org/info/rfc4033>.
[DOH] Hoffman, P. and P. McManus, "DNS Queries over HTTPS
(DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018,
<https://www.rfc-editor.org/info/rfc8484>.
[DOT] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D.,
and P. Hoffman, "Specification for DNS over Transport
Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May
2016, <https://www.rfc-editor.org/info/rfc7858>.
[PMON] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
2014, <https://www.rfc-editor.org/info/rfc7258>.
8.2. Informative References
[DBOUND] Levine, J., "Publishing Organization Boundaries in the
DNS", draft-levine-dbound-dns-03 (work in progress), April
2019.
[HBSHF19] Austin Hounsel, ., Kevin Borgolte, ., Paul Schmidt, .,
Jordan Holland, ., and . Nick Feamster, "Analyzing the
Costs (and Benefits) of DNS, DoT, andDoH for the Modern
Web", ANRW '19, July 22, 2019, Montreal, QC, Canada ,
n.d., <https://dl.acm.org/doi/10.1145/3340301.3341129>.
[I-D.pauly-dprive-adaptive-dns-privacy]
Kinnear, E., Pauly, T., Wood, C., and P. McManus,
"Adaptive DNS: Improving Privacy of Name Resolution",
draft-pauly-dprive-adaptive-dns-privacy-01 (work in
progress), November 2019.
Arkko, et al. Expires September 11, 2020 [Page 12]
Internet-Draft Distributed Resolver Selection March 2020
[I-D.schinazi-httpbis-doh-preference-hints]
Schinazi, D., Sullivan, N., and J. Kipp, "DoH Preference
Hints for HTTP", draft-schinazi-httpbis-doh-preference-
hints-01 (work in progress), January 2020.
[MSCVUS] Wikipedia, ., "Microsoft Corp. v. United States",
https://en.wikipedia.org/wiki/
Microsoft_Corp._v._United_States , n.d..
8.3. URIs
[1] https://publicsuffix.org/
[2] https://github.com/mozilla-services/shavar-prod-
lists/blob/master/disconnect-entitylist.json
[3] https://github.com/krgovind/first-party-sets
Appendix A. Acknowledgements
The authors would like to thank Christian Huitema, Ari Keraenen, Mark
Nottingham, Stephen Farrell, Gonzalo Camarillo, Mirja Kuehlewind,
David Allan, Daniel Migault, Goran AP Eriksson, Christopher Wood, and
many others for interesting discussions in this problem space.
Authors' Addresses
Jari Arkko
Ericsson
Email: jari.arkko@piuha.net
Martin Thomson
Mozilla
Email: martin.thomson@gmail.com
Ted Hardie
Google
Email: ted.ietf@gmail.com
Arkko, et al. Expires September 11, 2020 [Page 13]