Internet DRAFT - draft-wood-linkable-identifiers
draft-wood-linkable-identifiers
Network Working Group C. Wood
Internet-Draft Apple Inc.
Intended status: Informational April 24, 2019
Expires: October 26, 2019
Linkable Identifiers
draft-wood-linkable-identifiers-01
Abstract
Rotating public identifiers is encouraged as best practice as a means
of protecting endpoint privacy. For example, regular MAC address
randomization helps mitigate device tracking across time and space.
Other protocols beyond those in the link layer also have public
identifiers or parameters that should rotate over time, in unison
with coupled protocol identifiers, and perhaps with application level
identifiers. This document surveys such privacy-related identifiers
exposed by common Internet protocols at various layers in a network
stack. It provides advice for rotating linked identifiers such that
privacy violations do not occur from rotating one identifier while
neglecting to rotate coupled identifiers.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 26, 2019.
Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
Wood Expires October 26, 2019 [Page 1]
Internet-Draft linkable-identifiers April 2019
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Sticky Protocol Identifiers . . . . . . . . . . . . . . . . . 3
2.1. Internet and Link Layer . . . . . . . . . . . . . . . . . 3
2.2. Transport and Session Layer . . . . . . . . . . . . . . . 4
2.3. Application Layer: . . . . . . . . . . . . . . . . . . . 5
3. Identifier Scope and Threat Model . . . . . . . . . . . . . . 6
4. Limiting Linkable Identifiers . . . . . . . . . . . . . . . . 6
4.1. Time and Path Linkability . . . . . . . . . . . . . . . . 7
5. Timing Considerations . . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
7. Security Considerations . . . . . . . . . . . . . . . . . . . 8
8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 8
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8
10. Normative References . . . . . . . . . . . . . . . . . . . . 9
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
[RFC6973] defines the correlation of information relevant to or
associated with a specific user as a significant attack on privacy.
Different layers of the network stack use identifiers to uniquely
address hosts or information flows. To mitigate the privacy concern,
many standards suggest randomizing or otherwise rotating such
identifiers on a regular basis. For example, a MAC address may be
used to link otherwise unrelated network packets to a single device.
Rotating the MAC address prevents this association at the link layer.
However, when multiple identifiers are simultaneously present on
different layers of the stack, breaking the association at any
individual layer might be insufficient to disassociate a host from
their network traffic. Linkability can also occur across protocols
and/or across layers. For example, TLS connections are commonly
preceded by DNS queries for a particular endpoint (host name), e.g.
example.com. Moreover, in the TLS handshake, this same host name is
sent in cleartext in the Server Name Indication extension. Thus,
observing either the DNS query or TLS SNI reveals information about
the other. Similarly, while an IP address of a device may rotate, if
web browser cookies do not, then, a website can track the various IP
addresses of a given cookie over time.
Wood Expires October 26, 2019 [Page 2]
Internet-Draft linkable-identifiers April 2019
Huitema et al. [I-D.ietf-dnssd-privacy] say, "it is important that
the obfuscation of instance names is performed at the right time, and
that the obfuscated names change in synchrony with other identifiers,
such as MAC Addresses, IP Addresses or host names." Consider the
following example where this advice is not followed, wherein an IP
address is changed yet the MAC address is not.
+---------------+ +---------------+
| ... | .... | ... |
+---------------+ +---------------+
| IP Address A <---//---> IP Address B |
+---------------+ +---------------+
| MAC Address A <--------> MAC Address A |
+---------------+ +---------------+
+---------------------------------------->
time
A network adversary may trivially link these packets based on their
common MAC address and continue to associate traffic with this
particular host based on IP address B even if the MAC address
eventually changes in the future. In this document, we outline
simple rules that SHOULD be followed by protocol implementations to
avoid such linkability. We then survey protocols developed inside
the IETF and out, and identify their sticky identifiers. Results
were obtained by analyzing protocol documentation and specifications,
and also scanning packet traces captured from protocols in practice
on common systems.
2. Sticky Protocol Identifiers
In this section, we survey existing protocols developed inside and
out of the IETF, and identify sticky protocol identifiers for each.
A sticky identifier is one that persists across logically grouped
data exchanges between a client and server. This may include state-
generating servers or, commonly, client algorithm, software
configuration, or device-specific fields. We categorize surveyed
protocols by the OSI layer at which they operate. Specifically, we
focus on Link, Internet, Transport, Session, and Application layers.
(Our taxonomy may not match traditional OSI models, though we
consider it sufficiently representative.)
2.1. Internet and Link Layer
o Ethernet, 802.11, and Bluetooth: MAC addresses are fixed to
specific devices. Unless frequently rotated, they are sticky
identifiers. Simply rotating the MAC address may or may not be
sufficient depending on other information sent at the protocol
Wood Expires October 26, 2019 [Page 3]
Internet-Draft linkable-identifiers April 2019
layer with the a (rotated) MAC address. For example, in 802.11,
frames have an incrementing sequence number and if the sequence
number is not reset in unison with a MAC address change, the
sequence number can be used to re-correlate randomized MAC
addresses.
o IPv4 and IPv6: Static or infrequently rotating addresses are
sticky identifiers when exposed on the network. Privacy
Extensions for Stateless Address Autoconfiguration [RFC4941]
enhance IPv6 client privacy by, e.g., issuing new IPv6 /64
prefixes every day. The 64-bit IID suffix remains random to deter
linkability.
o IKEv2: Initiator Security Parameters Indexes (SPIs) are used as
connection identifiers instead of IP addresses. They are required
to rotate for each new SA.
2.2. Transport and Session Layer
o TCP [RFC0793]: TCP source ports may be sticky if reused across
senders. For example, most operating systems allocate allocate
ephemeral (short lived) ports to each new connection. Per IANA
allocations, ephemeral ports range from 49152 to 65535 (2^15+2^14
to 2^16-1) [http://www.iana.org/assignments/port-numbers].
However, this does not prevent an application from re-using port
across connections. Destination are also intentionally sticky,
since they identify services offered by endpoints. Therefore,
reusing a destination port does not lead to decreased linkability.
Moreover, with TCP Fast Open (TFO) [RFC7413], servers give clients
plaintext cookies that must be re-used when resuming a TCP+TFO
connection. Clients do not modify these server cookies, which
therefore means they can be tracked.
o MPTCP [RFC6824]: Connection tokens or IDs are explicitly used to
link MPTCP subflows between IP address pairs. These tokens are
only exposed during flow management operations, e.g., when
creating new subflows. Normal data transfer uses TCP sequence
numbers to bypass middlebox interference and an additional data
sequence number (DSN) TCP option to allow receivers to deal with
out-of-order subflow packet arrival. The union of packet DSNs
across subflows should yield a contiguous packet number sequence.
o TLS [RFC5246] [RFC8446]: Prior to TLS 1.3, significant information
is exposed during TLS handshakes, including: session identifiers
(or re-used PSK identifiers in TLS 1.3), timestamps, random
nonces, supported ciphersuites, certificates, and extensions.
Many of these are common across all TLS clients - specifically,
ciphersuites, nonces, and timestamps. However, others may persist
Wood Expires October 26, 2019 [Page 4]
Internet-Draft linkable-identifiers April 2019
across active sessions, including: session identifiers (in TLS 1.2
and earlier versions) and re-used PSK identifiers (in TLS 1.3).
Without rotation, these re-used identifers are sticky.
o DTLS [RFC6347]: Datagram TLS is a slightly modified variant of TLS
aimed to run over datagram protocols such as UDP. In addition to
identifiers exposed via TLS, DTLS adds cookie-based denial-of-
service countermeasures. Servers issue stateless cookies to
clients during a handshake, which must be replayed in cleartext by
clients to prove ownership of its IP address. (This is similar to
TFO cookies described above.) Additionally, DTLS is considering
support of a static connection identifier (CID)
[I-D.ietf-tls-dtls-connection-id], which permits client address
mobility. CIDs are specifically designed to not change across
addresses.
o QUIC [I-D.ietf-quic-transport]: QUIC is another secure transport
protocol originally developed by Google and now being standardized
by the IETF. IETF-QUIC [I-D.ietf-quic-transport] uses TLS 1.3 for
its handshake. In addition to identifiers exposed by TLS 1.3,
QUIC has its own connection identifier (CID) used to permit
address mobility.
2.3. Application Layer:
o HTTP [RFC2616]: While HTTP is a stateless protocol, it enables
applications to define state-keeping mechanisms in header fields.
The fields might carry the state itself or tokens pointing to
state kept at the endpoints. The Cookie header field [RFC6265] is
de-facto the mechanism for web applications to uniquely identify
their clients by generating a token and instructing the client to
attach to any future requests. The ETag header field [RFC7232]
enables applications to uniquely reference a resource which the
client may cache. Applications may return unique reference tokens
to distinct clients.
o DNS [RFC1035]: SRV records often contain human-readable
information specific to particular devices, clients, or users.
For example, printers may advertise its services with SRV records
that contain a human-readable instance name. These are often not
rotated as services change.
o NTP [RFC5905]: By default, mode 3 for NTP - client to server -
sends several source-specific fields in the clear to NTP servers,
including: timestamps, poll, and precision. These fields should
be left empty or randomized as per
[I-D.ietf-ntp-data-minimization]. Other fields that may link to
Wood Expires October 26, 2019 [Page 5]
Internet-Draft linkable-identifiers April 2019
clients include: Stratum, Root Delay, Root Dispersion, Ref ID, Ref
Timestamp, Origin Timestamp, and Receive Timestamp.
3. Identifier Scope and Threat Model
Not all packet identifiers are visible end-to-end in a client-server
interaction. For example, MAC addresses are only visible to those
with physical access to the medium - the local subnet for Ethernet
and proximity for Wi-Fi; we will consider both of these "on-path" for
the sake of this analysis. IP addresses are only visible between
endpoints. (In systems such as Tor, source and destination addresses
change at each circuit hop.) Thus, identifier linkability depends on
the threat model under consideration. Off-path adversaries, e.g.
those without physical access to the medium, are not considered a
problem since they do not have access to packets in flight. On-path
adversaries may exist at various locations relative to an endpoint
(sender or receiver) on a path, e.g., in a local subnet, as an
intermediate router or middlebox between two endpoints, or as a TLS
terminating reverse proxy. In this document, we categorize these
three types of adversaries as follows:
1. Local: An on-path adversary belonging to the same local subnet as
an endpoint, e.g., a switch.
2. Intermediate: An on-path adversary that observes datagrams in
flight but does not terminate a (TCP or TLS) connection, e.g., a
middlebox or performance enhancing proxy (PEP).
3. Terminator: An on-path adversary that terminates a connection,
e.g., a TLS- terminating reverse proxy. Note that there can be
distinct terminators for individual layers of network stack.
E.g., one for TLS and another for HTTP.
The scope of an identifier includes are all other protocols and
layers observable by the same adversary.
4. Limiting Linkable Identifiers
The introductory example illustrating packet linkability using MAC
addresses is one of many possible ways in which an attacker may link
packets. As another hypothetical example, assume that IP address and
MAC addresses were properly rotated, whereas TLS session identifiers
were reused over time, as shown below.
Wood Expires October 26, 2019 [Page 6]
Internet-Draft linkable-identifiers April 2019
+---------------+ +---------------+
| TLS Session X <--------> TLS Session X |
+---------------+ +---------------+
| ... | .... | ... |
+---------------+ +---------------+
| IP Address A <---//---> IP Address B |
+---------------+ +---------------+
| MAC Address A <---//---> MAC Address C |
+---------------+ +---------------+
+------------------------------------------>
time
Despite rotating all protocol identifiers beneath TLS, a static
session identifier makes packet linkability trivial. Thus, a strict,
yet safe rule for removing packet linkability is to rotate all linked
identifiers in unison. Unfortunately, this strategy is problematic
in practice. It would imply terminating active connections whenever
an identifier changes (otherwise, linkability remains trivial). For
example, if MAC addresses are rotated on a regular basis, e.g., every
15 minutes, then connection lifetimes would be limited to this
window.
A more sensible policy would be to restrict identifier rotation to
layers which are exposed to the same adversary. For example, origin
MAC addresses may not be visible to the destination. In this case,
rotating IP addresses and TLS session identifiers is not required to
prevent packet linkability by an adversary who does not see the
origin MAC address. A realistic threat model is one in which IP- to
TLS-layer information is exposed to the same on-path adversary.
Identifiers beneath IP are visible to local adversaries, which may
not be an issue, and those above TLS are visible to authenticated
peers.
4.1. Time and Path Linkability
There are multiple dimensions along which identifiers may be linked:
(1) time, as identifiers are used and re-used by senders, and (2)
space, as identifiers are duplicated across multiple disjoint network
paths, possibly by different protocols. We refer to these dimensions
as time and path linkability, respectively.
Time linkability is arguably simpler to mitigate, since new
connections over time may opt to use new identifiers. For example,
instead of resuming a TLS session with an existing session ID, a
client may initiate a fresh handshake. As a simple rule, if an
identifier in the same scope changes, endpoints SHOULD use fresh
identifiers for all other protocols in that scope. This means that,
Wood Expires October 26, 2019 [Page 7]
Internet-Draft linkable-identifiers April 2019
for identifiers visible to intermediate adversaries, new TLS sessions
SHOULD be initiated from an endpoint with a fresh IP address and TCP
source port. Note that clients behind NATs may not need to generate
a fresh IP address, as they enjoy some measure of anonymity by
design. If local adversaries were considered part of the threat
model, then a fresh MAC address may also be needed.
In contrast, path linkability is more difficult to achieve, as it
requires using fresh identifiers for each protocol field. This may
not always be technically feasible. For example, DNS query names are
also intentionally used as the TLS SNI. Moreover, protocols such as
QUIC explicitly try to enable path linkability via connection-level
identifiers (CIDs) to support multihoming or mobile endpoints. This
makes path linkability impossible to mitigate. However, as multiple,
disjoint paths may be operated by different entities (e.g., ISPs),
collusion may be less common.
5. Timing Considerations
Advice in this document SHOULD NOT be interpreted as guarantees for
preventing linkability. Rather, it aims to increase linkability
complexity. It is difficult to prevent path-linkability without
modifying protocols above the layer at which identifiers rotate. For
example, assuming MPTCP subflows were unlinkable across paths, shared
transport state controlling the rate of data transmission may be
sufficient to link these flows.
6. IANA Considerations
This document has no request to IANA.
7. Security Considerations
This document does not introduce any new security protocol.
8. Privacy Considerations
This document describes considerations and suggestions for improving
privacy in the context of many IETF protocols. It does not introduce
any new features or protocol behavior that would adversely impact
privacy.
9. Acknowledgments
The authors thank Martin Thompson and Brian Trammell for comments on
earlier versions of this document.
Wood Expires October 26, 2019 [Page 8]
Internet-Draft linkable-identifiers April 2019
10. Normative References
[I-D.ietf-dnssd-privacy]
Huitema, C. and D. Kaiser, "Privacy Extensions for DNS-
SD", draft-ietf-dnssd-privacy-05 (work in progress),
October 2018.
[I-D.ietf-ntp-data-minimization]
Franke, D. and A. Malhotra, "NTP Client Data
Minimization", draft-ietf-ntp-data-minimization-04 (work
in progress), March 2019.
[I-D.ietf-quic-transport]
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
and Secure Transport", draft-ietf-quic-transport-20 (work
in progress), April 2019.
[I-D.ietf-tls-dtls-connection-id]
Rescorla, E., Tschofenig, H., and T. Fossati, "Connection
Identifiers for DTLS 1.2", draft-ietf-tls-dtls-connection-
id-04 (work in progress), March 2019.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, DOI 10.17487/RFC0793, September 1981,
<https://www.rfc-editor.org/info/rfc793>.
[RFC1035] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[RFC2508] Casner, S. and V. Jacobson, "Compressing IP/UDP/RTP
Headers for Low-Speed Serial Links", RFC 2508,
DOI 10.17487/RFC2508, February 1999, <https://www.rfc-
editor.org/info/rfc2508>.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616,
DOI 10.17487/RFC2616, June 1999, <https://www.rfc-
editor.org/info/rfc2616>.
[RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy
Extensions for Stateless Address Autoconfiguration in
IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007,
<https://www.rfc-editor.org/info/rfc4941>.
Wood Expires October 26, 2019 [Page 9]
Internet-Draft linkable-identifiers April 2019
[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security
(TLS) Protocol Version 1.2", RFC 5246,
DOI 10.17487/RFC5246, August 2008, <https://www.rfc-
editor.org/info/rfc5246>.
[RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
"Network Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
<https://www.rfc-editor.org/info/rfc5905>.
[RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265,
DOI 10.17487/RFC6265, April 2011, <https://www.rfc-
editor.org/info/rfc6265>.
[RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347,
January 2012, <https://www.rfc-editor.org/info/rfc6347>.
[RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
"TCP Extensions for Multipath Operation with Multiple
Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013,
<https://www.rfc-editor.org/info/rfc6824>.
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
Morris, J., Hansen, M., and R. Smith, "Privacy
Considerations for Internet Protocols", RFC 6973,
DOI 10.17487/RFC6973, July 2013, <https://www.rfc-
editor.org/info/rfc6973>.
[RFC7232] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Protocol (HTTP/1.1): Conditional Requests", RFC 7232,
DOI 10.17487/RFC7232, June 2014, <https://www.rfc-
editor.org/info/rfc7232>.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
<https://www.rfc-editor.org/info/rfc7413>.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
<https://www.rfc-editor.org/info/rfc8446>.
Author's Address
Wood Expires October 26, 2019 [Page 10]
Internet-Draft linkable-identifiers April 2019
Christopher A. Wood
Apple Inc.
One Apple Park Way
Cupertino, California 95014
United States of America
Email: cawood@apple.com
Wood Expires October 26, 2019 [Page 11]