Network Working Group | A. Cooper |
Internet-Draft | CDT |
Intended status: Informational | H. Tschofenig |
Expires: September 11, 2012 | Nokia Siemens Networks |
B. Aboba | |
Microsoft Corporation | |
J. Peterson | |
NeuStar, Inc. | |
J. Morris | |
March 12, 2012 |
Privacy Considerations for Internet Protocols
draft-iab-privacy-considerations-02.txt
This document offers guidance for developing privacy considerations for IETF documents and aims to make protocol designers aware of privacy-related design choices.
Discussion of this document is taking place on the IETF Privacy Discussion mailing list (see https://www.ietf.org/mailman/listinfo/ietf-privacy).
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 11, 2012.
Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
[RFC3552] provides detailed guidance to protocol designers about both how to consider security as part of protocol design and how to inform readers of IETF documents about security issues. This document intends to provide a similar set of guidance for considering privacy in protocol design.
Whether any individual document will require a specific privacy considerations section will depend on the document's content. Documents whose entire focus is privacy may not merit a separate section (for example, [RFC3325]). For certain specifications, privacy considerations are a subset of security considerations and can be discussed explicitly in the security considerations section. The guidance provided here can and should be used to assess the privacy considerations of protocol, architectural, and operational specifications and to decide whether those considerations are to be documented in a stand-alone section, within the security considerations section, or throughout the document.
Privacy is a complicated concept with a rich history that spans many disciplines. Many sets of privacy principles and privacy design frameworks have been developed in different forums over the years. These include the Fair Information Practices (FIPs), a baseline set of privacy protections pertaining to the collection and use of data about individuals (see [OECD] for one example), and the Privacy by Design concept, which provides high-level privacy guidance for systems design (see [PbD] for one example). The guidance provided in this document is inspired by this prior work, but it aims to be more concrete, pointing protocol designers to specific engineering choices that can impact the privacy of the individuals that make use of Internet protocols.
Privacy as a legal concept is understood differently in different jurisdictions. The guidance provided in this document is generic and can be used to inform the design of any protocol to be used anywhere in the world, without reference to specific legal frameworks.
This document is organized as follows. Section 2 describes the extent to which the guidance offered is applicable within the IETF. Section 3 discusses threats to privacy as they apply to Internet protocols. Section 4 outlines privacy goals. Section 5 provides the guidelines for analyzing and documenting privacy considerations within IETF specifications. Section 6 examines the privacy characteristics of an IETF protocol to demonstrate the use of the guidance framework. Section 7 provides a concise glossary of terms used in this document, with a more complete discussion of some of the terms available in [I-D.iab-privacy-terminology].
The core function of IETF activity is building protocols. Internet protocols are often built flexibly, making them useful in a variety of architectures, contexts, and deployment scenarios without requiring significant interdependency between disparately designed components. Although some protocols assume particular architectures at design time, it is not uncommon for architectural frameworks to develop later, after implementations exist and have been deployed in combination with other protocols or components to form complete systems.
As a consequence, the extent to which protocol designers can foresee all of the privacy implications of a particular protocol at design time is significantly limited. An individual protocol may be relatively benign on its own, but when deployed within a larger system or used in a way not envisioned at design time, its use may create new privacy risks. The guidelines in Section 5 ask protocol designers to consider how their protocols are expected to interact with systems and information that exist outside the protocol bounds, but not to imagine every possible deployment scenario.
Furthermore, in many cases the privacy properties of a system are dependent upon API specifics, internal application functionality, database structure, local policy, and other details that are specific to particular instantiations and generally outside the scope of the work conducted in the IETF. The guidance provided here may be useful in making choices about those details, but its primary aim is to assist with the design, implementation, and operation of protocols. Privacy issues, even those related to protocol development, go beyond the technical guidance discussed herein.
As an example, consider HTTP [RFC2616], which was designed to allow the exchange of arbitrary data. A complete analysis of the privacy considerations for uses of HTTP might include what type of data is exchanged, how this data is stored, and how it is processed. Hence the analysis for an individual's static personal web page would be different than the use of HTTP for exchanging health records. A protocol designer working on HTTP extensions (such as WebDAV [RFC4918]) is not expected to describe the privacy risks derived from all possible usage scenarios, but rather the privacy properties specific to the extensions and any particular uses of the extensions that are expected and foreseen at design time.
Privacy harms come in a number of forms, including harms to financial standing, reputation, solitude, autonomy, and safety. A victim of identity theft or blackmail, for example, may suffer a financial loss as a result. Reputational harm can occur when disclosure of information about an individual, whether true or false, subjects that individual to stigma, embarrassment, or loss of personal dignity. Intrusion or interruption of an individual's life or activities can harm the individual's ability to be left alone. When individuals or their activities are monitored, exposed, or at risk of exposure, those individuals may be stifled from expressing themselves, associating with others, and generally conducting their lives freely. In cases where such monitoring is for the purpose of stalking or violence, it can put individuals in physical danger.
This section lists common privacy threats (drawing liberally from [Solove]), showing how each of them may cause individuals to incur privacy harms and providing examples of how these threats can exist on the Internet.
To understand attacks in the privacy-harm sense, it is helpful to consider the overall communication architecture and different actors' roles within it. Consider a protocol element that initiates communication with some recipient (an "initiator"). Privacy analysis is most relevant for protocols with use cases in which the initiator acts on behalf of a natural person (or different people at different times). It is this natural person -- the data subject -- whose privacy is potentially threatened.
Communications may be direct between the initiator and the recipient, or they may involve an intermediary (such as a proxy or cache) that is necessary for the two parties to communicate. In some cases this intermediary stays in the communication path for the entire duration of the communication and sometimes it is only used for communication establishment, for either inbound or outbound communication. In rare cases there may be a series of intermediaries that are traversed.
Some communications tasks require multiple protocol interactions with different entities. For example, a request to an HTTP server may be preceded by an interaction between the initiator and an Authentication, Authorization, and Accounting (AAA) server or DNS resolver. In this case, the HTTP server is the recipient and the other entities are enablers of the initiator-to-recipient communication. Similarly, a single communication with the recipient my generate further protocol interactions between either the initiator or the recipient and other entities. For example, an HTTP request might trigger interactions with an authentication server or with other resource servers.
As a general matter, recipients, intermediaries, and enablers are usually assumed to be authorized to receive and handle data from initiators. As [RFC3552] explains, "we assume that the end-systems engaging in a protocol exchange have not themselves been compromised."
Although they may not generally be considered as attackers, recipients, intermediaires, and enablers may all pose privacy threats (depending on the context) because they are able to observe and collect privacy-relevant data. These entities are collectively described below as "observers" to distinguish them from traditional attackers. From a privacy perspective, one important type of attacker is an eavesdropper: an entity that passively observes the initiator's communications without the initiator's knowledge or authorization.
The threat descriptions in the next section explain how observers and attackers might act to harm data subjects' privacy. Different kinds of attacks may be feasible at different points in the communications path. For example, an observer could mount surveillance or identification attacks between the initiator and intermediary, or instead could surveil an enabler (e.g., by observing DNS queries from the initiator).
Some privacy threats are already considered in IETF protocols as a matter of routine security analysis. Others are more pure privacy threats that existing security considerations do not usually address. The threats described here are divided into those that may also be considered security threats and those that are primarily privacy threats.
Note that an individual's knowledge and authorization of the practices described below can greatly affect the extent to which they threaten privacy. If a data subject authorizes surveillance of his own activities, for example, the harms associated with it may be significantly mitigated.
Surveillance is the observation or monitoring of an individual's communications or activities. The effects of surveillance on the individual can range from anxiety and discomfort to behavioral changes such as inhibition and self-censorship to the perpetration of violence against the individual. The individual need not be aware of the surveillance for it to impact privacy -- the possibility of surveillance may be enough to harm individual autonomy.
Surveillance can be conducted by observers or eavesdroppers at any point along the communications path. Confidentiality protections (as discussed in [RFC3552] Section 3) are necessary to prevent surveillance of the content of communications. To prevent traffic analysis or other surveillance of communications patterns, other measures may be necessary, such as [Tor].
End systems that do not take adequate measures to secure stored data from unauthorized or inappropriate access expose individuals to potential financial, reputational, or physical harm.
By and large, protecting against stored data compromise is outside the scope of IETF protocols. However, a number of common protocol functions -- key management, access control, or operational logging, for example -- require the storage of data about initiators of communications. When requiring or recommending that information about initiators or their communications be stored or logged by end systems (see, e.g., RFC 6302), it is important to recognize the potential for that information to be compromised and for that potential to be weighed against the benefits of data storage. Any recipient, intermediary, or enabler that stores data may be vulnerable to compromise.
Intrusion consists of invasive acts that disturb or interrupt one's life or activities. Intrusion can thwart individuals' desires to be let alone, sap their time or attention, or interrupt their activities.
Unsolicited mail and denial-of-service attacks are the most common types of intrusion on the Internet. Intrusion can be perpetrated by any attacker that is capable of sending unwanted traffic to the initiator.
Correlation is the combination of various pieces of information about an individual. Correlation can defy people's expectations of the limits of what others know about them. It can increase the power that those doing the correlating have over individuals as well as correlators' ability to pass judgment, threatening individual autonomy and reputation.
Correlation is closely related to identification. Internet protocols can facilitate correlation by allowing data subjects' activities to be tracked and combined over time. The use of persistent or infrequently refreshed identifiers at any layer of the stack can facilitate correlation. For example, an initiator's persistent use of the same device ID, certificate, or email address across multiple interactions could allow recipients to correlate all of the initiator's communications over time.
In theory any observer or attacker that receives an initiator's communications can engage in correlation. The extent of the potential for correlation will depend on what data the entity receives from the initiator and has access to otherwise. Often, intermediaries only require a small amount of information for message routing and/or security. In theory, protocol mechanisms could ensure that end-to-end information is not made accessible to these entities, but in practice the difficulty of deploying end-to-end security procedures, additional messaging or computational overhead, and other business or legal requirements often slow or prevent the deployment of end-to-end security mechanisms, giving intermediaries greater exposure to initiators' data than is strictly necessary.
Identification is the linking of information to a particular individual. In some contexts it is perfectly legitimate to identify individuals, whereas in others identification may potentially stifle individuals' activities or expression by inhibiting their ability to be anonymous or pseudonymous. Identification also makes it easier for individuals to be explicitly controlled by others (e.g., governments).
Many protocol identifiers, such as those used in SIP or XMPP, may allow for the direct identification of data subjects. Protocol identifiers may also contribute indirectly to identification via correlation. For example, a web site that does not directly authenticate users may be able to match its HTTP header logs with logs from another site that does authenticate users, rendering users on the first site identifiable.
As with correlation, any observer or attacker may be able to engage in identification depending on the information about the initiator that is available via the protocol mechanism or other channels.
Secondary use is the use of collected information without the data subject's consent for a purpose different from that for which the information was collected. Secondary use may violate people's expectations or desires. The potential for secondary use can generate uncertainty over how one's information will be used in the future, potentially discouraging information exchange in the first place.
One example of secondary use would be a network access server that uses an initiator's access requests to track the initiator's location. Any observer or attacker could potentially make unwanted secondary uses of initiators' data.
Disclosure is the revelation of truthful information about a person that affects the way others judge the person. Disclosure can violate people's expectations of the confidentiality of the data they share. The threat of disclosure may deter people from engaging in certain activities for fear of reputational harm.
Any observer or attacker that receives data about an initiator may choose to engage in disclosure. In most cases, there is nothing done at the protocol level to influence or limit disclosure, although there are some exceptions. For example, the GEOPRIV architecture [RFC6280] provides a way for users to express a preference that their location information not be disclosed beyond the intended recipient.
Exclusion is the failure to allow the data subject to know about the data that others have about him or her and to participate in its handling and use. Exclusion reduces accountability on the part of entities that maintain information about people and creates a sense of vulnerability about individuals' ability to control how information about them is collected and used.
The most common way for Internet protocols to be involved in limiting exclusion is through access control mechanisms. The presence architecture developed in the IETF is a good example where data subjects are included in the control of information about them. Using a rules expression language (e.g., Presence Authorization Rules [RFC5025]), presence clients can authorize the specific conditions under which their presence information may be shared.
Exclusion is primarily considered problematic when the recipient fails to involve the initiator in decisions about data collection, handling, and use. Eavesdroppers engage in exclusion by their very nature since their data collection and handling practices are covert.
Privacy is notoriously difficult to measure and quantify. The extent to which a particular protocol, system, or architecture "protects" or "enhances" privacy is dependent on a large number of factors relating to its design, use, and potential misuse. However, there are certain widely recognized privacy properties against which designs may be assessed for their potential to impact privacy. This section adapts these properties into four privacy goals for Internet protocols: (1) data minimization, (2) user participation, (3) accountability, and (4) security.
Data minimization refers to collecting, using, and storing the minimal data necessary to perform a task. The less data about data subjects that gets exchanged in the first place, the lower the chances of that data being used for privacy invasion.
Data minimization is comprised of a number of mutually exclusive sub-goals:
As explained in Section 3.2.2.5, data collection and use that happens "in secret," without the data subject's knowledge, is apt to violate the subject's expectation of privacy and may create incentives for misuse of data. As a result, privacy regimes tend to include provisions to support informing data subjects about data collection and use and involving them in decisions about the treatment of their data. In an engineering context, supporting the goal of user participation usually means providing ways for users to control the data that is shared about them.
An entity that collects, uses, or stores data can undergird its commitments to the other privacy goals by providing mechanisms by which data subjects and third parties can hold the entity accountable for those commitments. These mechanisms usually allow for verification of what data is collected or stored and with whom it is shared, again helping to mitigate the threat of exclusion.
Keeping data secure at rest and in transit is another important component of privacy protection. As they are described in [RFC3552] Section 2, a number of security goals also serve to enhance privacy:
This section provides guidance for document authors in the form of a questionnaire about a protocol being designed. The questionnaire may be useful at any point in the design process, particularly after document authors have developed a high-level protocol model as described in [RFC4101].
Note that the guidance does not recommend specific practices. The range of protocols developed in the IETF is too broad to make recommendations about particular uses of data or how privacy might be balanced against other design goals. However, by carefully considering the answers to each question, document authors should be able to produce a comprehensive analysis that can serve as the basis for discussion of whether the protocol adequately protects against privacy threats.
The framework is divided into four sections that address each of the goals from Section 4, plus a general section. Security is not fully elaborated since substantial guidance already exists in [RFC3552].
[To be provided in a future version once the guidance is settled.]
This document describes privacy aspects that protocol designers should consider in addition to regular security analysis.
This document does not require actions by IANA.
We would like to thank the participants for the feedback they provided during the December 2010 Internet Privacy workshop co-organized by MIT, ISOC, W3C and the IAB.
[I-D.iab-privacy-terminology] | Hansen, M, Tschofenig, H and R Smith, "Privacy Terminology", Internet-Draft draft-iab-privacy-terminology-00, January 2012. |