Network Working Group | M. Kucherawy |
Internet-Draft | July 5, 2020 |
Intended status: Experimental | |
Expires: January 6, 2021 |
Recognized Transformations of Messages Bearing DomainKeys Identified Mail (DKIM) Signatures
draft-kucherawy-dkim-transform-01
DomainKeys Identified Mail (DKIM) introduced a mechanism whereby a mail operator can affix a signature to a message that validates at the level of the signer's domain name. It specified two possible ways of converting the message body to a canonical form, one intolerant of changes and the other tolerant of simple changes to whitespace within the message body.
The provided canonicalization schemes do not tolerate changes in a message such as conversion between transfer encodings or addition of new message content. It is useful to have these capabilities to allow for transport through gateways, and also for transport through handlers (such as mailing list services) that might add content that would invalidate a signature generated using the existing canonicalization schemes.
This document presents a mechanism for declaring that a message underwent one of a handful of well-defined transformations prior to being re-signed by a mediator, so that a verifier might rewind such a modification and thereby confirm that the original signature still verifies against the original content.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 6, 2021.
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
DomainKeys Identified Mail (DKIM) [RFC6376] defines a mechanism whereby a verified domain name can be attached to a message, or portion of a message, using a cryptographic signature. It presents two possible schemes for converting the header block to a canonical form, and similarly two schemes for canonicalizing the body. In each case, one scheme permits no changes whatsoever, and the other permits limited changes restricted to areas such as whitespace munging, case changing, and header field wrapping.
Some agents deliberately, but innocently, modify content in transit. A prime example of this is mailing lists, which might add a prefix to the Subject field of a message, add list-specific information to the header (in the form of new header fields), or append administrivia to the body of messages before they are re-mailed to the list subscribers. Use of mailing lists with respect to DKIM, and a discussion of related challenges, can be found in [RFC6377]. The urgency to solve this family of problems increased dramatically with the large-scale introduction of Domain-based Message Authentication, Reporting, and Conformance (DMARC) [RFC7489].
There is a desire to have DKIM signatures survive transit through lists. One way to do this is to make use of DKIM's "l=" tag which limits the portion of the body that is signed. This exposes an attack vector, however, since one can simply append any content to a partly-signed message and the signature will continue to verify. (See Section 8.2 of [RFC6376].)
This document defines an incremental mechanism to declare that a signature is being applied to message content after some number of a small set of well-defined, reversible content transformations. The message verifier can then reverse the effect of the claimed transformation and, theoretically, recover the original content and confirm its integrity relative to an original signature.
The utility of this mechanism is predicated on the notion that agents that modify signed messages will do using only the known (registered) transformations, and that common transformations will be registered as they are developed.
Numerous terms used here, especially "Author" and "Mediator", are defined in [RFC5598].
For the purposes of this experiment, a transformation is "reversible" if at the time the message is received, the verifier has enough information to recover the pre-transformation content. For example, a transformation that removes a MIME part with an undesired media type or filename extension cannot be undone by the receiver because it cannot restore content it doesn't have; such a transformation is not reversible and thus not a candidate for consideration here. However, a transformation that adds a specific header field to a message is reversible because the verifier can simply remove the header field.
This section defines the 'tf' DKIM signature tag.
The presence of this tag is an indication to a verifier that the agent adding this signature transformed the original message between receipt (and verification of any previously-applied signature) and retransmission, and that such transmission was one of a set that are common, well-defined, and reversible.
The value of this tag is one of the transformations registered in the DKIM Message Transformations registry. See Section 12.
sig-tf-tag = %x74.66 [FWS] "=" [FWS] sig-tf-tag-trans sig-tf-tag-trans = Token *("," Token) ; expected to be a list of one or more ; transformation names found in the DKIM ; Message Transformations registry
Using ABNF, as defined in [RFC5234]:
"Token" is imported from [RFC2045], and "FWS" is imported from [RFC6376].
A verifier finding a signature with the "tf" tag present but bearing a value it does not recognize ignores its presence (other than including it in hash computation).
In all cases, DKIM operations involving this tag begin with a message author generating content and submitting it to the appropriate Message Submission Agent (MSA). The MSA is presumed to have some kind of DKIM signature generation capability, and thus the message will have an author domain signature attached to it.
When a message arrives at a Mediator or other intermediary that wishes to distribute an altered form of the author's content, such as a Mailing List Manager (MLM) configured to do so, it generates an additional DKIM signature with the new form of the content as input. This second includes the "tf" tag, announcing which known transformation(s) was applied to the message prior to creation of the Mediator's signature. Importantly, the original signature is not removed from the message nor is it altered in any way.
Since DKIM-compliant verifiers ignore signature tags of which they are not aware, this is a purely incremental change as it will not interfere with the deployed DKIM infrastructure.
A DKIM verifier aware of this tag will first confirm that the Mediator's signature is valid. On doing so, it can then apply the reverse of the claimed transformation. This will restore the message to the form and content originally submitted by the Author, and the Author's signature will then be valid over the restored content.
This might be used to confirm that a message which passed through a Mediator can still be considered to have a valid Author signature, satisfying policy checks such as those described in [RFC7489].
s-punct = 0x45 / 0x5f / 0x2f / 0x20 / 0x2e s-tag = 0x5b 1*( ALPHA / DIGIT / s-punct ) 0x5d 1*FWS
Mailing list services commonly apply a "tag" to the Subject field of a message identifying the message as having been distributed as part of a list. By far the most common tag method is to prefix the Subject field with the name of the list in square brackets (ASCII 0x5b and 0x5d), possibly followed by a space and a sequence number. Accordingly, this transformation describes exactly such a mutation. Specifically, the mutation is the addition of a string to the beginning of the Subject field comprised of alphanumeric characters, a limited set of punctation, or digits, surrounded by square brackets, possibly including and followed by whitespace. In ABNF terms, the string is described by:
Thus, the reverse operation is simply the removal of any such substring at the front of the Subject field.
If there is no Subject field prefix matching the above ABNF, then the transformation reversal cannot be computed and an error is returned.
Mailing lists sometimes add a "footer" to a message, typically consisting of a small number of lines of text identifying the name of the list and some other administrivia, and usually including a URL where subscriptions can be managed or list archives can be found. Such trivial text edits are reversible, so these too are a candidate for this mechanism.
A "footer" for the purposes of this capability is all text below a trivial boundary marker. A boundary comprises a line of text made up solely of two or more hyphen or underscore (0x2d or 0x5f) characters. Therefore, reversing this transformation is accomplished by searching backwards, a line at a time, from the end of the message, until such a line is found. When found, the message is truncated such that the line and all lines after it are removed.
If no such line is found, then the transformation reversal cannot be computed and an error is returned.
The "mimeify" transformation converts a message that is not formatted according to Multipurpose Internet Mail Extensions (MIME) [RFC2045], and converts it to that form. This allows a Mediator to place the original content in one MIME part, and its own additional content in a second MIME part. The reverse transformation is to remove the second MIME part altogether, and then strip away all MIME structure, leaving only the original author content.
More specifically, the transformation follows these steps:
The reverse of this transformation is as follows:
If any setp cannot be completed because the stated header field or content cannot be located, an error is returned.
The "add-part" transformation augments a multipart message that is already formatted according to MIME by appending an additional part that includes the content the Mediator wishes to add.
This transformation cannot be used unless the media type of the message as a whole (the one named in the Content-Type field in the header of the message itself) is "multipart/mixed". Simply put, a new part within the existing set of parts is added at the end, containing the Mediator's content.
More specifically, the transformation follows these steps:
The reverse of this transformation is as follows:
If any setp cannot be completed because the stated MIME part cannot be located, an error is returned.
The "mime-wrap" transformation augments a message that is already formatted according to MIME by enclosing the existing MIME structure in a new layer. This new layer contains two parts: the original MIME structure in its first part, and the Mediator content in its second part.
More specifically, the transformation follows these steps:
This leaves the new message as a MIME message with two parts at the outermost layer; the original message appears as the first part, and the Mediator's content is the second part.
The reverse of this transformation is as follows:
If any setp cannot be completed because the stated MIME part cannot be located, an error is returned.
Section 3.5 of [RFC6376] defined an optional DKIM signature tag ("z=") that can be used to reconstruct the header field set that was signed by the author. When a signature fails to verify, this information could conceivably be used to replay the correct (original) header fields through canonicalization and possibly yield a passing result.
Doing this augmented replay blindly would allow a signature to pass when it failed because some alteration correctly rendered the original content invalid or even dangerous. This is manifestly not an error. Identifying which mutations of the original content ought to be permissible necessarily relies on heuristics and possibly local knowledge. However, a mutation universally considered to be tolerable should become part of the canonicalization process rather than being identified and handled in this manner. Moreover, if two implementations apply different heuristics, the result of verification is no longer deterministic. As a result, [RFC6376] asserts that use of the "z=" content, if present, can only be used for diagnostic purposes.
In contrast, the proposal here enumerates a handful of specific mutations known to be safe, and in common use, that are also reversible, which means the Author's original content can be unambiguously recovered and subjected to the usual signature verification process even though the message has been legitimately modified by a Mediator.
It does not take much imagination to conceive of a legitimate message using the capability described here that fails some part of the process. For example, the "footer" transformation does not account for a footer block that itself contains a boundary marker, and so reversing that transformation as described would produce a wrong result. This is harmless, however, as the verifier then is no worse off than it was before in that it still doesn't have the original content, and thus operates as if none of this proposal was applied (i.e., the original signature still fails). The proposal is only incremental to what DKIM can provide when it actually does work to recover original content.
It is expected that the definitions of the known transformations will evolve over time as we gain community experience with what works.
Section 8 of [RFC6376] discusses numerous security considerations relevant to DKIM. Of particular interest here is Section 8.2, which discusses concerns regarding signatures that sill verify in the presence of added message content.
Conceivably, some of these transformations or those registered in the future could be computationally expensive or require non-trivial ephemeral resource allocation (e.g., storage), especially for large or complex messages. An attacker could send signatures in claiming some or all of the known transformations on a message which a participating verifier would then attempt to execute, presumably in an attempt to recover original content, as a denial of service attack.
There is little reason to believe that any given transformation might be applied more than once, or that certain combinations have any practical application (e.g., "footer" is unlikely to be useful when combined with any of the MIME transformations). This experimental document does not explicitly proscribe these, but implementers may choose to detect such strange requests and disregard them.
IANA is requested to create a new registry in the DomainKeys Identified Mail (DKIM) Parameters group called the "DKIM Transformations Registry". This registry will enumerate known reversible content transformations that might be made by Mediators to messages bearing DKIM signatures.
Entries in this registry include all of the following:
An entry may be added or updated in this registry only when it meets the requirements of the "Specification Required" rules found in [RFC5226]. The Designated Expert will confirm that the referenced specification is clear and complete, and that the transformation and its inverse are not ambiguous.
The initial entries in this registry are as follows, all with status "active":
[RFC2045] | Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996. |
[RFC5226] | Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 5226, DOI 10.17487/RFC5226, May 2008. |
[RFC5234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008. |
[RFC6376] | Crocker, D., Hansen, T. and M. Kucherawy, "DomainKeys Identified Mail (DKIM) Signatures", STD 76, RFC 6376, DOI 10.17487/RFC6376, September 2011. |
[RFC5598] | Crocker, D., "Internet Mail Architecture", RFC 5598, DOI 10.17487/RFC5598, July 2009. |
[RFC6377] | Kucherawy, M., "DomainKeys Identified Mail (DKIM) and Mailing Lists", BCP 167, RFC 6377, DOI 10.17487/RFC6377, September 2011. |
[RFC7489] | Kucherawy, M. and E. Zwicky, "Domain-based Message Authentication, Reporting, and Conformance (DMARC)", RFC 7489, DOI 10.17487/RFC7489, March 2015. |
TODO: Show at least one example.
The original team developing this concept included Michael Adkins and Wez Furlong.
The author wishes to acknowledge (names) for their comments during the development of this document.