Network Working Group | R. Van Rein |
Internet-Draft | ARPA2.net |
Intended status: Standards Track | September 25, 2017 |
Expires: March 29, 2018 |
Lenient DKIM
draft-vanrein-dkim-lenient-00
DKIM is a framework for signed messages, especially for email. While in transit, changes are sometimes made, and these break the DKIM-Signature. This specification makes DKIM more lenient, without changes to its core. It adds leniency towards MIME body rewrites, removal of alternatives and annotation with bits of text. The intention is to allow these changes such that they can be clearly shown to the user, while indicating that the remainder of the signedmessage is still in tact.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 29, 2018.
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
DKIM standardises a header that signs content and headers of (email) messages. This is a practical system to validate (the origin of) a message to not have changed; unfortunely however, it breaks on a few patterns of communication that are in common use xref-target-"RFCnnnn". As a result, it is difficult to assign the full weight of evidence that DKIM could otherwise provide.
DKIM signs content based on their wire format, which may include a transfer-encoding when the MIME extensions are used. MTAs may have to alter this transfer-encoding to accommodate a downgrade of communication capabilities, and are therefore permitted to do so, but this is not compatible with current DKIM. This specification introduces a new canonicalisation algorithm to remedy this problem, based on fixating the binary content of a MIME body part.
Another common practice is the introduction of extra text in the Subject header of an email, or in the body or body parts. As a general principle, it need not be problemtic to introduce extra data to an email, as long as this is clearly shown in the MUA. Given that email is usually rendered in a manner that has information fields in clearly distinct graphical framing, it should be possible for a MUA to unambiguously mark any additions made. This specification introduces a manner of locating the parts of a message that were not modified since it was signed, and leaves it to the MUA to decide on a method of distinct rendering for original and modified parts. The same mechanisms may be useful for MTAs for distinguishing common patterns from abusive ones. When unsupportive or incapable, a MUA or MTA is always free to not incorporate DKIM's confirmations of message content.
A more debatable practice is the removal of complete MIME attachments. This is sometimes done for content in an undesirable format, usually for operational considerations. Such removals can be clearly marked, and it is up to the MUA whether to accept the removal. When one part of an originally sent multipart/alternative is removed, it may not be as bad as when the composition is altered, as would be the case with other multipart types, such as multipart/form-data.
The general handling of a MIME multipart/* message body involves a number of independently encoded body parts. When this Content-Type is used, a separate DKIM-Signature will be made on each of the body parts, and a dedicated canonicalisation method will be used to compose the parts. This will list the hashes of the various body parts, thus enabling the detection of a removed part.
When changes are made to parts of the message, be it to headers or body, then a DKIM-Signature is invalidated. Although this is a clear sign of modification, it is not always a sign of a rogue change. Ideally, the changed content would be presented to the user, but in such a manner that original content can be easily distinguished from changed content. When the changed message is subsequently signed by an intermediate mailer, it is also clear where any concerns about the change should be redirected, which is helpful for reputation management.
The DKIM-Signed-Content header can be inserted after a DKIM-Signature, possibly at the origin or at a later stage of the message, to mark parts of the message that are incorporated into the signature as a whole header, or as a whole body. It may in turn be incorporated into a DKIM-Signature of a modified-forwarding party such as an email list, where the new DKIM-Signature implies taking responsibility over the changes made locally. But even when this is not done, the DKIM-Signed-Content header allows a receiving MTA, MUA or user to learn standard patterns of change by such intermediates, and hopefully trigger alarms after deviations from a pattern.
To detect extra text, the DKIM-Signed-Content header describes the range that was originally signed, describing the text in canonical representation through its size, a rolling checksum and a secure hash. The size and rolling checksum can be used to quickly pass through a text in search of the original text; the secure hash is then used to validate that it is indeed a good match.
Originators could insert DKIM-Signed-Content headers before or after inserting the DKIM-Signature header to allow non-compliant forwarders to be easily recognised. Forwarders who add their own DKIM-Signature on modified content are advised to retain the existing ones, validate or remove any DKIM-Signed-Content headers inserted after the last DKIM-Signature, add any missing DKIM-Signed-Content headers over what will be modified, then make the desired modifications and then insert the new DKIM-Signature header. This procedure ensures that the forwarder only takes responsibility over their own work, which helps in building their reputation.
DKIM-Signature and DKIM-Signed-Content headers MUST be inserted on top of the message, and MUST NOT be reordered in transit. This is a common and reasonable requirement for current messaging systems.
The DKIM-Signed-Content header works for complete headers or for complete bodies, and will fail to work when content is being removed. This is to protect users from information being held back; the one exception is in the handling of multipart/alternative, defined below.
The DKIM-Signed-Content header consists of one or more sets of the following four components, with whitespace to separate components: (1) the path to the content; (2) the decimal notation for the length in bytes; (3) the hexadecimal notation for a rolling hash; (4) the secure hash of the content.
The path to the content is either a header name or the special word "Body" to notify the message body. When multiple instances exist, an optional extension is permitted, namely a colon and a decimal sequence number, counting from 0. Headers count in the order of insertion, so in the opposite order of their occurrence in the message. Bodies count in the order in which they occur in th message, but this is only meaningful for multipart MIME-types. TODO:ALT:0 is the last value inserted before the DKIM-Signature; no going back. TODO:EXT:Allow header:index:header:index:...
The values that follow all apply to the canonical form of the information, be it a header or body, as specified in the "c" parameter of the DKIM-Signature that was last inserted into the message.
The rolling hash is TODO:AS_IN_RSYNC? cyclic polynomials looks best. Implementation note: either have the full message in memory or have to files open on it for reading.
The secure hash matches the representation and algorithm of the "bh" field in the related DKIM-Signature.
When a body with MIME content is signed, then more headers should be considered for inclusion in the DKIM-Signature; notably, MIME-Version, Content-Type, Content-Disposition SHOULD be included; on the other hand, Content-Transfer-Encoding SHOULD NOT be included when the intention is to support changes to the transfer encoding. When headers are absent, their inclusion into the DKIM-Signature header list disables later-on addition of these headers. TODO:CHECK
Special processing is required for the Content-Type header; it includes a field named boundary, whose contents MUST be set to an empty string, included in double quotes, before computing the header checksum. This assures that changes to the boundary tag are permitted while the message moves between MTAs. TODO:ALT: Move this parameter to the end of the header and take it out from the DKIM-Signed-Content as well as the DKIM-Signature; this requires MTAs and MUAs to recognise this missing information as correctly absent.
All body parts of a multipart MIME message SHOULD have a DKIM-Signature inserted. In this case, the Content-Type, Content-Disposition and Content-ID headers SHOULD be included, even when they are not presented as headers, in which case they could not be added later. Note that the remarks made above for the boundary tag in a Content-Type header may apply recursively.
When signing a multipart message, a DKIM-Signed-Content header SHOULD include the DKIM-Signature headers of the directly subordinate body parts. Indirect subordinates are included recursively by these nested DKIM-Signature headers.
This practice enables the detection of removed as well as added body parts. A common example is the removal of a text/html form from multipart/alternative, so as to leave the text/plain form. This may be tolerated on account of the semantics of multipart/alternative messages.
MIME messages require special handling. The descriptions below define how headers and bodies are canonicalised for MIME handling. When a message has a MIME-Version header, it MAY want ot use c=mime/mime in its DKIM-Signature headers.
This is an extension to the version of DKIM that is currently in use, but it is not a change to the header format, merely a matter of new canonicalisation rules and handling for contained body parts. This means that no new version number is defined for DKIM. The changes may however lead to interpretation problems in older software. To serve these somewhat, it is possible for DKIM signers to add not only a DKIM-Signature with c=mime/mime, but also with other canonicalisation mechanisms. It is advised to plan a final date for any such policy.
Header canonicalisation for MIME headers is a modified form of the relaxed algorithm. Its added function is to change any boundary tag in a Content-Type header to an empty string, denoted as two double quotes.
When an MTA carries a binary message over a degraded link, it may need to change the encoding of the message body. The simple and relaxed canonicalisation algorithms both relate to the wire format of the message and will break on such valid changes of form. For MIME body parts however, this wire format is representative of an exactly reproduced binary format, which can serve as a canonical format for these body parts.
The body canonicalisation algorithm "mime" is used to reduce the wire format of a MIME message body to the binary content that it represents, before applying hashing and signing algorithms. This should make it neutral to changes to the transfer encoding. Note that it is vital that the Message-Transfer-Encoding header MUST NOT be included in the header list when the "content" algorithm is used.
The "mime" algorithm MUST NOT be used on multipart messages. For those, an zero-byte body is used with headers including a DKIM-Signed-Content header that references that various body parts individually.
Some MIME body parts include an URI reference, and it is possible for an MDA or MUA to replace a body part with such a reference to an external store. In such cases, a DKIM-Signature on the body part is bound to fail; it would however serve to validate the downloaded content together with the descriptive headers that are signed along with it. This means that components of the mail system that replace a body part by an URI reference should try to retain any DKIM-Signature header. Especially when a bh field is included, it would be easy to verify a downloaded body part.
TODO:REGISTER:HEADER:DKIM-SIGNED-CONTENT
TODO:REGISTER:DKIMCANONALG:CONTENT
TODO: Parts before/after MIME body parts
TODO: Modified MIME separators that also appear in body parts.