Internet DRAFT - draft-kucherawy-dkim-anti-replay
draft-kucherawy-dkim-anti-replay
Network Working Group M. S. Kucherawy, Ed.
Internet-Draft 28 December 2022
Intended status: Experimental
Expires: 1 July 2023
Replay-Resistant DomainKeys Identified Mail (DKIM) Signatures
draft-kucherawy-dkim-anti-replay-03
Abstract
DomainKeys Identified Mail (DKIM) provides a digital signature
mechanism for Internet messages, allowing a domain name owner to
affix its domain name in a way that can be cryptographically
validated.
DKIM signatures protect the integrity of the message header and body
only. By design, it decoupled itself from the transport and storage
mechanisms used to handle messages. This gives rise to a possible
replay attack, which the original DKIM specification acknowledged but
did not provide a mitigation strategy. This document presents an
optional method for binding a signature to a specific recipient or
set of recipients so that broader replay attacks can be mitigated.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 1 July 2023.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
Kucherawy Expires 1 July 2023 [Page 1]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Recommended Reading . . . . . . . . . . . . . . . . . . . 3
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3
3. The 'e' Tag . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2. General Definition . . . . . . . . . . . . . . . . . . . 3
3.2.1. Modified Algorithm . . . . . . . . . . . . . . . . . 4
3.3. Example . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. Recipient Mutations . . . . . . . . . . . . . . . . . . . 7
4.2. Envelope Splitting . . . . . . . . . . . . . . . . . . . 7
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
6. Security Considerations . . . . . . . . . . . . . . . . . . . 8
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1. Normative References . . . . . . . . . . . . . . . . . . 8
7.2. Informative References . . . . . . . . . . . . . . . . . 9
Appendix A. Multiple Signatures . . . . . . . . . . . . . . . . 9
Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 10
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
DomainKeys Identified Mail (DKIM) provides a digital signature
mechanism for Internet messages, allowing a domain name owner to
affix its domain name to a message in a way that can be
cryptographically validated so long as the integrity of the message
is preserved in transit.
[RFC4686] presents the original threat model DKIM was meant to
address, and the environment in which it was expected to work.
Notably, DKIM decoupled itself from the transport of the message.
The theory suggests it should be possible to validate a signature
whether a message is in situ (i.e., in an inbox on disk), in transit
between mail servers, or being retrieved through a mailbox access
protocol.
Kucherawy Expires 1 July 2023 [Page 2]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
In particular, this meant a DKIM signature can validate irrespective
of what is in the SMTP [RFC5321] envelope containing it, or even when
there is no envelope to consider. This means a message and its
signature can be re-sent to anyone simply by changing the set of
recipients in the envelope and passing the message back to a Mail
Transport Agent (MTA) or Mail Submission Agent (MSA). As the message
itself is unaltered, any DKIM signature(s) on it will continue to
validate. This is a form of replay attack, and it relies for its
success on the perceived value (i.e., reputation) of the domain(s)
named in the signature(s).
This document describes a mechanism by which a signature and a
message can be coupled such that successful replays to other
recipient sets are not possible, as the signature will no longer
validate.
2. Definitions
2.1. Recommended Reading
Several terms used in this document are based on their definitions in
[RFC5598].
The term "envelope recipient" is, using the notation proposed in that
document, an RFC5321.RcptTo address.
2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. The 'e' Tag
3.1. Syntax
Using ABNF [RFC5234], the syntax for the new tag is:
sig-e-tag = %x65 [FWS] "=" %x79
3.2. General Definition
This section introduces the "e" (for "envelope") tag, a new DKIM
signature tag that can be used by a signer to indicate that signature
will only validate for a specific envelope recipient set, namely the
one associated with the message at the time it was signed.
Kucherawy Expires 1 July 2023 [Page 3]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
DKIM signers and verifiers to date have no reason to be interested in
any aspect of the envelope used to transport a message. This sort of
verification is not possible without that context being available,
which may prove to be a challenge to some operating environments.
Also, this will make it impossible to validate a DKIM signature using
this algorithm in a context where no envelope exists, such as when
retrieving a message from a mailbox.
The expected value of the tag is simply the character "y", though
other values may be introduced by future work. The value has no
particular meaning; the presence of the tag is the important signal.
[FOR DISCUSSION] Maybe this should be "r", indicating "recipients",
to allow later extensions to include other parts of the envelope that
might be helpful to include.
The presence of this tag in a DKIM signature indicates that the
signer executed a modified version of the algorithm described in
Section 3.7 of [RFC6376], and the verifier MUST do the same. The
modification inserts the envelope recipients available at signing or
verification time into the data fed to the hash algorithm to either
produce or verify the DKIM signature.
3.2.1. Modified Algorithm
This section specifies the modified version of the algorithm defined
in Section 3.7 of [RFC6376].
The pseudo-code of "data-hash" is replaced as follows:
OLD:
data-hash = hash-alg (h-headers, D-SIG, body-hash)
NEW:
data-hash = hash-alg (recipients, h-headers, D-SIG, body-hash)
The definition of "data-hash" is replaced as follows:
Kucherawy Expires 1 July 2023 [Page 4]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
OLD:
data-hash: is the output from using the hash-alg algorithm, to hash
the header including the DKIM-Signature header, and the
body hash.
NEW:
data-hash: is the output from using the hash-alg algorithm to hash
the recipients, the header including the DKIM-Signature
header field, and the body hash.
"recipients" is determined as follows:
1. Collect all envelope recipients into a list.
2. Remove any duplicate entries in the list.
3. Sort them in typical lexical ASCII order.
4. Format the list by concatenating them all in this sorted order,
separated by CRLF strings (ASCII 13 followed by ASCII 10), and
with the last one terminated by a CRLF.
The signing and verifying processes defined for DKIM are otherwise
unmodified.
3.3. Example
Consider the following SMTP transaction, wherein "C" denotes
something sent by an SMTP client, "S" denotes something sent by an
SMTP server, and terminating CRLFs in both directions are omitted:
C: MAIL FROM:<msk@example.net>
S: 250 Sender OK
C: RCPT TO:<bob@example.com>
S: 250 Recipient OK
C: RCPT TO:<alice@example.com>
S: 250 Recipient OK
C: DATA
S: 354 Go ahead
[message header omitted]
[message body omitted]
.
C: 250 Message delivered
Kucherawy Expires 1 July 2023 [Page 5]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
Compared to the standard signatures that would be generated or
verified in the absence of this tag, the process described above
would work the same way as the standard signing process would, except
that the content fed to the hash algorithm would be preceded by:
alice@example.com<CR><LF>
bob@example.com<CR><LF>
4. Discussion
Use of this tag guarantees that a signature will not verify unless
sent to exactly the same set of envelope recipients as was present in
the envelope when the message was prepared for signing. The fact
that the recipient set is sorted allows verifiers to tolerate any
reordering of the envelope that may be done in transit. However, if
any original recipient is removed, or any new recipient is added, the
signature will not validate because the content passed to the hash
step at the verifier will differ from what was done at the signer.
Thus, in the replay scenario described in Section 1, the signature no
longer validates.
Anecdotal evidence suggests that the bulk of Internet message traffic
is single-recipient traffic already, which implies the success of
this proposal. However, since the messaging standards both permit
and even encourage this "common factoring" of traffic (see
Section 4.5.4.1 of [RFC5321]), and this evidence has not been broadly
verified, it is appropriate to consider all possibilities.
In the absence of an SMTP envelope in the verification environment,
the DKIM implementation SHOULD indicate that the signature cannot be
verified, as distinct from considering such validation to have
failed. Legacy implementations may not be capable of this, however.
If the need to be able to validate a signature from storage (without
an envelope) needs to be preserved, the signer can still add a second
signature not using this tag, which therefore does not need the
envelope context to verify. This, however, requires the verifier to
understand when it is appropriate to use which signature and how to
interpret their results. There may be a solution in this space via
use or extension of the Authentication-Results header field
[RFC8601].
Kucherawy Expires 1 July 2023 [Page 6]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
Since [RFC6376] stipulates that unknown tags are to be ignored, there
will be a possibly substantial time period during which the tag is
unknown to receivers. Legacy verifiers will thus ignore the tag but
still process the signature, leading to a failure result. Operators
should thus expect these signatures to fail broadly during any early
deployment period, even for non-replay messages, and it may be some
time before meaningful signal begins to appear.
Note that this mechanism is fragile in the modern Internet message
ecosystem. Some scenarios that will yield false negatives with this
method are described in subsections below. Analysis has shown that
it is likely beneficial to include both a conventional DKIM signature
and one using this modification on a message. This produces
additional signal, rather than interfering with the signal previously
available. See Appendix A for further discussion.
4.1. Recipient Mutations
If a receiving MTA notes that one of the envelope recipients refers
to a mailbox in a domain for which it has administrative authority,
but is known to be an alias, it may rewrite that envelope into its
canonical form. For instance, if a receiving MTA is officially known
as the mail server for "example.com", but also accepts mail for its
users when addressed to "example.net", it may alter that latter
address in the envelope to refer to its canonical name. This alters
the recipient list, and thus alters the content passed to the hash
algorithm when validating the signature, leading to a failure.
Since hostnames are generally case-insensitive on the Internet, a
relay MTA might (improperly) fold a hostname to lowercase. This too
would invalidate a signature making use of this protocol.
[FOR DISCUSSION] A mitigation strategy here would be to pass the
domain part of the address after converting it all to lowercase.
4.2. Envelope Splitting
If a message contains envelope recipients at domains served by
separate MTAs, [RFC5321] compels the handling MTA to split the
message, creating multiple envelopes with different recipient subsets
yet identical header and body content. The first of these will be
addressed to one recipient and sent on its way; the second will be
addressed to another and sent via its own route; etc.
Upon arrival at a DKIM verifier, the recipient list has effectively
been altered since signing. This alters the content passed to the
hash algorithm when validating the signature, leading to a failure.
Kucherawy Expires 1 July 2023 [Page 7]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
This can be avoided by arranging that no envelope ever has more than
a single recipient, but this renders useless an important "common
factoring" feature of SMTP. In the case of a mailing list server
that may need to distribute a single message to a very large number
of recipients, this method can impose significant compute or storage
costs.
5. IANA Considerations
IANA is asked to make the following entry in the "DKIM-Signature Tag
Specifications" sub-registry of the "DKIM Parameters" registry group:
Type: e
Reference: [this document]
Status: active
6. Security Considerations
All of the security considerations of [RFC6376] apply when applying
the modification described here.
A signer that is forced to generate independently signed messages for
each recipient in a situation where large recipient lists are common
could be exploited to cause a denial-of-service attack simply from
the fact that there is an amplication of work being done.
The loss of the ability to verify messages signed using this tag when
extracted from their mailboxes will have unknown security impact.
Although DKIM intentionally supports this capability, it is not known
whether it is widely used.
7. References
7.1. Normative References
[RFC2119] Bradner, S. and RFC Publisher, "Key words for use in RFCs
to Indicate Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC5234] Crocker, D., Ed., Overell, P., and RFC Publisher,
"Augmented BNF for Syntax Specifications: ABNF", STD 68,
RFC 5234, DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>.
Kucherawy Expires 1 July 2023 [Page 8]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
[RFC5321] Klensin, J. and RFC Publisher, "Simple Mail Transfer
Protocol", RFC 5321, DOI 10.17487/RFC5321, October 2008,
<https://www.rfc-editor.org/info/rfc5321>.
[RFC6376] Crocker, D., Ed., Hansen, T., Ed., Kucherawy, M., Ed., and
RFC Publisher, "DomainKeys Identified Mail (DKIM)
Signatures", STD 76, RFC 6376, DOI 10.17487/RFC6376,
September 2011, <https://www.rfc-editor.org/info/rfc6376>.
[RFC8174] Leiba, B. and RFC Publisher, "Ambiguity of Uppercase vs
Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174,
DOI 10.17487/RFC8174, May 2017,
<https://www.rfc-editor.org/info/rfc8174>.
7.2. Informative References
[RFC4686] Fenton, J. and RFC Publisher, "Analysis of Threats
Motivating DomainKeys Identified Mail (DKIM)", RFC 4686,
DOI 10.17487/RFC4686, September 2006,
<https://www.rfc-editor.org/info/rfc4686>.
[RFC5598] Crocker, D. and RFC Publisher, "Internet Mail
Architecture", RFC 5598, DOI 10.17487/RFC5598, July 2009,
<https://www.rfc-editor.org/info/rfc5598>.
[RFC8601] Kucherawy, M. and RFC Publisher, "Message Header Field for
Indicating Message Authentication Status", RFC 8601,
DOI 10.17487/RFC8601, May 2019,
<https://www.rfc-editor.org/info/rfc8601>.
Appendix A. Multiple Signatures
The email ecosystem has seen broad adoption of DKIM to date. This
means validating signatures already provide useful signal in many
cases, and an important property of DKIM is that this signal survives
changes to the message envelope that might occur as described in
Section 4.
Switching to this proposal would solve the replay problem at the
expense of DKIM's broader success to date. Naturally, this is not
desirable.
Analysis suggests that a hybrid approach is possible. That is: A
signer affixes a typical "pure" DKIM signature and then in addition
adds one using this proposal. If we call these signatures A and B,
respectively, then there is no loss of signal, only a gain, as
follows:
Kucherawy Expires 1 July 2023 [Page 9]
Internet-Draft DKIM Anti-Replay Canonicalization December 2022
+------+------+------------------------------------------------+
| A | B | Meaning |
+------+------+------------------------------------------------+
| fail | fail | No conclusions possible |
+------+------+------------------------------------------------+
| fail | pass | Should never occur |
+------+------+------------------------------------------------+
| pass | fail | Message arrived intact; may have been replayed |
+------+------+------------------------------------------------+
| pass | pass | Message arrived intact and was not replayed |
+------+------+------------------------------------------------+
In particular, if the experimental signature fails while the
conventional one does not, we cannot make a conclusion about replay,
but all of the original signal provided by the conventional signature
is still available. However, if both signatures pass, we are certain
no replay occurred.
Appendix B. Acknowledgments
The author wishes to thank Dave Crocker for his contributions to this
work.
Author's Address
Murray S. Kucherawy (editor)
Email: superuser@gmail.com
Kucherawy Expires 1 July 2023 [Page 10]