<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>

<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="bcp"
  docName="draft-li-dnsop-resolver-resilience-02"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  consensus="true"
  xml:lang="en"
  version="3">

  <front>
    <title abbrev="DNS Resolver Resilience">Best Current Practices for DNS Resolver Resilience Against Coordinated Amplification Attacks</title>

    <seriesInfo name="Internet-Draft" value="draft-li-dnsop-resolver-resilience-02"/>

     <author fullname="Xiang Li" initials="X.L." surname="Li">
      <organization>Nankai University</organization>
      <address>
        <postal>
          <street>38 Tongyan Road</street>
          <city>Tianjin</city>
          <region>Tianjin</region>
          <code>300355</code>
          <country>China</country>
        </postal>
        <email>lixiang@nankai.edu.cn</email>
      </address>
    </author>

     <author fullname="Yuqi Qiu" initials="Y.Q." surname="Qiu">
      <organization>Nankai University</organization>
      <address>
        <postal>
          <street>38 Tongyan Road</street>
          <city>Tianjin</city>
          <region>Tianjin</region>
          <code>300355</code>
          <country>China</country>
        </postal>
        <email>norahqiu@163.com</email>
      </address>
    </author>

    <date year="2026" month="February" day="28"/>

    <area>ops</area>
    <workgroup>dnsop</workgroup>

    <keyword>DNS</keyword>
    <keyword>DoS</keyword>
    <keyword>PDoS</keyword>
    <keyword>DNSBomb</keyword>
    <keyword>Amplification</keyword>
    <keyword>Resilience</keyword>

    <abstract>
      <t>This document describes an attack vector, exemplified by the "DNSBomb"
   attack, that leverages the emergent behavior of several widely-
   implemented DNS resolver mechanisms. By combining query timeouts, query
   aggregation, and response timing, an attacker can turn a set of
   resolvers into powerful amplifiers for a Pulsing Denial-of-Service
   (PDoS) attack. This attack is difficult to detect due to its low
   average traffic rate but can be highly effective at overwhelming a
   target's resources.</t>
      <t>This document provides operational guidance and a set of best
   practices for DNS resolver implementers and operators to mitigate this
   threat. The goal is to harden the DNS ecosystem by reducing the
   potential for resolvers to be used in such a coordinated fashion,
   thereby improving the operational resilience of the DNS.</t>
    </abstract>

  </front>

  <middle>

    <section>
      <name>Introduction</name>
      <t>The Domain Name System (DNS) <xref target="RFC1034"/> <xref target="RFC1035"/> has long been used as a vector for reflection
   and amplification attacks <xref target="RFC5358"/>. A sophisticated variant, the
   Pulsing Denial-of-Service (PDoS) attack <xref target="Shrew"/>, uses
   intermittent, high-volume traffic bursts. This pattern makes PDoS
   attacks challenging to detect with conventional traffic analysis, yet
   they remain highly effective.</t>
      <t>The "DNSBomb" attack <xref target="DNSBomb"/> demonstrates a practical method for
   generating such bursts by exploiting the combined, emergent behavior
   of standard resolver features. The attack model does not rely on a
   single protocol vulnerability but on the operational ambiguity in how
   resolvers should handle a specific sequence of events: a large number
   of queries from a single source for a domain whose authoritative
   server is slow to respond.</t>
      <t>This document specifies best practices for resolver implementations
   and configurations to mitigate this and similar attack vectors. These
   practices are designed to limit the ability of an attacker to
   accumulate and concentrate responses without negatively impacting
   legitimate use cases.</t>
      <t>This document acknowledges that the diversity of DNS implementations is
   a strength and not a weakness. The exact mitigations detailed herein are
   provided as operational guidance and Best Current Practices, rather than
   rigid Internet Standards. Implementers are encouraged to adapt these
   mechanisms to suit their specific architectures.</t>

      <section>
        <name>Requirements Language</name>
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
          "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
          RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
          interpreted as described in BCP 14 <xref target="RFC2119"/>
          <xref target="RFC8174"/> when, and only when, they appear in
          all capitals, as shown here.</t>
      </section>
    </section>

    <section>
      <name>Terminology</name>
       <dl newline="true">
        <dt>Pulsing DoS (PDoS) Attack:</dt>
        <dd>A Denial-of-Service attack characterized
   by intermittent, short bursts of high-volume traffic separated by
   periods of little or no attack traffic.</dd>
        <dt>Query Accumulation:</dt>
        <dd>An attack phase where a resolver receives and holds
   numerous queries, typically from a spoofed source IP, while awaiting
   a delayed response from a malicious authoritative nameserver.</dd>
        <dt>Response Concentration:</dt>
        <dd>The near-simultaneous transmission of a large
   number of DNS responses from a resolver to a single target. This is
   the culmination of the attack, forming the traffic pulse.</dd>
        <dt>Response Pacing:</dt>
        <dd>A mitigation technique whereby a resolver
   deliberately de-synchronizes the transmission of a large batch of
   responses to a single client to prevent a traffic spike.</dd>
      </dl>
    </section>

    <section>
        <name>Attack Model</name>
        <t>The attack model assumes the adversary can send IP-spoofed DNS
   queries and controls an authoritative nameserver for a domain. The
   attack proceeds in three phases:</t>
        <ol type="1">
          <li><strong>Accumulation:</strong> The attacker sends a low-rate stream of
   queries for unique subdomains of their controlled domain to one or
   more recursive resolvers. The source IP address is spoofed to that of
   the victim. The attacker's authoritative server receives the upstream
   queries from the resolver but deliberately withholds its response. The
   resolver's query timeout window (potentially extended by IP
   defragmentation timeouts <xref target="RFC0791"/> <xref target="RFC8200"/>) becomes the accumulation
   period.</li>
          <li><strong>Amplification:</strong> The attacker leverages query aggregation within the
   resolver to minimize the upstream query load on their authoritative
   server. When the attacker finally responds, it sends a large response,
   using EDNS(0) <xref target="RFC6891"/> to maximize the payload size. This single
   large response will be used as the basis for responding to all
   accumulated queries.</li>
          <li><strong>Concentration:</strong> Upon receiving the single, delayed response, the
   resolver unblocks all pending client-side queries. Due to optimizations
   for low latency, many resolvers will transmit all of these responses
   to the victim's IP address nearly simultaneously, creating a powerful,
   concentrated traffic pulse.</li>
        </ol>
    </section>

    <section>
      <name>Problem Statement</name>
      <t>This attack vector arises from an operational ambiguity in current
   DNS specifications. While features like query timeouts, aggregation,
   and fast response are individually beneficial for performance and
   resilience, their interaction under specific, maliciously crafted
   conditions is not well-defined. Resolvers lack clear guidance on how
   to differentiate between a legitimate, large-scale query event (e.g.,
   from a large NAT) and a coordinated attack. This document aims to
   provide that guidance to reduce the potential for exploitation.</t>
    </section>

    <section>
      <name>Mitigation Strategies and Operational Guidance</name>
      <t>To mitigate this attack vector, this document recommends a set of
   interrelated strategies for resolver software and its operation.</t>
      <section>
        <name>Response Pacing</name>
        <t>The most direct mitigation for the response concentration phase is
   Response Pacing. When a resolver is about to send a large number of
   responses to a single client IP address in a short time window (e.g.,
   as a result of a single upstream answer), it SHOULD introduce a small,
   randomized delay (jitter) between each response transmission.
        </t>
        <t>This technique de-synchronizes the response burst, spreading it
   out over time and reducing its peak bandwidth. The total delay should
   be carefully calibrated to avoid a significant performance impact on
   legitimate clients.</t>
        <t><strong>Operational Trade-offs:</strong> This mechanism may introduce minor
   latency for legitimate clients behind large-scale NATs. The pacing
   algorithm should be configurable and potentially adaptive based on
   the number of responses in the queue.</t>
      </section>
      <section>
        <name>Guidance on Timeout Values</name>
        <t>Long upstream query timeouts provide a larger window for query
   accumulation. It is RECOMMENDED that resolver operators configure
   shorter timeouts for queries to authoritative servers. A value
   between 1.5 and 3 seconds is generally sufficient to accommodate
   most network conditions without providing an excessive window for
   attackers.</t>
        <t>Resolver software MAY also implement adaptive timeouts. For example,
   if an authoritative server is consistently slow, the resolver could
   dynamically shorten the timeout for subsequent queries to it.</t>
      </section>
      <section>
        <name>Limiting Query Accumulation</name>
        <t>Resolvers SHOULD implement a mechanism to limit the number of
   pending queries that can be accumulated per source IP address (or
   prefix). A configurable limit on the number of outstanding queries
   from a single source directly caps the scale of the accumulation
   phase.</t>
        <t>Once this limit is reached, the resolver SHOULD either drop new
   queries from that source or respond immediately with an appropriate
   error code (e.g., REFUSED) until some of the pending queries are
   resolved. This is preferable to holding an unbounded number of
   queries.</t>
        <t><strong>Operational Trade-offs:</strong> A limit that is too low could affect
   service for users behind large-scale NATs. This limit should be
   configurable by the operator.</t>
      </section>
      <section>
        <name>EDNS(0) Buffer Size</name>
        <t>To limit the amplification factor, it is a standing best practice
   for resolver operators to configure a conservative EDNS(0) UDP buffer
   size. A value of 1232 bytes is RECOMMENDED, as this avoids IP
   fragmentation on most network paths. Operators SHOULD NOT configure
   larger values without a specific and compelling operational
   requirement.</t>
      </section>
    </section>

    <section anchor="Security">
      <name>Security Considerations</name>
      <t>The practices described in this document are designed to mitigate a
   specific attack vector and are not a complete solution for all DNS-
   based DoS attacks. The effectiveness of these mitigations relies on
   their combined deployment.</t>
      <t>Source address validation remains the most fundamental defense
   against attacks requiring IP spoofing. Network operators are strongly
   urged to implement ingress filtering as described in BCP 38
   <xref target="RFC2827"/> and BCP 84 <xref target="RFC3704"/>.</t>
      <t>The mitigations proposed herein involve operational trade-offs
   between security and performance. For example, Response Pacing adds
   latency, and strict query accumulation limits may impact legitimate
   users. Operators must be able to configure these parameters to suit
   their specific environment. The default settings in resolver software
   should prioritize resilience.</t>
      <t>While these measures make individual resolvers more resilient, a
   sufficiently motivated attacker could still achieve a significant
   impact by coordinating a very large number of unpatched or misconfigured
   resolvers. Therefore, broad adoption of these best practices across
   the community is essential for improving the overall security posture
   of the DNS.</t>
    </section>

    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>This document has no IANA actions.</t>
    </section>

    <section anchor="Contributors" numbered="false">
      <name>Contributors</name>
      <t>The authors of the "DNSBomb" paper, Dashuai Wu, Haixin Duan, and Qi Li,
      provided the foundational research for the attack vector described in this
      document.</t>
    </section>

  </middle>

  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>

      </references>

      <references>
        <name>Informative References</name>

        <reference anchor="DNSBomb" target="https://www.researchgate.net/publication/376355184">
          <front>
            <title>DNSBOMB: A New Practical-and-Powerful Pulsing DoS Attack Exploiting DNS Queries-and-Responses</title>
            <author initials="X." surname="Li"/>
            <author initials="D." surname="Wu"/>
            <author initials="H." surname="Duan"/>
            <author initials="Q." surname="Li"/>
            <date year="2024" month="May"/>
          </front>
        </reference>

        <reference anchor="Shrew">
          <front>
            <title>Low-rate TCP-targeted denial of service attacks</title>
            <author initials="A." surname="Kuzmanovic">
                <organization>Rice University</organization>
            </author>
            <author initials="E." surname="Knightly">
                <organization>Rice University</organization>
            </author>
            <date year="2003"/>
          </front>
           <seriesInfo name="ACM SIGCOMM Computer Communication Review" value="vol. 33, no. 4, pp. 75-86"/>
        </reference>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.0791.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1034.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1035.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2827.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3704.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5358.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6891.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8200.xml"/>

      </references>
    </references>
 </back>
</rfc>